All in one: structural, nucleotide, and epigenetic variations via ultra-long nanopore reads with phase information.
Third generation sequencing techniques (such as Oxford Nanopore Technologies) have overcome the limits of previous short read sequencing techniques, leading to an increase in the capacity of detecting and annotating variants of various ranges (from SNPs to SVs) and to phase them. The presence of additional information in raw data (such as the methylation state of cytosine) has also unlocked the possibility to obtain additional information and to phase them starting from just one type of data. In this case study we tested the ability to detect genetic and epigenetic variations in the sequence (SNP, SV and methylation levels) and phase them using one set per sample of ultra-long reads obtained with a PromethION 2 sequencer. We compared ONT reads from two varieties of Vitis vinifera for which a T2T haplotype-phased de novo assembly has also been produced to the genome of a third, highly homozygous, line of the same species. The pipeline envelops various software (Clair3, WhatsHap, Sniffles, CuteSV, Modkit) in order to obtain robust results and phase information. By using this approach, not only SNPs and SVs have been assigned precisely to one or the other haplotype, but also methylation state has been linked to haplotype-specific structural variants as well as to highly repetitive regions (e.g. centromeres and gene clusters). The relatively low price of the ultra-long nanopore reads and the multiple data types that can be obtained from this technology makes the proposed pipeline a competitive solution to de-novo genome assembly, both economically and in terms of derived information.
Introduction: Davide Scaglione, PhD - CEO- IGA Technology Services
Speaker: Mario Liva, PhD Student - IGA Technology Services