WATBIO is a European Framework 7 funded collaborative research project focused on developing drought-tolerant biomass crops for Europe. IGATech is the leading SME partner in WATBIO and conducts all the next-generation sequencing work. A large proportion of the WATBIO DNA and RNA sequencing work is done on poplar, and the WATBIO project has enabled IGATech to drive significant innovation in the type of sequencing involved.
Genome-wide association analysis (GWAS) involves scanning the entire genome in a wide range of individuals to find associations between variations in genes and observed traits (the phenotype). Given the high number of genes in a complex organism, this is like searching a needle in the haystack. Therefore to properly tackle the problem the number of regions to be tested needs to be extremely high. Here, the challenge presented by WATBIO was to provide adequate genotypic data in an ambitious study of black poplar. While a previously developed array was available, it did not have an adequate number of genetic markers. There is a risk of low statistical power from utilizing markers that are not tightly linked to functional genes and unexpected diversity may need to be explored.
IGATech pioneered the enrichment technology developed by NuGEN Technologies Inc., which utilizes single primer-based extension (SPET). This NuGEN technology was developed to deliver sequencing libraries uniformly enriched for genomic features such as genes or exons in model organisms. However, we needed to optimize the sequencing coverage on a huge amount of well-defined nucleotide positions. With minor modifications to the original protocols, we adopted the standard preparation kit to modify the assay from an exploratory paradigm to a focused one, ensuring that most of the sequences are informative for a set of chosen polymorphisms. Our original target set of polymorphisms (90,000) all resided on or near all the coding genes of black poplar genomes and were chosen from the few available sequencing data available for some accessions of the germplasm.
The genome complexity reduction of different approaches varies according to the requested genotyping level. GWAS is one of the most complicated.
For the first time, a single primer enrichment system enabled the genotyping of both target and non-target polymorphism on black poplar for more than 500k nucleotides: some 66k form the target set of 90k, while the remainder were de novo identified by the sequencing information for each locus. Quality of data points was assessed by the successful discovery of markers associated with phenotypic traits under study from project partners (University of Southampton).
From this pilot experiment, NuGEN has decided to promote the utilization of their technology also for massive genotyping by ad-hoc optimization of the kit to optimize the sequencing costs for this type of application and give an improved multiplexing capacity to cope with very large cohort of samples. IGATech is now routinely offering genotyping services using this technology, with reduced costs thanks to the further optimization in chemistry, faster protocol and higher-level of throughput now available from the manufacturer. This method has a high chance of becoming a leading genotyping-by-sequencing solution for medium-large polymorphisms number (5,000-90,000) in medium-large cohort of samples (>500).