16S rRNA sequencing is fast and cost-effective approach used to investigate microbiome communities, yet it suffers some inefficiencies. The major problem encountered is eukaryotic host contamination, i.e. >80% of 16S reads can be wasted due to off target amplification from the host’s genome. While this problem is well known to those who work with plants, it seems to be underestimated in the human sector.
In fact, according to Walker et.al., human contamination in 16S rRNA metabarcoding is quite common. Since the amount of off target amplification relates to the ratio of human to bacterial DNA the issue was under reported mostly because it does not affect stool and skin samples, which contain lower amounts of human DNA (stool <10% and skin <90%). Nevertheless, contamination can heavily impact analysis of biopsies where over 97% of the DNA present is of human origin.
How can it be fixed?
The first remedy is a primer choice. The use of degenerated 16S primers increases coverage of different bacterial species, but also allows for off target amplification of host DNA. Primers need to be generic enough to amplify target DNA from most taxa (including those still unknown, but also specific enough to avoid the amplification of DNA from non-target host.
In breast cancer biopsy samples, for example, primer set targeting the V1–V2 region, performs much better than the one that targets V3-V4 region, showing on average 80% less human genome aligning reads.
Another possibility is the use of peptide nucleic acid (PNA) PCR clamps, i.e. synthetic oligomers that bind specifically to a targeted region of the host genome and physically block its amplification during PCR. We successfully used this system to substantially enrich microbial sequences from plant samples, without introducing bias.
Finally, you can sequence more, i.e. produce more reads per sample to compensate low yield. It will certainly cost more but will avoid loss of data.
We hope that this post will help you to efficiently plan your experiment.