Revealing the Complexities of Metabarcoding with a Diverse Arthropod Mock Community
AbstractDNA metabarcoding is an attractive approach for monitoring biodiversity. However, it is subject to biases that often impede detection of all species in a sample. In particular, the proportion of sequences recovered from each species depends on its biomass, mitome copy number, and primer set employed for PCR. To examine these variables, we constructed a mock community of terrestrial arthropods comprised of 374 BINs, a species proxy. We used this community to examine how species recovery was impacted when amplicon pools were constructed in four ways. The first two protocols involved the construction of bulk DNA extracts from different body partitions (Bulk Abdomen, Bulk Leg). The other protocols involved the production of DNA extracts from single legs which were then merged prior to PCR (Composite Leg) or PCR-amplified separately (Single Leg) and then pooled. The amplicon generated by these four treatments were then sequenced on three platforms (Illumina MiSeq, Ion Torrent PGM and Ion Torrent S5). The choice of sequencing platform did not substantially influence species recovery, other variables did. As expected, the best recovery was obtained from the Single Leg treatment, but the Bulk Abdomen produced a more uniform read abundance than the Bulk Leg or Composite Leg samples. Primer choice also influenced species recovery. Our results reveal how variation in protocols can have substantive impacts on perceived diversity unless sequencing coverage is sufficient to reach an asymptote. Although metabarcoding is a powerful approach, further optimization of analytical protocols is crucial to obtain reproducible results and increase its cost-effectiveness.