High throughput amplicon sequencing to assess within- and between-host genetic diversity in plant viruses

MAUI-seq: Metabarcoding using amplicons with unique molecular identifiers to improve error correction

10.1101/538587 ◽

2019 ◽

Cited By ~ 1

Author(s):

Bryden Fields ◽

Sara Moeskjær ◽

Ville-Petri Friman ◽

Stig U. Andersen ◽

J. Peter W. Young

Keyword(s):

Genetic Diversity ◽

16S Rrna ◽

Error Correction ◽

High Throughput ◽

Environmental Dna ◽

Amplicon Sequencing ◽

Efficient Elimination

AbstractBackgroundSequencing and PCR errors are a major challenge when characterising genetic diversity using high-throughput amplicon sequencing (HTAS).ResultsWe have developed a multiplexed HTAS method, MAUI-seq, which uses unique molecular identifiers (UMIs) to improve error correction by exploiting variation among sequences associated with a single UMI. We show that two main advantages of this approach are efficient elimination of chimeric and other erroneous reads, outperforming DADA2 and UNOISE3, and the ability to confidently recognise genuine alleles that are present at low abundance or resemble chimeras.ConclusionsThe method provides sensitive and flexible profiling of diversity and is readily adaptable to most HTAS applications, including microbial 16S rRNA profiling and metabarcoding of environmental DNA.

Download Full-text

Handling of targeted amplicon sequencing data focusing on index hopping and demultiplexing using a nested metabarcoding approach in ecology

Scientific Reports ◽

10.1038/s41598-021-98018-4 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Yasemin Guenay-Greunke ◽

David A. Bohan ◽

Michael Traugott ◽

Corinna Wallinger

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Cost Effective ◽

Amplicon Sequencing ◽

Sequencing Depth ◽

Sequencing Error ◽

Sequencing Data ◽

Large Sample ◽

Sequencing Errors ◽

Plant Feeding

AbstractHigh-throughput sequencing platforms are increasingly being used for targeted amplicon sequencing because they enable cost-effective sequencing of large sample sets. For meaningful interpretation of targeted amplicon sequencing data and comparison between studies, it is critical that bioinformatic analyses do not introduce artefacts and rely on detailed protocols to ensure that all methods are properly performed and documented. The analysis of large sample sets and the use of predefined indexes create challenges, such as adjusting the sequencing depth across samples and taking sequencing errors or index hopping into account. However, the potential biases these factors introduce to high-throughput amplicon sequencing data sets and how they may be overcome have rarely been addressed. On the example of a nested metabarcoding analysis of 1920 carabid beetle regurgitates to assess plant feeding, we investigated: (i) the variation in sequencing depth of individually tagged samples and the effect of library preparation on the data output; (ii) the influence of sequencing errors within index regions and its consequences for demultiplexing; and (iii) the effect of index hopping. Our results demonstrate that despite library quantification, large variation in read counts and sequencing depth occurred among samples and that the sequencing error rate in bioinformatic software is essential for accurate adapter/primer trimming and demultiplexing. Moreover, setting an index hopping threshold to avoid incorrect assignment of samples is highly recommended.

Download Full-text

Next-generation sequencing reveals cryptic Symbiodinium diversity within Orbicella faveolata and Orbicella franksi at the Flower Garden Banks, Gulf of Mexico

10.7287/peerj.preprints.246 ◽

2014 ◽

Author(s):

Elizabeth Green ◽

Sarah W. Davies ◽

Mikhail V. Matz ◽

Mónica Medina

Keyword(s):

Genetic Diversity ◽

Local Community ◽

Coral Species ◽

Geographic Location ◽

Amplicon Sequencing ◽

Coral Host ◽

Clade B ◽

Orbicella Faveolata ◽

Host Genetic ◽

Physiological Variations

The genetic composition of the resident Symbiodinium endosymbionts appears to strongly modulate the physiological performance of reef-building corals. Here, we used deep amplicon sequencing to quantitatively assess Symbiodinium genetic diversity for the two mountainous star corals, Orbicella franksi and Orbicella faveolata, from two reefs separated by 19 kilometers of deep water. We aimed to determine if symbiont diversity is largely partitioned with respect to coral host species or geographic location. Our results demonstrate that across the two reefs both coral species contained only Symbiodinium identifiable as clade B type B1, represented by five distinct haplotypes. Three of these haplotypes have not been previously described and may be endemic to the Flower Garden Banks. No consistent differences in symbiont composition were detected between the two coral species. However, significant quantitative differences were observed between the east and west banks for two of the five haplotypes. These results highlight the need for consistent molecular genotyping techniques to assess local community assemblages of Symbiodinium-host relationships, which could be largely irrespective of host genetic background. This deep-sequencing approach used to sensitively characterize cryptic genetic diversity of Symbiodinium will potentially contribute to the understanding of physiological variations among coral populations.

Download Full-text

MAUI-seq: Metabarcoding using amplicons with unique molecular identifiers to improve error correction

10.21203/rs.2.21630/v1 ◽

2020 ◽

Author(s):

Bryden Fields ◽

Sara Moeskjær ◽

Ville-Petri Friman ◽

Stig U. Andersen ◽

J. Peter W. Young

Keyword(s):

Genetic Diversity ◽

16S Rrna ◽

Error Correction ◽

High Throughput ◽

Environmental Dna ◽

Amplicon Sequencing ◽

Efficient Elimination

Abstract Background Sequencing and PCR errors are a major challenge when characterising genetic diversity using high-throughput amplicon sequencing (HTAS). Results We have developed a multiplexed HTAS method, MAUI-seq, which uses unique molecular identifiers (UMIs) to improve error correction by exploiting variation among sequences associated with a single UMI. We show that two main advantages of this approach are efficient elimination of chimeric and other erroneous reads, outperforming DADA2 and UNOISE3, and the ability to confidently recognise genuine alleles that are present at low abundance or resemble chimeras. Conclusions The method provides sensitive and flexible profiling of diversity and is readily adaptable to most HTAS applications, including microbial 16S rRNA profiling and metabarcoding of environmental DNA.

Download Full-text

Next-generation sequencing reveals cryptic Symbiodinium diversity within Orbicella faveolata and Orbicella franksi at the Flower Garden Banks, Gulf of Mexico

10.7287/peerj.preprints.246v1 ◽

2014 ◽

Cited By ~ 1

Author(s):

Elizabeth Green ◽

Sarah W. Davies ◽

Mikhail V. Matz ◽

Mónica Medina

Keyword(s):

Genetic Diversity ◽

Local Community ◽

Coral Species ◽

Geographic Location ◽

Amplicon Sequencing ◽

Coral Host ◽

Clade B ◽

Orbicella Faveolata ◽

Host Genetic ◽

Physiological Variations

The genetic composition of the resident Symbiodinium endosymbionts appears to strongly modulate the physiological performance of reef-building corals. Here, we used deep amplicon sequencing to quantitatively assess Symbiodinium genetic diversity for the two mountainous star corals, Orbicella franksi and Orbicella faveolata, from two reefs separated by 19 kilometers of deep water. We aimed to determine if symbiont diversity is largely partitioned with respect to coral host species or geographic location. Our results demonstrate that across the two reefs both coral species contained only Symbiodinium identifiable as clade B type B1, represented by five distinct haplotypes. Three of these haplotypes have not been previously described and may be endemic to the Flower Garden Banks. No consistent differences in symbiont composition were detected between the two coral species. However, significant quantitative differences were observed between the east and west banks for two of the five haplotypes. These results highlight the need for consistent molecular genotyping techniques to assess local community assemblages of Symbiodinium-host relationships, which could be largely irrespective of host genetic background. This deep-sequencing approach used to sensitively characterize cryptic genetic diversity of Symbiodinium will potentially contribute to the understanding of physiological variations among coral populations.

Download Full-text

A multispecies amplicon sequencing approach for genetic diversity assessment in grassland plant species

10.1101/2021.07.26.453819 ◽

2021 ◽

Author(s):

Miguel Loera-Sanchez ◽

Bruno Studer ◽

Roland Koelliker

Keyword(s):

Genetic Diversity ◽

Plant Species ◽

High Throughput ◽

Large Scale ◽

Trifolium Pratense ◽

Amplicon Sequencing ◽

Pathogen Resistance ◽

Production Plant ◽

Diversity Assessment ◽

Grassland Plants

Grasslands are widespread and economically relevant ecosystems at the basis of sustainable roughage production. Plant genetic diversity (PGD; i.e., within-species diversity) is related to many beneficial effects to the ecosystem functioning of grasslands. The monitoring of PGD in temperate grasslands is complicated by the multiplicity of species present and by a shortage of methods for large-scale assessment. However, the continuous advancement of high-throughput DNA sequencing approaches have improved the prospects of broad, multispecies PGD monitoring. Among them, amplicon sequencing stands out as a robust and cost-effective method. Here we report a set of twelve multispecies primer pairs that can be used for high-throughput PGD assessment in multiple grassland plant species. The loci targeted by the amplicons were selected and tested in two phases: a "discovery phase" based on a sequence capture assay (611 target nuclear loci assessed in 16 grassland plant species), which resulted in the selection of eleven loci; and a "validation phase", in which the selected loci were targeted and sequenced using twelve multispecies primers in test populations of Dactylis glomerata L., Lolium perenne L., Festuca pratensis Huds., Trifolium pratense L. and T. repens L. The resulting multispecies amplicons had overall nucleotide diversities per species ranging from 5.19 × 10-3 to 1.29 × 10-2, which is in the range of flowering-related genes but slightly lower than pathogen resistance genes. We conclude that the methodology, the DNA sequence resources, and the amplicon-specific primer pairs reported in this study provide the basis for large-scale, multispecies PGD monitoring in grassland plants.

Download Full-text

Whole genome resequencing and custom genotyping unveil clonal lineages in ‘Malbec’ grapevines (Vitis vinifera L.)

Scientific Reports ◽

10.1038/s41598-021-87445-y ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Luciano Calderón ◽

Nuria Mauri ◽

Claudio Muñoz ◽

Pablo Carbonell-Bejerano ◽

Laura Bree ◽

...

Keyword(s):

Genetic Diversity ◽

Somatic Mutations ◽

Clonal Propagation ◽

Variant Calling ◽

Vitis Vinifera L ◽

Whole Genome ◽

Single Nucleotide Variants ◽

Genome Resequencing ◽

Diversity Pattern ◽

Whole Genome Resequencing

AbstractGrapevine cultivars are clonally propagated to preserve their varietal attributes. However, genetic variations accumulate due to the occurrence of somatic mutations. This process is anthropically influenced through plant transportation, clonal propagation and selection. Malbec is a cultivar that is well-appreciated for the elaboration of red wine. It originated in Southwestern France and was introduced in Argentina during the 1850s. In order to study the clonal genetic diversity of Malbec grapevines, we generated whole-genome resequencing data for four accessions with different clonal propagation records. A stringent variant calling procedure was established to identify reliable polymorphisms among the analyzed accessions. The latter procedure retrieved 941 single nucleotide variants (SNVs). A reduced set of the detected SNVs was corroborated through Sanger sequencing, and employed to custom-design a genotyping experiment. We successfully genotyped 214 Malbec accessions using 41 SNVs, and identified 14 genotypes that clustered in two genetically divergent clonal lineages. These lineages were associated with the time span of clonal propagation of the analyzed accessions in Argentina and Europe. Our results show the usefulness of this approach for the study of the scarce intra-cultivar genetic diversity in grapevines. We also provide evidence on how human actions might have driven the accumulation of different somatic mutations, ultimately shaping the Malbec genetic diversity pattern.

Download Full-text

Extremely Halophilic Biohydrogen Producing Microbial Communities from High-Salinity Soil and Salt Evaporation Pond

Fuels ◽

10.3390/fuels2020014 ◽

2021 ◽

Vol 2 (2) ◽

pp. 241-252

Author(s):

Dyah Asri Handayani Taroepratjeka ◽

Tsuyoshi Imai ◽

Prapaipid Chairattanamanokorn ◽

Alissara Reungsang

Keyword(s):

Microbial Communities ◽

High Throughput ◽

High Throughput Sequencing ◽

High Salinity ◽

Amplicon Sequencing ◽

Spatial Proximity ◽

Lignocellulosic Waste ◽

Evaporation Pond ◽

Operational Taxonomic Units ◽

Determining Factor

Extreme halophiles offer the advantage to save on the costs of sterilization and water for biohydrogen production from lignocellulosic waste after the pretreatment process with their ability to withstand extreme salt concentrations. This study identifies the dominant hydrogen-producing genera and species among the acclimatized, extremely halotolerant microbial communities taken from two salt-damaged soil locations in Khon Kaen and one location from the salt evaporation pond in Samut Sakhon, Thailand. The microbial communities’ V3–V4 regions of 16srRNA were analyzed using high-throughput amplicon sequencing. A total of 345 operational taxonomic units were obtained and the high-throughput sequencing confirmed that Firmicutes was the dominant phyla of the three communities. Halanaerobium fermentans and Halanaerobacter lacunarum were the dominant hydrogen-producing species of the communities. Spatial proximity was not found to be a determining factor for similarities between these extremely halophilic microbial communities. Through the study of the microbial communities, strategies can be developed to increase biohydrogen molar yield.

Download Full-text

Estimating sequencing error rates using families

BioData Mining ◽

10.1186/s13040-021-00259-6 ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Kelley Paskov ◽

Jae-Yoon Jung ◽

Brianna Chrisman ◽

Nate T. Stockham ◽

Peter Washington ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Exome Sequencing ◽

Genome Sequencing ◽

Variant Calling ◽

Error Rates ◽

Sequencing Error ◽

Whole Genome ◽

Sequencing Data ◽

Sequencing Platform ◽

Whole Exome

Abstract Background As next-generation sequencing technologies make their way into the clinic, knowledge of their error rates is essential if they are to be used to guide patient care. However, sequencing platforms and variant-calling pipelines are continuously evolving, making it difficult to accurately quantify error rates for the particular combination of assay and software parameters used on each sample. Family data provide a unique opportunity for estimating sequencing error rates since it allows us to observe a fraction of sequencing errors as Mendelian errors in the family, which we can then use to produce genome-wide error estimates for each sample. Results We introduce a method that uses Mendelian errors in sequencing data to make highly granular per-sample estimates of precision and recall for any set of variant calls, regardless of sequencing platform or calling methodology. We validate the accuracy of our estimates using monozygotic twins, and we use a set of monozygotic quadruplets to show that our predictions closely match the consensus method. We demonstrate our method’s versatility by estimating sequencing error rates for whole genome sequencing, whole exome sequencing, and microarray datasets, and we highlight its sensitivity by quantifying performance increases between different versions of the GATK variant-calling pipeline. We then use our method to demonstrate that: 1) Sequencing error rates between samples in the same dataset can vary by over an order of magnitude. 2) Variant calling performance decreases substantially in low-complexity regions of the genome. 3) Variant calling performance in whole exome sequencing data decreases with distance from the nearest target region. 4) Variant calls from lymphoblastoid cell lines can be as accurate as those from whole blood. 5) Whole-genome sequencing can attain microarray-level precision and recall at disease-associated SNV sites. Conclusion Genotype datasets from families are powerful resources that can be used to make fine-grained estimates of sequencing error for any sequencing platform and variant-calling methodology.

Download Full-text

A9 Deep sequencing analysis to investigate the importance of within host genetic diversity and evolution of influenza A viruses for the development of resistance against neuraminidase inhibitors

Virus Evolution ◽

10.1093/ve/vew036.008 ◽

2017 ◽

Vol 3 (suppl_1) ◽

Author(s):

R. Roosenhoff ◽

A. van der Linden ◽

M. Schutten ◽

R.A.M. Fouchier

Keyword(s):

Genetic Diversity ◽

Deep Sequencing ◽

Influenza A ◽

Sequencing Analysis ◽

Neuraminidase Inhibitors ◽

Influenza A Viruses ◽

Development Of Resistance ◽

Host Genetic ◽

Deep Sequencing Analysis

Download Full-text