scholarly journals High throughput amplicon sequencing to assess within- and between-host genetic diversity in plant viruses

2017 ◽  
Author(s):  
Sylvain Piry ◽  
Catherine Wipf-Scheibel ◽  
Jean-François Martin ◽  
Maxime Galan ◽  
Karine Berthier

AbstractMolecular epidemiology approaches at the landscape scale require to study the genetic diversity of viral populations from numerous hosts and to characterize mixed infections. In such a context, high-throughput amplicon sequencing (HTAS) techniques create interesting opportunities as they allow identifying distinct variants within a same host while simultaneously genotyping a high number of samples. Validating variants produced by HTAS may, however, remain difficult due to biases occurring at different steps of the data-generating process (e.g. environmental contaminations and sequencing error). Here, we focused on Endive necrotic mosaic virus (ENMV), a member of family Potyviridae, genus Potyvirus to develop an HTAS approach and to characterize the genetic diversity at the intra- and inter-host levels from 430 samples collected over an area of 1660 km2 located in south-eastern France. We demonstrated how it is possible, by incorporating various controls in the experimental design and by performing independent sample replicates, to estimate potential biases in HTAS results and to implement an automated and robust variant calling procedure.HighlightsHigh-throughput amplicon sequencing to assess plant virus genetic diversityEstimating bias in high throughput amplicon sequencing resultsAutomated variant calling procedure for robust high throughput amplicon sequencing

2019 ◽  
Author(s):  
Bryden Fields ◽  
Sara Moeskjær ◽  
Ville-Petri Friman ◽  
Stig U. Andersen ◽  
J. Peter W. Young

AbstractBackgroundSequencing and PCR errors are a major challenge when characterising genetic diversity using high-throughput amplicon sequencing (HTAS).ResultsWe have developed a multiplexed HTAS method, MAUI-seq, which uses unique molecular identifiers (UMIs) to improve error correction by exploiting variation among sequences associated with a single UMI. We show that two main advantages of this approach are efficient elimination of chimeric and other erroneous reads, outperforming DADA2 and UNOISE3, and the ability to confidently recognise genuine alleles that are present at low abundance or resemble chimeras.ConclusionsThe method provides sensitive and flexible profiling of diversity and is readily adaptable to most HTAS applications, including microbial 16S rRNA profiling and metabarcoding of environmental DNA.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Yasemin Guenay-Greunke ◽  
David A. Bohan ◽  
Michael Traugott ◽  
Corinna Wallinger

AbstractHigh-throughput sequencing platforms are increasingly being used for targeted amplicon sequencing because they enable cost-effective sequencing of large sample sets. For meaningful interpretation of targeted amplicon sequencing data and comparison between studies, it is critical that bioinformatic analyses do not introduce artefacts and rely on detailed protocols to ensure that all methods are properly performed and documented. The analysis of large sample sets and the use of predefined indexes create challenges, such as adjusting the sequencing depth across samples and taking sequencing errors or index hopping into account. However, the potential biases these factors introduce to high-throughput amplicon sequencing data sets and how they may be overcome have rarely been addressed. On the example of a nested metabarcoding analysis of 1920 carabid beetle regurgitates to assess plant feeding, we investigated: (i) the variation in sequencing depth of individually tagged samples and the effect of library preparation on the data output; (ii) the influence of sequencing errors within index regions and its consequences for demultiplexing; and (iii) the effect of index hopping. Our results demonstrate that despite library quantification, large variation in read counts and sequencing depth occurred among samples and that the sequencing error rate in bioinformatic software is essential for accurate adapter/primer trimming and demultiplexing. Moreover, setting an index hopping threshold to avoid incorrect assignment of samples is highly recommended.


2014 ◽  
Author(s):  
Elizabeth Green ◽  
Sarah W. Davies ◽  
Mikhail V. Matz ◽  
Mónica Medina

The genetic composition of the resident Symbiodinium endosymbionts appears to strongly modulate the physiological performance of reef-building corals. Here, we used deep amplicon sequencing to quantitatively assess Symbiodinium genetic diversity for the two mountainous star corals, Orbicella franksi and Orbicella faveolata, from two reefs separated by 19 kilometers of deep water. We aimed to determine if symbiont diversity is largely partitioned with respect to coral host species or geographic location. Our results demonstrate that across the two reefs both coral species contained only Symbiodinium identifiable as clade B type B1, represented by five distinct haplotypes. Three of these haplotypes have not been previously described and may be endemic to the Flower Garden Banks. No consistent differences in symbiont composition were detected between the two coral species. However, significant quantitative differences were observed between the east and west banks for two of the five haplotypes. These results highlight the need for consistent molecular genotyping techniques to assess local community assemblages of Symbiodinium-host relationships, which could be largely irrespective of host genetic background. This deep-sequencing approach used to sensitively characterize cryptic genetic diversity of Symbiodinium will potentially contribute to the understanding of physiological variations among coral populations.


2020 ◽  
Author(s):  
Bryden Fields ◽  
Sara Moeskjær ◽  
Ville-Petri Friman ◽  
Stig U. Andersen ◽  
J. Peter W. Young

Abstract Background Sequencing and PCR errors are a major challenge when characterising genetic diversity using high-throughput amplicon sequencing (HTAS). Results We have developed a multiplexed HTAS method, MAUI-seq, which uses unique molecular identifiers (UMIs) to improve error correction by exploiting variation among sequences associated with a single UMI. We show that two main advantages of this approach are efficient elimination of chimeric and other erroneous reads, outperforming DADA2 and UNOISE3, and the ability to confidently recognise genuine alleles that are present at low abundance or resemble chimeras. Conclusions The method provides sensitive and flexible profiling of diversity and is readily adaptable to most HTAS applications, including microbial 16S rRNA profiling and metabarcoding of environmental DNA.


Author(s):  
Elizabeth Green ◽  
Sarah W. Davies ◽  
Mikhail V. Matz ◽  
Mónica Medina

The genetic composition of the resident Symbiodinium endosymbionts appears to strongly modulate the physiological performance of reef-building corals. Here, we used deep amplicon sequencing to quantitatively assess Symbiodinium genetic diversity for the two mountainous star corals, Orbicella franksi and Orbicella faveolata, from two reefs separated by 19 kilometers of deep water. We aimed to determine if symbiont diversity is largely partitioned with respect to coral host species or geographic location. Our results demonstrate that across the two reefs both coral species contained only Symbiodinium identifiable as clade B type B1, represented by five distinct haplotypes. Three of these haplotypes have not been previously described and may be endemic to the Flower Garden Banks. No consistent differences in symbiont composition were detected between the two coral species. However, significant quantitative differences were observed between the east and west banks for two of the five haplotypes. These results highlight the need for consistent molecular genotyping techniques to assess local community assemblages of Symbiodinium-host relationships, which could be largely irrespective of host genetic background. This deep-sequencing approach used to sensitively characterize cryptic genetic diversity of Symbiodinium will potentially contribute to the understanding of physiological variations among coral populations.


2021 ◽  
Author(s):  
Miguel Loera-Sanchez ◽  
Bruno Studer ◽  
Roland Koelliker

Grasslands are widespread and economically relevant ecosystems at the basis of sustainable roughage production. Plant genetic diversity (PGD; i.e., within-species diversity) is related to many beneficial effects to the ecosystem functioning of grasslands. The monitoring of PGD in temperate grasslands is complicated by the multiplicity of species present and by a shortage of methods for large-scale assessment. However, the continuous advancement of high-throughput DNA sequencing approaches have improved the prospects of broad, multispecies PGD monitoring. Among them, amplicon sequencing stands out as a robust and cost-effective method. Here we report a set of twelve multispecies primer pairs that can be used for high-throughput PGD assessment in multiple grassland plant species. The loci targeted by the amplicons were selected and tested in two phases: a "discovery phase" based on a sequence capture assay (611 target nuclear loci assessed in 16 grassland plant species), which resulted in the selection of eleven loci; and a "validation phase", in which the selected loci were targeted and sequenced using twelve multispecies primers in test populations of Dactylis glomerata L., Lolium perenne L., Festuca pratensis Huds., Trifolium pratense L. and T. repens L. The resulting multispecies amplicons had overall nucleotide diversities per species ranging from 5.19 × 10-3 to 1.29 × 10-2, which is in the range of flowering-related genes but slightly lower than pathogen resistance genes. We conclude that the methodology, the DNA sequence resources, and the amplicon-specific primer pairs reported in this study provide the basis for large-scale, multispecies PGD monitoring in grassland plants.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Luciano Calderón ◽  
Nuria Mauri ◽  
Claudio Muñoz ◽  
Pablo Carbonell-Bejerano ◽  
Laura Bree ◽  
...  

AbstractGrapevine cultivars are clonally propagated to preserve their varietal attributes. However, genetic variations accumulate due to the occurrence of somatic mutations. This process is anthropically influenced through plant transportation, clonal propagation and selection. Malbec is a cultivar that is well-appreciated for the elaboration of red wine. It originated in Southwestern France and was introduced in Argentina during the 1850s. In order to study the clonal genetic diversity of Malbec grapevines, we generated whole-genome resequencing data for four accessions with different clonal propagation records. A stringent variant calling procedure was established to identify reliable polymorphisms among the analyzed accessions. The latter procedure retrieved 941 single nucleotide variants (SNVs). A reduced set of the detected SNVs was corroborated through Sanger sequencing, and employed to custom-design a genotyping experiment. We successfully genotyped 214 Malbec accessions using 41 SNVs, and identified 14 genotypes that clustered in two genetically divergent clonal lineages. These lineages were associated with the time span of clonal propagation of the analyzed accessions in Argentina and Europe. Our results show the usefulness of this approach for the study of the scarce intra-cultivar genetic diversity in grapevines. We also provide evidence on how human actions might have driven the accumulation of different somatic mutations, ultimately shaping the Malbec genetic diversity pattern.


Fuels ◽  
2021 ◽  
Vol 2 (2) ◽  
pp. 241-252
Author(s):  
Dyah Asri Handayani Taroepratjeka ◽  
Tsuyoshi Imai ◽  
Prapaipid Chairattanamanokorn ◽  
Alissara Reungsang

Extreme halophiles offer the advantage to save on the costs of sterilization and water for biohydrogen production from lignocellulosic waste after the pretreatment process with their ability to withstand extreme salt concentrations. This study identifies the dominant hydrogen-producing genera and species among the acclimatized, extremely halotolerant microbial communities taken from two salt-damaged soil locations in Khon Kaen and one location from the salt evaporation pond in Samut Sakhon, Thailand. The microbial communities’ V3–V4 regions of 16srRNA were analyzed using high-throughput amplicon sequencing. A total of 345 operational taxonomic units were obtained and the high-throughput sequencing confirmed that Firmicutes was the dominant phyla of the three communities. Halanaerobium fermentans and Halanaerobacter lacunarum were the dominant hydrogen-producing species of the communities. Spatial proximity was not found to be a determining factor for similarities between these extremely halophilic microbial communities. Through the study of the microbial communities, strategies can be developed to increase biohydrogen molar yield.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Kelley Paskov ◽  
Jae-Yoon Jung ◽  
Brianna Chrisman ◽  
Nate T. Stockham ◽  
Peter Washington ◽  
...  

Abstract Background As next-generation sequencing technologies make their way into the clinic, knowledge of their error rates is essential if they are to be used to guide patient care. However, sequencing platforms and variant-calling pipelines are continuously evolving, making it difficult to accurately quantify error rates for the particular combination of assay and software parameters used on each sample. Family data provide a unique opportunity for estimating sequencing error rates since it allows us to observe a fraction of sequencing errors as Mendelian errors in the family, which we can then use to produce genome-wide error estimates for each sample. Results We introduce a method that uses Mendelian errors in sequencing data to make highly granular per-sample estimates of precision and recall for any set of variant calls, regardless of sequencing platform or calling methodology. We validate the accuracy of our estimates using monozygotic twins, and we use a set of monozygotic quadruplets to show that our predictions closely match the consensus method. We demonstrate our method’s versatility by estimating sequencing error rates for whole genome sequencing, whole exome sequencing, and microarray datasets, and we highlight its sensitivity by quantifying performance increases between different versions of the GATK variant-calling pipeline. We then use our method to demonstrate that: 1) Sequencing error rates between samples in the same dataset can vary by over an order of magnitude. 2) Variant calling performance decreases substantially in low-complexity regions of the genome. 3) Variant calling performance in whole exome sequencing data decreases with distance from the nearest target region. 4) Variant calls from lymphoblastoid cell lines can be as accurate as those from whole blood. 5) Whole-genome sequencing can attain microarray-level precision and recall at disease-associated SNV sites. Conclusion Genotype datasets from families are powerful resources that can be used to make fine-grained estimates of sequencing error for any sequencing platform and variant-calling methodology.


Sign in / Sign up

Export Citation Format

Share Document