scholarly journals A Primer on the Analysis of High-Throughput Sequencing Data for Detection of Plant Viruses

2021 ◽  
Vol 9 (4) ◽  
pp. 841
Author(s):  
Denis Kutnjak ◽  
Lucie Tamisier ◽  
Ian Adams ◽  
Neil Boonham ◽  
Thierry Candresse ◽  
...  

High-throughput sequencing (HTS) technologies have become indispensable tools assisting plant virus diagnostics and research thanks to their ability to detect any plant virus in a sample without prior knowledge. As HTS technologies are heavily relying on bioinformatics analysis of the huge amount of generated sequences, it is of utmost importance that researchers can rely on efficient and reliable bioinformatic tools and can understand the principles, advantages, and disadvantages of the tools used. Here, we present a critical overview of the steps involved in HTS as employed for plant virus detection and virome characterization. We start from sample preparation and nucleic acid extraction as appropriate to the chosen HTS strategy, which is followed by basic data analysis requirements, an extensive overview of the in-depth data processing options, and taxonomic classification of viral sequences detected. By presenting the bioinformatic tools and a detailed overview of the consecutive steps that can be used to implement a well-structured HTS data analysis in an easy and accessible way, this paper is targeted at both beginners and expert scientists engaging in HTS plant virome projects.

Genomics ◽  
2017 ◽  
Vol 109 (2) ◽  
pp. 83-90 ◽  
Author(s):  
Yan Guo ◽  
Yulin Dai ◽  
Hui Yu ◽  
Shilin Zhao ◽  
David C. Samuels ◽  
...  

2019 ◽  
Vol 109 (5) ◽  
pp. 716-725 ◽  
Author(s):  
D. E. V. Villamor ◽  
T. Ho ◽  
M. Al Rwahnih ◽  
R. R. Martin ◽  
I. E. Tzanetakis

Over the last decade, virologists have discovered an unprecedented number of viruses using high throughput sequencing (HTS), which led to the advancement of our knowledge on the diversity of viruses in nature, particularly unraveling the virome of many agricultural crops. However, these new virus discoveries have often widened the gaps in our understanding of virus biology; the forefront of which is the actual role of a new virus in disease, if any. Yet, when used critically in etiological studies, HTS is a powerful tool to establish disease causality between the virus and its host. Conversely, with globalization, movement of plant material is increasingly more common and often a point of dispute between countries. HTS could potentially resolve these issues given its capacity to detect and discover. Although many pipelines are available for plant virus discovery, all share a common backbone. A description of the process of plant virus detection and discovery from HTS data are presented, providing a summary of the different pipelines available for scientists’ utility in their research.


2019 ◽  
Author(s):  
Ayman Yousif ◽  
Nizar Drou ◽  
Jillian Rowe ◽  
Mohammed Khalfan ◽  
Kristin C Gunsalus

AbstractBackgroundAs high-throughput sequencing applications continue to evolve, the rapid growth in quantity and variety of sequence-based data calls for the development of new software libraries and tools for data analysis and visualization. Often, effective use of these tools requires computational skills beyond those of many researchers. To ease this computational barrier, we have created a dynamic web-based platform, NASQAR (Nucleic Acid SeQuence Analysis Resource).ResultsNASQAR offers a collection of custom and publicly available open-source web applications that make extensive use of a variety of R packages to provide interactive data analysis and visualization. The platform is publicly accessible at http://nasqar.abudhabi.nyu.edu/. Open-source code is on GitHub at https://github.com/nasqar/NASQAR, and the system is also available as a Docker image at https://hub.docker.com/r/aymanm/nasqarall. NASQAR is a collaboration between the core bioinformatics teams of the NYU Abu Dhabi and NYU New York Centers for Genomics and Systems Biology.ConclusionsNASQAR empowers non-programming experts with a versatile and intuitive toolbox to easily and efficiently explore, analyze, and visualize their Transcriptomics data interactively. Popular tools for a variety of applications are currently available, including Transcriptome Data Preprocessing, RNA-seq Analysis (including Single-cell RNA-seq), Metagenomics, and Gene Enrichment.


Viruses ◽  
2018 ◽  
Vol 10 (8) ◽  
pp. 436 ◽  
Author(s):  
Varvara Maliogka ◽  
Angelantonio Minafra ◽  
Pasquale Saldarelli ◽  
Ana Ruiz-García ◽  
Miroslav Glasa ◽  
...  

Perennial crops, such as fruit trees, are infected by many viruses, which are transmitted through vegetative propagation and grafting of infected plant material. Some of these pathogens cause severe crop losses and often reduce the productive life of the orchards. Detection and characterization of these agents in fruit trees is challenging, however, during the last years, the wide application of high-throughput sequencing (HTS) technologies has significantly facilitated this task. In this review, we present recent advances in the discovery, detection, and characterization of fruit tree viruses and virus-like agents accomplished by HTS approaches. A high number of new viruses have been described in the last 5 years, some of them exhibiting novel genomic features that have led to the proposal of the creation of new genera, and the revision of the current virus taxonomy status. Interestingly, several of the newly identified viruses belong to virus genera previously unknown to infect fruit tree species (e.g., Fabavirus, Luteovirus) a fact that challenges our perspective of plant viruses in general. Finally, applied methodologies, including the use of different molecules as templates, as well as advantages and disadvantages and future directions of HTS in fruit tree virology are discussed.


PLoS ONE ◽  
2014 ◽  
Vol 9 (1) ◽  
pp. e85879 ◽  
Author(s):  
Fabrice P. A. David ◽  
Julien Delafontaine ◽  
Solenne Carat ◽  
Frederick J. Ross ◽  
Gregory Lefebvre ◽  
...  

2016 ◽  
Vol 62 (8) ◽  
pp. 692-703 ◽  
Author(s):  
Gregory B. Gloor ◽  
Gregor Reid

A workshop held at the 2015 annual meeting of the Canadian Society of Microbiologists highlighted compositional data analysis methods and the importance of exploratory data analysis for the analysis of microbiome data sets generated by high-throughput DNA sequencing. A summary of the content of that workshop, a review of new methods of analysis, and information on the importance of careful analyses are presented herein. The workshop focussed on explaining the rationale behind the use of compositional data analysis, and a demonstration of these methods for the examination of 2 microbiome data sets. A clear understanding of bioinformatics methodologies and the type of data being analyzed is essential, given the growing number of studies uncovering the critical role of the microbiome in health and disease and the need to understand alterations to its composition and function following intervention with fecal transplant, probiotics, diet, and pharmaceutical agents.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Gwenna Breton ◽  
Anna C. V. Johansson ◽  
Per Sjödin ◽  
Carina M. Schlebusch ◽  
Mattias Jakobsson

Abstract Background Population genetic studies of humans make increasing use of high-throughput sequencing in order to capture diversity in an unbiased way. There is an abundance of sequencing technologies, bioinformatic tools and the available genomes are increasing in number. Studies have evaluated and compared some of these technologies and tools, such as the Genome Analysis Toolkit (GATK) and its “Best Practices” bioinformatic pipelines. However, studies often focus on a few genomes of Eurasian origin in order to detect technical issues. We instead surveyed the use of the GATK tools and established a pipeline for processing high coverage full genomes from a diverse set of populations, including Sub-Saharan African groups, in order to reveal challenges from human diversity and stratification. Results We surveyed 29 studies using high-throughput sequencing data, and compared their strategies for data pre-processing and variant calling. We found that processing of data is very variable across studies and that the GATK “Best Practices” are seldom followed strictly. We then compared three versions of a GATK pipeline, differing in the inclusion of an indel realignment step and with a modification of the base quality score recalibration step. We applied the pipelines on a diverse set of 28 individuals. We compared the pipelines in terms of count of called variants and overlap of the callsets. We found that the pipelines resulted in similar callsets, in particular after callset filtering. We also ran one of the pipelines on a larger dataset of 179 individuals. We noted that including more individuals at the joint genotyping step resulted in different counts of variants. At the individual level, we observed that the average genome coverage was correlated to the number of variants called. Conclusions We conclude that applying the GATK “Best Practices” pipeline, including their recommended reference datasets, to underrepresented populations does not lead to a decrease in the number of called variants compared to alternative pipelines. We recommend to aim for coverage of > 30X if identifying most variants is important, and to work with large sample sizes at the variant calling stage, also for underrepresented individuals and populations.


PLoS ONE ◽  
2019 ◽  
Vol 14 (10) ◽  
pp. e0222512
Author(s):  
Edoardo Morandi ◽  
Matteo Cereda ◽  
Danny Incarnato ◽  
Caterina Parlato ◽  
Giulia Basile ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document