htSeqTools: high-throughput sequencing quality control, processing and visualization in R

Evarist Planet; Camille Stephan-Otto Attolini; Oscar Reina; Oscar Flores; David Rossell

doi:10.1093/bioinformatics/btr700

DNA-Based Herbal Teas’ Authentication: An ITS2 and psbA-trnH Multi-Marker DNA Metabarcoding Approach

Plants ◽

10.3390/plants10102120 ◽

2021 ◽

Vol 10 (10) ◽

pp. 2120

Author(s):

Jessica Frigerio ◽

Giulia Agostinetto ◽

Valerio Mezzasalma ◽

Fabrizio De De Mattia ◽

Massimo Labra ◽

...

Keyword(s):

Quality Control ◽

Quantitative Analysis ◽

Medicinal Plants ◽

High Throughput ◽

High Throughput Sequencing ◽

The Other ◽

Plant Component ◽

Identification Rate ◽

Dna Metabarcoding ◽

Therapeutic Properties

Medicinal plants have been widely used in traditional medicine due to their therapeutic properties. Although they are mostly used as herbal infusion and tincture, employment as ingredients of food supplements is increasing. However, fraud and adulteration are widespread issues. In our study, we aimed at evaluating DNA metabarcoding as a tool to identify product composition. In order to accomplish this, we analyzed fifteen commercial products with DNA metabarcoding, using two barcode regions: psbA-trnH and ITS2. Results showed that on average, 70% (44–100) of the declared ingredients have been identified. The ITS2 marker appears to identify more species (n = 60) than psbA-trnH (n = 35), with an ingredients’ identification rate of 52% versus 45%, respectively. Some species are identified only by one marker rather than the other. Additionally, in order to evaluate the quantitative ability of high-throughput sequencing (HTS) to compare the plant component to the corresponding assigned sequences, in the laboratory, we created six mock mixtures of plants starting both from biomass and gDNA. Our analysis also supports the application of DNA metabarcoding for a relative quantitative analysis. These results move towards the application of HTS analysis for studying the composition of herbal teas for medicinal plants’ traceability and quality control.

Download Full-text

PathoQC: Computationally Efficient Read Preprocessing and Quality Control for High-Throughput Sequencing Data Sets

Cancer Informatics ◽

10.4137/cin.s13890 ◽

2014 ◽

Vol 13s1 ◽

pp. CIN.S13890 ◽

Cited By ~ 1

Author(s):

Changjin Hong ◽

Solaiappan Manimaran ◽

William Evan Johnson

Keyword(s):

Quality Control ◽

High Throughput ◽

High Performance ◽

High Throughput Sequencing ◽

Next Generation Sequencing Data ◽

Data Sets ◽

Sequencing Data ◽

Computationally Efficient ◽

High Throughput Sequencing Data ◽

Downstream Analysis

Quality control and read preprocessing are critical steps in the analysis of data sets generated from high-throughput genomic screens. In the most extreme cases, improper preprocessing can negatively affect downstream analyses and may lead to incorrect biological conclusions. Here, we present PathoQC, a streamlined toolkit that seamlessly combines the benefits of several popular quality control software approaches for preprocessing next-generation sequencing data. PathoQC provides a variety of quality control options appropriate for most high-throughput sequencing applications. PathoQC is primarily developed as a module in the PathoScope software suite for metagenomic analysis. However, PathoQC is also available as an open-source Python module that can run as a stand-alone application or can be easily integrated into any bioinformatics workflow. PathoQC achieves high performance by supporting parallel computation and is an effective tool that removes technical sequencing artifacts and facilitates robust downstream analysis. The PathoQC software package is available at http://sourceforge.net/projects/PathoScope/ .

Download Full-text

Rqc: A Bioconductor Package for Quality Control of High-Throughput Sequencing Data

Journal of Statistical Software ◽

10.18637/jss.v087.c02 ◽

2018 ◽

Vol 87 (Code Snippet 2) ◽

Cited By ~ 2

Author(s):

Wélliton de Souza ◽

Benilton de Sá Carvalho ◽

Iscia Lopes-Cendes

Keyword(s):

Quality Control ◽

High Throughput ◽

High Throughput Sequencing ◽

Bioconductor Package ◽

Sequencing Data ◽

High Throughput Sequencing Data

Download Full-text

Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data

Bioinformatics ◽

10.1093/bioinformatics/btv566 ◽

2015 ◽

pp. btv566 ◽

Cited By ~ 210

Author(s):

Konstantin Okonechnikov ◽

Ana Conesa ◽

Fernando García-Alcalde

Keyword(s):

Quality Control ◽

High Throughput ◽

High Throughput Sequencing ◽

Sequencing Data ◽

Sample Quality ◽

High Throughput Sequencing Data

Download Full-text

Broom: Application for non-redundant storage of High Throughput Sequencing data

10.1101/312306 ◽

2018 ◽

Author(s):

Levent Albayrak ◽

Kamil Khanipov ◽

George Golovko ◽

Yuriy Fofanov

Keyword(s):

Data Storage ◽

High Throughput ◽

High Throughput Sequencing ◽

Data Generation ◽

Sequencing Data ◽

High Throughput Sequencing Data ◽

Sequencing Quality ◽

Redundant Storage ◽

Recent Trends ◽

The Cost

AbstractMotivationThe data generation capabilities of High Throughput Sequencing (HTS) instruments have exponentially increased over the last few years, while the cost of sequencing has dramatically decreased allowing this technology to become widely used in biomedical studies. For small labs and individual researchers, however, storage and transfer of large amounts of HTS data present a significant challenge. The recent trends in increased sequencing quality and genome coverage can be used to reconsider HTS data storage strategies.ResultsWe present Broom, a stand-alone application designed to select and store only high-quality sequencing reads at extremely high compression rates. Written in C++, the application accepts single and paired-end reads in FASTQ and FASTA formats and decompresses data in FASTA format.AvailabilityC++ code available at https://scsb.utmb.edu/labgroups/fofanov/[email protected]

Download Full-text

HTSQualC is a flexible and one-step quality control software for high-throughput sequencing data analysis

Scientific Reports ◽

10.1038/s41598-021-98124-3 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Renesh Bedre ◽

Carlos Avila ◽

Kranthi Mandadi

Keyword(s):

Quality Control ◽

High Throughput ◽

High Throughput Sequencing ◽

Science Research ◽

Control Analysis ◽

Sequencing Data ◽

Quality Control Analysis ◽

High Throughput Sequencing Data ◽

One Step ◽

Automated Quality Control

AbstractUse of high-throughput sequencing (HTS) has become indispensable in life science research. Raw HTS data contains several sequencing artifacts, and as a first step it is imperative to remove the artifacts for reliable downstream bioinformatics analysis. Although there are multiple stand-alone tools available that can perform the various quality control steps separately, availability of an integrated tool that can allow one-step, automated quality control analysis of HTS datasets will significantly enhance handling large number of samples parallelly. Here, we developed HTSQualC, a stand-alone, flexible, and easy-to-use software for one-step quality control analysis of raw HTS data. HTSQualC can evaluate HTS data quality and perform filtering and trimming analysis in a single run. We evaluated the performance of HTSQualC for conducting batch analysis of HTS datasets with 322 samples with an average ~ 1 M (paired end) sequence reads per sample. HTSQualC accomplished the QC analysis in ~ 3 h in distributed mode and ~ 31 h in shared mode, thus underscoring its utility and robust performance. In addition to command-line execution, we integrated HTSQualC into the free, open-source, CyVerse cyberinfrastructure resource as a GUI interface, for wider access to experimental biologists who have limited computational resources and/or programming abilities.

Download Full-text

HTSeqQC: A Flexible and One-Step Quality Control Software for High-throughput Sequence Data Analysis

10.1101/2020.07.23.214536 ◽

2020 ◽

Author(s):

Renesh Bedre ◽

Carlos Avila ◽

Kranthi Mandadi

Keyword(s):

Quality Control ◽

High Throughput ◽

High Throughput Sequencing ◽

Sequence Data ◽

Supplementary Information ◽

Supplementary File ◽

Control Analysis ◽

Link Type ◽

Quality Control Analysis ◽

One Step

AbstractMotivationUse of high-throughput sequencing (HTS) has become indispensable in life science research. Raw HTS data contains several sequencing artifacts, and as a first step it is imperative to remove the artifacts for reliable downstream bioinformatics analysis. Although there are multiple stand-alone tools available that can perform the various quality control steps separately, availability of an integrated tool that can allow one-step, automated quality control analysis of HTS datasets will significantly enhance handling large number of samples parallelly.ResultsHere, we developed HTSeqQC, a stand-alone, flexible, and easy-to-use software for one-step quality control analysis of raw HTS data. HTSeqQC can evaluate HTS data quality and perform filtering and trimming analysis in a single run. We evaluated the performance of HTSeqQC for conducting batch analysis of HTS datasets with 322 sample datasets with an average ∼ 1M (paired end) sequence reads per sample. HTSeqQC accomplished the QC analysis in ∼3 hours in distributed mode and ∼31 hours in shared mode, thus underscoring its utility and robust performance.Availability and implementationHTSeqQC software, Docker image and Nextflow template are available for download at https://github.com/reneshbedre/HTSeqQC and graphical user interface (GUI) is available at CyVerse Discovery Environment (DE) (https://cyverse.org/). Documentation available at https://reneshbedre.github.io/blog/htseqqc.html and https://cyverse-htseqqc-cyverse-tutorial.readthedocs-hosted.com/en/latest/ (for CyVerse).ContactKranthi Mandadi ([email protected])Supplementary informationSupplementary information provided in Supplementary File 1.

Download Full-text

On the optimal trimming of high-throughput mRNA sequence data

10.1101/000422 ◽

2013 ◽

Cited By ~ 1

Author(s):

Matthew D. MacManes

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Sequence Data ◽

Deep Understanding ◽

Sequencing Technologies ◽

Software Packages ◽

Sequencing Quality ◽

Functional Biology ◽

Genome Level ◽

Optimal Strength

AbstractThe widespread and rapid adoption of high-throughput sequencing technologies has afforded researchers the opportunity to gain a deep understanding of genome level processes that underlie evolutionary change, and perhaps more importantly, the links between genotype and phenotype. In particular, researchers interested in functional biology and adaptation have used these technologies to sequence mRNA transcriptomes of specific tissues, which in turn are often compared to other tissues, or other individuals with different phenotypes. While these techniques are extremely powerful, careful attention to data quality is required. In particular, because high-throughput sequencing is more error-prone than traditional Sanger sequencing, quality trimming of sequence reads should be an important step in all data processing pipelines. While several software packages for quality trimming exist, no general guidelines for the specifics of trimming have been developed. Here, using empirically derived sequence data, I provide general recommendations regarding the optimal strength of trimming, specifically in mRNA-Seq studies. Although very aggressive quality trimming is common, this study suggests that a more gentle trimming, specifically of those nucleotides whose Phred score <2 or <5, is optimal for most studies across a wide variety of metrics.

Download Full-text

AlmostSignificant: simplifying quality control of high-throughput sequencing data

Bioinformatics ◽

10.1093/bioinformatics/btw559 ◽

2016 ◽

Vol 32 (24) ◽

pp. 3850-3851 ◽

Cited By ~ 3

Author(s):

Joseph Ward ◽

Christian Cole ◽

Melanie Febrer ◽

Geoffrey J. Barton

Keyword(s):

Quality Control ◽

High Throughput ◽

High Throughput Sequencing ◽

Sequencing Data ◽

High Throughput Sequencing Data

Download Full-text

Quality control of the traditional Chinese medicine Ruyi jinhuang powder based on high-throughput sequencing and real-time PCR

Scientific Reports ◽

10.1038/s41598-018-26520-3 ◽

2018 ◽

Vol 8 (1) ◽

Cited By ~ 8

Author(s):

Qiang Li ◽

Ying Sun ◽

Huijun Guo ◽

Feng Sang ◽

Hongyu Ma ◽

...

Keyword(s):

Quality Control ◽

Chinese Medicine ◽

Traditional Chinese Medicine ◽

Real Time ◽

High Throughput ◽

Real Time Pcr ◽

High Throughput Sequencing

Download Full-text