NGseqBasic - a single-command UNIX tool for ATAC-seq, DNaseI-seq, Cut-and-Run, and ChIP-seq data mapping, high-resolution visualisation, and quality control

Mapping Intimacies ◽

10.1101/393413 ◽

2018 ◽

Cited By ~ 8

Author(s):

Jelena Telenius ◽

Jim R. Hughes ◽

Keyword(s):

Quality Control ◽

Big Data ◽

Next Generation Sequencing ◽

High Resolution ◽

Data Processing ◽

Version Control ◽

Data Set ◽

Genome Group ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

ABSTRACTWith decreasing cost of next-generation sequencing (NGS), we are observing a rapid rise in the volume of ‘big data’ in academic research, healthcare and drug discovery sectors. The present bottleneck for extracting value from these ‘big data’ sets is data processing and analysis. Considering this, there is still a lack of reliable, automated and easy to use tools that will allow experimentalists to assess the quality of the sequenced libraries and explore the data first hand, without the need of investing a lot of time of computational core analysts in the early stages of analysis.NGseqBasic is an easy-to-use single-command analysis tool for chromatin accessibility (ATAC, DNaseI) and ChIP sequencing data, providing support to also new techniques such as low cell number sequencing and Cut-and-Run. It takes in fastq, fastq.gz or bam files, conducts all quality control, trimming and mapping steps, along with quality control and data processing statistics, and combines all this to a single-click loadable UCSC data hub, with integral statistics html page providing detailed reports from the analysis tools and quality control metrics. The tool is easy to set up, and no installation is needed. A wide variety of parameters are provided to fine-tune the analysis, with optional setting to generate DNase footprint or high resolution ChIP-seq tracks. A tester script is provided to help in the setup, along with a test data set and downloadable example user cases.NGseqBasic has been used in the routine analysis of next generation sequencing (NGS) data in high-impact publications 1,2. The code is actively developed, and accompanied with Git version control and Github code repository. Here we demonstrate NGseqBasic analysis and features using DNaseI-seq data from GSM689849, and CTCF-ChIP-seq data from GSM2579421, as well as a Cut-and-Run CTCF data set GSM2433142, and provide the one-click loadable UCSC data hubs generated by the tool, allowing for the ready exploration of the run results and quality control files generated by the tool.AvailabilityDownload, setup and help instructions are available on the NGseqBasic web site http://userweb.molbiol.ox.ac.uk/public/telenius/NGseqBasicManual/external/Bioconda users can load the tool as library “ngseqbasic”. The source code with Git version control is available in https://github.com/Hughes-Genome-Group/NGseqBasic/[email protected]

Download Full-text

A preliminary Quality Control (QC) for next generation sequencing (NGS) library evaluation turns out to be a very useful tool for a rapid detection of BRCA1/2 deleterious mutations

Clinica Chimica Acta ◽

10.1016/j.cca.2014.06.026 ◽

2014 ◽

Vol 437 ◽

pp. 72-77 ◽

Cited By ~ 16

Author(s):

Paola Concolino ◽

Alessandra Costella ◽

Angelo Minucci ◽

Giovanni Luca Scaglione ◽

Concetta Santonocito ◽

...

Keyword(s):

Quality Control ◽

Next Generation Sequencing ◽

Rapid Detection ◽

Deleterious Mutations ◽

Next Generation ◽

Library Evaluation ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

Download Full-text

Demystification of RNAseq Quality Control

JITA - Journal of Information Technology and Applications (Banja Luka) - APEIRON ◽

10.7251/jit2102073d ◽

2021 ◽

Vol 22 (2) ◽

Author(s):

Dragana Dudić ◽

Bojana Banović Đeri ◽

Vesna Pajić ◽

Gordana Pavlović-Lažetić

Keyword(s):

Quality Control ◽

Next Generation Sequencing ◽

Rna Sequencing ◽

Next Generation ◽

Comprehensive Guidance ◽

Dna And Rna ◽

Control Evaluation ◽

Downstream Analysis ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

Next Generation Sequencing (NGS) analysis has become a widely used method for studying the structure of DNA and RNA, but complexity of the procedure leads to obtaining error-prone datasets which need to be cleansed in order to avoid misinterpretation of data. We address the usage and proper interpretations of characteristic metrics for RNA sequencing (RNAseq) quality control, implemented in and reported by FastQC, and provide a comprehensive guidance for their assessment in the context of total RNAseq quality control of Illumina raw reads. Additionally, we give recommendations how to adequately perform the quality control preprocessing step of raw total RNAseq Illumina reads according to the obtained results of the quality control evaluation step; the aim is to provide the best dataset to downstream analysis, rather than to get better FastQC results. We also tested effects of different preprocessing approaches to the downstream analysis and recommended the most suitable approach.

Download Full-text

Preimplantation HLA genotyping with high-resolution next generation sequencing (NGS) in the same blastocyst biopsy collected for regular preimplantation genetic screen (PGS)

Fertility and Sterility ◽

10.1016/j.fertnstert.2018.07.1191 ◽

2018 ◽

Vol 110 (4) ◽

pp. e416

Author(s):

H.H. Jin ◽

R. Snyder ◽

A. Salbato ◽

H. Manvelyan ◽

H. Chen ◽

...

Keyword(s):

Next Generation Sequencing ◽

High Resolution ◽

Genetic Screen ◽

Next Generation ◽

Blastocyst Biopsy ◽

Hla Genotyping ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

Download Full-text

Novel bioinformatics quality control metric for next-generation sequencing experiments in the clinical context

Nucleic Acids Research ◽

10.1093/nar/gkz775 ◽

2019 ◽

Vol 47 (21) ◽

pp. e135-e135

Author(s):

Maxim Ivanov ◽

Mikhail Ivanov ◽

Artem Kasianov ◽

Ekaterina Rozhavskaya ◽

Sergey Musienko ◽

...

Keyword(s):

Quality Control ◽

Next Generation Sequencing ◽

Performance Measure ◽

Next Generation ◽

Coverage Depth ◽

Clinical Context ◽

Context Availability ◽

Control Metric ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

Abstract As the use of next-generation sequencing (NGS) for the Mendelian diseases diagnosis is expanding, the performance of this method has to be improved in order to achieve higher quality. Typically, performance measures are considered to be designed in the context of each application and, therefore, account for a spectrum of clinically relevant variants. We present EphaGen, a new computational methodology for bioinformatics quality control (QC). Given a single NGS dataset in BAM format and a pre-compiled VCF-file of targeted clinically relevant variants it associates this dataset with a single arbiter parameter. Intrinsically, EphaGen estimates the probability to miss any variant from the defined spectrum within a particular NGS dataset. Such performance measure virtually resembles the diagnostic sensitivity of given NGS dataset. Here we present case studies of the use of EphaGen in context of BRCA1/2 and CFTR sequencing in a series of 14 runs across 43 blood samples and 504 publically available NGS datasets. EphaGen is superior to conventional bioinformatics metrics such as coverage depth and coverage uniformity. We recommend using this software as a QC step in NGS studies in the clinical context. Availability: https://github.com/m4merg/EphaGen or https://hub.docker.com/r/m4merg/ephagen.

Download Full-text

High-Resolution HLA Typing By Next Generation Sequencing (NGS): Performance and Comparison of Ion Torrent PGM and Illumina MiSeq Sequencers.

Transplantation Journal ◽

10.1097/00007890-201407151-00802 ◽

2014 ◽

Vol 98 ◽

pp. 262

Author(s):

J. Lan ◽

Y. Yin ◽

Y. Saito ◽

E. Reed ◽

Q. Zhang

Keyword(s):

Next Generation Sequencing ◽

High Resolution ◽

Illumina Miseq ◽

Hla Typing ◽

Ion Torrent ◽

Next Generation ◽

Ion Torrent Pgm ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

Download Full-text

Next-Generation Sequencing (NGS) in COVID-19: A Tool for SARS-CoV-2 Diagnosis, Monitoring New Strains and Phylodynamic Modeling in Molecular Epidemiology

Current Issues in Molecular Biology ◽

10.3390/cimb43020061 ◽

2021 ◽

Vol 43 (2) ◽

pp. 845-867

Author(s):

Goldin John ◽

Nikhil Shri Sahajpal ◽

Ashis K. Mondal ◽

Sudha Ananth ◽

Colin Williams ◽

...

Keyword(s):

Quality Control ◽

Next Generation Sequencing ◽

Real World ◽

Next Generation ◽

Comprehensive Review ◽

Current Testing ◽

Phylogenetic Evolution ◽

Next Generation Sequencing Ngs ◽

Ngs Data ◽

Generation Sequencing

This review discusses the current testing methodologies for COVID-19 diagnosis and explores next-generation sequencing (NGS) technology for the detection of SARS-CoV-2 and monitoring phylogenetic evolution in the current COVID-19 pandemic. The review addresses the development, fundamentals, assay quality control and bioinformatics processing of the NGS data. This article provides a comprehensive review of the obstacles and opportunities facing the application of NGS technologies for the diagnosis, surveillance, and study of SARS-CoV-2 and other infectious diseases. Further, we have contemplated the opportunities and challenges inherent in the adoption of NGS technology as a diagnostic test with real-world examples of its utility in the fight against COVID-19.

Download Full-text

Performance Optimization of a Parallel Error Correction Tool

Engineering Proceedings ◽

10.3390/engproc2021007034 ◽

2021 ◽

Vol 7 (1) ◽

pp. 34

Author(s):

Marco Martínez-Sánchez ◽

Roberto R. Expósito ◽

Juan Touriño

Keyword(s):

Big Data ◽

Next Generation Sequencing ◽

Error Correction ◽

Performance Optimization ◽

Core Cluster ◽

Improved Performance ◽

Next Generation Sequencing Ngs ◽

Spark Framework ◽

Generation Sequencing

Due to the continuous development in the field of Next Generation Sequencing (NGS) technologies that have allowed researchers to take advantage of greater genetic samples in less time, it is a matter of relevance to improve the existing algorithms aimed at the enhancement of the quality of those generated reads. In this work, we present a Big Data tool implemented upon the open-source Apache Spark framework that is able to execute validated error-correction algorithms at an improved performance. The experimental evaluation conducted on a multi-core cluster has shown significant improvements in execution times, providing a maximum speedup of 9.5 over existing error correction tools when processing an NGS dataset with 25 million reads.

Download Full-text

Primer ID Validates Template Sampling Depth and Greatly Reduces the Error Rate of Next-Generation Sequencing of HIV-1 Genomic RNA Populations

Journal of Virology ◽

10.1128/jvi.00522-15 ◽

2015 ◽

Vol 89 (16) ◽

pp. 8540-8555 ◽

Cited By ~ 77

Author(s):

Shuntai Zhou ◽

Corbin Jones ◽

Piotr Mieczkowski ◽

Ronald Swanstrom

Keyword(s):

Next Generation Sequencing ◽

Error Rate ◽

Consensus Sequence ◽

Next Generation ◽

Sampling Depth ◽

Data Set ◽

Sequencing Errors ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing ◽

Hiv 1

ABSTRACTValidating the sampling depth and reducing sequencing errors are critical for studies of viral populations using next-generation sequencing (NGS). We previously described the use of Primer ID to tag each viral RNA template with a block of degenerate nucleotides in the cDNA primer. We now show that low-abundance Primer IDs (offspring Primer IDs) are generated due to PCR/sequencing errors. These artifactual Primer IDs can be removed using a cutoff model for the number of reads required to make a template consensus sequence. We have modeled the fraction of sequences lost due to Primer ID resampling. For a typical sequencing run, less than 10% of the raw reads are lost to offspring Primer ID filtering and resampling. The remaining raw reads are used to correct for PCR resampling and sequencing errors. We also demonstrate that Primer ID reveals bias intrinsic to PCR, especially at low template input or utilization. cDNA synthesis and PCR convert ca. 20% of RNA templates into recoverable sequences, and 30-fold sequence coverage recovers most of these template sequences. We have directly measured the residual error rate to be around 1 in 10,000 nucleotides. We use this error rate and the Poisson distribution to define the cutoff to identify preexisting drug resistance mutations at low abundance in an HIV-infected subject. Collectively, these studies show that >90% of the raw sequence reads can be used to validate template sampling depth and to dramatically reduce the error rate in assessing a genetically diverse viral population using NGS.IMPORTANCEAlthough next-generation sequencing (NGS) has revolutionized sequencing strategies, it suffers from serious limitations in defining sequence heterogeneity in a genetically diverse population, such as HIV-1 due to PCR resampling and PCR/sequencing errors. The Primer ID approach reveals the true sampling depth and greatly reduces errors. Knowing the sampling depth allows the construction of a model of how to maximize the recovery of sequences from input templates and to reduce resampling of the Primer ID so that appropriate multiplexing can be included in the experimental design. With the defined sampling depth and measured error rate, we are able to assign cutoffs for the accurate detection of minority variants in viral populations. This approach allows the power of NGS to be realized without having to guess about sampling depth or to ignore the problem of PCR resampling, while also being able to correct most of the errors in the data set.

Download Full-text

TILLING, high-resolution melting (HRM), and next-generation sequencing (NGS) techniques in plant mutation breeding

Molecular Breeding ◽

10.1007/s11032-017-0643-7 ◽

2017 ◽

Vol 37 (3) ◽

Cited By ~ 14

Author(s):

Sima Taheri ◽

Thohirah Lee Abdullah ◽

Shri Mohan Jain ◽

Mahbod Sahebi ◽

Parisa Azizi

Keyword(s):

Next Generation Sequencing ◽

High Resolution ◽

High Resolution Melting ◽

Mutation Breeding ◽

Next Generation ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

Download Full-text

A small trophectoderm biopsy sample is sufficient to detect most mosaicisms after analysis with high resolution next generation sequencing (NGS)

10.26226/morressier.573c1514d462b80296c98b90 ◽

2016 ◽

Author(s):

G John Garrisi

Keyword(s):

Next Generation Sequencing ◽

High Resolution ◽

Biopsy Sample ◽

Trophectoderm Biopsy ◽

Next Generation ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

Download Full-text