SnakeChunks: modular blocks to build Snakemake workflows for reproducible NGS analyses

Mapping Intimacies ◽

10.1101/165191 ◽

2017 ◽

Author(s):

Claire Rioualen ◽

Lucie Charbonnier-Khamvongsa ◽

Jacques van Helden

Keyword(s):

Next Generation Sequencing ◽

Life Sciences ◽

Supplementary Information ◽

Supplementary Data ◽

Rna Seq ◽

Genome Wide ◽

Domains Of Life ◽

Supplementary Material ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

AbstractSummaryNext-Generation Sequencing (NGS) is becoming a routine approach for most domains of life sciences, yet there is a crucial need to improve the automation of processing for the huge amounts of data generated and to ensure reproducible results. We present SnakeChunks, a collection of Snakemake rules enabling to compose modular and user-configurable workflows, and show its usage with analyses of transcriptome (RNA-seq) and genome-wide location (ChIP-seq) data.AvailabilityThe code is freely available (github.com/SnakeChunks/SnakeChunks), and documented with tutorials and illustrative demos (snakechunks.readthedocs.io)[email protected], [email protected] informationSupplementary data are available at Bioinformatics online.

Download Full-text

GTO: a toolkit to unify pipelines in genomic and proteomic research

10.1101/2020.01.07.882845 ◽

2020 ◽

Author(s):

João R. Almeida ◽

Armando J. Pinho ◽

José L. Oliveira ◽

Olga Fajarda ◽

Diogo Pratas

Keyword(s):

Next Generation Sequencing ◽

Web Site ◽

Life Sciences ◽

Supplementary Information ◽

Supplementary Data ◽

Next Generation ◽

Modular Architecture ◽

C Language ◽

Proteomic Research ◽

Generation Sequencing

AbstractSummaryNext-generation sequencing triggered the production of a massive volume of publicly available data and the development of new specialised tools. These tools are dispersed over different frameworks, making the management and analyses of the data a challenging task. Additionally, new targeted tools are needed, given the dynamics and specificities of the field. We present GTO, a comprehensive toolkit designed to unify pipelines in genomic and proteomic research, which combines specialised tools for analysis, simulation, compression, development, visualisation, and transformation of the data. This toolkit combines novel tools with a modular architecture, being an excellent platform for experimental scientists, as well as a useful resource for teaching bioinformatics inquiry to students in life sciences.Availability and implementationGTO is implemented in C language and it is available, under the MIT license, at http://bioinformatics.ua.pt/[email protected] informationSupplementary data are available at publisher’s Web site.

Download Full-text

EARRINGS: an efficient and accurate adapter trimmer entails no a priori adapter sequences

Bioinformatics ◽

10.1093/bioinformatics/btab025 ◽

2021 ◽

Author(s):

Ting-Hsuan Wang ◽

Cheng-Ching Huang ◽

Jui-Hung Hung

Keyword(s):

Open Source Software ◽

Large Scale ◽

A Priori ◽

Supplementary Information ◽

Supplementary Data ◽

Comparable Accuracy ◽

Meta Analyses ◽

Next Generation Sequencing Ngs ◽

Adapter Trimming ◽

Generation Sequencing

Abstract Motivation Cross-sample comparisons or large-scale meta-analyses based on the next generation sequencing (NGS) involve replicable and universal data preprocessing, including removing adapter fragments in contaminated reads (i.e. adapter trimming). While modern adapter trimmers require users to provide candidate adapter sequences for each sample, which are sometimes unavailable or falsely documented in the repositories (such as GEO or SRA), large-scale meta-analyses are therefore jeopardized by suboptimal adapter trimming. Results Here we introduce a set of fast and accurate adapter detection and trimming algorithms that entail no a priori adapter sequences. These algorithms were implemented in modern C++ with SIMD and multithreading to accelerate its speed. Our experiments and benchmarks show that the implementation (i.e. EARRINGS), without being given any hint of adapter sequences, can reach comparable accuracy and higher throughput than that of existing adapter trimmers. EARRINGS is particularly useful in meta-analyses of a large batch of datasets and can be incorporated in any sequence analysis pipelines in all scales. Availability and implementation EARRINGS is open-source software and is available at https://github.com/jhhung/EARRINGS. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Repli-seq: genome-wide analysis of replication timing by next-generation sequencing

10.1101/104653 ◽

2017 ◽

Cited By ~ 8

Author(s):

Claire Marchal ◽

Takayo Sasaki ◽

Daniel Vera ◽

Korey Wilson ◽

Jiao Sima ◽

...

Keyword(s):

Next Generation Sequencing ◽

Replication Timing ◽

Nucleotide Polymorphisms ◽

Robust Methods ◽

Next Generation ◽

Single Nucleotide ◽

Cellular Processes ◽

Genome Wide ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

ABSTRACTCycling cells duplicate their DNA content during S phase, following a defined program called replication timing (RT). Early and late replicating regions differ in terms of mutation rates, transcriptional activity, chromatin marks and sub-nuclear position. Moreover, RT is regulated during development and is altered in disease. Exploring mechanisms linking RT to other cellular processes in normal and diseased cells will be facilitated by rapid and robust methods with which to measure RT genome wide. Here, we describe a rapid, robust and relatively inexpensive protocol to analyze genome-wide RT by next-generation sequencing (NGS). This protocol yields highly reproducible results across laboratories and platforms. We also provide computational pipelines for analysis, parsing phased genomes using single nucleotide polymorphisms (SNP) for analyzing RT allelic asynchrony, and for direct comparison to Repli-chip data obtained by analyzing nascent DNA by microarrays.

Download Full-text

ngsReports: a Bioconductor package for managing FastQC reports and other NGS related log files

Bioinformatics ◽

10.1093/bioinformatics/btz937 ◽

2019 ◽

Vol 36 (8) ◽

pp. 2587-2588 ◽

Cited By ~ 10

Author(s):

Christopher M Ward ◽

Thu-Hien To ◽

Stephen M Pederson

Keyword(s):

Quality Control ◽

R Package ◽

Supplementary Information ◽

Bioconductor Package ◽

Supplementary Data ◽

Large Sample ◽

Log Files ◽

Shiny App ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

Abstract Motivation High throughput next generation sequencing (NGS) has become exceedingly cheap, facilitating studies to be undertaken containing large sample numbers. Quality control (QC) is an essential stage during analytic pipelines and the outputs of popular bioinformatics tools such as FastQC and Picard can provide information on individual samples. Although these tools provide considerable power when carrying out QC, large sample numbers can make inspection of all samples and identification of systemic bias a challenge. Results We present ngsReports, an R package designed for the management and visualization of NGS reports from within an R environment. The available methods allow direct import into R of FastQC reports along with outputs from other tools. Visualization can be carried out across many samples using default, highly customizable plots with options to perform hierarchical clustering to quickly identify outlier libraries. Moreover, these can be displayed in an interactive shiny app or HTML report for ease of analysis. Availability and implementation The ngsReports package is available on Bioconductor and the GUI shiny app is available at https://github.com/UofABioinformaticsHub/shinyNgsreports. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Mapping DNA Topoisomerase Binding and Cleavage Genome Wide Using Next-Generation Sequencing Techniques

Genes ◽

10.3390/genes11010092 ◽

2020 ◽

Vol 11 (1) ◽

pp. 92 ◽

Cited By ~ 1

Author(s):

Shannon J. McKie ◽

Anthony Maxwell ◽

Keir C. Neuman

Keyword(s):

Next Generation Sequencing ◽

Dna Cleavage ◽

Next Generation ◽

Dna Breaks ◽

Genome Wide ◽

Extension Activity ◽

Nucleotide Resolution ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

Next-generation sequencing (NGS) platforms have been adapted to generate genome-wide maps and sequence context of binding and cleavage of DNA topoisomerases (topos). Continuous refinements of these techniques have resulted in the acquisition of data with unprecedented depth and resolution, which has shed new light on in vivo topo behavior. Topos regulate DNA topology through the formation of reversible single- or double-stranded DNA breaks. Topo activity is critical for DNA metabolism in general, and in particular to support transcription and replication. However, the binding and activity of topos over the genome in vivo was difficult to study until the advent of NGS. Over and above traditional chromatin immunoprecipitation (ChIP)-seq approaches that probe protein binding, the unique formation of covalent protein–DNA linkages associated with DNA cleavage by topos affords the ability to probe cleavage and, by extension, activity over the genome. NGS platforms have facilitated genome-wide studies mapping the behavior of topos in vivo, how the behavior varies among species and how inhibitors affect cleavage. Many NGS approaches achieve nucleotide resolution of topo binding and cleavage sites, imparting an extent of information not previously attainable. We review the development of NGS approaches to probe topo interactions over the genome in vivo and highlight general conclusions and quandaries that have arisen from this rapidly advancing field of topoisomerase research.

Download Full-text

Abstract 4843: A next generation sequencing (NGS) genome wide copy number variation (CNV) assay for comparison of circulating tumor cell (CTC) heterogeneity

10.1158/1538-7445.am2015-4843 ◽

2015 ◽

Author(s):

Stephanie Green ◽

Mark Landers ◽

Jessica Louw ◽

Adam Jendrisak ◽

Ryan Dittamore ◽

...

Keyword(s):

Next Generation Sequencing ◽

Copy Number Variation ◽

Tumor Cell ◽

Copy Number ◽

Circulating Tumor Cell ◽

Next Generation ◽

Genome Wide ◽

Number Variation ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

Download Full-text

Use of Next Generation Sequencing (NGS) Technologies for the Genome-Wide Detection of Transposition

Methods in Molecular Biology - Plant Transposable Elements ◽

10.1007/978-1-62703-568-2_19 ◽

2013 ◽

pp. 265-274 ◽

Cited By ~ 5

Author(s):

Moaine Elbaidouri ◽

Cristian Chaparro ◽

Olivier Panaud

Keyword(s):

Next Generation Sequencing ◽

Next Generation ◽

Genome Wide ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

Download Full-text

Canine Genetics and Genomics

Canine Genetics, Health and Medicine ◽

10.5772/intechopen.95781 ◽

2021 ◽

Author(s):

Edo D’Agaro ◽

Andrea Favaro ◽

Davide Rosa

Keyword(s):

Heart Disease ◽

Next Generation Sequencing ◽

Genomic Data ◽

Behavioral Traits ◽

Canine Genetics ◽

Genome Wide ◽

Tremendous Progress ◽

Dog Breed ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

In the past fifteen years, tremendous progress has been made in dog genomics. Several genetic aspects of cancer, heart disease, hip dysplasia, vision and hearing problems in dogs have been investigated and studied in detail. Genome-wide associative studies have made it possible to identify several genes associated with diseases, morphological and behavioral traits. The dog genome contains an extraordinary amount of genetic variability that distinguishes the different dog breeds. As a consequence of the selective programs, applied using stringent breed standards, each dog breed represents, today, a population isolated from the others. The availability of modern next generation sequencing (NGS) techniques and the identification of millions of single functional mutations (SNPs) has enabled us to obtain new and unknown detailed genomic data of the different breeds.

Download Full-text

The Next Generation Sequencing Techniques and Application in Drug Discovery and Development

Advances in Medical Technologies and Clinical Practice - Computer Applications in Drug Discovery and Development ◽

10.4018/978-1-5225-7326-5.ch011 ◽

2019 ◽

pp. 240-259

Author(s):

Afzal Hussain

Keyword(s):

Next Generation Sequencing ◽

Rna Sequencing ◽

Expression Profiling ◽

Massively Parallel Sequencing ◽

Life Sciences ◽

Rna Seq ◽

Next Generation ◽

Parallel Sequencing ◽

Short Period ◽

Generation Sequencing

Next-generation sequencing or massively parallel sequencing describe DNA sequencing, RNA sequencing, or methylation sequencing, which shows its great impact on the life sciences. The recent advances of these parallel sequencing for the generation of huge amounts of data in a very short period of time as well as reducing the computing cost for the same. It plays a major role in the gene expression profiling, chromosome counting, finding out the epigenetic changes, and enabling the future of personalized medicine. Here the authors describe the NGS technologies and its application as well as applying different tools such as TopHat, Bowtie, Cufflinks, Cuffmerge, Cuffdiff for analyzing the high throughput RNA sequencing (RNA-Seq) data.

Download Full-text

Guidelines for diagnostic next-generation sequencing

European Journal of Human Genetics ◽

10.1038/ejhg.2015.226 ◽

2015 ◽

Vol 24 (1) ◽

pp. 2-5 ◽

Cited By ~ 209

Author(s):

Gert Matthijs ◽

Erika Souche ◽

Mariëlle Alders ◽

Anniek Corveleyn ◽

Sebastian Eck ◽

...

Keyword(s):

Next Generation Sequencing ◽

Diagnostic Tests ◽

Human Genetics ◽

Genetic Disorders ◽

Next Generation ◽

The Past ◽

Supplementary Material ◽

New Feature ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

Abstract We present, on behalf of EuroGentest and the European Society of Human Genetics, guidelines for the evaluation and validation of next-generation sequencing (NGS) applications for the diagnosis of genetic disorders. The work was performed by a group of laboratory geneticists and bioinformaticians, and discussed with clinical geneticists, industry and patients’ representatives, and other stakeholders in the field of human genetics. The statements that were written during the elaboration of the guidelines are presented here. The background document and full guidelines are available as supplementary material. They include many examples to assist the laboratories in the implementation of NGS and accreditation of this service. The work and ideas presented by others in guidelines that have emerged elsewhere in the course of the past few years were also considered and are acknowledged in the full text. Interestingly, a few new insights that have not been cited before have emerged during the preparation of the guidelines. The most important new feature is the presentation of a ‘rating system’ for NGS-based diagnostic tests. The guidelines and statements have been applauded by the genetic diagnostic community, and thus seem to be valuable for the harmonization and quality assurance of NGS diagnostics in Europe.

Download Full-text