scholarly journals A Simplest Bioinformatics Pipeline for Whole Transcriptome Sequencing: Overview of the Processing and Steps from Raw Data to Downstream Analysis

2019 ◽  
Author(s):  
Ayam Gupta ◽  
Sonal Gupta ◽  
Suresh Kumar Jatawa ◽  
Ashwani Kumar ◽  
Prashanth Suravajhala

AbstractRecent advances in next generation sequencing (NGS) technologies have heralded the genomic research. From the good-old inferring differentially expressed genes (DEG) using microarray to the current adage NGS-based whole transcriptome or RNA-Seq pipelines, there have been advances and improvements. With several bioinformatics pipelines for analysing RNA-Seq on rise, inferring the candidate DEGs prove to be a cumbersome approach as one may have to reach consensus among all the pipelines. To Check this, we have benchmarked the well known cufflinks-cuffdiff pipeline on a set of datasets and outline it in the form of a protocol where researchers interested in performing whole transcriptome shotgun sequencing and it’s downstream analysis can better disseminate the analysis using their datasets.

2021 ◽  
Vol 5 (4) ◽  
pp. 1003-1016
Author(s):  
Sylvain Mareschal ◽  
Anna Palau ◽  
Johan Lindberg ◽  
Philippe Ruminy ◽  
Christer Nilsson ◽  
...  

Abstract Although copy number alterations (CNAs) and translocations constitute the backbone of the diagnosis and prognostication of acute myeloid leukemia (AML), techniques used for their assessment in routine diagnostics have not been reconsidered for decades. We used a combination of 2 next-generation sequencing–based techniques to challenge the currently recommended conventional cytogenetic analysis (CCA), comparing the approaches in a series of 281 intensively treated patients with AML. Shallow whole-genome sequencing (sWGS) outperformed CCA in detecting European Leukemia Net (ELN)–defining CNAs and showed that CCA overestimated monosomies and suboptimally reported karyotype complexity. Still, the concordance between CCA and sWGS for all ELN CNA–related criteria was 94%. Moreover, using in silico dilution, we showed that 1 million reads per patient would be enough to accurately assess ELN-defining CNAs. Total genomic loss, defined as a total loss ≥200 Mb by sWGS, was found to be a better marker for genetic complexity and poor prognosis compared with the CCA-based definition of complex karyotype. For fusion detection, the concordance between CCA and whole-transcriptome sequencing (WTS) was 99%. WTS had better sensitivity in identifying inv(16) and KMT2A rearrangements while showing limitations in detecting lowly expressed PML-RARA fusions. Ligation-dependent reverse transcription polymerase chain reaction was used for validation and was shown to be a fast and reliable method for fusion detection. We conclude that a next-generation sequencing–based approach can replace conventional CCA for karyotyping, provided that efforts are made to cover lowly expressed fusion transcripts.


Author(s):  
Dragana Dudić ◽  
Bojana Banović Đeri ◽  
Vesna Pajić ◽  
Gordana Pavlović-Lažetić

Next Generation Sequencing (NGS) analysis has become a widely used method for studying the structure of DNA and RNA, but complexity of the procedure leads to obtaining error-prone datasets which need to be cleansed in order to avoid misinterpretation of data. We address the usage and proper interpretations of characteristic metrics for RNA sequencing (RNAseq) quality control, implemented in and reported by FastQC, and provide a comprehensive guidance for their assessment in the context of total RNAseq quality control of Illumina raw reads. Additionally, we give recommendations how to adequately perform the quality control preprocessing step of raw total RNAseq Illumina reads according to the obtained results of the quality control evaluation step; the aim is to provide the best dataset to downstream analysis, rather than to get better FastQC results. We also tested effects of different preprocessing approaches to the downstream analysis and recommended the most suitable approach.


2013 ◽  
Vol 80 (3) ◽  
pp. 959-971 ◽  
Author(s):  
Shaolong Feng ◽  
Tyson P. Eucker ◽  
Mayumi K. Holly ◽  
Michael E. Konkel ◽  
Xiaonan Lu ◽  
...  

ABSTRACTWe present the results of a study using high-throughput whole-transcriptome sequencing (RNA-seq) and vibrational spectroscopy to characterize and fingerprint pathogenic-bacterium injury under conditions of unfavorable stress. Two garlic-derived organosulfur compounds were found to be highly effective antimicrobial compounds againstCronobacter sakazakii, a leading pathogen associated with invasive infection of infants and causing meningitis, necrotizing entercolitis, and bacteremia. RNA-seq shows changes in gene expression patterns and transcriptomic response, while confocal micro-Raman spectroscopy characterizes macromolecular changes in the bacterial cell resulting from this chemical stress. RNA-seq analyses showed that the bacterial response to ajoene differed from the response to diallyl sulfide. Specifically, ajoene caused downregulation of motility-related genes, while diallyl sulfide treatment caused an increased expression of cell wall synthesis genes. Confocal micro-Raman spectroscopy revealed that the two compounds appear to have the same phase I antimicrobial mechanism of binding to thiol-containing proteins/enzymes in bacterial cells generating a disulfide stretching band but different phase II antimicrobial mechanisms, showing alterations in the secondary structures of proteins in two different ways. Diallyl sulfide primarily altered the α-helix and β-sheet, as reflected in changes in amide I, while ajoene altered the structures containing phenylalanine and tyrosine. Bayesian probability analysis validated the ability of principal component analysis to differentiate treated and controlC. sakazakiicells. Scanning electron microscopy confirmed cell injury, showing significant morphological variations in cells following treatments by these two compounds. Findings from this study aid in the development of effective intervention strategies to reduce the risk ofC. sakazakiicontamination in the food production environment and on food contact surfaces, reducing the risks to susceptible consumers.


2014 ◽  
Vol 96 ◽  
Author(s):  
NIR PILLAR ◽  
OFER ISAKOV ◽  
NOAM SHOMRON

Next-generation sequencing (NGS; also known as deep sequencing or ultra-high throughput sequencing) has probably been the most important tool for genomic research over the past few years. NGS has led to numerous discoveries and scientific breakthroughs in the genetic field. The sequencing technology that has entered the research laboratory in the past decade is now being introduced into the clinical diagnostic laboratory. Consequently, NGS results are becoming available in the medical arena as abundance of clinically relevant variants, conferring predisposition to disease, are being discovered at a growing rate (Stanley, 2014).


Blood ◽  
2014 ◽  
Vol 124 (21) ◽  
pp. 3768-3768 ◽  
Author(s):  
Grazia Fazio ◽  
Marco Severgnini ◽  
Ingrid Cifola ◽  
Silvia Bungaro ◽  
Andrea Biondi ◽  
...  

Abstract Introduction. Acute Lymphoblastic Leukemia (ALL) is the most frequent type of childhood leukemia. It is a multi-step process, characterized by the expansion of a pre-leukemic clone, accumulating cooperative genetic events required for the full transformation and clinical manifestation. Recently, the technological advances in genome-wide profiling techniques have allowed a better understanding of its molecular basis and heterogeneity. However, incidence and cure rates greatly differ among children, reflecting diverse responses to drug treatment and distinguishing risk groups. This defines the need for molecular investigations to better understand leukemia biology and improve risk prediction. Aim. We applied a whole-transcriptome sequencing approach (RNA-seq) to characterize low- (LR) versus high-risk (HR) patients, to identify new genetic lesions associated to different early response to therapy. Methods. Total RNA was extracted from primary leukemic blast samples of 10 pediatric ALL patients (4LR and 6 HR, according to minimal residual disease monitoring), included in the Italian AIEOP-BFM ALL2000 protocol. Genome-wide DNA profiling was performed by Affymetrix Cyto2.7M Arrays, RNA-seq was carried out by Illumina GAIIx platform, and validations were performed using independent approaches, such as RT-PCR and FISH. Fusion events were detected using FusionMap software, followed by a custom computational pipeline for the reduction of false positives and the identification of the most likely fusion candidates. Potential interest for leukemia was explored by testing the occurrence of these candidate fusions and con-joined genes in other RNA-seq datasets from different tumors and normal blood samples (i.e.: 15 melanomas, 2 melanocytes, 45 CEU individuals from 1K Genomes Project, plus 25 AML and 12 ALL). Results. We sequenced the transcriptome of 10 childhood ALL cases, not carrying other clinical or genetic risk factors. We performed a comprehensive whole-transcriptome analysis, comprising identification of fusion transcripts, alternative splicing and SNPs. Priority was given to fusion transcripts, which could originate from intra- or inter-chromosomal rearrangements, since they might represent potential prognostic markers or therapeutic targets for personalized treatments. We identified 127 fusion candidates. Strikingly, 123 out of 127 events were identified as intra-chromosomal, 119 of which were involving two contiguous genes or with overlapping loci (the so-called “con-joined genes”). Among the four intra-chromosomal events, the NUP214-ABL1 fusion, previously found in T-ALL and responsive to kinase inhibitors, was here identified and validated in one HR B-ALL patient, thus opening new perspectives for targeted treatment options. Finally, among the four inter-chromosomal events, the novel PAX5-POM121C fusion was identified and validated in one LR patient. Both intra- and inter-chromosomal fusions resulted private or low-frequent events, not recurrent in other tumor types, nor in normal blood samples. Among the con-joined genes, we identified a subset of 22 events not present in melanoma nor in normal blood samples, but common to the external AML and ALL datasets. Conclusion. RNA-seq represents one of the most comprehensive approaches to identify genetic alterations carried by leukemia clones. Our analyses identified novel fusion genes, originated by either inter- or intra-chromosomal rearrangements, as well as a considerable number of con-joined genes. Further evaluations will address SNPs, mutations, gene expression changes and splice variants that could be related to a different risk of relapse, and the feasibility of the screening of these candidates on a larger population of consecutive clinical cases. Disclosures No relevant conflicts of interest to declare.


2017 ◽  
Vol 35 (15_suppl) ◽  
pp. e23118-e23118
Author(s):  
Alexandra E Gylfe ◽  
Eve Shinbrot ◽  
Boyko Kakaradov ◽  
Wayne Delport ◽  
Corine K Lau ◽  
...  

e23118 Background: Current targeted cancer therapies rely on the identification of clinically relevant somatic alterations in the tumor. Hotspot gene-panels and exome sequencing are designed to quickly assess somatic variations in frequently mutated regions and/or the coding regions of relevant genes, but they have limited ability to detect complex genomic rearrangements or novel structural variations. Here, we describe an integrative and comprehensive approach to fully characterize the genomic complexity of solid tumors using high throughput whole genome sequencing (WGS) and whole transcriptome sequencing (RNA Seq). Methods: We performed WGS and high-depth sequencing of known cancer genes in 14 paired tumor-normal samples of a variety of tumor types. Tumor-specific somatic alteration assessments included protein-coding mutations, copy number variations, gene fusions and structural variants. In addition, RNA Seq data was analyzed to identify expressed somatic alterations. Results: We identified 2 novel fusion genes as well as important structural alterations which could have clinical and therapeutic implications. We described a novel BRAF fusion gene in a cholangiocarcinoma devoid of other known driver mutations. BRAF fusions have not been described previously in cholangiocarcinoma; this fusion may represent an alternative mechanism for MAPK activation and could be a useful drug target. We also identified a novel NTRK3 fusion partner in a glioblastoma tumor. This fusion may imply a novel mechanism for NTRK3 activation. Finally, we identified numerous tandem duplications in an ovarian cancer. Recent advances describe tandem duplication hotspots in ovarian cancer as a potential driver mechanism characterizing a specific mutational signature. Conclusions: Comprehensive genomics assessment of paired tumor-normal samples through whole-genome and transcriptome sequencing can yield additional clinically actionable genomic characteristics that may not be detected in whole-exome or hotspot gene-panel sequencing. These findings have the potential to aid in clinical decision making.


Sign in / Sign up

Export Citation Format

Share Document