A cloud-based framework for applying metamorphic testing to a bioinformatics pipeline

Author(s):  
Michael Troup ◽  
Andrian Yang ◽  
Amir Hossein Kamali ◽  
Eleni Giannoulatou ◽  
Tsong Yueh Chen ◽  
...  
2013 ◽  
Vol 33 (6) ◽  
pp. 1657-1661 ◽  
Author(s):  
SONG HUANG ◽  
Ruihao DING ◽  
Hui LI ◽  
Yi YAO
Keyword(s):  

2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Gundula Povysil ◽  
Monika Heinzl ◽  
Renato Salazar ◽  
Nicholas Stoler ◽  
Anton Nekrutenko ◽  
...  

Abstract Duplex sequencing is currently the most reliable method to identify ultra-low frequency DNA variants by grouping sequence reads derived from the same DNA molecule into families with information on the forward and reverse strand. However, only a small proportion of reads are assembled into duplex consensus sequences (DCS), and reads with potentially valuable information are discarded at different steps of the bioinformatics pipeline, especially reads without a family. We developed a bioinformatics toolset that analyses the tag and family composition with the purpose to understand data loss and implement modifications to maximize the data output for the variant calling. Specifically, our tools show that tags contain polymerase chain reaction and sequencing errors that contribute to data loss and lower DCS yields. Our tools also identified chimeras, which likely reflect barcode collisions. Finally, we also developed a tool that re-examines variant calls from raw reads and provides different summary data that categorizes the confidence level of a variant call by a tier-based system. With this tool, we can include reads without a family and check the reliability of the call, that increases substantially the sequencing depth for variant calling, a particular important advantage for low-input samples or low-coverage regions.


2021 ◽  
Vol 22 (15) ◽  
pp. 8012
Author(s):  
Rongxin Zhang ◽  
Yajun Liu ◽  
Xingxing Zhang ◽  
Ke Xiao ◽  
Yue Hou ◽  
...  

G-quadruplexes are the non-canonical nucleic acid structures that are preferentially formed in G-rich regions. This structure has been shown to be associated with many biological functions. Regardless of the broad efforts on DNA G-quadruplexes, we still have limited knowledge on RNA G-quadruplexes, especially in a transcriptome-wide manner. Herein, by integrating the DMS-seq and the bioinformatics pipeline, we profiled and depicted the RNA G-quadruplexes in the human transcriptome. The genes that contain RNA G-quadruplexes in their specific regions are significantly related to immune pathways and the COVID-19-related gene sets. Bioinformatics analysis reveals the potential regulatory functions of G-quadruplexes on miRNA targeting at the scale of the whole transcriptome. In addition, the G-quadruplexes are depleted in the putative, not the real, PAS-strong poly(A) sites, which may weaken the possibility of such sites being the real cleaved sites. In brief, our study provides insight into the potential function of RNA G-quadruplexes in post-transcription.


2019 ◽  
Author(s):  
Yu Liu ◽  
Paul W Bible ◽  
Bin Zou ◽  
Qiaoxing Liang ◽  
Cong Dong ◽  
...  

Abstract Motivation Microbiome analyses of clinical samples with low microbial biomass are challenging because of the very small quantities of microbial DNA relative to the human host, ubiquitous contaminating DNA in sequencing experiments and the large and rapidly growing microbial reference databases. Results We present computational subtraction-based microbiome discovery (CSMD), a bioinformatics pipeline specifically developed to generate accurate species-level microbiome profiles for clinical samples with low microbial loads. CSMD applies strategies for the maximal elimination of host sequences with minimal loss of microbial signal and effectively detects microorganisms present in the sample with minimal false positives using a stepwise convergent solution. CSMD was benchmarked in a comparative evaluation with other classic tools on previously published well-characterized datasets. It showed higher sensitivity and specificity in host sequence removal and higher specificity in microbial identification, which led to more accurate abundance estimation. All these features are integrated into a free and easy-to-use tool. Additionally, CSMD applied to cell-free plasma DNA showed that microbial diversity within these samples is substantially broader than previously believed. Availability and implementation CSMD is freely available at https://github.com/liuyu8721/csmd. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
pp. 111040
Author(s):  
Rongjie Yan ◽  
Siqi Wang ◽  
Yixuan Yan ◽  
Hongyu Gao ◽  
Jun Yan

RNA Biology ◽  
2021 ◽  
pp. 1-6
Author(s):  
Bhaskar Shukla ◽  
Sanchita Gupta ◽  
Gaurava Srivastava ◽  
Ashok Sharma ◽  
Ashutosh K. Shukla ◽  
...  

2020 ◽  
Vol 9 (1) ◽  
pp. 2
Author(s):  
Tal Domanovich-Asor ◽  
Yair Motro ◽  
Boris Khalfin ◽  
Hillary A. Craddock ◽  
Avi Peretz ◽  
...  

Antimicrobial resistance (AMR) in Helicobacter pylori is increasing and can result in treatment failure and inappropriate antibiotic usage. This study used whole genome sequencing (WGS) to comprehensively analyze the H. pylori resistome and phylogeny in order to characterize Israeli H. pylori. Israeli H. pylori isolates (n = 48) underwent antimicrobial susceptibility testing (AST) against five antimicrobials and WGS analysis. Literature review identified 111 mutations reported to correlate with phenotypic resistance to these antimicrobials. Analysis was conducted via our in-house bioinformatics pipeline targeting point mutations in the relevant genes (pbp1A, 23S rRNA, gyrA, rdxA, frxA, and rpoB) in order to assess genotype-to-phenotype correlation. Resistance rates of study isolates were as follows: clarithromycin 54%, metronidazole 31%, amoxicillin 10%, rifampicin 4%, and levofloxacin 2%. Genotype-to-phenotype correlation was inconsistent; for every analyzed gene at least one phenotypically susceptible isolate was found to have a mutation previously associated with resistance. This was also observed regarding mutations commonly used in commercial kits to diagnose AMR in H. pylori cases. Furthermore, 11 novel point mutations associated with a resistant phenotype were detected. Analysis of a unique set of H. pylori isolates demonstrates that inferring resistance phenotypes from WGS in H. pylori remains challenging and should be optimized further.


Sign in / Sign up

Export Citation Format

Share Document