scholarly journals Gene Prediction in Metagenomic Fragments with Deep Learning

2017 ◽  
Vol 2017 ◽  
pp. 1-9 ◽  
Author(s):  
Shao-Wu Zhang ◽  
Xiang-Yang Jin ◽  
Teng Zhang

Next generation sequencing technologies used in metagenomics yield numerous sequencing fragments which come from thousands of different species. Accurately identifying genes from metagenomics fragments is one of the most fundamental issues in metagenomics. In this article, by fusing multifeatures (i.e., monocodon usage, monoamino acid usage, ORF length coverage, and Z-curve features) and using deep stacking networks learning model, we present a novel method (called Meta-MFDL) to predict the metagenomic genes. The results with 10 CV and independent tests show that Meta-MFDL is a powerful tool for identifying genes from metagenomic fragments.

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Jinny X. Zhang ◽  
Boyan Yordanov ◽  
Alexander Gaunt ◽  
Michael X. Wang ◽  
Peng Dai ◽  
...  

AbstractTargeted high-throughput DNA sequencing is a primary approach for genomics and molecular diagnostics, and more recently as a readout for DNA information storage. Oligonucleotide probes used to enrich gene loci of interest have different hybridization kinetics, resulting in non-uniform coverage that increases sequencing costs and decreases sequencing sensitivities. Here, we present a deep learning model (DLM) for predicting Next-Generation Sequencing (NGS) depth from DNA probe sequences. Our DLM includes a bidirectional recurrent neural network that takes as input both DNA nucleotide identities as well as the calculated probability of the nucleotide being unpaired. We apply our DLM to three different NGS panels: a 39,145-plex panel for human single nucleotide polymorphisms (SNP), a 2000-plex panel for human long non-coding RNA (lncRNA), and a 7373-plex panel targeting non-human sequences for DNA information storage. In cross-validation, our DLM predicts sequencing depth to within a factor of 3 with 93% accuracy for the SNP panel, and 99% accuracy for the non-human panel. In independent testing, the DLM predicts the lncRNA panel with 89% accuracy when trained on the SNP panel. The same model is also effective at predicting the measured single-plex kinetic rate constants of DNA hybridization and strand displacement.


2013 ◽  
Vol 15 (9) ◽  
pp. 721-728 ◽  
Author(s):  
Katrina A.B. Goddard ◽  
Evelyn P. Whitlock ◽  
Jonathan S. Berg ◽  
Marc S. Williams ◽  
Elizabeth M. Webber ◽  
...  

2008 ◽  
Vol 18 (10) ◽  
pp. 1638-1642 ◽  
Author(s):  
D. R. Smith ◽  
A. R. Quinlan ◽  
H. E. Peckham ◽  
K. Makowsky ◽  
W. Tao ◽  
...  

2011 ◽  
Vol 16 (11-12) ◽  
pp. 512-519 ◽  
Author(s):  
Peter M. Woollard ◽  
Nalini A.L. Mehta ◽  
Jessica J. Vamathevan ◽  
Stephanie Van Horn ◽  
Bhushan K. Bonde ◽  
...  

Genes ◽  
2018 ◽  
Vol 9 (9) ◽  
pp. 429 ◽  
Author(s):  
Daniela Barros-Silva ◽  
C. Marques ◽  
Rui Henrique ◽  
Carmen Jerónimo

DNA methylation is an epigenetic modification that plays a pivotal role in regulating gene expression and, consequently, influences a wide variety of biological processes and diseases. The advances in next-generation sequencing technologies allow for genome-wide profiling of methyl marks both at a single-nucleotide and at a single-cell resolution. These profiling approaches vary in many aspects, such as DNA input, resolution, coverage, and bioinformatics analysis. Thus, the selection of the most feasible method according with the project’s purpose requires in-depth knowledge of those techniques. Currently, high-throughput sequencing techniques are intensively used in epigenomics profiling, which ultimately aims to find novel biomarkers for detection, diagnosis prognosis, and prediction of response to therapy, as well as to discover new targets for personalized treatments. Here, we present, in brief, a portrayal of next-generation sequencing methodologies’ evolution for profiling DNA methylation, highlighting its potential for translational medicine and presenting significant findings in several diseases.


2021 ◽  
Author(s):  
Ahmed S Fahad ◽  
Cheng Yu Chung ◽  
Sheila N. Lopez Acevedo ◽  
Nicoleen Boyle ◽  
Bharat Madan ◽  
...  

Functional analyses of the T cell receptor (TCR) landscape can reveal critical information about protection from disease and molecular responses to vaccines. However, it has proven difficult to combine advanced next-generation sequencing technologies with methods to decode the peptide-major histocompatibility complex (pMHC) specificity of individual TCRs. Here we developed a new high-throughput approach to enable repertoire-scale functional evaluations of natively paired TCRs. In particular, we leveraged the immortalized nature of physically linked TCRα:β amplicon libraries to analyze binding against multiple recombinant pMHCs on a repertoire scale. To exemplify the utility of this approach, we also performed affinity-based functional mapping in conjunction with quantitative next-generation sequencing to track antigen- specific TCRs. These data successfully validated a new immortalization and screening platform to facilitate detailed molecular analyses of human TCRs against diverse antigen targets associated with health, vaccination, or disease.


2018 ◽  
Vol 15 (2) ◽  
pp. 367-372
Author(s):  
Lê Ngọc Giang ◽  
Lưu Hàn Ly ◽  
Nguyễn Mai Phương ◽  
Lê Tùng Lâm ◽  
Đỗ Thị Huyền ◽  
...  

Microorganisms, particularly bacteria, in the ruminant's rumen are valuable genetic resources that many scientists interested in. In recent years, the application of next-generation sequencing technologies allows direct decoding an extracted DNA metagenome in each ecological community without culture, increasing the efficiency of exploiting interested genes. Notably, the quantity and quality of extracted DNA play an important role in getting a reliable metagenome database. In this study, DNA metagenome from goat rumen fluid was extracted by five different methods RBB (repeated bead beating plus column), RBBC (repeated bead beating), PSP1, PSP2 (PSP®Spin Stool DNA Kit, protocol 1, 2, Germany) và QIA (QIAamp® DNA Stool Mini Kit, Germany). The results showed that DNA metagenome obtained by all methods had A260/280 greater than 1.8. DNA extracted by the RBB method had high DNA concentration but low A260/230 values (less than 1.4) and still contained Taq polymerase inhibitor. After purifying by QIA column, A260/230 values of RBB-extracted DNA significantly increased up to 2.0 and Taq polymerase inhibitor in samples were removed. However, the concentrations decreased by 57% that nearly equivalent to concentration of DNA metagenome obtained by QIA. The method using PSP®Spin Stool DNA kit produced the highest DNA concentrations (from 149.7 to 195.5 ng/µl) with A260/280 ratios of 1.9 and A260/230 ratios of 1.8 to 1.9. Morever, this method was able to remove polymerase inhibitor and be performed on short time. Therefore, the PSP®Spin Stool DNA kit is a suitable method for DNA metagenome extraction of bacteria from goat rumen. DNA obtained by this method fulfilled all criteria about quality and concentration for sequencing by next-generation sequencing Illumina.


Sign in / Sign up

Export Citation Format

Share Document