scholarly journals Deciphering gene regulation from gene expression dynamics using deep neural network

2018 ◽  
Author(s):  
Jingxiang Shen ◽  
Mariela D. Petkova ◽  
Yuhai Tu ◽  
Feng Liu ◽  
Chao Tang

AbstractComplex biological functions are carried out by the interaction of genes and proteins. Uncovering the gene regulation network behind a function is one of the central themes in biology. Typically, it involves extensive experiments of genetics, biochemistry and molecular biology. In this paper, we show that much of the inference task can be accomplished by a deep neural network (DNN), a form of machine learning or artificial intelligence. Specifically, the DNN learns from the dynamics of the gene expression. The learnt DNN behaves like an accurate simulator of the system, on which one can perform in-silico experiments to reveal the underlying gene network. We demonstrate the method with two examples: biochemical adaptation and the gap-gene patterning in fruit fly embryogenesis. In the first example, the DNN can successfully find the two basic network motifs for adaptation – the negative feedback and the incoherent feed-forward. In the second and much more complex example, the DNN can accurately predict behaviors of essentially all the mutants. Furthermore, the regulation network it uncovers is strikingly similar to the one inferred from experiments. In doing so, we develop methods for deciphering the gene regulation network hidden in the DNN “black box”. Our interpretable DNN approach should have broad applications in genotype-phenotype mapping.SignificanceComplex biological functions are carried out by gene regulation networks. The mapping between gene network and function is a central theme in biology. The task usually involves extensive experiments with perturbations to the system (e.g. gene deletion). Here, we demonstrate that machine learning, or deep neural network (DNN), can help reveal the underlying gene regulation for a given function or phenotype with minimal perturbation data. Specifically, after training with wild-type gene expression dynamics data and a few mutant snapshots, the DNN learns to behave like an accurate simulator for the genetic system, which can be used to predict other mutants’ behaviors. Furthermore, our DNN approach is biochemically interpretable, which helps uncover possible gene regulatory mechanisms underlying the observed phenotypic behaviors.

2019 ◽  
Vol 14 (6) ◽  
pp. 551-561
Author(s):  
Shengxian Cao ◽  
Yu Wang ◽  
Zhenhao Tang

Background:Time series expression data of genes contain relations among different genes, which are difficult to model precisely. Slime-forming bacteria is one of the three major harmful bacteria types in industrial circulating cooling water systems.Objective:This study aimed at constructing gene regulation network(GRN) for slime-forming bacteria to understand the microbial fouling mechanism.Methods:For this purpose, an Adaptive Elman Neural Network (AENN) to reveal the relationships among genes using gene expression time series is proposed. The parameters of Elman neural network were optimized adaptively by a Genetic Algorithm (GA). And a Pearson correlation analysis is applied to discover the relationships among genes. In addition, the gene expression data of slime-forming bacteria by transcriptome gene sequencing was presented.Results:To evaluate our proposed method, we compared several alternative data-driven approaches, including a Neural Fuzzy Recurrent Network (NFRN), a basic Elman Neural Network (ENN), and an ensemble network. The experimental results of simulated and real datasets demonstrate that the proposed approach has a promising performance for modeling Gene Regulation Networks (GRNs). We also applied the proposed method for the GRN construction of slime-forming bacteria and at last a GRN for 6 genes was constructed.Conclusion:The proposed GRN construction method can effectively extract the regulations among genes. This is also the first report to construct the GRN for slime-forming bacteria.


2018 ◽  
Vol 49 (4) ◽  
pp. 1492-1498 ◽  
Author(s):  
Hai-Ting Xu ◽  
Jian-Chun Guo ◽  
Hua-Zhen Liu ◽  
Wan-wan Jin

Background/Aims: Major burn injury is one of the main severe forms of wound which lead to a mass of clinical debilitation, this study was to identify core biomarkers for the recovery of severe burned injury. Methods: Gene expression profiles (GSE19743) from the Gene Expression Omnibus (GEO) was downloaded, followed by background correction, normalization of raw microarray dataset and identification of differential expression genes (DEGs) . Soft clustering of DEGs was used for the screening of gene clusters that with sustained increasing (SIG) and decreasing expression (SDG) profiles along with the recovery process of burned injury. The significantly enriched Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways of SIGs and SDGs were obtained through the Database for Annotation, Visualization, and Integrated Discovery (DAVID), based on which the miRNA-gene regulation network for SIGs and SDGs were constructed from the miRWalk database. Results: Ten clusters were obtained through soft clustering. The SIGs and SDGs were found to be closely associated with the biological processes of immune system. The miRNA-gene regulation network analysis suggested different roles between SIGs and SDGs in the recovery of severe burned injury. Furthermore, a bunch of important biomarkers were identified, which would be helpful in the treatment of burned patients. Conclusion: Our current findings suggest an interesting molecular link between transcriptional regulation potentially involved in immunosuppressive state after major burn injury, which warrants further exploration for their utilization in the treatment of major burn injury.


2020 ◽  
Vol 8 (10) ◽  
pp. 766
Author(s):  
Dohan Oh ◽  
Julia Race ◽  
Selda Oterkus ◽  
Bonguk Koo

Mechanical damage is recognized as a problem that reduces the performance of oil and gas pipelines and has been the subject of continuous research. The artificial neural network in the spotlight recently is expected to be another solution to solve the problems relating to the pipelines. The deep neural network, which is on the basis of artificial neural network algorithm and is a method amongst various machine learning methods, is applied in this study. The applicability of machine learning techniques such as deep neural network for the prediction of burst pressure has been investigated for dented API 5L X-grade pipelines. To this end, supervised learning is employed, and the deep neural network model has four layers with three hidden layers, and the neural network uses the fully connected layer. The burst pressure computed by deep neural network model has been compared with the results of finite element analysis based parametric study, and the burst pressure calculated by the experimental results. According to the comparison results, it showed good agreement. Therefore, it is concluded that deep neural networks can be another solution for predicting the burst pressure of API 5L X-grade dented pipelines.


2019 ◽  
Vol 10 (36) ◽  
pp. 8374-8383 ◽  
Author(s):  
Mohammad Atif Faiz Afzal ◽  
Aditya Sonpal ◽  
Mojtaba Haghighatlari ◽  
Andrew J. Schultz ◽  
Johannes Hachmann

Computational pipeline for the accelerated discovery of organic materials with high refractive index via high-throughput screening and machine learning.


2020 ◽  
Author(s):  
Muhammad Afzal ◽  
Fakhare Alam ◽  
Khalid Mahmood Malik ◽  
Ghaus M Malik

BACKGROUND Automatic text summarization (ATS) enables users to retrieve meaningful evidence from big data of biomedical repositories to make complex clinical decisions. Deep neural and recurrent networks outperform traditional machine-learning techniques in areas of natural language processing and computer vision; however, they are yet to be explored in the ATS domain, particularly for medical text summarization. OBJECTIVE Traditional approaches in ATS for biomedical text suffer from fundamental issues such as an inability to capture clinical context, quality of evidence, and purpose-driven selection of passages for the summary. We aimed to circumvent these limitations through achieving precise, succinct, and coherent information extraction from credible published biomedical resources, and to construct a simplified summary containing the most informative content that can offer a review particular to clinical needs. METHODS In our proposed approach, we introduce a novel framework, termed Biomed-Summarizer, that provides quality-aware Patient/Problem, Intervention, Comparison, and Outcome (PICO)-based intelligent and context-enabled summarization of biomedical text. Biomed-Summarizer integrates the prognosis quality recognition model with a clinical context–aware model to locate text sequences in the body of a biomedical article for use in the final summary. First, we developed a deep neural network binary classifier for quality recognition to acquire scientifically sound studies and filter out others. Second, we developed a bidirectional long-short term memory recurrent neural network as a clinical context–aware classifier, which was trained on semantically enriched features generated using a word-embedding tokenizer for identification of meaningful sentences representing PICO text sequences. Third, we calculated the similarity between query and PICO text sequences using Jaccard similarity with semantic enrichments, where the semantic enrichments are obtained using medical ontologies. Last, we generated a representative summary from the high-scoring PICO sequences aggregated by study type, publication credibility, and freshness score. RESULTS Evaluation of the prognosis quality recognition model using a large dataset of biomedical literature related to intracranial aneurysm showed an accuracy of 95.41% (2562/2686) in terms of recognizing quality articles. The clinical context–aware multiclass classifier outperformed the traditional machine-learning algorithms, including support vector machine, gradient boosted tree, linear regression, K-nearest neighbor, and naïve Bayes, by achieving 93% (16127/17341) accuracy for classifying five categories: aim, population, intervention, results, and outcome. The semantic similarity algorithm achieved a significant Pearson correlation coefficient of 0.61 (0-1 scale) on a well-known BIOSSES dataset (with 100 pair sentences) after semantic enrichment, representing an improvement of 8.9% over baseline Jaccard similarity. Finally, we found a highly positive correlation among the evaluations performed by three domain experts concerning different metrics, suggesting that the automated summarization is satisfactory. CONCLUSIONS By employing the proposed method Biomed-Summarizer, high accuracy in ATS was achieved, enabling seamless curation of research evidence from the biomedical literature to use for clinical decision-making.


2021 ◽  
Author(s):  
Mohammed Ayub ◽  
SanLinn Kaka

Abstract Manual first-break picking from a large volume of seismic data is extremely tedious and costly. Deployment of machine learning models makes the process fast and cost effective. However, these machine learning models require high representative and effective features for accurate automatic picking. Therefore, First- Break (FB) picking classification model that uses effective minimum number of features and promises performance efficiency is proposed. The variants of Recurrent Neural Networks (RNNs) such as Long ShortTerm Memory (LSTM) and Gated Recurrent Unit (GRU) can retain contextual information from long previous time steps. We deploy this advantage for FB picking as seismic traces are amplitude values of vibration along the time-axis. We use behavioral fluctuation of amplitude as input features for LSTM and GRU. The models are trained on noisy data and tested for generalization on original traces not seen during the training and validation process. In order to analyze the real-time suitability, the performance is benchmarked using accuracy, F1-measure and three other established metrics. We have trained two RNN models and two deep Neural Network models for FB classification using only amplitude values as features. Both LSTM and GRU have the accuracy and F1-measure with a score of 94.20%. With the same features, Convolutional Neural Network (CNN) has an accuracy of 93.58% and F1-score of 93.63%. Again, Deep Neural Network (DNN) model has scores of 92.83% and 92.59% as accuracy and F1-measure, respectively. From the pexperiment results, we see significant superior performance of LSTM and GRU to CNN and DNN when used the same features. For robustness of LSTM and GRU models, the performance is compared with DNN model that is trained using nine features derived from seismic traces and observed that the performance superiority of RNN models. Therefore, it is safe to conclude that RNN models (LSTM and GRU) are capable of classifying the FB events efficiently even by using a minimum number of features that are not computationally expensive. The novelty of our work is the capability of automatic FB classification with the RNN models that incorporate contextual behavioral information without the need for sophisticated feature extraction or engineering techniques that in turn can help in reducing the cost and fostering classification model robust and faster.


Sign in / Sign up

Export Citation Format

Share Document