Evolutionarily informed deep learning methods: Predicting transcript abundance from DNA sequence

Evolutionarily informed deep learning methods for predicting relative transcript abundance from DNA sequence

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1814551116 ◽

2019 ◽

Vol 116 (12) ◽

pp. 5542-5549 ◽

Cited By ~ 30

Author(s):

Jacob D. Washburn ◽

Maria Katherine Mejia-Guerra ◽

Guillaume Ramstein ◽

Karl A. Kremling ◽

Ravi Valluru ◽

...

Keyword(s):

Deep Learning ◽

Large Scale ◽

Gene Families ◽

Transcript Abundance ◽

Fine Tuning ◽

Evolutionary Divergence ◽

Orthologous Genes ◽

Evolutionary Relatedness ◽

Model Training ◽

Model Weight

Deep learning methodologies have revolutionized prediction in many fields and show potential to do the same in molecular biology and genetics. However, applying these methods in their current forms ignores evolutionary dependencies within biological systems and can result in false positives and spurious conclusions. We developed two approaches that account for evolutionary relatedness in machine learning models: (i) gene-family–guided splitting and (ii) ortholog contrasts. The first approach accounts for evolution by constraining model training and testing sets to include different gene families. The second approach uses evolutionarily informed comparisons between orthologous genes to both control for and leverage evolutionary divergence during the training process. The two approaches were explored and validated within the context of mRNA expression level prediction and have the area under the ROC curve (auROC) values ranging from 0.75 to 0.94. Model weight inspections showed biologically interpretable patterns, resulting in the hypothesis that the 3′ UTR is more important for fine-tuning mRNA abundance levels while the 5′ UTR is more important for large-scale changes.

Download Full-text

Self-Supervised Pre-Training of Transformers for Satellite Image Time Series Classification

10.36227/techrxiv.13025039.v1 ◽

2020 ◽

Author(s):

Yuan Yuan ◽

Lei Lin

Keyword(s):

Time Series ◽

Deep Learning ◽

Large Scale ◽

Temporal Structure ◽

Satellite Image ◽

Fine Tuning ◽

Small Scale ◽

Model Parameters ◽

Learning Approaches ◽

Wide Range

Satellite image time series (SITS) classification is a major research topic in remote sensing and is relevant for a wide range of applications. Deep learning approaches have been commonly employed for SITS classification and have provided state-of-the-art performance. However, deep learning methods suffer from overfitting when labeled data is scarce. To address this problem, we propose a novel self-supervised pre-training scheme to initialize a Transformer-based network by utilizing large-scale unlabeled data. In detail, the model is asked to predict randomly contaminated observations given an entire time series of a pixel. The main idea of our proposal is to leverage the inherent temporal structure of satellite time series to learn general-purpose spectral-temporal representations related to land cover semantics. Once pre-training is completed, the pre-trained network can be further adapted to various SITS classification tasks by fine-tuning all the model parameters on small-scale task-related labeled data. In this way, the general knowledge and representations about SITS can be transferred to a label-scarce task, thereby improving the generalization performance of the model as well as reducing the risk of overfitting. Comprehensive experiments have been carried out on three benchmark datasets over large study areas. Experimental results demonstrate the effectiveness of the proposed method, leading to a classification accuracy increment up to 1.91% to 6.69%. <div><b>This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.</b></div>

Download Full-text

A general approach for improving deep learning-based medical relation extraction using a pre-trained model and fine-tuning

Database ◽

10.1093/database/baz116 ◽

2019 ◽

Vol 2019 ◽

Cited By ~ 2

Author(s):

Tao Chen ◽

Mingfen Wu ◽

Hexi Li

Keyword(s):

Deep Learning ◽

Large Scale ◽

Relation Extraction ◽

Training Model ◽

Biomedical Literature ◽

Training Data ◽

Fine Tuning ◽

Learning Approaches ◽

Additional Time ◽

Clinical Records

Abstract The automatic extraction of meaningful relations from biomedical literature or clinical records is crucial in various biomedical applications. Most of the current deep learning approaches for medical relation extraction require large-scale training data to prevent overfitting of the training model. We propose using a pre-trained model and a fine-tuning technique to improve these approaches without additional time-consuming human labeling. Firstly, we show the architecture of Bidirectional Encoder Representations from Transformers (BERT), an approach for pre-training a model on large-scale unstructured text. We then combine BERT with a one-dimensional convolutional neural network (1d-CNN) to fine-tune the pre-trained model for relation extraction. Extensive experiments on three datasets, namely the BioCreative V chemical disease relation corpus, traditional Chinese medicine literature corpus and i2b2 2012 temporal relation challenge corpus, show that the proposed approach achieves state-of-the-art results (giving a relative improvement of 22.2, 7.77, and 38.5% in F1 score, respectively, compared with a traditional 1d-CNN classifier). The source code is available at https://github.com/chentao1999/MedicalRelationExtraction.

Download Full-text

Humanization of yeast genes with multiple human orthologs reveals principles of functional divergence between paralogs

10.1101/668335 ◽

2019 ◽

Cited By ~ 5

Author(s):

Jon M. Laurent ◽

Riddhiman K. Garge ◽

Ashley I. Teufel ◽

Claus O. Wilke ◽

Aashiq H. Kachroo ◽

...

Keyword(s):

Functional Divergence ◽

Gene Families ◽

Evolutionary Divergence ◽

Orthologous Genes ◽

Specific Expression ◽

Human Genes ◽

Human Orthologs ◽

Human Proteins ◽

Yeast Genes ◽

Opening Up

AbstractDespite over a billion years of evolutionary divergence, several thousand human genes possess clearly identifiable orthologs in yeast, and many have undergone lineage-specific duplications in one or both lineages. The ortholog conjecture postulates that orthologous genes between species retain ancestral functions despite divergence over vast timescales, but duplicated genes will be free to diverge in function. However, the retention of ancestral functions among co-orthologs between species and within gene families has been difficult to test experimentally at scale. In order to investigate how ancestral functions are retained or lost post-duplication, we systematically replaced hundreds of essential yeast genes with their human orthologs from gene families that have undergone lineage-specific duplications, including those with single duplications (one yeast gene to two human genes, 1:2) or higher-order expansions (1:>2) in the human lineage. We observe a variable pattern of replaceability across different ortholog classes, with an obvious trend towards differential replaceability inside gene families, rarely observing replaceability by all members of a family. We quantify the ability of various properties of the orthologs to predict replaceability, showing that in the case of 1:2 orthologs, replaceability is predicted largely by the divergence and tissue-specific expression of the human co-orthologs, i.e. the human proteins that are less diverged from their yeast counterpart and more ubiquitously expressed across human tissues more often replace their single yeast ortholog. These trends were consistent with in silico simulations demonstrating that when only one ortholog is replaceable, it tends to be the least diverged of the pair. Replaceability of yeast genes having more than two human co-orthologs was marked by retention of orthologous interactions in functional or protein networks as well as by more ancestral subcellular localization. Overall, we performed >400 human gene replaceability assays revealing 56 new human-yeast complementation pairs, thus opening up avenues to further functionally characterize these human genes in a simplified organismal context.

Download Full-text

Learning Attention for Object Tracking with Adversarial Learning Network

10.21203/rs.3.rs-15512/v3 ◽

2020 ◽

Author(s):

Xu Cheng ◽

Chen Song ◽

Yongxiang Gu ◽

Beijing Chen ◽

Lin Zhou ◽

...

Keyword(s):

Deep Learning ◽

Object Tracking ◽

Large Scale ◽

Fine Tuning ◽

Object Localization ◽

Generative Adversarial Network ◽

Adversarial Learning ◽

Training Time ◽

Learning Network ◽

Object State

Abstract Artificial intelligence has been widely studied on solving intelligent surveillance analysis and security problems in recent years. Although many multimedia security approaches have been proposed by using deep learning network model, there are still some challenges on their performances which deserve in-depth research. On one hand, high computational complexity of current deep learning methods makes it hard to be applied to real-time scenario. On the other hand, it is difficult to obtain the specific features of a video by fine-tuning the network online with the object state of the first frame, which fails to capture rich appearance variations of the object. To solve above two issues, in this paper, an effective object tracking method with learning attention is proposed to achieve the object localization and reduce the training time in adversarial learning framework. First, a prediction network is designed to track the object in video sequences. The object positions of the first ten frames are employed to fine-tune prediction network, which can fully mine a specific features of an object. Second, the prediction network is integrated into the generative adversarial network framework, which randomly generates masks to capture object appearance variations via adaptively dropout input features. Third, we present a spatial attention mechanism to improve the tracking performance. The proposed network can identify the mask that maintains the most robust features of the objects over a long temporal span. Extensive experiments on two large-scale benchmarks demonstrate that the proposed algorithm performs favorably against state-of-the-art methods.

Download Full-text

Self-Supervised Pre-Training of Transformers for Satellite Image Time Series Classification

10.36227/techrxiv.13025039.v3 ◽

2020 ◽

Author(s):

Yuan Yuan ◽

Lei Lin

Keyword(s):

Time Series ◽

Deep Learning ◽

Large Scale ◽

Temporal Structure ◽

Satellite Image ◽

Fine Tuning ◽

Small Scale ◽

Model Parameters ◽

Learning Approaches ◽

Wide Range

<div>Satellite image time series (SITS) classification is a major research topic in remote sensing and is relevant for a wide range of applications. Deep learning approaches have been commonly employed for SITS classification and have provided state-of-the-art performance. However, deep learning methods suffer from overfitting when labeled data is scarce. To address this problem, we propose a novel self-supervised pre-training scheme to initialize a Transformer-based network by utilizing large-scale unlabeled data. In detail, the model is asked to predict randomly contaminated observations given an entire time series of a pixel. The main idea of our proposal is to leverage the inherent temporal structure of satellite time series to learn general-purpose spectral-temporal representations related to land cover semantics. Once pre-training is completed, the pre-trained network can be further adapted to various SITS classification tasks by fine-tuning all the model parameters on small-scale task-related labeled data. In this way, the general knowledge and representations about SITS can be transferred to a label-scarce task, thereby improving the generalization performance of the model as well as reducing the risk of overfitting. Comprehensive experiments have been carried out on three benchmark datasets over large study areas. Experimental results demonstrate the effectiveness of the proposed method, leading to a classification accuracy increment up to 2.38% to 5.27%. The code and the pre-trained model will be available at https://github.com/linlei1214/SITS-BERT upon publication.</div><div><b>This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.</b></div>

Download Full-text

Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction

npj Digital Medicine ◽

10.1038/s41746-021-00455-y ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Laila Rasmy ◽

Yang Xiang ◽

Ziqian Xie ◽

Cui Tao ◽

Degui Zhi

Keyword(s):

Deep Learning ◽

Electronic Health Records ◽

Large Scale ◽

Training Data ◽

Fine Tuning ◽

Disease Prediction ◽

Operating Characteristics ◽

Health Records ◽

Clinical Databases ◽

Electronic Health

AbstractDeep learning (DL)-based predictive models from electronic health records (EHRs) deliver impressive performance in many clinical tasks. Large training cohorts, however, are often required by these models to achieve high accuracy, hindering the adoption of DL-based models in scenarios with limited training data. Recently, bidirectional encoder representations from transformers (BERT) and related models have achieved tremendous successes in the natural language processing domain. The pretraining of BERT on a very large training corpus generates contextualized embeddings that can boost the performance of models trained on smaller datasets. Inspired by BERT, we propose Med-BERT, which adapts the BERT framework originally developed for the text domain to the structured EHR domain. Med-BERT is a contextualized embedding model pretrained on a structured EHR dataset of 28,490,650 patients. Fine-tuning experiments showed that Med-BERT substantially improves the prediction accuracy, boosting the area under the receiver operating characteristics curve (AUC) by 1.21–6.14% in two disease prediction tasks from two clinical databases. In particular, pretrained Med-BERT obtains promising performances on tasks with small fine-tuning training sets and can boost the AUC by more than 20% or obtain an AUC as high as a model trained on a training set ten times larger, compared with deep learning models without Med-BERT. We believe that Med-BERT will benefit disease prediction studies with small local training datasets, reduce data collection expenses, and accelerate the pace of artificial intelligence aided healthcare.

Download Full-text

Transfer Learning for Inference of Metastatic Origin from Whole Slide Histology

10.1101/2021.04.21.440864 ◽

2021 ◽

Author(s):

Geoffrey F. Schau ◽

Hassan Ghani ◽

Erik A. Burlingame ◽

Guillaume Thibault ◽

Joe W. Gray ◽

...

Keyword(s):

Deep Learning ◽

Transfer Learning ◽

Primary Tumor ◽

Large Scale ◽

Metastatic Cancer ◽

Clinical Diagnostics ◽

Training Data ◽

Fine Tuning ◽

Learning Approach ◽

Whole Slide Images

AbstractAccurate diagnosis of metastatic cancer is essential for prescribing optimal control strategies to halt further spread of metastasizing disease. While pathological inspection aided by immunohistochemistry staining provides a valuable gold standard for clinical diagnostics, deep learning methods have emerged as powerful tools for identifying clinically relevant features of whole slide histology relevant to a tumor’s metastatic origin. Although deep learning models require significant training data to learn effectively, transfer learning paradigms provide mechanisms to circumvent limited training data by first training a model on related data prior to fine-tuning on smaller data sets of interest. In this work we propose a transfer learning approach that trains a convolutional neural network to infer the metastatic origin of tumor tissue from whole slide images of hematoxylin and eosin (H&E) stained tissue sections and illustrate the advantages of pre-training network on whole slide images of primary tumor morphology. We further characterize statistical dissimilarity between primary and metastatic tumors of various indications on patch-level images to highlight limitations of our indication-specific transfer learning approach. Using a primary-to-metastatic transfer learning approach, we achieved mean class-specific areas under receiver operator characteristics curve (AUROC) of 0.779, which outperformed comparable models trained on only images of primary tumor (mean AUROC of 0.691) or trained on only images of metastatic tumor (mean AUROC of 0.675), supporting the use of large scale primary tumor imaging data in developing computer vision models to characterize metastatic origin of tumor lesions.

Download Full-text

Mosaic evolution of molecular pathways for sex pheromone communication in a butterfly

10.1101/2020.09.04.279224 ◽

2020 ◽

Author(s):

Caroline M. Nieberding ◽

Patrícia Beldade ◽

Véronique Baumlé ◽

Gilles San Martin ◽

Alok Arun ◽

...

Keyword(s):

Sex Pheromone ◽

Large Scale ◽

Olfactory Receptors ◽

Gene Families ◽

Pheromone Biosynthesis ◽

Specific Gene ◽

Evolutionary Divergence ◽

Molecular Pathways ◽

Pheromone Communication ◽

Bicyclus Anynana

AbstractUnraveling the origin of molecular pathways underlying the evolution of adaptive traits is essential for understanding how new lineages emerge, including the relative contribution of conserved, ancestral traits, and newly evolved, derived traits. Here, we investigated the evolutionary divergence of sex pheromone communication from moths (mostly nocturnal) to butterflies (mostly diurnal) that occurred ~98 million years ago. In moths, females typically emit pheromones to attract male mates, but in butterflies pheromones and used by females for mate choice. The molecular bases of sex pheromone communication are well understood in moths, but have remained virtually unexplored in butterflies. We used a combination of transcriptomics, real time qPCR, and phylogenetics, to identify genes involved in different steps of sex pheromone communication in the butterfly Bicyclus anynana. Our results show that the biosynthesis and reception of sex pheromones relies both on moth-specific gene families (reductases) and on more ancestral insect gene families (desaturases, olfactory receptors, odorant binding proteins). Interestingly, B. anynana further appears to use what was believed to be the moth-specific neuropeptide Pheromone Biosynthesis Activating Neuropeptide (PBAN) for regulation of sex pheromone production. Altogether, our results suggest that a mosaic pattern best explains how sex pheromone communication evolved in butterflies, with some molecular components derived from moths, and others conserved from more ancient insect ancestors. This is the first large-scale analysis of the genetic pathways underlying sex pheromone communication in a butterfly.

Download Full-text

Learning Attention for Object Tracking with Adversarial Learning Network

10.21203/rs.3.rs-15512/v1 ◽

2020 ◽

Author(s):

Xu Cheng ◽

Chen Song ◽

Yongxiang Gu ◽

Beijing Chen

Keyword(s):

Deep Learning ◽

Object Tracking ◽

Large Scale ◽

Fine Tuning ◽

Object Localization ◽

Generative Adversarial Network ◽

Adversarial Learning ◽

Training Time ◽

Learning Network ◽

Object State

Abstract Artificial intelligence has been widely studied on solving intelligent surveillance analysis and security problems in recent years. Although many multimedia security approaches have been proposed by using deep learning network model, there are still some challenges on their performances which deserve in-depth research. On one hand, high computational complexity of current deep learning methods makes it hard to be applied to real-time scenario. On the other hand, it is difficult to obtain the specific features of a video by fine-tuning the network online with the object state of the first frame, which fails to capture rich appearance variations of the object. To solve above two issues, in this paper, an effective object tracking method with learning attention is proposed to achieve the object localization and reduce the training time in adversarial learning framework. First, a prediction network is designed to track the object in video sequences. The object positions of the first ten frames are employed to fine-tune prediction network, which can fully mine a specific features of an object. Second, the prediction network is integrated into the generative adversarial network framework, which randomly generates masks to capture object appearance variations via adaptively dropout input features. Third, we present a spatial attention mechanism to improve the tracking performance. The proposed network can identify the mask that maintains the most robust features of the objects over a long temporal span. Extensive experiments on two large-scale benchmarks demonstrate that the proposed algorithm performs favorably against state-of-the-art methods.

Download Full-text