Deep Learning with Implicit Handling of Tissue-Specific Phenomena Predicts Tumor DNA Accessibility and Immune Activity

Deep Learning Implicitly Handles Tissue Specific Phenomena to Predict Tumor DNA Accessibility and Immune Activity

iScience ◽

10.1016/j.isci.2019.09.018 ◽

2019 ◽

Vol 20 ◽

pp. 119-136 ◽

Cited By ~ 2

Author(s):

Kamil Wnuk ◽

Jeremi Sudol ◽

Kevin B. Givechian ◽

Patrick Soon-Shiong ◽

Shahrooz Rabizadeh ◽

...

Keyword(s):

Deep Learning ◽

Tissue Specific ◽

Immune Activity ◽

Dna Accessibility ◽

Tumor Dna

Download Full-text

Deep learning with implicit handling of tissue-specific phenomena predicts tumor DNA accessibility and immune activity

10.1101/229385 ◽

2017 ◽

Cited By ~ 1

Author(s):

Kamil Wnuk ◽

Jeremi Sudol ◽

Kevin B. Givechian ◽

Patrick Soon-Shiong ◽

Shahrooz Rabizadeh ◽

...

Keyword(s):

Gene Expression ◽

Chromatin State ◽

Open Chromatin ◽

Dynamic Feature ◽

Tissue Specific ◽

Immune Activity ◽

Dna Accessibility ◽

Immune Pathways ◽

Patient Prognosis ◽

The Impact

AbstractDNA accessibility is a key dynamic feature of chromatin regulation that can potentiate transcriptional events and tumor progression. Recently, neural networks have begun to make it possible to explore the impact of mutations on DNA accessibility and transcriptional regulation by demonstrating state-of-the-art prediction of chromatin features from DNA sequence data in specific tissue types. We demonstrate enhancements to improve such tissue-specific prediction performance, and show that by extending models with RNA-seq expression input, they can be applied to novel tissue samples whose types were not present in training. We show that our expression-informed model achieved particularly consistent accuracy predicting DNA accessibility at promoter and promoter flank regions of the genome.Leveraging this new tool to analyze tumor genomes across tissues, we provide a first glimpse of the DNA accessibility landscape across The Cancer Genome Atlas (TCGA). Our analysis of the Lung Adenocarcinoma (LUAD) cohort reveals that viewing tumors from the perspective of accessibility at promoters uniquely highlights several immune pathways inversely correlated with an overall more open chromatin state. Further, through identification of accessibility sites linked with differential gene expression in immune-inflamed LUAD tumors and training of a classifier ensemble, we show that patterns of predicted chromatin state are discriminative of immune activity across many tumor types, with direct implications for patient prognosis. We see such models playing a significant future role in matching patients to appropriate immunotherapy treatment regimens, as well as in analysis of other conditions where epigenetic state may play a significant role.Significance StatementDNA accessibility determines whether proteins have access to DNA-binding sites and is a key dynamic feature that influences regulation of gene expression that differentiates cells. We improve and extend a neural network model in a way that expands its application domain beyond studying the impact of genetic sequence and mutations on DNA accessibility in specific cell types, to tissues for which training data is unavailable.Leveraging our tool to analyze tumor genomes, we demonstrate that in lung adenocarcinomas the accessibility perspective uniquely highlights immune pathways inversely correlated with a more accessible DNA state. Further, we show that accessibility patterns learned from even a single tumor type can discriminate immune inflammation across many cancers, often with direct relation to patient prognosis.

Download Full-text

Uncovering tissue-specific binding features from differential deep learning

10.1101/606269 ◽

2019 ◽

Cited By ~ 1

Author(s):

Mike Phuycharoen ◽

Peyman Zarrineh ◽

Laure Bridoux ◽

Shilu Amin ◽

Marta Losa ◽

...

Keyword(s):

Deep Learning ◽

Binding Sites ◽

Specific Binding ◽

Specific Expression ◽

Tissue Specific ◽

Dimensional Classification ◽

Branchial Arches ◽

Gradient Based ◽

Differential Binding

ABSTRACTMotivationTranscription factors (TFs) can bind DNA in a cooperative manner, enabling a mutual increase in occupancy. Through this type of interaction, alternative binding sites can be preferentially bound in different tissues to regulate tissue-specific expression programmes. Recently, deep learning models have become state-of-the-art in various pattern analysis tasks, including applications in the field of genomics. We therefore investigate the application of convolutional neural network (CNN) models to the discovery of sequence features determining cooperative and differential TF binding across tissues.ResultsWe analyse ChIP-seq data from MEIS, TFs which are broadly expressed across mouse branchial arches, and HOXA2, which is expressed in the second and more posterior branchial arches. By developing models predictive of MEIS differential binding in all three tissues we are able to accurately predict HOXA2 co-binding sites. We evaluate transfer-like and multitask approaches to regularising the high-dimensional classification task with a larger regression dataset, allowing for creation of deeper and more accurate models. We test the performance of perturbation and gradient-based attribution methods in identifying the HOXA2 sites from differential MEIS data. Our results show that deep regularised models significantly outperform shallow CNNs as well as k-mer methods in the discovery of tissue-specific sites bound in vivo.AvailabilityFor implementation and models please visit https://doi.org/10.5281/zenodo.2635463.

Download Full-text

TS-m6A-DL: Tissue-specific identification of N6-methyladenosine sites using a universal deep learning model

Computational and Structural Biotechnology Journal ◽

10.1016/j.csbj.2021.08.014 ◽

2021 ◽

Author(s):

Zeeshan Abbas ◽

Hilal Tayara ◽

Quan Zou ◽

Kil To Chong

Keyword(s):

Deep Learning ◽

Learning Model ◽

Tissue Specific ◽

Specific Identification ◽

Deep Learning Model

Download Full-text

Abstract 393: Predicting DNA accessibility in the pan-cancer tumor genome using RNA-Seq, WGS, and deep learning

10.1158/1538-7445.am2017-393 ◽

2017 ◽

Cited By ~ 3

Author(s):

Kamil Wnuk ◽

Jeremi Sudol ◽

Shahrooz Rabizadeh ◽

Patrick Soon-Shiong ◽

Christopher Szeto ◽

...

Keyword(s):

Deep Learning ◽

Rna Seq ◽

Dna Accessibility ◽

Cancer Tumor ◽

Tumor Genome ◽

Pan Cancer

Download Full-text

Attentive deep learning-based tumor-only somatic mutation classifier achieves high accuracy agnostic of tissue type and capture kit.

10.1101/2021.12.07.471513 ◽

2021 ◽

Author(s):

R. Tyler McLaughlin ◽

Maansi Asthana ◽

Marc Di Meo ◽

Michele Ceccarelli ◽

Howard J. Jacob ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

State Of The Art ◽

Variant Calling ◽

Learning Model ◽

Tissue Type ◽

Germline Variants ◽

Art Performance ◽

Deep Learning Model ◽

Tumor Dna

In precision oncology, reliable identification of tumor-specific DNA mutations requires sequencing tumor DNA and non-tumor DNA (so-called "matched normal") from the same patient. The normal sample allows researchers to distinguish acquired (somatic) and hereditary (germline) variants. The ability to distinguish somatic and germline variants facilitates estimation of tumor mutation burden (TMB), which is a recently FDA-approved pan-cancer marker for highly successful cancer immunotherapies; in tumor-only variant calling (i.e., without a matched normal), the difficulty in discriminating germline and somatic variants results in inflated and unreliable TMB estimates. We apply machine learning to the task of somatic vs germline classification in tumor-only samples using TabNet, a recently developed attentive deep learning model for tabular data that has achieved state of the art performance in multiple classification tasks (Arik and Pfister 2019). We constructed a training set for supervised classification using features derived from tumor-only variant calling and drawing somatic and germline truth-labels from an independent pipeline incorporating the patient-matched normal samples. Our trained model achieved state-of-the-art performance on two hold-out test datasets: a TCGA dataset including sarcoma, breast adenocarcinoma, and endometrial carcinoma samples (F1-score: 88.3), and a metastatic melanoma dataset, (F1-score 79.8). Concordance between matched-normal and tumor-only TMB improves from R2 = 0.006 to 0.705 with the addition of our classifier. And importantly, this approach generalizes across tumor tissue types and capture kits and has a call rate of 100%. The interpretable feature masks of the attentive deep learning model explain the reasons for misclassified variants. We reproduce the recent finding that tumor-only TMB estimates for Black patients are extremely inflated relative to that of White patients due to the racial biases of germline databases. We show that our machine learning approach appreciably reduces this racial bias in tumor-only variant-calling.

Download Full-text

Tissue-specific cell-free DNA degradation quantifies circulating tumor DNA burden

Nature Communications ◽

10.1038/s41467-021-22463-y ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Guanhua Zhu ◽

Yu A. Guo ◽

Danliang Ho ◽

Polly Poon ◽

Zhong Wee Poh ◽

...

Keyword(s):

Disease Progression ◽

Cancer Patients ◽

Circulating Tumor Dna ◽

Specific Cell ◽

Regulatory Regions ◽

Cell Free Dna ◽

Invasive Approach ◽

Tissue Specific ◽

Free Dna ◽

Tumor Dna

AbstractProfiling of circulating tumor DNA (ctDNA) may offer a non-invasive approach to monitor disease progression. Here, we develop a quantitative method, exploiting local tissue-specific cell-free DNA (cfDNA) degradation patterns, that accurately estimates ctDNA burden independent of genomic aberrations. Nucleosome-dependent cfDNA degradation at promoters and first exon-intron junctions is strongly associated with differential transcriptional activity in tumors and blood. A quantitative model, based on just 6 regulatory regions, could accurately predict ctDNA levels in colorectal cancer patients. Strikingly, a model restricted to blood-specific regulatory regions could predict ctDNA levels across both colorectal and breast cancer patients. Using compact targeted sequencing (<25 kb) of predictive regions, we demonstrate how the approach could enable quantitative low-cost tracking of ctDNA dynamics and disease progression.

Download Full-text

Uncovering tissue-specific binding features from differential deep learning

Nucleic Acids Research ◽

10.1093/nar/gkaa009 ◽

2020 ◽

Vol 48 (5) ◽

pp. e27-e27 ◽

Cited By ~ 2

Author(s):

Mike Phuycharoen ◽

Peyman Zarrineh ◽

Laure Bridoux ◽

Shilu Amin ◽

Marta Losa ◽

...

Keyword(s):

Deep Learning ◽

Binding Sites ◽

Specific Binding ◽

Specific Expression ◽

Tissue Specific ◽

Dimensional Classification ◽

Branchial Arches ◽

Gradient Based ◽

Differential Binding

Abstract Transcription factors (TFs) can bind DNA in a cooperative manner, enabling a mutual increase in occupancy. Through this type of interaction, alternative binding sites can be preferentially bound in different tissues to regulate tissue-specific expression programmes. Recently, deep learning models have become state-of-the-art in various pattern analysis tasks, including applications in the field of genomics. We therefore investigate the application of convolutional neural network (CNN) models to the discovery of sequence features determining cooperative and differential TF binding across tissues. We analyse ChIP-seq data from MEIS, TFs which are broadly expressed across mouse branchial arches, and HOXA2, which is expressed in the second and more posterior branchial arches. By developing models predictive of MEIS differential binding in all three tissues, we are able to accurately predict HOXA2 co-binding sites. We evaluate transfer-like and multitask approaches to regularizing the high-dimensional classification task with a larger regression dataset, allowing for the creation of deeper and more accurate models. We test the performance of perturbation and gradient-based attribution methods in identifying the HOXA2 sites from differential MEIS data. Our results show that deep regularized models significantly outperform shallow CNNs as well as k-mer methods in the discovery of tissue-specific sites bound in vivo.

Download Full-text

Abstract 693: Deep learning analysis of circulating tumor DNA identified a pan-tumor molecular subtype with enhanced response to durvalumab (anti-PDL1)

10.1158/1538-7445.am2019-693 ◽

2019 ◽

Author(s):

Song Wu ◽

Sriram Sridhar ◽

Han Si ◽

Michael Kuziora ◽

Judson Englert ◽

...

Keyword(s):

Deep Learning ◽

Molecular Subtype ◽

Circulating Tumor Dna ◽

Learning Analysis ◽

Tumor Dna

Download Full-text

Abstract 693: Deep learning analysis of circulating tumor DNA identified a pan-tumor molecular subtype with enhanced response to durvalumab (anti-PDL1)

10.1158/1538-7445.sabcs18-693 ◽

2019 ◽

Author(s):

Song Wu ◽

Sriram Sridhar ◽

Han Si ◽

Michael Kuziora ◽

Judson Englert ◽

...

Keyword(s):

Deep Learning ◽

Molecular Subtype ◽

Circulating Tumor Dna ◽

Learning Analysis ◽

Tumor Dna

Download Full-text