A machine learning approach predicts essential genes and pharmacological targets in cancer

Machine learning prediction of oncology drug targets based on protein and network properties

10.21203/rs.2.15798/v1 ◽

2019 ◽

Author(s):

Zoltan Dezso ◽

Michele Ceccarelli

Keyword(s):

Machine Learning ◽

Clinical Trial ◽

Drug Target ◽

Drug Targets ◽

Validation Dataset ◽

Learning Approach ◽

Biological Functions ◽

Machine Learning Approach ◽

Network Properties ◽

Trial Drug

Abstract Background The selection and prioritization of drug targets is a central problem in drug discovery. Computational approaches can leverage the growing number of large-scale human genomics and proteomics data to make in-silico target identification, reducing the cost and the time needed. Results We developed a machine learning approach to score proteins to generate a druggability score of novel targets. In our model we incorporated 70 protein features which included properties derived from the sequence, features characterizing protein functions as well as network properties derived from the protein-protein interaction network. The advantage of this approach is that it is unbiased and even less studied proteins with limited information about their function can score well as most of the features are independent of the accumulated literature. We build models on a training set which consist of targets with approved drugs and a negative set of non-drug targets. The machine learning techniques help to identify the most important combination of features differentiating validated targets from non-targets. We validated our predictions on an independent set of clinical trial drug targets, achieving a high accuracy characterized by an AUC of 0.89. Our most predictive features included biological function of proteins, network centrality measures, protein essentiality, tissue specificity, localization and solvent accessibility. Our predictions, based on a small set of 102 validated oncology targets, recovered the majority of known drug targets and identifies a novel set of proteins as drug target candidates. Conclusions We developed a machine learning approach to prioritize proteins according to their similarity to approved drug targets. We have shown that the method proposed is highly predictive on a validation dataset consisting of 277 targets of clinical trial drug confirming that our computational approach is an efficient and cost-effective tool for drug target discovery and prioritization. Our predictions were based on oncology targets and cancer relevant biological functions, resulting in significantly higher scores for targets of oncology clinical trial drugs compared to the scores of targets of trial drugs for other indications. Our approach can be used to make indication specific drug-target prediction by combining generic druggability features with indication specific biological functions.

Download Full-text

Machine learning prediction of oncology drug targets based on protein and network properties

10.21203/rs.2.15798/v2 ◽

2019 ◽

Author(s):

Zoltan Dezso ◽

Michele Ceccarelli

Keyword(s):

Machine Learning ◽

Clinical Trial ◽

Drug Target ◽

Drug Targets ◽

Validation Dataset ◽

Learning Approach ◽

Biological Functions ◽

Machine Learning Approach ◽

Network Properties ◽

Trial Drug

Abstract Background The selection and prioritization of drug targets is a central problem in drug discovery. Computational approaches can leverage the growing number of large-scale human genomics and proteomics data to make in-silico target identification, reducing the cost and the time needed. Results We developed a machine learning approach to score proteins to generate a druggability score of novel targets. In our model we incorporated 70 protein features which included properties derived from the sequence, features characterizing protein functions as well as network properties derived from the protein-protein interaction network. The advantage of this approach is that it is unbiased and even less studied proteins with limited information about their function can score well as most of the features are independent of the accumulated literature. We build models on a training set which consist of targets with approved drugs and a negative set of non-drug targets. The machine learning techniques help to identify the most important combination of features differentiating validated targets from non-targets. We validated our predictions on an independent set of clinical trial drug targets, achieving a high accuracy characterized by an AUC of 0.89. Our most predictive features included biological function of proteins, network centrality measures, protein essentiality, tissue specificity, localization and solvent accessibility. Our predictions, based on a small set of 102 validated oncology targets, recovered the majority of known drug targets and identifies a novel set of proteins as drug target candidates. Conclusions We developed a machine learning approach to prioritize proteins according to their similarity to approved drug targets. We have shown that the method proposed is highly predictive on a validation dataset consisting of 277 targets of clinical trial drug confirming that our computational approach is an efficient and cost-effective tool for drug target discovery and prioritization. Our predictions were based on oncology targets and cancer relevant biological functions, resulting in significantly higher scores for targets of oncology clinical trial drugs compared to the scores of targets of trial drugs for other indications. Our approach can be used to make indication specific drug-target prediction by combining generic druggability features with indication specific biological functions.

Download Full-text

Predicting "Essential" Genes across Microbial Genomes: A Machine Learning Approach

2011 10th International Conference on Machine Learning and Applications and Workshops ◽

10.1109/icmla.2011.114 ◽

2011 ◽

Cited By ~ 8

Author(s):

K. Palaniappan ◽

S. Mukherjee

Keyword(s):

Machine Learning ◽

Essential Genes ◽

Learning Approach ◽

Microbial Genomes ◽

Machine Learning Approach

Download Full-text

Machine learning approach to gene essentiality prediction: a review

Briefings in Bioinformatics ◽

10.1093/bib/bbab128 ◽

2021 ◽

Author(s):

Olufemi Aromolaran ◽

Damilare Aromolaran ◽

Itunuoluwa Isewon ◽

Jelili Oyelade

Keyword(s):

Machine Learning ◽

Gene Ontology ◽

Prediction Models ◽

Essential Gene ◽

Standard Procedure ◽

Essential Genes ◽

Limiting Factor ◽

Learning Approach ◽

Gene Essentiality ◽

Machine Learning Approach

Abstract Essential genes are critical for the growth and survival of any organism. The machine learning approach complements the experimental methods to minimize the resources required for essentiality assays. Previous studies revealed the need to discover relevant features that significantly classify essential genes, improve on the generalizability of prediction models across organisms, and construct a robust gold standard as the class label for the train data to enhance prediction. Findings also show that a significant limitation of the machine learning approach is predicting conditionally essential genes. The essentiality status of a gene can change due to a specific condition of the organism. This review examines various methods applied to essential gene prediction task, their strengths, limitations and the factors responsible for effective computational prediction of essential genes. We discussed categories of features and how they contribute to the classification performance of essentiality prediction models. Five categories of features, namely, gene sequence, protein sequence, network topology, homology and gene ontology-based features, were generated for Caenorhabditis elegans to perform a comparative analysis of their essentiality prediction capacity. Gene ontology-based feature category outperformed other categories of features majorly due to its high correlation with the genes’ biological functions. However, the topology feature category provided the highest discriminatory power making it more suitable for essentiality prediction. The major limiting factor of machine learning to predict essential genes conditionality is the unavailability of labeled data for interest conditions that can train a classifier. Therefore, cooperative machine learning could further exploit models that can perform well in conditional essentiality predictions. Short abstract Identification of essential genes is imperative because it provides an understanding of the core structure and function, accelerating drug targets’ discovery, among other functions. Recent studies have applied machine learning to complement the experimental identification of essential genes. However, several factors are limiting the performance of machine learning approaches. This review aims to present the standard procedure and resources available for predicting essential genes in organisms, and also highlight the factors responsible for the current limitation in using machine learning for conditional gene essentiality prediction. The choice of features and ML technique was identified as an important factor to predict essential genes effectively.

Download Full-text

Machine learning prediction of oncology drug targets based on protein and network properties

10.21203/rs.2.15798/v3 ◽

2020 ◽

Author(s):

Zoltan Dezso ◽

Michele Ceccarelli

Keyword(s):

Machine Learning ◽

Clinical Trial ◽

Drug Target ◽

Drug Targets ◽

Validation Dataset ◽

Learning Approach ◽

Biological Functions ◽

Machine Learning Approach ◽

Network Properties ◽

Trial Drug

Abstract Background The selection and prioritization of drug targets is a central problem in drug discovery. Computational approaches can leverage the growing number of large-scale human genomics and proteomics data to make in-silico target identification, reducing the cost and the time needed. Results We developed a machine learning approach to score proteins to generate a druggability score of novel targets. In our model we incorporated 70 protein features which included properties derived from the sequence, features characterizing protein functions as well as network properties derived from the protein-protein interaction network. The advantage of this approach is that it is unbiased and even less studied proteins with limited information about their function can score well as most of the features are independent of the accumulated literature. We build models on a training set which consist of targets with approved drugs and a negative set of non-drug targets. The machine learning techniques help to identify the most important combination of features differentiating validated targets from non-targets. We validated our predictions on an independent set of clinical trial drug targets, achieving a high accuracy characterized by an AUC of 0.89. Our most predictive features included biological function of proteins, network centrality measures, protein essentiality, tissue specificity, localization and solvent accessibility. Our predictions, based on a small set of 102 validated oncology targets, recovered the majority of known drug targets and identifies a novel set of proteins as drug target candidates. Conclusions We developed a machine learning approach to prioritize proteins according to their similarity to approved drug targets. We have shown that the method proposed is highly predictive on a validation dataset consisting of 277 targets of clinical trial drug confirming that our computational approach is an efficient and cost-effective tool for drug target discovery and prioritization. Our predictions were based on oncology targets and cancer relevant biological functions, resulting in significantly higher scores for targets of oncology clinical trial drugs compared to the scores of targets of trial drugs for other indications. Our approach can be used to make indication specific drug-target prediction by combining generic druggability features with indication specific biological functions.

Download Full-text

Constructing and Validating Geographically Refined HAZUS-MH4 Hurricane Wind Risk Models: A Machine Learning Approach

Advances in Hurricane Engineering ◽

10.1061/9780784412626.092 ◽

2012 ◽

Cited By ~ 2

Author(s):

D. Subramanian ◽

J. Salazar ◽

L. Duenas-Osorio ◽

R. Stein

Keyword(s):

Machine Learning ◽

Learning Approach ◽

Risk Models ◽

Hurricane Wind ◽

Machine Learning Approach

Download Full-text

The impact of economic plans on the Chinese education system: a machine learning approach

CADMO ◽

10.3280/cad2018-001005 ◽

2018 ◽

pp. 37-49

Author(s):

Wenjun Lin ◽

Xuefu Xu ◽

Francesco Dell’Anna

Keyword(s):

Machine Learning ◽

Education System ◽

Learning Approach ◽

Chinese Education ◽

System A ◽

Machine Learning Approach ◽

The Impact

Download Full-text

Improving Bandwidth Utilization and Fairness between TCP Flows based on a Machine-learning Approach

IEEJ Transactions on Electronics Information and Systems ◽

10.1541/ieejeiss.133.1259 ◽

2013 ◽

Vol 133 (6) ◽

pp. 1259-1268

Author(s):

Akihiro Shiozu ◽

Syunji Yazaki ◽

K^|^ocirc;ki Abe

Keyword(s):

Machine Learning ◽

Learning Approach ◽

Bandwidth Utilization ◽

Machine Learning Approach

Download Full-text

A Machine Learning Approach to Anaphora Resolution in Arabic

International Review on Computers and Software (IRECOS) ◽

10.15866/irecos.v9i12.4786 ◽

2014 ◽

Vol 9 (12) ◽

pp. 1956

Author(s):

Abdullatif Abolohom ◽

Nazlia Omar

Keyword(s):

Machine Learning ◽

Learning Approach ◽

Anaphora Resolution ◽

Machine Learning Approach

Download Full-text

1552-P: Machine Learning Approach to Decision-Making for Initial Insulin Use in Japanese Patients with Type 2 Diabetes

Diabetes ◽

10.2337/db20-1552-p ◽

2020 ◽

Vol 69 (Supplement 1) ◽

pp. 1552-P

Author(s):

KAZUYA FUJIHARA ◽

MAYUKO H. YAMADA ◽

YASUHIRO MATSUBAYASHI ◽

MASAHIKO YAMAMOTO ◽

TOSHIHIRO IIZUKA ◽

...

Keyword(s):

Machine Learning ◽

Type 2 Diabetes ◽

Decision Making ◽

Japanese Patients ◽

Learning Approach ◽

Machine Learning Approach ◽

Insulin Use

Download Full-text