Computational prediction of cancer-gene function

Pingzhao Hu; Gary Bader; Dennis A. Wigle; Andrew Emili

doi:10.1038/nrc2036

On the automatic annotation of gene functions using observational data and phylogenetic trees

10.1101/2020.05.14.095687 ◽

2020 ◽

Author(s):

George G. Vega Yon ◽

Duncan C. Thomas ◽

John Morrison ◽

Huaiyu Mi ◽

Paul D. Thomas ◽

...

Keyword(s):

Gene Function ◽

Phylogenetic Trees ◽

Evolutionary Model ◽

Computational Prediction ◽

Gene Families ◽

R Package ◽

Biomedical Sciences ◽

Computationally Efficient ◽

Link Type ◽

Gene Functions

AbstractMotivationGene function annotation is important for a variety of downstream analyses of genetic data. Yet experimental characterization of function remains costly and slow, making computational prediction an important endeavor. In this paper we use a probabilistic evolutionary model built upon phylogenetic trees and experimental Gene Ontology functional annotations that allows automated prediction of function for unannotated genes.ResultsWe have developed a computationally efficient model of evolution of gene annotations using phylogenies based on a Bayesian framework using Markov Chain Monte Carlo for parameter estimation. Unlike previous approaches, our method is able to estimate parameters over many different phylogenetic trees and functions. The resulting parameters agree with biological intuition, such as the increased probability of function change following gene duplication. The method performs well on leave-one-out validation, and we further validated some of the predictions in the experimental scientific literature.AvailabilityOur method has been implemented as an R package and it is available online at https://github.com/USCBiostats/aphylo. Code needed to reproduce the tables and figures can be found in https://github.com/USCbiostats/aphylo-simulations.Author summaryUnderstanding the individual role that genes play in life is a key issue in biomedical-sciences. While information regarding gene functions is continuously growing, the number of genes with unknown biological purpose is yet greater. Because of this, scientists have dedicated much of their time to build and design tools that automatically infer gene functions. In this paper, we present yet another attempt to do such. While very simple, our model of gene-function evolution has some key features that have the potential to generate an impact in the field: (a) compared to other methods, ours is highly-scalable, which means that it is possible to simultaneously analyze hundreds of what are known as gene-families, compromising thousands of genes, (b) supports our biological intuition as our model’s data-driven results coherently agree with what theory dictates regarding how gene-functions evolved, (c) notwithstanding its simplicity, the model’s prediction accuracy is comparable to other more complex alternatives, and (d) perhaps most importantly, it can be used to both support new annotations and to suggest areas in which existing annotations show inconsistencies that may indicate errors or controversies.

Download Full-text

Progress and challenges in the computational prediction of gene function using networks: 2012-2013 update

F1000Research ◽

10.12688/f1000research.2-230.v1 ◽

2013 ◽

Vol 2 ◽

pp. 230 ◽

Cited By ~ 14

Author(s):

Paul Pavlidis ◽

Jesse Gillis

Keyword(s):

Gene Ontology ◽

Recent Work ◽

Gene Function ◽

Gold Standard ◽

Gene Network ◽

Computational Prediction ◽

Significant Part ◽

Guilt By Association

In an opinion published in 2012, we reviewed and discussed our studies of how gene network-based guilt-by-association (GBA) is impacted by confounds related to gene multifunctionality. We found such confounds account for a significant part of the GBA signal, and as a result meaningfully evaluating and applying computationally-guided GBA is more challenging than generally appreciated. We proposed that effort currently spent on incrementally improving algorithms would be better spent in identifying the features of data that do yield novel functional insights. We also suggested that part of the problem is the reliance by computational biologists on gold standard annotations such as the Gene Ontology. In the year since, there has been continued heavy activity in GBA-based research, including work that contributes to our understanding of the issues we raised. Here we provide a review of some of the most relevant recent work, or which point to new areas of progress and challenges.

Download Full-text

The use of CRISPR/Cas9-based gene editing strategies to explore cancer gene function in mice

Current Opinion in Genetics & Development ◽

10.1016/j.gde.2020.12.005 ◽

2021 ◽

Vol 66 ◽

pp. 57-62

Author(s):

Louise van der Weyden ◽

Jos Jonkers ◽

David J Adams

Keyword(s):

Gene Function ◽

Gene Editing ◽

Cancer Gene

Download Full-text

An Emerging Cell Kinetics Regulation Network: Integrated Control of Nucleotide Metabolism and Cancer Gene Function

ACS Symposium Series - Inosine Monophosphate Dehydrogenase ◽

10.1021/bk-2003-0839.ch004 ◽

2003 ◽

pp. 59-90 ◽

Cited By ~ 1

Author(s):

James L. Sherley

Keyword(s):

Gene Function ◽

Cell Kinetics ◽

Integrated Control ◽

Nucleotide Metabolism ◽

Cancer Gene ◽

Regulation Network

Download Full-text

Computational prediction of SEG (single exon gene) function in humans

Frontiers in Bioscience ◽

10.2741/1627 ◽

2005 ◽

Vol 10 (1-3) ◽

pp. 1382 ◽

Cited By ~ 18

Author(s):

Meena, K. Sakharkar

Keyword(s):

Gene Function ◽

Computational Prediction ◽

Single Exon Gene ◽

Exon Gene

Download Full-text

Progress and challenges in the computational prediction of gene function using networks

F1000Research ◽

10.12688/f1000research.1-14.v1 ◽

2012 ◽

Vol 1 ◽

pp. 14 ◽

Cited By ~ 17

Author(s):

Paul Pavlidis ◽

Jesse Gillis

Keyword(s):

Network Analysis ◽

Gene Function ◽

Cross Validation ◽

Computational Prediction ◽

Function Prediction ◽

Computational Genomics ◽

Network Data ◽

Computational Approaches ◽

Genomics Research ◽

Guilt By Association

In this opinion piece, we attempt to unify recent arguments we have made that serious confounds affect the use of network data to predict and characterize gene function. The development of computational approaches to determine gene function is a major strand of computational genomics research. However, progress beyond using BLAST to transfer annotations has been surprisingly slow. We have previously argued that a large part of the reported success in using "guilt by association" in network data is due to the tendency of methods to simply assign new functions to already well-annotated genes. While such predictions will tend to be correct, they are generic; it is true, but not very helpful, that a gene with many functions is more likely to have any function. We have also presented evidence that much of the remaining performance in cross-validation cannot be usefully generalized to new predictions, making progressive improvement in analysis difficult to engineer. Here we summarize our findings about how these problems will affect network analysis, discuss some ongoing responses within the field to these issues, and consolidate some recommendations and speculation, which we hope will modestly increase the reliability and specificity of gene function prediction.

Download Full-text