scholarly journals A Phylogenomic Census of Molecular Functions Identifies Modern Thermophilic Archaea as the Most Ancient Form of Cellular Life

Archaea ◽  
2014 ◽  
Vol 2014 ◽  
pp. 1-15 ◽  
Author(s):  
Arshan Nasir ◽  
Kyung Mo Kim ◽  
Gustavo Caetano-Anollés

The origins of diversified life remain mysterious despite considerable efforts devoted to untangling the roots of the universal tree of life. Here we reconstructed phylogenies that described the evolution of molecular functions and the evolution of species directly from a genomic census of gene ontology (GO) definitions. We sampled 249 free-living genomes spanning organisms in the three superkingdoms of life, Archaea, Bacteria, and Eukarya, and used the abundance of GO terms as molecular characters to produce rooted phylogenetic trees. Results revealed an early thermophilic origin of Archaea that was followed by genome reduction events in microbial superkingdoms. Eukaryal genomes displayed extraordinary functional diversity and were enriched with hundreds of novel molecular activities not detected in the akaryotic microbial cells. Remarkably, the majority of these novel functions appeared quite late in evolution, synchronized with the diversification of the eukaryal superkingdom. The distribution of GO terms in superkingdoms confirms that Archaea appears to be the simplest and most ancient form of cellular life, while Eukarya is the most diverse and recent.

2015 ◽  
Vol 12 (4) ◽  
pp. 1235-1253 ◽  
Author(s):  
Shu-Bo Zhang ◽  
Jian-Huang Lai

Measuring the semantic similarity between pairs of terms in Gene Ontology (GO) can help to compare genes that can not be compared by other computational methods. In this study, we proposed an integrated information-based similarity measurement (IISM) to calculate the semantic similarity between two GO terms by taking into account multiple common ancestors that they share, and aggregating the semantic information and depth information of the non-redundant common ancestors. Our method searches for non-redundant common ancestors in an effective way. Validation experiments were conducted on both gene expression dataset and pathway dataset, and the experimental results suggest the superiority of our method against some existing methods.


2017 ◽  
Author(s):  
Dat Duong ◽  
Wasi Uddin Ahmad ◽  
Eleazar Eskin ◽  
Kai-Wei Chang ◽  
Jingyi Jessica Li

AbstractThe Gene Ontology (GO) database contains GO terms that describe biological functions of genes. Previous methods for comparing GO terms have relied on the fact that GO terms are organized into a tree structure. In this paradigm, the locations of two GO terms in the tree dictate their similarity score. In this paper, we introduce two new solutions for this problem, by focusing instead on the definitions of the GO terms. We apply neural network based techniques from the natural language processing (NLP) domain. The first method does not rely on the GO tree, whereas the second indirectly depends on the GO tree. In our first approach, we compare two GO definitions by treating them as two unordered sets of words. The word similarity is estimated by a word embedding model that maps words into an N-dimensional space. In our second approach, we account for the word-ordering within a sentence. We use a sentence encoder to embed GO definitions into vectors and estimate how likely one definition entails another. We validate our methods in two ways. In the first experiment, we test the model’s ability to differentiate a true protein-protein network from a randomly generated network. In the second experiment, we test the model in identifying orthologs from randomly-matched genes in human, mouse, and fly. In both experiments, a hybrid of NLP and GO-tree based method achieves the best classification accuracy.Availabilitygithub.com/datduong/NLPMethods2CompareGOterms


2021 ◽  
Vol 41 (2) ◽  
Author(s):  
Takuya Morikawa ◽  
Hiroaki Ohishi ◽  
Kengo Kosaka ◽  
Tomofumi Shimojo ◽  
Akihiro Nagano ◽  
...  

Abstract We have previously reported a novel homozygous 4-bp deletion in DDHD1 as the responsible variant for spastic paraplegia type 28 (SPG28; OMIM#609340). The variant causes a frameshift, resulting in a functionally null allele in the patient. DDHD1 encodes phospholipase A1 (PLA1) catalyzing phosphatidylinositol to lysophosphatidylinositol (LPI). To clarify the pathogenic mechanism of SPG28, we established Ddhd1 knockout mice (Ddhd1[−/−]) carrying a 5-bp deletion in Ddhd1, resulting in a premature termination of translation at a position similar to that of the patient. We observed a significant decrease in foot–base angle (FBA) in aged Ddhd1(−/−) (24 months of age) and a significant decrease in LPI 20:4 (sn-2) in Ddhd1(−/−) cerebra (26 months of age). These changes in FBA were not observed in 14 months of age. We also observed significant changes of expression levels of 22 genes in the Ddhd1(−/−) cerebra (26 months of age). Gene Ontology (GO) terms relating to the nervous system and cell–cell communications were significantly enriched. We conclude that the reduced signaling of LPI 20:4 (sn-2) by PLA1 dysfunction is responsible for the locomotive abnormality in SPG28, further suggesting that the reduction of downstream signaling such as GPR55 which is agonized by LPI is involved in the pathogenesis of SPG28.


2010 ◽  
Vol 74 (4) ◽  
pp. 479-503 ◽  
Author(s):  
Trudy Torto-Alalibo ◽  
Candace W. Collmer ◽  
Michelle Gwinn-Giglio ◽  
Magdalen Lindeberg ◽  
Shaowu Meng ◽  
...  

SUMMARY Microbes form intimate relationships with hosts (symbioses) that range from mutualism to parasitism. Common microbial mechanisms involved in a successful host association include adhesion, entry of the microbe or its effector proteins into the host cell, mitigation of host defenses, and nutrient acquisition. Genes associated with these microbial mechanisms are known for a broad range of symbioses, revealing both divergent and convergent strategies. Effective comparisons among these symbioses, however, are hampered by inconsistent descriptive terms in the literature for functionally similar genes. Bioinformatic approaches that use homology-based tools are limited to identifying functionally similar genes based on similarities in their sequences. An effective solution to these limitations is provided by the Gene Ontology (GO), which provides a standardized language to describe gene products from all organisms. The GO comprises three ontologies that enable one to describe the molecular function(s) of gene products, the biological processes to which they contribute, and their cellular locations. Beginning in 2004, the Plant-Associated Microbe Gene Ontology (PAMGO) interest group collaborated with the GO consortium to extend the GO to accommodate terms for describing gene products associated with microbe-host interactions. Currently, over 900 terms that describe biological processes common to diverse plant- and animal-associated microbes are incorporated into the GO database. Here we review some unifying themes common to diverse host-microbe associations and illustrate how the new GO terms facilitate a standardized description of the gene products involved. We also highlight areas where new terms need to be developed, an ongoing process that should involve the whole community.


2008 ◽  
Vol 5 (2) ◽  
Author(s):  
Detlef Groth ◽  
Stefanie Hartmann ◽  
Georgia Panopoulou ◽  
Albert J. Poustka ◽  
Steffen Hennig

SummaryThe functional annotation of genomic data has become a major task for the ever-growing number of sequencing projects. In order to address this challenge, we recently developed GOblet, a free web service for the annotation of anonymous sequences with Gene Ontology (GO) terms. However, to overcome limitations of the GO terminology, and to aid in understanding not only single components but as well systemic interactions between the individual components, we have now extended the GOblet web service to integrate also pathway annotations. Furthermore, we extended and upgraded the data analysis pipeline with improved summaries, and added term enrichment and clustering algorithms. Finally, we are now making GOblet available as a stand-alone application for high-throughput processing on local machines. The advantages of this frequently requested feature is that a) the user can avoid restrictions of our web service for uploading and processing large amounts of data, and that b) confidential data can be analysed without insecure transfer to a public web server. The stand-alone version of the web service has been implemented using platform independent Tcl-scripts, which can be run with just a single runtime file utilizing the Starkit technology. The GOblet web service and the stand-alone application are freely available at http://goblet.molgen.mpg.de.


2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
Jian Zhang ◽  
ZhiHao Xing ◽  
Mingming Ma ◽  
Ning Wang ◽  
Yu-Dong Cai ◽  
...  

Identifying disease genes is one of the most important topics in biomedicine and may facilitate studies on the mechanisms underlying disease. Age-related macular degeneration (AMD) is a serious eye disease; it typically affects older adults and results in a loss of vision due to retina damage. In this study, we attempt to develop an effective method for distinguishing AMD-related genes. Gene ontology and KEGG enrichment analyses of known AMD-related genes were performed, and a classification system was established. In detail, each gene was encoded into a vector by extracting enrichment scores of the gene set, including it and its direct neighbors in STRING, and gene ontology terms or KEGG pathways. Then certain feature-selection methods, including minimum redundancy maximum relevance and incremental feature selection, were adopted to extract key features for the classification system. As a result, 720 GO terms and 11 KEGG pathways were deemed the most important factors for predicting AMD-related genes.


2009 ◽  
Vol 38 (suppl_1) ◽  
pp. D204-D210 ◽  
Author(s):  
Huaiyu Mi ◽  
Qing Dong ◽  
Anushya Muruganujan ◽  
Pascale Gaudet ◽  
Suzanna Lewis ◽  
...  

2011 ◽  
Vol 09 (06) ◽  
pp. 681-695 ◽  
Author(s):  
MARCO A. ALVAREZ ◽  
CHANGHUI YAN

Existing methods for calculating semantic similarities between pairs of Gene Ontology (GO) terms and gene products often rely on external databases like Gene Ontology Annotation (GOA) that annotate gene products using the GO terms. This dependency leads to some limitations in real applications. Here, we present a semantic similarity algorithm (SSA), that relies exclusively on the GO. When calculating the semantic similarity between a pair of input GO terms, SSA takes into account the shortest path between them, the depth of their nearest common ancestor, and a novel similarity score calculated between the definitions of the involved GO terms. In our work, we use SSA to calculate semantic similarities between pairs of proteins by combining pairwise semantic similarities between the GO terms that annotate the involved proteins. The reliability of SSA was evaluated by comparing the resulting semantic similarities between proteins with the functional similarities between proteins derived from expert annotations or sequence similarity. Comparisons with existing state-of-the-art methods showed that SSA is highly competitive with the other methods. SSA provides a reliable measure for semantics similarity independent of external databases of functional-annotation observations.


2012 ◽  
Vol 6 ◽  
pp. BBI.S9101
Author(s):  
Guvanch Ovezmyradov ◽  
Qianhao Lu ◽  
Martin C. Göpfert

The Gene Ontology (GO) initiative is a collaborative effort that uses controlled vocabularies for annotating genetic information. We here present AGENDA (Application for mining Gene Ontology Data), a novel web-based tool for accessing the GO database. AGENDA allows the user to simultaneously retrieve and compare gene lists linked to different GO terms in diverse species using batch queries, facilitating comparative approaches to genetic information. The web-based application offers diverse search options and allows the user to bookmark, visualize, and download the results. AGENDA is an open source web-based application that is freely available for non-commercial use at the project homepage. URL: http://sourceforge.net/projects/bioagenda .


2018 ◽  
Vol 2018 ◽  
pp. 1-8
Author(s):  
YiMin Zhang ◽  
Li Shao ◽  
Ning Zhou ◽  
JianZhou Li ◽  
Yu Chen ◽  
...  

Background. The key gene sets involved in the progression of acute liver failure (ALF), which has a high mortality rate, remain unclear. This study aims to gain a deeper understanding of the transcriptional response of peripheral blood mononuclear cells (PBMCs) following ALF. Methods. ALF was induced by D-galactosamine (D-gal) in a porcine model. PBMCs were separated at time zero (baseline group), 36 h (failure group), and 60 h (dying group) after D-gal injection. Transcriptional profiling was performed using RNA sequencing and analysed using DAVID bioinformatics resources. Results. Compared with the baseline group, 816 and 1,845 differentially expressed genes (DEGs) were identified in the failure and dying groups, respectively. A total of five and two gene ontology (GO) term clusters were enriched in 107 GO terms in the failure group and 154 GO terms in the dying group. These GO clusters were primarily immune-related, including genes regulating the inflammasome complex and toll-like receptor signalling pathways. Specifically, GO terms related to cell death, including apoptosis, pyroptosis, and autophagy, and those related to fibrosis, coagulation dysfunction, and hepatic encephalopathy were enriched. Seven Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, cytokine-cytokine receptor interaction, hematopoietic cell lineage, lysosome, rheumatoid arthritis, malaria, and phagosome and pertussis pathways were mapped for DEGs in the failure group. All of these seven KEGG pathways were involved in the 19 KEGG pathways mapped in the dying group. Conclusion. We found that the dramatic PBMC transcriptome changes triggered by ALF progression was predominantly related to immune responses. The enriched GO terms related to cell death, fibrosis, and so on, as indicated by PBMC transcriptome analysis, seem to be useful in elucidating potential key gene sets in the progression of ALF. A better understanding of these gene sets might be of preventive or therapeutic interest.


Sign in / Sign up

Export Citation Format

Share Document