Evolution of Eukaryotic Gene Repertoire and Gene Structure: Discovering the Unexpected Dynamics of Genome Evolution

2003 ◽  
Vol 68 (0) ◽  
pp. 293-302 ◽  
Author(s):  
I.B. ROGOZIN ◽  
V.N. BABENKO ◽  
N.D. FEDOROVA ◽  
J. D. JACKSON ◽  
A.R. JACOBS ◽  
...  
BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Jeanne Wilbrandt ◽  
Bernhard Misof ◽  
Kristen A. Panfilio ◽  
Oliver Niehuis

Abstract Background The location and modular structure of eukaryotic protein-coding genes in genomic sequences can be automatically predicted by gene annotation algorithms. These predictions are often used for comparative studies on gene structure, gene repertoires, and genome evolution. However, automatic annotation algorithms do not yet correctly identify all genes within a genome, and manual annotation is often necessary to obtain accurate gene models and gene sets. As manual annotation is time-consuming, only a fraction of the gene models in a genome is typically manually annotated, and this fraction often differs between species. To assess the impact of manual annotation efforts on genome-wide analyses of gene structural properties, we compared the structural properties of protein-coding genes in seven diverse insect species sequenced by the i5k initiative. Results Our results show that the subset of genes chosen for manual annotation by a research community (3.5–7% of gene models) may have structural properties (e.g., lengths and exon counts) that are not necessarily representative for a species’ gene set as a whole. Nonetheless, the structural properties of automatically generated gene models are only altered marginally (if at all) through manual annotation. Major correlative trends, for example a negative correlation between genome size and exonic proportion, can be inferred from either the automatically predicted or manually annotated gene models alike. Vice versa, some previously reported trends did not appear in either the automatic or manually annotated gene sets, pointing towards insect-specific gene structural peculiarities. Conclusions In our analysis of gene structural properties, automatically predicted gene models proved to be sufficiently reliable to recover the same gene-repertoire-wide correlative trends that we found when focusing on manually annotated gene models only. We acknowledge that analyses on the individual gene level clearly benefit from manual curation. However, as genome sequencing and annotation projects often differ in the extent of their manual annotation and curation efforts, our results indicate that comparative studies analyzing gene structural properties in these genomes can nonetheless be justifiable and informative.


2016 ◽  
Vol 26 (7) ◽  
pp. 918-932 ◽  
Author(s):  
Nikolaos Vakirlis ◽  
Véronique Sarilar ◽  
Guénola Drillon ◽  
Aubin Fleiss ◽  
Nicolas Agier ◽  
...  

2008 ◽  
Vol 9 (1) ◽  
pp. R7 ◽  
Author(s):  
Brian J Haas ◽  
Steven L Salzberg ◽  
Wei Zhu ◽  
Mihaela Pertea ◽  
Jonathan E Allen ◽  
...  

2021 ◽  
Author(s):  
Lotte J U Pronk ◽  
Marnix H Medema

Metagenomics has become a prominent technology to study the functional potential of all organisms in a microbial community. Most studies focus on the bacterial content of these communities, while ignoring eukaryotic microbes. Indeed, many metagenomics analysis pipelines silently assume that all contigs in a metagenome are prokaryotic. However, because of marked differences in gene structure, prokaryotic gene prediction tools fail to accurately predict eukaryotic genes. Here, we developed a classifier that distinguishes eukaryotic from prokaryotic contigs based on foundational differences between these taxa in gene structure. We first developed a random forest classifier that uses intergenic distance, gene density and gene length as the most important features. We show that, with an estimated accuracy of 97%, this classifier with principled features grounded in biology can perform almost as well as the classifiers EukRep and Tiara, which use k-mer frequencies as features. By re-training our classifier with Tiara predictions as additional feature, weaknesses of both types of classifiers are compensated; the result is an enhanced classifier that outperforms all individual classifiers, with an F1-score of 1.00 on precision, recall and accuracy for both eukaryotes and prokaryotes, while still being fast. In a reanalysis of metagenome data from a disease-suppressive plant endosphere microbial community, we show how using Whokaryote to select contigs for eukaryotic gene prediction facilitates the discovery of several biosynthetic gene clusters that were missed in the original study. Our enhanced classifier, which we call ′Whokaryote′, is wrapped in an easily installable package and is freely available from https://git.wageningenur.nl/lotte.pronk/whokaryote.


Author(s):  
Shira Milo ◽  
Reut Harari Misgav ◽  
Einat Hazkani-Covo ◽  
Shay Covo

Abstract Ascomycota is the largest phylogenetic group of fungi that includes species important to human health and wellbeing. DNA repair is important for fungal survival and genome evolution. Here, we describe a detailed comparative genomic analysis of DNA repair genes in Ascomycota. We determined the DNA repair gene repertoire in Taphrinomycotina, Saccharomycotina, Leotiomycetes, Sordariomycetes, Dothideomycetes, and Eurotiomycetes. The subphyla of yeasts, Saccharomycotina and Taphrinomycotina, have a smaller DNA repair gene repertoire comparing to Pezizomycotina. Some genes were absent from most, if not all, yeast species. To study the conservation of these genes in Pezizomycotina, we used the GLOOME algorithm that provides the expectations of gain or loss of genes given the tree topology. Genes that were absent from most of the species of Taphrinomycotina or Saccharomycotina showed lower conservation in Pezizomycotina. This suggests that the absence of some DNA repair in yeasts is not random; genes with a tendency to be lost in other classes are missing. We ranked the conservation of DNA repair genes in Ascomycota. We found that Rad51 and its paralogs were less conserved than other recombinational proteins, suggesting that there is a redundancy between Rad51 and its paralogs, at least in some species. Finally, based on the repertoire of UV repair genes, we found conditions that differentially kill the wine pathogen Brettanomyces bruxellensis and not Saccharomyces cerevisiae. In summary, our analysis provides testable hypotheses to the role of DNA repair proteins in the genome evolution of Ascomycota.


Sign in / Sign up

Export Citation Format

Share Document