scholarly journals Combining Shigella Tn-seq data with Gold-standard E. coli Gene Deletion Data Suggests Rare Transitions between Essential and Non-essential Gene Functionality

2016 ◽  
Author(s):  
Nikki E Freed ◽  
Dirk Bumann ◽  
Olin K Silander

Gene essentiality - whether or not a gene is necessary for cell growth - is a fundamental component of gene function. It is not well established how quickly gene essentiality can change, as few studies have compared empirical measures of essentiality between closely related organisms. Here we present the results of a Tn-seq experiment designed to detect essential protein coding genes in the bacterial pathogen Shigella flexneri 2a 2457T on a genome-wide scale. Superficial analysis of this data suggested that 451 protein-coding genes in this Shigella strain are critical for robust cellular growth on rich media. Comparison of this set of genes with a gold-standard data set of essential genes in the closely related Escherichia coli K12 BW25113 suggested that an excessive number of genes appeared essential in Shigella but non-essential in E. coli. Importantly, and in converse to this comparison, we found no genes that were essential in E. coli and non-essential in Shigella, suggesting that many genes were artefactually inferred as essential in Shigella. Controlling for such artefacts resulted in a much smaller set of discrepant genes. Among these, we identified three sets of functionally related genes, two of which have previously been implicated as critical for Shigella growth, but which are dispensable for E. coli growth. The data presented here highlight the small number of protein coding genes for which we have strong evidence that their essentiality status differs between the closely related bacterial taxa E. coli and Shigella. A set of genes involved in acetate utilization provides a canonical example. These results leave open the possibility of developing strain-specific antibiotic treatments targeting such differentially essential genes, but suggest that such opportunities may be rare in closely related bacteria.

2006 ◽  
Vol 188 (23) ◽  
pp. 8259-8271 ◽  
Author(s):  
Andrew R. Joyce ◽  
Jennifer L. Reed ◽  
Aprilfawn White ◽  
Robert Edwards ◽  
Andrei Osterman ◽  
...  

ABSTRACT Genome-wide gene essentiality data sets are becoming available for Escherichia coli, but these data sets have yet to be analyzed in the context of a genome scale model. Here, we present an integrative model-driven analysis of the Keio E. coli mutant collection screened in this study on glycerol-supplemented minimal medium. Out of 3,888 single-deletion mutants tested, 119 mutants were unable to grow on glycerol minimal medium. These conditionally essential genes were then evaluated using a genome scale metabolic and transcriptional-regulatory model of E. coli, and it was found that the model made the correct prediction in ∼91% of the cases. The discrepancies between model predictions and experimental results were analyzed in detail to indicate where model improvements could be made or where the current literature lacks an explanation for the observed phenotypes. The identified set of essential genes and their model-based analysis indicates that our current understanding of the roles these essential genes play is relatively clear and complete. Furthermore, by analyzing the data set in terms of metabolic subsystems across multiple genomes, we can project which metabolic pathways are likely to play equally important roles in other organisms. Overall, this work establishes a paradigm that will drive model enhancement while simultaneously generating hypotheses that will ultimately lead to a better understanding of the organism.


2021 ◽  
Vol 6 ◽  
pp. 258
Author(s):  
Konrad Lohse ◽  
Alexander Mackintosh ◽  
Roger Vila ◽  
◽  
◽  
...  

We present a genome assembly from an individual male Aglais io (also known as Inachis io and Nymphalis io) (the European peacock; Arthropoda; Insecta; Lepidoptera; Nymphalidae). The genome sequence is 384 megabases in span. The majority (99.91%) of the assembly is scaffolded into 31 chromosomal pseudomolecules, with the Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 11,420 protein coding genes.


2019 ◽  
Author(s):  
Change Laura Tan

AbstractPublic access to thousands of completely sequenced and annotated genomes provides a great opportunity to address the relationships of different organisms, at the molecular level and on a genome-wide scale. Via comparing the phylogenetic profiles of all protein-coding genes in 317 model species described in the OrthoInspector3.0 database, we found that approximately 29.8% of the total protein-coding genes were orphan genes (genes unique to a specific species) while < 0.01% were universal genes (genes with homologs in each of the 317 species analyzed). When weighted by potential birth event, the orphan genes comprised 82% of the total, while the universal genes accounted for less than 0.00008%. Strikingly, as the analyzed genomes increased, the sum total of universal and nearly-universal genes plateaued while that of orphan and nearly-orphan genes grew continuously. When the compared species increased to the inclusion of 3863 bacteria, 711 eukaryotes, and 179 archaea, not one of the universal genes remained. The results speak to a previously unappreciated degree of genetic biodiversity, which we propose to quantify using the birth-event-weighted gene count method.


BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Jeanne Wilbrandt ◽  
Bernhard Misof ◽  
Kristen A. Panfilio ◽  
Oliver Niehuis

Abstract Background The location and modular structure of eukaryotic protein-coding genes in genomic sequences can be automatically predicted by gene annotation algorithms. These predictions are often used for comparative studies on gene structure, gene repertoires, and genome evolution. However, automatic annotation algorithms do not yet correctly identify all genes within a genome, and manual annotation is often necessary to obtain accurate gene models and gene sets. As manual annotation is time-consuming, only a fraction of the gene models in a genome is typically manually annotated, and this fraction often differs between species. To assess the impact of manual annotation efforts on genome-wide analyses of gene structural properties, we compared the structural properties of protein-coding genes in seven diverse insect species sequenced by the i5k initiative. Results Our results show that the subset of genes chosen for manual annotation by a research community (3.5–7% of gene models) may have structural properties (e.g., lengths and exon counts) that are not necessarily representative for a species’ gene set as a whole. Nonetheless, the structural properties of automatically generated gene models are only altered marginally (if at all) through manual annotation. Major correlative trends, for example a negative correlation between genome size and exonic proportion, can be inferred from either the automatically predicted or manually annotated gene models alike. Vice versa, some previously reported trends did not appear in either the automatic or manually annotated gene sets, pointing towards insect-specific gene structural peculiarities. Conclusions In our analysis of gene structural properties, automatically predicted gene models proved to be sufficiently reliable to recover the same gene-repertoire-wide correlative trends that we found when focusing on manually annotated gene models only. We acknowledge that analyses on the individual gene level clearly benefit from manual curation. However, as genome sequencing and annotation projects often differ in the extent of their manual annotation and curation efforts, our results indicate that comparative studies analyzing gene structural properties in these genomes can nonetheless be justifiable and informative.


2013 ◽  
Vol 42 (D1) ◽  
pp. D574-D580 ◽  
Author(s):  
Hao Luo ◽  
Yan Lin ◽  
Feng Gao ◽  
Chun-Ting Zhang ◽  
Ren Zhang

2018 ◽  
Vol 6 (3) ◽  
pp. e01443-17 ◽  
Author(s):  
Vivek Kumar Ranjan ◽  
Tilak Saha ◽  
Shriparna Mukherjee ◽  
Ranadhir Chakraborty

ABSTRACTThe draft genome sequence of a novel strain,Pseudomonassp. MR 02, a pyomelanin-producing bacterium isolated from the Mahananda River at Siliguri, West Bengal, India, is reported here. This strain has a genome size of 5.94 Mb, with an overall G+C content of 62.6%. The draft genome reports 5,799 genes (mean gene length, 923 bp), among which 5,503 are protein-coding genes, including the genes required for the catabolism of tyrosine or phenylalanine for the characteristic production of homogentisic acid (HGA). Excess HGA, on excretion, auto-oxidizes and polymerizes to form pyomelanin.


2021 ◽  
Vol 6 ◽  
pp. 266
Author(s):  
Roger Vila ◽  
Alex Hayward ◽  
Konrad Lohse ◽  
Charlotte Wright ◽  
◽  
...  

We present a genome assembly from an individual male Melitaea cinxia (the Glanville fritillary; Arthropoda; Insecta; Lepidoptera; Nymphalidae). The genome sequence is 499 megabases in span. The complete assembly is scaffolded into 31 chromosomal pseudomolecules, with the Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 13,666 protein coding genes.


2019 ◽  
Author(s):  
Haley Wight ◽  
Junhui Zhou ◽  
Muzi Li ◽  
Sridhar Hannenhalli ◽  
Stephen M. Mount ◽  
...  

AbstractThe red raspberry, Rubus idaeus, is widely distributed in all temperate regions of Europe, Asia, and North America and is a major commercial fruit valued for its taste, high antioxidant and vitamin content. However, Rubus breeding is a long and slow process hampered by limited genomic and molecular resources. Genomic resources such as a complete genome sequencing and transcriptome will be of exceptional value to improve research and breeding of this high value crop. Using a hybrid sequence assembly approach including data from both long and short sequence reads, we present the first assembly of the Rubus idaeus genome (Joan J. variety). The de novo assembled genome consists of 2,145 scaffolds with a genome completeness of 95.3% and an N50 score of 638 KB. Leveraging a linkage map, we anchored 80.1% of the genome onto seven chromosomes. Using over 1 billion paired-end RNAseq reads, we annotated 35,566 protein coding genes with a transcriptome completeness score of 97.2%. The Rubus idaeus genome provides an important new resource for researchers and breeders.


2019 ◽  
Vol 8 (40) ◽  
Author(s):  
James E. Corban ◽  
Jacob Gramer ◽  
Russell Moreland ◽  
Mei Liu ◽  
Jolene Ramsey

Escherichia coli is a Gram-negative bacterium often found in animal intestinal tracts. Here, we present the genome of the Guernseyvirinae-like E. coli 4s siphophage Snoke. The 44.4-kb genome contains 81 protein-coding genes, for which 33 functions were predicted. The capsid morphogenesis gene in Snoke contains a large intein.


Sign in / Sign up

Export Citation Format

Share Document