Combining Shigella Tn-seq data with Gold-standard E. coli Gene Deletion Data Suggests Rare Transitions between Essential and Non-essential Gene Functionality

Mapping Intimacies ◽

10.1101/038869 ◽

2016 ◽

Author(s):

Nikki E Freed ◽

Dirk Bumann ◽

Olin K Silander

Keyword(s):

Gold Standard ◽

Essential Genes ◽

Cellular Growth ◽

Data Set ◽

Protein Coding ◽

Empirical Measures ◽

Gene Essentiality ◽

E Coli ◽

Protein Coding Genes ◽

A Genome

Gene essentiality - whether or not a gene is necessary for cell growth - is a fundamental component of gene function. It is not well established how quickly gene essentiality can change, as few studies have compared empirical measures of essentiality between closely related organisms. Here we present the results of a Tn-seq experiment designed to detect essential protein coding genes in the bacterial pathogen Shigella flexneri 2a 2457T on a genome-wide scale. Superficial analysis of this data suggested that 451 protein-coding genes in this Shigella strain are critical for robust cellular growth on rich media. Comparison of this set of genes with a gold-standard data set of essential genes in the closely related Escherichia coli K12 BW25113 suggested that an excessive number of genes appeared essential in Shigella but non-essential in E. coli. Importantly, and in converse to this comparison, we found no genes that were essential in E. coli and non-essential in Shigella, suggesting that many genes were artefactually inferred as essential in Shigella. Controlling for such artefacts resulted in a much smaller set of discrepant genes. Among these, we identified three sets of functionally related genes, two of which have previously been implicated as critical for Shigella growth, but which are dispensable for E. coli growth. The data presented here highlight the small number of protein coding genes for which we have strong evidence that their essentiality status differs between the closely related bacterial taxa E. coli and Shigella. A set of genes involved in acetate utilization provides a canonical example. These results leave open the possibility of developing strain-specific antibiotic treatments targeting such differentially essential genes, but suggest that such opportunities may be rare in closely related bacteria.

Download Full-text

Experimental and Computational Assessment of Conditionally Essential Genes in Escherichia coli

Journal of Bacteriology ◽

10.1128/jb.00740-06 ◽

2006 ◽

Vol 188 (23) ◽

pp. 8259-8271 ◽

Cited By ~ 174

Author(s):

Andrew R. Joyce ◽

Jennifer L. Reed ◽

Aprilfawn White ◽

Robert Edwards ◽

Andrei Osterman ◽

...

Keyword(s):

Escherichia Coli ◽

Minimal Medium ◽

Essential Genes ◽

Integrative Model ◽

Scale Model ◽

Data Sets ◽

Data Set ◽

E Coli ◽

A Genome ◽

Genome Scale

ABSTRACT Genome-wide gene essentiality data sets are becoming available for Escherichia coli, but these data sets have yet to be analyzed in the context of a genome scale model. Here, we present an integrative model-driven analysis of the Keio E. coli mutant collection screened in this study on glycerol-supplemented minimal medium. Out of 3,888 single-deletion mutants tested, 119 mutants were unable to grow on glycerol minimal medium. These conditionally essential genes were then evaluated using a genome scale metabolic and transcriptional-regulatory model of E. coli, and it was found that the model made the correct prediction in ∼91% of the cases. The discrepancies between model predictions and experimental results were analyzed in detail to indicate where model improvements could be made or where the current literature lacks an explanation for the observed phenotypes. The identified set of essential genes and their model-based analysis indicates that our current understanding of the roles these essential genes play is relatively clear and complete. Furthermore, by analyzing the data set in terms of metabolic subsystems across multiple genomes, we can project which metabolic pathways are likely to play equally important roles in other organisms. Overall, this work establishes a paradigm that will drive model enhancement while simultaneously generating hypotheses that will ultimately lead to a better understanding of the organism.

Download Full-text

The genome sequence of the European peacock butterfly, Aglais io (Linnaeus, 1758)

Wellcome Open Research ◽

10.12688/wellcomeopenres.17204.1 ◽

2021 ◽

Vol 6 ◽

pp. 258

Author(s):

Konrad Lohse ◽

Alexander Mackintosh ◽

Roger Vila ◽

◽

...

Keyword(s):

Genome Sequence ◽

Genome Assembly ◽

Sex Chromosome ◽

Gene Annotation ◽

Protein Coding ◽

Individual Male ◽

Protein Coding Genes ◽

A Genome ◽

Inachis Io

We present a genome assembly from an individual male Aglais io (also known as Inachis io and Nymphalis io) (the European peacock; Arthropoda; Insecta; Lepidoptera; Nymphalidae). The genome sequence is 384 megabases in span. The majority (99.91%) of the assembly is scaffolded into 31 chromosomal pseudomolecules, with the Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 11,420 protein coding genes.

Download Full-text

The Absence of Universally-Conserved Protein-coding Genes

10.1101/842633 ◽

2019 ◽

Author(s):

Change Laura Tan

Keyword(s):

Public Access ◽

Orphan Genes ◽

Protein Coding ◽

Great Opportunity ◽

Protein Coding Genes ◽

Phylogenetic Profiles ◽

Gene Count ◽

A Genome ◽

Wide Scale ◽

Specific Species

AbstractPublic access to thousands of completely sequenced and annotated genomes provides a great opportunity to address the relationships of different organisms, at the molecular level and on a genome-wide scale. Via comparing the phylogenetic profiles of all protein-coding genes in 317 model species described in the OrthoInspector3.0 database, we found that approximately 29.8% of the total protein-coding genes were orphan genes (genes unique to a specific species) while < 0.01% were universal genes (genes with homologs in each of the 317 species analyzed). When weighted by potential birth event, the orphan genes comprised 82% of the total, while the universal genes accounted for less than 0.00008%. Strikingly, as the analyzed genomes increased, the sum total of universal and nearly-universal genes plateaued while that of orphan and nearly-orphan genes grew continuously. When the compared species increased to the inclusion of 3863 bacteria, 711 eukaryotes, and 179 archaea, not one of the universal genes remained. The results speak to a previously unappreciated degree of genetic biodiversity, which we propose to quantify using the birth-event-weighted gene count method.

Download Full-text

Repertoire-wide gene structure analyses: a case study comparing automatically predicted and manually annotated gene models

BMC Genomics ◽

10.1186/s12864-019-6064-8 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 3

Author(s):

Jeanne Wilbrandt ◽

Bernhard Misof ◽

Kristen A. Panfilio ◽

Oliver Niehuis

Keyword(s):

Structural Properties ◽

Gene Structure ◽

Comparative Studies ◽

Gene Repertoire ◽

Manual Annotation ◽

Protein Coding ◽

Protein Coding Genes ◽

Gene Sets ◽

A Genome ◽

Gene Models

Abstract Background The location and modular structure of eukaryotic protein-coding genes in genomic sequences can be automatically predicted by gene annotation algorithms. These predictions are often used for comparative studies on gene structure, gene repertoires, and genome evolution. However, automatic annotation algorithms do not yet correctly identify all genes within a genome, and manual annotation is often necessary to obtain accurate gene models and gene sets. As manual annotation is time-consuming, only a fraction of the gene models in a genome is typically manually annotated, and this fraction often differs between species. To assess the impact of manual annotation efforts on genome-wide analyses of gene structural properties, we compared the structural properties of protein-coding genes in seven diverse insect species sequenced by the i5k initiative. Results Our results show that the subset of genes chosen for manual annotation by a research community (3.5–7% of gene models) may have structural properties (e.g., lengths and exon counts) that are not necessarily representative for a species’ gene set as a whole. Nonetheless, the structural properties of automatically generated gene models are only altered marginally (if at all) through manual annotation. Major correlative trends, for example a negative correlation between genome size and exonic proportion, can be inferred from either the automatically predicted or manually annotated gene models alike. Vice versa, some previously reported trends did not appear in either the automatic or manually annotated gene sets, pointing towards insect-specific gene structural peculiarities. Conclusions In our analysis of gene structural properties, automatically predicted gene models proved to be sufficiently reliable to recover the same gene-repertoire-wide correlative trends that we found when focusing on manually annotated gene models only. We acknowledge that analyses on the individual gene level clearly benefit from manual curation. However, as genome sequencing and annotation projects often differ in the extent of their manual annotation and curation efforts, our results indicate that comparative studies analyzing gene structural properties in these genomes can nonetheless be justifiable and informative.

Download Full-text

Genome-wide methylation and transcriptome of blood neutrophils reveal the roles of DNA methylation in affecting transcription of protein-coding genes and miRNAs in E. coli-infected mastitis cows

BMC Genomics ◽

10.1186/s12864-020-6526-z ◽

2020 ◽

Vol 21 (1) ◽

Cited By ~ 4

Author(s):

Zhihua Ju ◽

Qiang Jiang ◽

Jinpeng Wang ◽

Xiuge Wang ◽

Chunhong Yang ◽

...

Keyword(s):

Dna Methylation ◽

Protein Coding ◽

E Coli ◽

Protein Coding Genes ◽

Genome Wide ◽

Blood Neutrophils

Download Full-text

DEG 10, an update of the database of essential genes that includes both protein-coding genes and noncoding genomic elements: Table 1.

Nucleic Acids Research ◽

10.1093/nar/gkt1131 ◽

2013 ◽

Vol 42 (D1) ◽

pp. D574-D580 ◽

Cited By ~ 296

Author(s):

Hao Luo ◽

Yan Lin ◽

Feng Gao ◽

Chun-Ting Zhang ◽

Ren Zhang

Keyword(s):

Essential Genes ◽

Protein Coding ◽

Protein Coding Genes

Download Full-text

Draft Genome Sequence of a Novel Bacterium,Pseudomonassp. Strain MR 02, Capable of Pyomelanin Production, Isolated from the Mahananda River at Siliguri, West Bengal, India

Genome Announcements ◽

10.1128/genomea.01443-17 ◽

2018 ◽

Vol 6 (3) ◽

pp. e01443-17 ◽

Cited By ~ 1

Author(s):

Vivek Kumar Ranjan ◽

Tilak Saha ◽

Shriparna Mukherjee ◽

Ranadhir Chakraborty

Keyword(s):

Genome Sequence ◽

West Bengal ◽

Draft Genome ◽

Homogentisic Acid ◽

Draft Genome Sequence ◽

Gene Length ◽

Protein Coding ◽

Protein Coding Genes ◽

Novel Bacterium ◽

A Genome

ABSTRACTThe draft genome sequence of a novel strain,Pseudomonassp. MR 02, a pyomelanin-producing bacterium isolated from the Mahananda River at Siliguri, West Bengal, India, is reported here. This strain has a genome size of 5.94 Mb, with an overall G+C content of 62.6%. The draft genome reports 5,799 genes (mean gene length, 923 bp), among which 5,503 are protein-coding genes, including the genes required for the catabolism of tyrosine or phenylalanine for the characteristic production of homogentisic acid (HGA). Excess HGA, on excretion, auto-oxidizes and polymerizes to form pyomelanin.

Download Full-text

The genome sequence of the Glanville fritillary, Melitaea cinxia (Linnaeus, 1758)

Wellcome Open Research ◽

10.12688/wellcomeopenres.17283.1 ◽

2021 ◽

Vol 6 ◽

pp. 266

Author(s):

Roger Vila ◽

Alex Hayward ◽

Konrad Lohse ◽

Charlotte Wright ◽

◽

...

Keyword(s):

Genome Sequence ◽

Genome Assembly ◽

Sex Chromosome ◽

Gene Annotation ◽

Melitaea Cinxia ◽

Protein Coding ◽

Individual Male ◽

Protein Coding Genes ◽

A Genome

We present a genome assembly from an individual male Melitaea cinxia (the Glanville fritillary; Arthropoda; Insecta; Lepidoptera; Nymphalidae). The genome sequence is 499 megabases in span. The complete assembly is scaffolded into 31 chromosomal pseudomolecules, with the Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 13,666 protein coding genes.

Download Full-text

Draft Genome Assembly and Annotation of Red Raspberry Rubus Idaeus

10.1101/546135 ◽

2019 ◽

Cited By ~ 4

Author(s):

Haley Wight ◽

Junhui Zhou ◽

Muzi Li ◽

Sridhar Hannenhalli ◽

Stephen M. Mount ◽

...

Keyword(s):

De Novo ◽

Draft Genome ◽

Rubus Idaeus ◽

Slow Process ◽

Red Raspberry ◽

Protein Coding ◽

Draft Genome Assembly ◽

Protein Coding Genes ◽

A Genome ◽

Exceptional Value

AbstractThe red raspberry, Rubus idaeus, is widely distributed in all temperate regions of Europe, Asia, and North America and is a major commercial fruit valued for its taste, high antioxidant and vitamin content. However, Rubus breeding is a long and slow process hampered by limited genomic and molecular resources. Genomic resources such as a complete genome sequencing and transcriptome will be of exceptional value to improve research and breeding of this high value crop. Using a hybrid sequence assembly approach including data from both long and short sequence reads, we present the first assembly of the Rubus idaeus genome (Joan J. variety). The de novo assembled genome consists of 2,145 scaffolds with a genome completeness of 95.3% and an N50 score of 638 KB. Leveraging a linkage map, we anchored 80.1% of the genome onto seven chromosomes. Using over 1 billion paired-end RNAseq reads, we annotated 35,566 protein coding genes with a transcriptome completeness score of 97.2%. The Rubus idaeus genome provides an important new resource for researchers and breeders.

Download Full-text

Complete Genome Sequence of Escherichia coli Siphophage Snoke

Microbiology Resource Announcements ◽

10.1128/mra.01051-19 ◽

2019 ◽

Vol 8 (40) ◽

Cited By ~ 1

Author(s):

James E. Corban ◽

Jacob Gramer ◽

Russell Moreland ◽

Mei Liu ◽

Jolene Ramsey

Keyword(s):

Escherichia Coli ◽

Genome Sequence ◽

Complete Genome Sequence ◽

Complete Genome ◽

Negative Bacterium ◽

Protein Coding ◽

Gram Negative ◽

Content Type ◽

E Coli ◽

Protein Coding Genes

Escherichia coli is a Gram-negative bacterium often found in animal intestinal tracts. Here, we present the genome of the Guernseyvirinae-like E. coli 4s siphophage Snoke. The 44.4-kb genome contains 81 protein-coding genes, for which 33 functions were predicted. The capsid morphogenesis gene in Snoke contains a large intein.

Download Full-text