scholarly journals A versatile computational pipeline for bacterial genome annotation improvement and comparative analysis, with Brucella as a use case

2007 ◽  
Vol 35 (12) ◽  
pp. 3953-3962 ◽  
Author(s):  
G.X. Yu ◽  
E.E. Snyder ◽  
S.M. Boyle ◽  
O.R. Crasta ◽  
M. Czar ◽  
...  
PLoS ONE ◽  
2012 ◽  
Vol 7 (11) ◽  
pp. e49239 ◽  
Author(s):  
Pablo Pareja-Tobes ◽  
Marina Manrique ◽  
Eduardo Pareja-Tobes ◽  
Eduardo Pareja ◽  
Raquel Tobes

2005 ◽  
Vol 33 (Web Server) ◽  
pp. W455-W459 ◽  
Author(s):  
G. H. Van Domselaar ◽  
P. Stothard ◽  
S. Shrivastava ◽  
J. A. Cruz ◽  
A. Guo ◽  
...  

2018 ◽  
Author(s):  
Jelle Slager ◽  
Rieza Aprianto ◽  
Jan-Willem Veening

ABSTRACTA precise understanding of the genomic organization into transcriptional units and their regulation is essential for our comprehension of opportunistic human pathogens and how they cause disease. Using single-molecule real-time (PacBio) sequencing we unambiguously determined the genome sequence ofStreptococcus pneumoniaestrain D39 and revealed several inversions previously undetected by short-read sequencing. Significantly, a chromosomal inversion results in antigenic variation of PhtD, an important surface-exposed virulence factor. We generated a new genome annotation using automated tools, followed by manual curation, reflecting the current knowledge in the field. By combining sequence-driven terminator prediction, deep paired-end transcriptome sequencing and enrichment of primary transcripts by Cappable-Seq, we mapped 1,015 transcriptional start sites and 748 termination sites. Using this new genomic map, we identified several new small RNAs (sRNAs), riboswitches (including twelve previously misidentified as sRNAs), and antisense RNAs. In total, we annotated 92 new protein-encoding genes, 39 sRNAs and 165 pseudogenes, bringing theS. pneumoniaeD39 repertoire to 2,151 genetic elements. We report operon structures and observed that 9% of operons lack a 5’-UTR. The genome data is accessible in an online resource called PneumoBrowse (https://veeninglab.com/pneumobrowse) providing one of the most complete inventories of a bacterial genome to date. PneumoBrowse will accelerate pneumococcal research and the development of new prevention and treatment strategies.


2022 ◽  
Author(s):  
Caroline M. Weisman ◽  
Andrew M. Murray ◽  
Sean R Eddy

Comparisons of genomes of different species are used to identify lineage-specific genes, those genes that appear unique to one species or clade. Lineage-specific genes are often thought to represent genetic novelty that underlies unique adaptations. Identification of these genes depends not only on genome sequences, but also on inferred gene annotations. Comparative analyses typically use available genomes that have been annotated using different methods, increasing the risk that orthologous DNA sequences may be erroneously annotated as a gene in one species but not another, appearing lineage-specific as a result. To evaluate the impact of such 'annotation heterogeneity', we identified four clades of species with sequenced genomes with more than one publicly available gene annotation, allowing us to compare the number of lineage-specific genes inferred when differing annotation methods are used to those resulting when annotation method is uniform across the clade. In these case studies, annotation heterogeneity increases the apparent number of lineage-specific genes by up to 15-fold, suggesting that annotation heterogeneity is a substantial source of potential artifact.


PeerJ ◽  
2016 ◽  
Vol 4 ◽  
pp. e2056 ◽  
Author(s):  
Yevgeny Nikolaichik ◽  
Aliaksandr U. Damienikan

The majority of bacterial genome annotations are currently automated and based on a ‘gene by gene’ approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows) open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators) in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB) and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft RotEnterobacteriaceae(PectobacteriumandDickeyaspp.) andPseudomonasspp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome ofPectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of theP. atrosepticumchromosome. Reviewing the annotation in cases where it didn’t fit with regulatory information allowed us to correct product and gene names for over 300 loci.


Sign in / Sign up

Export Citation Format

Share Document