BASys: a web server for automated bacterial genome annotation

AbstractInsect Olfactory Receptors (ORs) are diverse family of membrane protein receptors responsible for most of the insect olfactory perception and communication, and hence they are of utmost importance for developing repellents or pesticides. Hence, accurate gene prediction of insect ORs from newly sequenced genomes is an important but challenging task. We have developed a dedicated web-server, ‘insectOR’, to predict and validate insect OR genes using multiple gene prediction algorithms, accompanied by relevant validations. It is possible to employ this sever nearly automatically and perform rapid prediction of the OR gene loci from thousands of OR-protein-to-genome alignments, resolve gene boundaries for tandem OR genes and refine them further to provide more complete OR gene models. InsectOR outperformed the popular genome annotation pipelines (MAKER and NCBI eukaryotic genome annotation) in terms of overall sensitivity at base, exon and locus level, when tested on two distantly related insect genomes. It displayed more than 95% nucleotide level precision in both tests. Finally, given the same input data and parameters, InsectOR missed less than 2% gene loci, in contrast to 55% loci missed by MAKER for Drosophila melanogaster. The web-server is freely available on the web at http://caps.ncbs.res.in/insectOR/. All major browsers are supported. Website is implemented in Python with Jinja2 for templating and bootstrap framework which uses HTML, CSS and JavaScript/Ajax. The core pipeline is written in Perl.

Download Full-text

Bacterial genome annotation tools. An annotated selection of World Wide Web sites relevant to the topics in Environmental Microbiology

Environmental Microbiology ◽

10.1111/j.1462-2920.2005.00794.x ◽

2005 ◽

Vol 7 (3) ◽

pp. 450-451

Author(s):

Lawrence P. Wackett

Keyword(s):

World Wide Web ◽

Web Sites ◽

Genome Annotation ◽

World Wide ◽

Bacterial Genome ◽

Environmental Microbiology ◽

Selection Of

Download Full-text

BG7: A New Approach for Bacterial Genome Annotation Designed for Next Generation Sequencing Data

PLoS ONE ◽

10.1371/journal.pone.0049239 ◽

2012 ◽

Vol 7 (11) ◽

pp. e49239 ◽

Cited By ~ 38

Author(s):

Pablo Pareja-Tobes ◽

Marina Manrique ◽

Eduardo Pareja-Tobes ◽

Eduardo Pareja ◽

Raquel Tobes

Keyword(s):

Next Generation Sequencing ◽

Genome Annotation ◽

Bacterial Genome ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

New Approach ◽

Generation Sequencing

Download Full-text

Deep genome annotation of the opportunistic human pathogenStreptococcus pneumoniaeD39

10.1101/283663 ◽

2018 ◽

Cited By ~ 1

Author(s):

Jelle Slager ◽

Rieza Aprianto ◽

Jan-Willem Veening

Keyword(s):

Single Molecule ◽

Genome Annotation ◽

Antigenic Variation ◽

Current Knowledge ◽

Bacterial Genome ◽

Treatment Strategies ◽

Human Pathogens ◽

Genome Data ◽

Manual Curation ◽

Automated Tools

ABSTRACTA precise understanding of the genomic organization into transcriptional units and their regulation is essential for our comprehension of opportunistic human pathogens and how they cause disease. Using single-molecule real-time (PacBio) sequencing we unambiguously determined the genome sequence ofStreptococcus pneumoniaestrain D39 and revealed several inversions previously undetected by short-read sequencing. Significantly, a chromosomal inversion results in antigenic variation of PhtD, an important surface-exposed virulence factor. We generated a new genome annotation using automated tools, followed by manual curation, reflecting the current knowledge in the field. By combining sequence-driven terminator prediction, deep paired-end transcriptome sequencing and enrichment of primary transcripts by Cappable-Seq, we mapped 1,015 transcriptional start sites and 748 termination sites. Using this new genomic map, we identified several new small RNAs (sRNAs), riboswitches (including twelve previously misidentified as sRNAs), and antisense RNAs. In total, we annotated 92 new protein-encoding genes, 39 sRNAs and 165 pseudogenes, bringing theS. pneumoniaeD39 repertoire to 2,151 genetic elements. We report operon structures and observed that 9% of operons lack a 5’-UTR. The genome data is accessible in an online resource called PneumoBrowse (https://veeninglab.com/pneumobrowse) providing one of the most complete inventories of a bacterial genome to date. PneumoBrowse will accelerate pneumococcal research and the development of new prevention and treatment strategies.

Download Full-text

SigmoID: a user-friendly tool for improving bacterial genome annotation through analysis of transcription control signals

PeerJ ◽

10.7717/peerj.2056 ◽

2016 ◽

Vol 4 ◽

pp. e2056 ◽

Cited By ~ 4

Author(s):

Yevgeny Nikolaichik ◽

Aliaksandr U. Damienikan

Keyword(s):

Transcription Factor ◽

Binding Sites ◽

Genome Annotation ◽

Bacterial Genome ◽

Transcription Factor Binding Sites ◽

Transcription Factor Binding ◽

Factor Binding ◽

A Genome ◽

Regulatory Information ◽

User Friendly

The majority of bacterial genome annotations are currently automated and based on a ‘gene by gene’ approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows) open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators) in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB) and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft RotEnterobacteriaceae(PectobacteriumandDickeyaspp.) andPseudomonasspp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome ofPectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of theP. atrosepticumchromosome. Reviewing the annotation in cases where it didn’t fit with regulatory information allowed us to correct product and gene names for over 300 loci.

Download Full-text

Lost and Found: Re-searching and Re-scoring Proteomics Data Aids Genome Annotation and Improves Proteome Coverage

mSystems ◽

10.1128/msystems.00833-20 ◽

2020 ◽

Vol 5 (5) ◽

Author(s):

Patrick Willems ◽

Igor Fijalkowski ◽

Petra Van Damme

Keyword(s):

Genome Annotation ◽

Deinococcus Radiodurans ◽

Gene Annotation ◽

Bacterial Genome ◽

Prokaryotic Genome ◽

Ribosome Profiling ◽

Great Promise ◽

Data Sets ◽

Proteome Coverage ◽

Content Type

ABSTRACT Prokaryotic genome annotation is heavily dependent on automated gene annotation pipelines that are prone to propagate errors and underestimate genome complexity. We describe an optimized proteogenomic workflow that uses ribosome profiling (ribo-seq) and proteomic data for Salmonella enterica serovar Typhimurium to identify unannotated proteins or alternative protein forms. This data analysis encompasses the searching of cofragmenting peptides and postprocessing with extended peptide-to-spectrum quality features, including comparison to predicted fragment ion intensities. When this strategy is applied, an enhanced proteome depth is achieved, as well as greater confidence for unannotated peptide hits. We demonstrate the general applicability of our pipeline by reanalyzing public Deinococcus radiodurans data sets. Taken together, our results show that systematic reanalysis using available prokaryotic (proteome) data sets holds great promise to assist in experimentally based genome annotation. IMPORTANCE Delineation of open reading frames (ORFs) causes persistent inconsistencies in prokaryote genome annotation. We demonstrate that by advanced (re)analysis of omics data, a higher proteome coverage and sensitive detection of unannotated ORFs can be achieved, which can be exploited for conditional bacterial genome (re)annotation, which is especially relevant in view of annotating the wealth of sequenced prokaryotic genomes obtained in recent years.

Download Full-text

Peer Review #1 of "SigmoID: a user-friendly tool for improving bacterial genome annotation through analysis of transcription control signals (v0.1)"

10.7287/peerj.2056v0.1/reviews/1 ◽

2016 ◽

Author(s):

KU Förstner

Keyword(s):

Peer Review ◽

Genome Annotation ◽

Bacterial Genome ◽

Transcription Control ◽

Control Signals ◽

User Friendly

Download Full-text

Gene Calling and Bacterial Genome Annotation with BG7

Methods in Molecular Biology - Bacterial Pangenomics ◽

10.1007/978-1-4939-1720-4_12 ◽

2015 ◽

pp. 177-189 ◽

Cited By ~ 8

Author(s):

Raquel Tobes ◽

Pablo Pareja-Tobes ◽

Marina Manrique ◽

Eduardo Pareja-Tobes ◽

Evdokim Kovach ◽

...

Keyword(s):

Genome Annotation ◽

Bacterial Genome

Download Full-text

Identification of Small Non-coding RNAs in Bacterial Genome Annotation Using Databases and Computational Approaches

Advances in Intelligent Systems and Computing - Advances in Computational Biology ◽

10.1007/978-3-319-01568-2_42 ◽

2014 ◽

pp. 295-300 ◽

Cited By ~ 1

Author(s):

Mauricio Corredor ◽

Oscar Murillo

Keyword(s):

Genome Annotation ◽

Bacterial Genome ◽

Computational Approaches ◽

Non Coding Rnas

Download Full-text