scholarly journals BASys: a web server for automated bacterial genome annotation

2005 ◽  
Vol 33 (Web Server) ◽  
pp. W455-W459 ◽  
Author(s):  
G. H. Van Domselaar ◽  
P. Stothard ◽  
S. Shrivastava ◽  
J. A. Cruz ◽  
A. Guo ◽  
...  
2020 ◽  
Author(s):  
Snehal D. Karpe ◽  
Vikas Tiwari ◽  
Sowdhamini Ramanathan

AbstractInsect Olfactory Receptors (ORs) are diverse family of membrane protein receptors responsible for most of the insect olfactory perception and communication, and hence they are of utmost importance for developing repellents or pesticides. Hence, accurate gene prediction of insect ORs from newly sequenced genomes is an important but challenging task. We have developed a dedicated web-server, ‘insectOR’, to predict and validate insect OR genes using multiple gene prediction algorithms, accompanied by relevant validations. It is possible to employ this sever nearly automatically and perform rapid prediction of the OR gene loci from thousands of OR-protein-to-genome alignments, resolve gene boundaries for tandem OR genes and refine them further to provide more complete OR gene models. InsectOR outperformed the popular genome annotation pipelines (MAKER and NCBI eukaryotic genome annotation) in terms of overall sensitivity at base, exon and locus level, when tested on two distantly related insect genomes. It displayed more than 95% nucleotide level precision in both tests. Finally, given the same input data and parameters, InsectOR missed less than 2% gene loci, in contrast to 55% loci missed by MAKER for Drosophila melanogaster. The web-server is freely available on the web at http://caps.ncbs.res.in/insectOR/. All major browsers are supported. Website is implemented in Python with Jinja2 for templating and bootstrap framework which uses HTML, CSS and JavaScript/Ajax. The core pipeline is written in Perl.


PLoS ONE ◽  
2012 ◽  
Vol 7 (11) ◽  
pp. e49239 ◽  
Author(s):  
Pablo Pareja-Tobes ◽  
Marina Manrique ◽  
Eduardo Pareja-Tobes ◽  
Eduardo Pareja ◽  
Raquel Tobes

2018 ◽  
Author(s):  
Jelle Slager ◽  
Rieza Aprianto ◽  
Jan-Willem Veening

ABSTRACTA precise understanding of the genomic organization into transcriptional units and their regulation is essential for our comprehension of opportunistic human pathogens and how they cause disease. Using single-molecule real-time (PacBio) sequencing we unambiguously determined the genome sequence ofStreptococcus pneumoniaestrain D39 and revealed several inversions previously undetected by short-read sequencing. Significantly, a chromosomal inversion results in antigenic variation of PhtD, an important surface-exposed virulence factor. We generated a new genome annotation using automated tools, followed by manual curation, reflecting the current knowledge in the field. By combining sequence-driven terminator prediction, deep paired-end transcriptome sequencing and enrichment of primary transcripts by Cappable-Seq, we mapped 1,015 transcriptional start sites and 748 termination sites. Using this new genomic map, we identified several new small RNAs (sRNAs), riboswitches (including twelve previously misidentified as sRNAs), and antisense RNAs. In total, we annotated 92 new protein-encoding genes, 39 sRNAs and 165 pseudogenes, bringing theS. pneumoniaeD39 repertoire to 2,151 genetic elements. We report operon structures and observed that 9% of operons lack a 5’-UTR. The genome data is accessible in an online resource called PneumoBrowse (https://veeninglab.com/pneumobrowse) providing one of the most complete inventories of a bacterial genome to date. PneumoBrowse will accelerate pneumococcal research and the development of new prevention and treatment strategies.


PeerJ ◽  
2016 ◽  
Vol 4 ◽  
pp. e2056 ◽  
Author(s):  
Yevgeny Nikolaichik ◽  
Aliaksandr U. Damienikan

The majority of bacterial genome annotations are currently automated and based on a ‘gene by gene’ approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows) open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators) in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB) and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft RotEnterobacteriaceae(PectobacteriumandDickeyaspp.) andPseudomonasspp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome ofPectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of theP. atrosepticumchromosome. Reviewing the annotation in cases where it didn’t fit with regulatory information allowed us to correct product and gene names for over 300 loci.


mSystems ◽  
2020 ◽  
Vol 5 (5) ◽  
Author(s):  
Patrick Willems ◽  
Igor Fijalkowski ◽  
Petra Van Damme

ABSTRACT Prokaryotic genome annotation is heavily dependent on automated gene annotation pipelines that are prone to propagate errors and underestimate genome complexity. We describe an optimized proteogenomic workflow that uses ribosome profiling (ribo-seq) and proteomic data for Salmonella enterica serovar Typhimurium to identify unannotated proteins or alternative protein forms. This data analysis encompasses the searching of cofragmenting peptides and postprocessing with extended peptide-to-spectrum quality features, including comparison to predicted fragment ion intensities. When this strategy is applied, an enhanced proteome depth is achieved, as well as greater confidence for unannotated peptide hits. We demonstrate the general applicability of our pipeline by reanalyzing public Deinococcus radiodurans data sets. Taken together, our results show that systematic reanalysis using available prokaryotic (proteome) data sets holds great promise to assist in experimentally based genome annotation. IMPORTANCE Delineation of open reading frames (ORFs) causes persistent inconsistencies in prokaryote genome annotation. We demonstrate that by advanced (re)analysis of omics data, a higher proteome coverage and sensitive detection of unannotated ORFs can be achieved, which can be exploited for conditional bacterial genome (re)annotation, which is especially relevant in view of annotating the wealth of sequenced prokaryotic genomes obtained in recent years.


Author(s):  
Raquel Tobes ◽  
Pablo Pareja-Tobes ◽  
Marina Manrique ◽  
Eduardo Pareja-Tobes ◽  
Evdokim Kovach ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document