Chromatiblock: scalable whole-genome visualization of structural differences in prokaryotes

GCViT: a method for interactive, genome-wide visualization of resequencing and SNP array data

BMC Genomics ◽

10.1186/s12864-020-07217-2 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Andrew P. Wilkey ◽

Anne V. Brown ◽

Steven B. Cannon ◽

Ethalinda K. S. Cannon

Keyword(s):

Snp Array ◽

Whole Genome ◽

Array Data ◽

Link Type ◽

Snp Data ◽

Genome Wide ◽

Genome Visualization ◽

Data Points ◽

Genomic Regions ◽

Genome Scale

Abstract Background Large genotyping datasets have become commonplace due to efficient, cheap methods for SNP identification. Typical genotyping datasets may have thousands to millions of data points per accession, across tens to thousands of accessions. There is a need for tools to help rapidly explore such datasets, to assess characteristics such as overall differences between accessions and regional anomalies across the genome. Results We present GCViT (Genotype Comparison Visualization Tool), for visualizing and exploring large genotyping datasets. GCViT can be used to identify introgressions, conserved or divergent genomic regions, pedigrees, and other features for more detailed exploration. The program can be used online or as a local instance for whole genome visualization of resequencing or SNP array data. The program performs comparisons of variants among user-selected accessions to identify allele differences and similarities between accessions and a user-selected reference, providing visualizations through histogram, heatmap, or haplotype views. The resulting analyses and images can be exported in various formats. Conclusions GCViT provides methods for interactively visualizing SNP data on a whole genome scale, and can produce publication-ready figures. It can be used in online or local installations. GCViT enables users to confirm or identify genomics regions of interest associated with particular traits. GCViT is freely available at https://github.com/LegumeFederation/gcvit. The 1.0 version described here is available at 10.5281/zenodo.4008713.

Download Full-text

Chromatiblock: scalable whole-genome visualization of structural differences in prokaryotes

The Journal of Open Source Software ◽

10.21105/joss.02451 ◽

2020 ◽

Vol 5 (53) ◽

pp. 2451

Author(s):

Mitchell Sullivan ◽

Harm van Bakel

Keyword(s):

Whole Genome ◽

Genome Visualization ◽

Structural Differences

Download Full-text

Tychus: a whole genome sequencing pipeline for assembly, annotation and phylogenetics of bacterial genomes

10.1101/283101 ◽

2018 ◽

Cited By ~ 1

Author(s):

Christopher Dean ◽

Noelle Noyes ◽

Steven Lakin ◽

Pablo Rovira-Sanz ◽

Xiang Yang ◽

...

Keyword(s):

Open Source ◽

Bacterial Genome ◽

Whole Genome Sequence ◽

Whole Genome ◽

Bacterial Genomes ◽

High Confidence ◽

Comprehensive Description ◽

Variant Discovery ◽

Large Numbers ◽

Virtualization Technology

AbstractSummaryTychus is a tool that allows researchers to perform massively parallel whole genome sequence (WGS) analysis with the goal of producing a high confidence and comprehensive description of the bacterial genome. Key features of the Tychus pipeline include the assembly, annotation, alignment, variant discovery and phylogenetic inference of large numbers of WGS isolates in parallel using open-source bioinformatics tools and virtualization technology. All prerequisite tools and dependencies come packaged together in a single suite that can be easily downloaded and installed on Linux and Mac operating systems.AvailabilityTychus is freely available as an open-source package under the MIT license, and can be downloaded via GitHub (https://github.com/Abdo-Lab/Tychus)[email protected]

Download Full-text

Employing whole genome mapping for optimal de novo assembly of bacterial genomes

BMC Research Notes ◽

10.1186/1756-0500-7-484 ◽

2014 ◽

Vol 7 (1) ◽

pp. 484 ◽

Cited By ~ 9

Author(s):

Basil Xavier ◽

Julia Sabirova ◽

Moons Pieter ◽

Jean-Pierre Hernalsteens ◽

Henri de Greve ◽

...

Keyword(s):

De Novo Assembly ◽

De Novo ◽

Genome Mapping ◽

Whole Genome ◽

Bacterial Genomes ◽

Whole Genome Mapping

Download Full-text

SkewIT: The Skew Index Test for large-scale GC Skew analysis of bacterial genomes

PLoS Computational Biology ◽

10.1371/journal.pcbi.1008439 ◽

2020 ◽

Vol 16 (12) ◽

pp. e1008439

Author(s):

Jennifer Lu ◽

Steven L. Salzberg

Keyword(s):

Large Scale ◽

Analysis Tool ◽

Index Test ◽

Bacterial Genomes ◽

Phylogenetic Groups ◽

Bacterial Phyla ◽

Link Type ◽

Gc Skew ◽

A Genome ◽

Web App

GC skew is a phenomenon observed in many bacterial genomes, wherein the two replication strands of the same chromosome contain different proportions of guanine and cytosine nucleotides. Here we demonstrate that this phenomenon, which was first discovered in the mid-1990s, can be used today as an analysis tool for the 15,000+ complete bacterial genomes in NCBI’s Refseq library. In order to analyze all 15,000+ genomes, we introduce a new method, SkewIT (Skew Index Test), that calculates a single metric representing the degree of GC skew for a genome. Using this metric, we demonstrate how GC skew patterns are conserved within certain bacterial phyla, e.g. Firmicutes, but show different patterns in other phylogenetic groups such as Actinobacteria. We also discovered that outlier values of SkewIT highlight potential bacterial mis-assemblies. Using our newly defined metric, we identify multiple mis-assembled chromosomal sequences in previously published complete bacterial genomes. We provide a SkewIT web app https://jenniferlu717.shinyapps.io/SkewIT/ that calculates SkewI for any user-provided bacterial sequence. The web app also provides an interactive interface for the data generated in this paper, allowing users to further investigate the SkewI values and thresholds of the Refseq-97 complete bacterial genomes. Individual scripts for analysis of bacterial genomes are provided in the following repository: https://github.com/jenniferlu717/SkewIT.

Download Full-text

chewBBACA: A complete suite for gene-by-gene schema creation and strain identification

10.1101/173146 ◽

2017 ◽

Cited By ~ 5

Author(s):

Mickael Silva ◽

Miguel Machado ◽

Diogo N. Silva ◽

Mirko Rossi ◽

Jacob Moran-Gilad ◽

...

Keyword(s):

Open Source ◽

Core Genome ◽

Bacterial Species ◽

Outbreak Detection ◽

Strain Identification ◽

List Type ◽

Whole Genome ◽

Link Type ◽

The Creation ◽

Allele Calling

ABSTRACTGene-by-gene approaches are becoming increasingly popular in bacterial genomic epidemiology and outbreak detection. However, there is a lack of open-source scalable software for schema definition and allele calling for these methodologies. The chewBBACA suite was designed to assist users in the creation and evaluation of novel whole-genome or core-genome gene-by-gene typing schemas and subsequent allele calling in bacterial strains of interest. The software can run in a laptop or in high performance clusters making it useful for both small laboratories and large reference centers. ChewBBACA is available athttps://github.com/B-UMMI/chewBBACAor as a docker image athttps://hub.docker.com/r/ummidock/chewbbaca/.DATA SUMMARYAssembled genomes used for the tutorial were downloaded from NCBI in August 2016 by selecting those submitted asStreptococcus agalactiaetaxon or sub-taxa. All the assemblies have been deposited as a zip file in FigShare (https://figshare.com/s/9cbe1d422805db54cd52), where a file with the original ftp link for each NCBI directory is also available.Code for the chewBBACA suite is available athttps://github.com/B-UMMI/chewBBACAwhile the tutorial example is found athttps://github.com/B-UMMI/chewBBACA_tutorial.I/We confirm all supporting data, code and protocols have been provided within the article or through supplementary data files. ⊠IMPACT STATEMENTThe chewBBACA software offers a computational solution for the creation, evaluation and use of whole genome (wg) and core genome (cg) multilocus sequence typing (MLST) schemas. It allows researchers to develop wg/cgMLST schemes for any bacterial species from a set of genomes of interest. The alleles identified by chewBBACA correspond to potential coding sequences, possibly offering insights into the correspondence between the genetic variability identified and phenotypic variability. The software performs allele calling in a matter of seconds to minutes per strain in a laptop but is easily scalable for the analysis of large datasets of hundreds of thousands of strains using multiprocessing options. The chewBBACA software thus provides an efficient and freely available open source solution for gene-by-gene methods. Moreover, the ability to perform these tasks locally is desirable when the submission of raw data to a central repository or web services is hindered by data protection policies or ethical or legal concerns.

Download Full-text

Taxonomic status of the species Clostridium methoxybenzovorans Mechichi et al. 1999

INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY ◽

10.1099/ijsem.0.004951 ◽

2021 ◽

Vol 71 (8) ◽

Author(s):

Hisami Kobayashi ◽

Yasuhiro Tanizawa ◽

Mitsuo Sakamoto ◽

Moriya Ohkuma ◽

Masanori Tohno

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Type Strain ◽

Type Species ◽

Taxonomic Status ◽

Rrna Gene ◽

Whole Genome ◽

Content Type ◽

Link Type ◽

The 16S Rrna Gene

The taxonomic status of the species Clostridium methoxybenzovorans was assessed. The 16S rRNA gene sequence, whole-genome sequence and phenotypic characterizations suggested that the type strain deposited in the American Type Culture Collection ( C. methoxybenzovorans ATCC 700855T) is a member of the species Eubacterium callanderi . Hence, C. methoxybenzovorans ATCC 700855T cannot be used as a reference for taxonomic study. The type strain deposited in the German Collection of Microorganism and Cell Cultures GmbH (DSM 12182T) is no longer listed in its online catalogue. Also, both the 16S rRNA gene and the whole-genome sequences of the original strain SR3T showed high sequence identity with those of Lacrimispora indolis (recently reclassified from Clostridium indolis ) as the most closely related species. Analysis of the two genomes showed average nucleotide identity based on blast and digital DNA–DNA hybridization values of 98.3 and 87.9 %, respectively. Based on these results, C. methoxybenzovorans SR3T was considered to be a member of L. indolis .

Download Full-text

Whole genome analyses reveal no pathogenetic single nucleotide or structural differences between monozygotic twins discordant for amyotrophic lateral sclerosis

Amyotrophic Lateral Sclerosis and Frontotemporal Degeneration ◽

10.3109/21678421.2015.1040029 ◽

2015 ◽

Vol 16 (5-6) ◽

pp. 385-392 ◽

Cited By ~ 18

Author(s):

Karyn Meltz Steinberg ◽

Thomas J. Nicholas ◽

Daniel C. Koboldt ◽

Bing Yu ◽

Elaine Mardis ◽

...

Keyword(s):

Amyotrophic Lateral Sclerosis ◽

Monozygotic Twins ◽

Whole Genome ◽

Single Nucleotide ◽

Structural Differences ◽

Genome Analyses ◽

Lateral Sclerosis

Download Full-text

Integration of Whole-Genome Sequencing into Infection Control Practices: the Potential and the Hurdles

Journal of Clinical Microbiology ◽

10.1128/jcm.00349-15 ◽

2015 ◽

Vol 53 (4) ◽

pp. 1054-1055 ◽

Cited By ~ 10

Author(s):

Elizabeth Robilotti ◽

Mini Kamboj

Keyword(s):

Infection Control ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Clinical Epidemiology ◽

Great Promise ◽

Whole Genome ◽

Genome sequence analyses show that Neisseria oralis is the same species as ‘ Neisseria mucosa var. heidelbergensis’

INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY ◽

10.1099/ijs.0.052431-0 ◽

2013 ◽

Vol 63 (Pt_10) ◽

pp. 3920-3926 ◽

Cited By ~ 30

Author(s):

Julia S. Bennett ◽

Keith A. Jolley ◽

Martin C. J. Maiden

Keyword(s):

Genome Sequence ◽

Type Species ◽

Whole Genome Sequence ◽

23S Rrna ◽

Rrna Gene ◽

Whole Genome ◽

Content Type ◽

Link Type ◽

The Family ◽

Neisseria Mucosa

Phylogenies generated from whole genome sequence (WGS) data provide definitive means of bacterial isolate characterization for typing and taxonomy. The species status of strains recently defined with conventional taxonomic approaches as representing Neisseria oralis was examined by the analysis of sequences derived from WGS data, specifically: (i) 53 Neisseria ribosomal protein subunit (rps) genes (ribosomal multi-locus sequence typing, rMLST); and (ii) 246 Neisseria core genes (core genome MLST, cgMLST). These data were compared with phylogenies derived from 16S and 23S rRNA gene sequences, demonstrating that the N. oralis strains were monophyletic with strains described previously as representing ‘ Neisseria mucosa var. heidelbergensis’ and that this group was of equivalent taxonomic status to other well-described species of the genus Neisseria . Phylogenetic analyses also indicated that Neisseria sicca and Neisseria macacae should be considered the same species as Neisseria mucosa and that Neisseria flavescens should be considered the same species as Neisseria subflava . Analyses using rMLST showed that some strains currently defined as belonging to the genus Neisseria were more closely related to species belonging to other genera within the family; however, whole genome analysis of a more comprehensive selection of strains from within the family Neisseriaceae would be necessary to confirm this. We suggest that strains previously identified as representing ‘ N. mucosa var. heidelbergensis’ and deposited in culture collections should be renamed N. oralis . Finally, one of the strains of N. oralis was able to ferment lactose, due to the presence of β-galactosidase and lactose permease genes, a characteristic previously thought to be unique to Neisseria lactamica , which therefore cannot be thought of as diagnostic for this species; however, the rMLST and cgMLST analyses confirm that N. oralis is most closely related to N. mucosa .

Download Full-text