Plasmid Profiler: Comparative Analysis of Plasmid Content in WGS Data

Mapping Intimacies ◽

10.1101/121350 ◽

2017 ◽

Cited By ~ 2

Author(s):

Adrian Zetner ◽

Jennifer Cabral ◽

Laura Mataseje ◽

Natalie C Knox ◽

Philip Mabon ◽

...

Keyword(s):

Comparative Analysis ◽

De Novo ◽

Sequence Data ◽

Health Agency ◽

R Package ◽

Whole Genome Sequence ◽

Reference Sequence ◽

Supplementary Information ◽

Plasmid Content ◽

Link Type

AbstractSummaryComparative analysis of bacterial plasmids from whole genome sequence (WGS) data generated from short read sequencing is challenging. This is due to the difficulty in identifying contigs harbouring plasmid sequence data, and further difficulty in assembling such contigs into a full plasmid. As such, few software programs and bioinformatics pipelines exist to perform comprehensive comparative analyses of plasmids within and amongst sequenced isolates. To address this gap, we have developed Plasmid Profiler, a pipeline to perform comparative plasmid content analysis without the need forde novoassembly. The pipeline is designed to rapidly identify plasmid sequences by mapping reads to a plasmid reference sequence database. Predicted plasmid sequences are then annotated with their incompatibility group, if known. The pipeline allows users to query plasmids for genes or regions of interest and visualize results as an interactive heat map.Availability and ImplementationPlasmid Profiler is freely available software released under the Apache 2.0 open source software license. A stand-alone version of the entire Plasmid Profiler pipeline is available as a Docker container athttps://hub.docker.com/r/phacnml/plasmidprofiler_0_1_6/.The conda recipe for the Plasmid R package is available at:https://anaconda.org/bioconda/r-plasmidprofilerThe custom Plasmid Profiler R package is also available as a CRAN package athttps://cran.r-project.org/web/packages/Plasmidprofiler/index.htmlGalaxy tools associated with the pipeline are available as a Galaxy tool suite athttps://toolshed.g2.bx.psu.edu/repository?repository_id=55e082200d16a504The source code is available at:https://github.com/phac-nml/plasmidprofilerThe Galaxy implementation is available at:https://github.com/phac-nml/plasmidprofiler-galaxyContactEmail:[email protected]: National Microbiology Laboratory, Public Health Agency of Canada, 1015 Arlington Street, Winnipeg, Manitoba, CanadaSupplementary informationDocumentation:http://plasmid-profiler.readthedocs.io/en/latest/

Download Full-text

TIGER: inferring DNA replication timing from whole-genome sequence data

Bioinformatics ◽

10.1093/bioinformatics/btab166 ◽

2021 ◽

Cited By ~ 1

Author(s):

Amnon Koren ◽

Dashiell J Massey ◽

Alexa N Bracci

Keyword(s):

Dna Replication ◽

Genome Sequence ◽

Genomic Dna ◽

Sequence Data ◽

Replication Timing ◽

Whole Genome Sequence ◽

Supplementary Information ◽

Whole Genome ◽

Genome Sequence Data ◽

Dna Replication Timing

Abstract Motivation Genomic DNA replicates according to a reproducible spatiotemporal program, with some loci replicating early in S phase while others replicate late. Despite being a central cellular process, DNA replication timing studies have been limited in scale due to technical challenges. Results We present TIGER (Timing Inferred from Genome Replication), a computational approach for extracting DNA replication timing information from whole genome sequence data obtained from proliferating cell samples. The presence of replicating cells in a biological specimen leads to non-uniform representation of genomic DNA that depends on the timing of replication of different genomic loci. Replication dynamics can hence be observed in genome sequence data by analyzing DNA copy number along chromosomes while accounting for other sources of sequence coverage variation. TIGER is applicable to any species with a contiguous genome assembly and rivals the quality of experimental measurements of DNA replication timing. It provides a straightforward approach for measuring replication timing and can readily be applied at scale. Availability and Implementation TIGER is available at https://github.com/TheKorenLab/TIGER. Supplementary information Supplementary data are available at Bioinformatics online

Download Full-text

Selective ancestral sorting and de novo evolution in the agricultural invasion of Amaranthus tuberculatus

10.1101/2021.07.26.453853 ◽

2021 ◽

Author(s):

Julia M. Kreiner ◽

Amalia Caballero ◽

Stephen I. Wright ◽

John R. Stinchcombe

Keyword(s):

De Novo ◽

Sequence Data ◽

Sex Differentiation ◽

Common Garden ◽

Whole Genome Sequence ◽

Secondary Contact ◽

Natural Environments ◽

Relative Role ◽

Standing Variation ◽

Amaranthus Tuberculatus

The relative role of hybridization, de novo evolution, and standing variation in weed adaptation to agricultural environments is largely unknown. In Amaranthus tuberculatus, a widespread North American agricultural weed, adaptation is likely influenced by recent secondary contact and admixture of two previously isolated subspecies. We characterized the extent of adaptation and phenotypic differentiation accompanying the spread of A. tuberculatus into agricultural environments and the contribution of subspecies divergence. We generated phenotypic and whole-genome sequence data from a manipulative common garden experiment, using paired samples from natural and agricultural populations. We found strong latitudinal, longitudinal, and sex differentiation in phenotypes, and subtle differences among agricultural and natural environments that were further resolved with ancestry-based inference. The transition into agricultural environments has favoured southwestern var. rudis ancestry that leads to higher biomass and environment-specific phenotypes: increased biomass and earlier flowering under reduced water availability, and reduced plasticity in fitness-related traits. We also detected de novo adaptation to agricultural habitats independent of ancestry effects, including marginally higher biomass and later flowering in agricultural populations, and a time to germination home advantage. Therefore, the invasion of A. tuberculatus into agricultural environments has drawn on adaptive variation across multiple timescales—through both preadaptation via the preferential sorting of var. rudis ancestry and de novo local adaptation.

Download Full-text

hypeR: An R Package for Geneset Enrichment Workflows

10.1101/656637 ◽

2019 ◽

Cited By ~ 1

Author(s):

Anthony Federico ◽

Stefano Monti

Keyword(s):

High Throughput Sequencing ◽

R Package ◽

Supplementary Information ◽

Sequencing Data ◽

Wide Audience ◽

Popular Method ◽

Link Type ◽

High Throughput Sequencing Data ◽

One Stop ◽

Recent Version

ABSTRACTSummaryGeneset enrichment is a popular method for annotating high-throughput sequencing data. Existing tools fall short in providing the flexibility to tackle the varied challenges researchers face in such analyses, particularly when analyzing many signatures across multiple experiments. We present a comprehensive R package for geneset enrichment workflows that offers multiple enrichment, visualization, and sharing methods in addition to novel features such as hierarchical geneset analysis and built-in markdown reporting. hypeR is a one-stop solution to performing geneset enrichment for a wide audience and range of use cases.Availability and implementationThe most recent version of the package is available at https://github.com/montilab/hypeR.Supplementary informationComprehensive documentation and tutorials, are available at https://montilab.github.io/hypeR-docs.

Download Full-text

The Genome Sequence of the Anthelmintic-Susceptible New Zealand Haemonchus contortus

Genome Biology and Evolution ◽

10.1093/gbe/evz141 ◽

2019 ◽

Vol 11 (7) ◽

pp. 1965-1970 ◽

Cited By ~ 7

Author(s):

Nikola Palevich ◽

Paul H Maclean ◽

Abdul Baten ◽

Richard W Scott ◽

David M Leathwick

Keyword(s):

Genome Sequence ◽

Molecular Mechanisms ◽

De Novo ◽

Sequence Data ◽

Animal Health ◽

Draft Genome ◽

Whole Genome Sequence ◽

Parasitic Nematodes ◽

Hybrid Assembly ◽

Genetic Structures

Abstract Internal parasitic nematodes are a global animal health issue causing drastic losses in livestock. Here, we report a H. contortus representative draft genome to serve as a genetic resource to the scientific community and support future experimental research of molecular mechanisms in related parasites. A de novo hybrid assembly was generated from PCR-free whole genome sequence data, resulting in a chromosome-level assembly that is 465 Mb in size encoding 22,341 genes. The genome sequence presented here is consistent with the genome architecture of the existing Haemonchus species and is a valuable resource for future studies regarding population genetic structures of parasitic nematodes. Additionally, comparative pan-genomics with other species of economically important parasitic nematodes have revealed highly open genomes and strong collinearities within the phylum Nematoda.

Download Full-text

multiHiCcompare: joint normalization and comparative analysis of complex Hi-C experiments

Bioinformatics ◽

10.1093/bioinformatics/btz048 ◽

2019 ◽

Vol 35 (17) ◽

pp. 2916-2923 ◽

Cited By ~ 15

Author(s):

John C Stansfield ◽

Kellen G Cresswell ◽

Mikhail G Dozmorov

Keyword(s):

Comparative Analysis ◽

A Priori ◽

Three Dimensional ◽

R Package ◽

Supplementary Information ◽

Chromatin Interaction ◽

Model Framework ◽

Chromatin Interactions ◽

Loess Regression ◽

Sequencing Studies

Abstract Motivation With the development of chromatin conformation capture technology and its high-throughput derivative Hi-C sequencing, studies of the three-dimensional interactome of the genome that involve multiple Hi-C datasets are becoming available. To account for the technology-driven biases unique to each dataset, there is a distinct need for methods to jointly normalize multiple Hi-C datasets. Previous attempts at removing biases from Hi-C data have made use of techniques which normalize individual Hi-C datasets, or, at best, jointly normalize two datasets. Results Here, we present multiHiCcompare, a cyclic loess regression-based joint normalization technique for removing biases across multiple Hi-C datasets. In contrast to other normalization techniques, it properly handles the Hi-C-specific decay of chromatin interaction frequencies with the increasing distance between interacting regions. multiHiCcompare uses the general linear model framework for comparative analysis of multiple Hi-C datasets, adapted for the Hi-C-specific decay of chromatin interaction frequencies. multiHiCcompare outperforms other methods when detecting a priori known chromatin interaction differences from jointly normalized datasets. Applied to the analysis of auxin-treated versus untreated experiments, and CTCF depletion experiments, multiHiCcompare was able to recover the expected epigenetic and gene expression signatures of loss of chromatin interactions and reveal novel insights. Availability and implementation multiHiCcompare is freely available on GitHub and as a Bioconductor R package https://bioconductor.org/packages/multiHiCcompare. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

A comparative analysis of family-based and population-based association tests using whole genome sequence data

BMC Proceedings ◽

10.1186/1753-6561-8-s1-s33 ◽

2014 ◽

Vol 8 (Suppl 1) ◽

pp. S33 ◽

Cited By ~ 6

Author(s):

Jin J Zhou ◽

Wai-Ki Yip ◽

Michael H Cho ◽

Dandi Qiao ◽

Merry-Lynn N McDonald ◽

...

Keyword(s):

Comparative Analysis ◽

Genome Sequence ◽

Sequence Data ◽

Population Based ◽

Whole Genome Sequence ◽

Whole Genome ◽

Association Tests ◽

Genome Sequence Data ◽

Family Based

Download Full-text

Characterizing mutagenic effects of recombination through a sequence-level genetic map

Science ◽

10.1126/science.aau1043 ◽

2019 ◽

Vol 363 (6425) ◽

pp. eaau1043 ◽

Cited By ~ 62

Author(s):

Bjarni V. Halldorsson ◽

Gunnar Palsson ◽

Olafur A. Stefansson ◽

Hakon Jonsson ◽

Marteinn T. Hardarson ◽

...

Keyword(s):

Genetic Map ◽

Meiotic Recombination ◽

De Novo ◽

Sequence Data ◽

Mutagenic Effect ◽

Whole Genome Sequence ◽

De Novo Mutation ◽

Base Pairs ◽

Males And Females ◽

Mutagenic Effects

Genetic diversity arises from recombination and de novo mutation (DNM). Using a combination of microarray genotype and whole-genome sequence data on parent-child pairs, we identified 4,531,535 crossover recombinations and 200,435 DNMs. The resulting genetic map has a resolution of 682 base pairs. Crossovers exhibit a mutagenic effect, with overrepresentation of DNMs within 1 kilobase of crossovers in males and females. In females, a higher mutation rate is observed up to 40 kilobases from crossovers, particularly for complex crossovers, which increase with maternal age. We identified 35 loci associated with the recombination rate or the location of crossovers, demonstrating extensive genetic control of meiotic recombination, and our results highlight genes linked to the formation of the synaptonemal complex as determinants of crossovers.

Download Full-text

Treponema phagedenis (ex Noguchi 1912) Brumpt 1922 sp. nov., nom. rev., isolated from bovine digital dermatitis

INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY ◽

10.1099/ijsem.0.004027 ◽

2020 ◽

Vol 70 (3) ◽

pp. 2115-2123 ◽

Cited By ~ 11

Author(s):

Peter Kuhnert ◽

Isabelle Brodard ◽

Maher Alsaaod ◽

Adrian Steiner ◽

Michael H. Stoffel ◽

...

Keyword(s):

Type Strain ◽

Type Species ◽

Sequence Data ◽

Whole Genome Sequence ◽

Valid Species ◽

Species Identity ◽

Digital Dermatitis ◽

Content Type ◽

Link Type ◽

Specific Pcr

‘ Treponema phagedenis ’ was originally described in 1912 by Noguchi but the name was not validly published and no type strain was designated. The taxon was not included in the Approved Lists of Bacterial Names and hence has no standing in nomenclature. Six Treponema strains positive in a ‘ T. phagedenis ’ phylogroup-specific PCR test were isolated from digital dermatitis (DD) lesions of cattle and further characterized and compared with the human strain ‘ T. phagedenis ’ ATCC 27087. Results of phenotypic and genotypic analyses including API ZYM, VITEK2, MALDI-TOF and electron microscopy, as well as whole genome sequence data, respectively, showed that they form a cluster of species identity. Moreover, this species identity was shared with ‘ T. phagedenis ’-like strains reported in the literature to be regularly isolated from bovine DD. High average nucleotide identity values between the genomes of bovine and human ‘ T. phagedenis ’ were observed. Slight genomic as well as phenotypic variations allowed us to differentiate bovine from human isolates, indicating host adaptation. Based on the fact that this species is regularly isolated from bovine DD and that the name is well dispersed in the literature, we propose the species Treponema phagedenis sp. nov., nom. rev. The species can phenotypically and genetically be identified and is clearly separated from other Treponema species. The valid species designation will allow to further explore its role in bovine DD. The type strain for Treponema phagedenis sp. nov., nom. rev. is B43.1T (=DSM 110455T=NCTC 14362T) isolated from a bovine DD lesion in Switzerland.

Download Full-text

Teredinibacter haidensis sp. nov., Teredinibacter purpureus sp. nov. and Teredinibacter franksiae sp. nov., marine, cellulolytic endosymbiotic bacteria isolated from the gills of the wood-boring mollusc Bankia setacea (Bivalvia: Teredinidae) and emended description of the genus Teredinibacter

INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY ◽

10.1099/ijsem.0.004627 ◽

2021 ◽

Author(s):

Marvin A. Altamia ◽

J. Reuben Shipway ◽

David Stein ◽

Meghan A. Betcher ◽

Jennifer M. Fung ◽

...

Keyword(s):

Type Strain ◽

Type Species ◽

Sequence Data ◽

Amino Acid Identity ◽

Whole Genome Sequence ◽

Phenotypic Characterization ◽

Rrna Gene ◽

Bacterial Strains ◽

Content Type ◽

Link Type

Here, we describe three endosymbiotic bacterial strains isolated from the gills of the shipworm, Bankia setacea (Teredinidae: Bivalvia). These strains, designated as Bs08T, Bs12T and Bsc2T, are Gram-stain-negative, microaerobic, gammaproteobacteria that grow on cellulose and a variety of substrates derived from lignocellulose. Phenotypic characterization, phylogeny based on 16S rRNA gene and whole genome sequence data, amino acid identity and percentage of conserved proteins analyses, show that these strains are novel and may be assigned to the genus Teredinibacter . The three strains may be differentiated and distinguished from other previously described Teredinibacter species based on a combination of four characteristics: colony colour (Bs12T, purple; others beige to brown), marine salt requirement (Bs12T, Bsc2T and Teredinibacter turnerae strains), the capacity for nitrogen fixation (Bs08T and T. turnerae strains) and the ability to respire nitrate (Bs08T). Based on these findings, we propose the names Teredinibacter haidensis sp. nov. (type strain Bs08T=ATCC TSD-121T=KCTC 62964T), Teredinibacter purpureus sp. nov. (type strain Bs12T=ATCC TSD-122T=KCTC 62965T) and Teredinibacter franksiae sp. nov. (type strain Bsc2T=ATCC TSD-123T=KCTC 62966T).

Download Full-text

blupADC: An R package and shiny toolkit for comprehensive genetic data analysis in animal and plant breeding

10.1101/2021.09.09.459557 ◽

2021 ◽

Author(s):

Quanshun Mei ◽

Chuanke Fu ◽

Jieling Li ◽

Shuhong Zhao ◽

Tao Xiang

Keyword(s):

Genetic Analysis ◽

Plant Breeding ◽

Genomic Data ◽

R Package ◽

Genotype Imputation ◽

Supplementary Information ◽

Composition Analysis ◽

Relationship Matrix ◽

Link Type ◽

Plant Breeding Program

AbstractSummaryGenetic analysis is a systematic and complex procedure in animal and plant breeding. With fast development of high-throughput genotyping techniques and algorithms, animal and plant breeding has entered into a genomic era. However, there is a lack of software, which can be used to process comprehensive genetic analyses, in the routine animal and plant breeding program. To make the whole genetic analysis in animal and plant breeding straightforward, we developed a powerful, robust and fast R package that includes genomic data format conversion, genomic data quality control and genotype imputation, breed composition analysis, pedigree tracing, analysis and visualization, pedigree-based and genomic-based relationship matrix construction, and genomic evaluation. In addition, to simplify the application of this package, we also developed a shiny toolkit for users.Availability and implementationblupADC is developed primarily in R with core functions written in C++. The development version is maintained at https://github.com/TXiang-lab/blupADC.Supplementary informationSupplementary data are available online

Download Full-text