Genomes of a major nosocomial pathogen Enterococcus faecium are shaped by adaptive evolution of the chromosome and plasmidome

Mapping Intimacies ◽

10.1101/530725 ◽

2019 ◽

Cited By ~ 4

Author(s):

S Arredondo-Alonso ◽

J Top ◽

AC Schürch ◽

A McNally ◽

S Puranen ◽

...

Keyword(s):

Enterococcus Faecium ◽

Large Scale ◽

Population Genomics ◽

Genomic Analysis ◽

Adaptive Landscape ◽

Priority List ◽

Sequencing Technologies ◽

Long Read ◽

Pathogen Populations ◽

High Level

AbstractEnterococcus faecium is a gut commensal of many mammals but is also recognized as a major nosocomial human pathogen, as it is listed on the WHO global priority list of multi-drug resistant organisms. Previous research has suggested that nosocomial strains have multiple zoonotic origins and are only distantly related to those involved in human commensal colonization. Here we present the first comprehensive population-wide joint genomic analysis of hospital, commensal and animal isolates using both short- and long-read sequencing techniques. This enabled us to investigate the population plasmidome, core genome variation and genome architecture in detail, using a combination of machine learning, population genomics and genome-wide co-evolution analysis. We observed a high level of genome plasticity with large-scale inversions and heterogeneous chromosome sizes, collectively painting a high-resolution picture of the adaptive landscape of E. faecium, and identified plasmids as the main indicator for host-specificity. Given the increasing availability of long-read sequencing technologies, our approach could be widely applied to other human and animal pathogen populations to unravel fine-scale mechanisms of their evolution.

Download Full-text

Fatal Respiratory Diphtheria Caused by ß-Lactam–Resistant Corynebacterium diphtheriae

Clinical Infectious Diseases ◽

10.1093/cid/ciaa1147 ◽

2020 ◽

Cited By ~ 1

Author(s):

Brian M Forde ◽

Andrew Henderson ◽

Elliott G Playford ◽

David Looke ◽

Belinda C Henderson ◽

...

Keyword(s):

Single Molecule ◽

Genomic Analysis ◽

Carbapenem Resistance ◽

Corynebacterium Diphtheriae ◽

Penicillin Binding Protein ◽

Long Read ◽

Amoxicillin Clavulanic Acid ◽

Resistance To Penicillin ◽

High Level

Abstract Background Diphtheria is a potentially fatal respiratory disease caused by toxigenic Corynebacterium diphtheriae. Although resistance to erythromycin has been recognized, β-lactam resistance in toxigenic diphtheria has not been described. Here, we report a case of fatal respiratory diphtheria caused by toxigenic C. diphtheriae resistant to penicillin and all other β-lactam antibiotics, and describe a novel mechanism of inducible carbapenem resistance associated with the acquisition of a mobile resistance element. Methods Long-read whole-genome sequencing was performed using Pacific Biosciences Single Molecule Real-Time sequencing to determine the genome sequence of C. diphtheriae BQ11 and the mechanism of β-lactam resistance. To investigate the phenotypic inducibility of meropenem resistance, short-read sequencing was performed using an Illumina NextSeq500 sequencer on the strain both with and without exposure to meropenem. Results BQ11 demonstrated high-level resistance to penicillin (benzylpenicillin minimum inhibitory concentration [MIC] ≥ 256 μg/ml), β-lactam/β-lactamase inhibitors and cephalosporins (amoxicillin/clavulanic acid MIC ≥ 256 μg/mL; ceftriaxone MIC ≥ 8 μg/L). Genomic analysis of BQ11 identified acquisition of a novel transposon carrying the penicillin-binding protein (PBP) Pbp2c, responsible for resistance to penicillin and cephalosporins. When strain BQ11 was exposed to meropenem, selective pressure drove amplification of the transposon in a tandem array and led to a corresponding change from a low-level to a high-level meropenem-resistant phenotype. Conclusions We have identified a novel mechanism of inducible antibiotic resistance whereby isolates that appear to be carbapenem susceptible on initial testing can develop in vivo resistance to carbapenems with repeated exposure. This phenomenon could have significant implications for the treatment of C. diphtheriae infection, and may lead to clinical failure.

Download Full-text

Complete Genome Resequencing of Thermus thermophilus Strain TMY by Hybrid Assembly of Long- and Short-Read Sequencing Technologies

Microbiology Resource Announcements ◽

10.1128/mra.00979-21 ◽

2021 ◽

Vol 10 (46) ◽

Author(s):

Kentaro Miyazaki ◽

Natsuko Tokito

Keyword(s):

Complete Genome ◽

Thermus Thermophilus ◽

Genomic Analysis ◽

Comparative Genomic ◽

Hybrid Assembly ◽

Genome Resequencing ◽

Short Read ◽

Content Type ◽

Sequencing Technologies ◽

Long Read

Complete genome resequencing was conducted for Thermus thermophilus strain TMY by hybrid assembly of Oxford Nanopore Technologies long-read and MGI short-read data. Errors in the previously reported genome sequence determined by PacBio technology alone were corrected, allowing for high-quality comparative genomic analysis of closely related T. thermophilus genomes.

Download Full-text

Evolutionary superscaffolding and chromosome anchoring to improve Anopheles genome assemblies

BMC Biology ◽

10.1186/s12915-019-0728-3 ◽

2020 ◽

Vol 18 (1) ◽

Cited By ~ 9

Author(s):

Robert M. Waterhouse ◽

Sergey Aganezov ◽

Yoann Anselmetti ◽

Jiyoung Lee ◽

Livio Ruzzante ◽

...

Keyword(s):

Anopheles Funestus ◽

Genomic Analysis ◽

Comparative Genomic ◽

Financial Barriers ◽

Complementary Method ◽

Gene Synteny ◽

Sequencing Technologies ◽

Complementary Approach ◽

Long Read ◽

Gene Order Conservation

Abstract Background New sequencing technologies have lowered financial barriers to whole genome sequencing, but resulting assemblies are often fragmented and far from ‘finished’. Updating multi-scaffold drafts to chromosome-level status can be achieved through experimental mapping or re-sequencing efforts. Avoiding the costs associated with such approaches, comparative genomic analysis of gene order conservation (synteny) to predict scaffold neighbours (adjacencies) offers a potentially useful complementary method for improving draft assemblies. Results We evaluated and employed 3 gene synteny-based methods applied to 21 Anopheles mosquito assemblies to produce consensus sets of scaffold adjacencies. For subsets of the assemblies, we integrated these with additional supporting data to confirm and complement the synteny-based adjacencies: 6 with physical mapping data that anchor scaffolds to chromosome locations, 13 with paired-end RNA sequencing (RNAseq) data, and 3 with new assemblies based on re-scaffolding or long-read data. Our combined analyses produced 20 new superscaffolded assemblies with improved contiguities: 7 for which assignments of non-anchored scaffolds to chromosome arms span more than 75% of the assemblies, and a further 7 with chromosome anchoring including an 88% anchored Anopheles arabiensis assembly and, respectively, 73% and 84% anchored assemblies with comprehensively updated cytogenetic photomaps for Anopheles funestus and Anopheles stephensi. Conclusions Experimental data from probe mapping, RNAseq, or long-read technologies, where available, all contribute to successful upgrading of draft assemblies. Our evaluations show that gene synteny-based computational methods represent a valuable alternative or complementary approach. Our improved Anopheles reference assemblies highlight the utility of applying comparative genomics approaches to improve community genomic resources.

Download Full-text

LoRTE: Detecting transposon-induced genomic variants using low coverage PacBio long read sequences

10.1101/073551 ◽

2016 ◽

Author(s):

Eric Disdero ◽

Jonathan Filée

Keyword(s):

Transposable Elements ◽

Reference Genome ◽

Genomic Analysis ◽

Bioinformatic Tools ◽

Sequencing Technologies ◽

Population Genomic ◽

Long Read ◽

Different Strains ◽

Low Coverage ◽

Ncbi Blast

AbstractMotivationPopulation genomic analysis of transposable elements has greatly benefited from recent advances of sequencing technologies. However, the propensity of transposable elements to nest in highly repeated regions of genomes limits the efficiency of bioinformatic tools when short read sequences technology is used.ResultsLoRTE is the first tool able to use PacBio long read sequences to identify transposon deletions and insertions between a reference genome and genomes of different strains or populations. Tested against Drosophila melanogaster PacBio datasets, LoRTE appears to be a reliable and broadly applicable tools to study the dynamic and evolutionary impact of transposable elements using low coverage, long read sequences.Availability and ImplementationLoRTE is available at http://www.egce.cnrs-gif.fr/?p=6422. It is written in Python 2.7 and only requires the NCBI BLAST + package. LoRTE can be used on standard computer with limited RAM resources and reasonable running time even with large [email protected]

Download Full-text

Evolutionary superscaffolding and chromosome anchoring to improve Anopheles genome assemblies

10.1101/434670 ◽

2018 ◽

Author(s):

Robert M. Waterhouse ◽

Sergey Aganezov ◽

Yoann Anselmetti ◽

Jiyoung Lee ◽

Livio Ruzzante ◽

...

Keyword(s):

Anopheles Funestus ◽

Genomic Analysis ◽

Comparative Genomic ◽

Financial Barriers ◽

Complementary Method ◽

Gene Synteny ◽

Sequencing Technologies ◽

Complementary Approach ◽

Long Read ◽

Gene Order Conservation

AbstractBackgroundNew sequencing technologies have lowered financial barriers to whole genome sequencing, but resulting assemblies are often fragmented and far from ‘finished’. Updating multi-scaffold drafts to chromosome-level status can be achieved through experimental mapping or re-sequencing efforts. Avoiding the costs associated with such approaches, comparative genomic analysis of gene order conservation (synteny) to predict scaffold neighbours (adjacencies) offers a potentially useful complementary method for improving draft assemblies.ResultsWe employed three gene synteny-based methods applied to 21 Anopheles mosquito assemblies to produce consensus sets of scaffold adjacencies. For subsets of the assemblies we integrated these with additional supporting data to confirm and complement the synteny-based adjacencies: six with physical mapping data that anchor scaffolds to chromosome locations, 13 with paired-end RNA sequencing (RNAseq) data, and three with new assemblies based on re-scaffolding or Pacific Biosciences long-read data. Our combined analyses produced 20 new superscaffolded assemblies with improved contiguities: seven for which assignments of non-anchored scaffolds to chromosome arms span more than 75% of the assemblies, and a further seven with chromosome anchoring including an 88% anchored Anopheles arabiensis assembly and, respectively, 73% and 84% anchored assemblies with comprehensively updated cytogenetic photomaps for Anopheles funestus and Anopheles stephensi.ConclusionsExperimental data from probe mapping, RNAseq, or long-read technologies, where available, all contribute to successful upgrading of draft assemblies. Our comparisons show that gene synteny-based computational methods represent a valuable alternative or complementary approach. Our improved Anopheles reference assemblies highlight the utility of applying comparative genomics approaches to improve community genomic resources.

Download Full-text

Altered cell and RNA isoform diversity in aging Down syndrome brains

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2114326118 ◽

2021 ◽

Vol 118 (47) ◽

pp. e2114326118

Author(s):

Carter R. Palmer ◽

Christine S. Liu ◽

William J. Romanow ◽

Ming-Hsiang Lee ◽

Jerold Chun

Keyword(s):

Down Syndrome ◽

Large Scale ◽

Cell Types ◽

Chromosome 21 ◽

Specific Cell ◽

Sequencing Technologies ◽

Isoform Diversity ◽

Long Read ◽

Single Nucleus ◽

Altered Cell

Down syndrome (DS), trisomy of human chromosome 21 (HSA21), is characterized by lifelong cognitive impairments and the development of the neuropathological hallmarks of Alzheimer’s disease (AD). The cellular and molecular modifications responsible for these effects are not understood. Here we performed single-nucleus RNA sequencing (snRNA-seq) employing both short- (Illumina) and long-read (Pacific Biosciences) sequencing technologies on a total of 29 DS and non-DS control prefrontal cortex samples. In DS, the ratio of inhibitory-to-excitatory neurons was significantly increased, which was not observed in previous reports examining sporadic AD. DS microglial transcriptomes displayed AD-related aging and activation signatures in advance of AD neuropathology, with increased microglial expression of C1q complement genes (associated with dendritic pruning) and the HSA21 transcription factor gene RUNX1. Long-read sequencing detected vast RNA isoform diversity within and among specific cell types, including numerous sequences that differed between DS and control brains. Notably, over 8,000 genes produced RNAs containing intra-exonic junctions, including amyloid precursor protein (APP) that had previously been associated with somatic gene recombination. These and related results illuminate large-scale cellular and transcriptomic alterations as features of the aging DS brain.

Download Full-text

Statistical inference in population genomics

10.1101/2021.10.27.466171 ◽

2021 ◽

Author(s):

Parul Johri ◽

Charles F. Aquadro ◽

Mark Beaumont ◽

Brian Charlesworth ◽

Laurent Excoffier ◽

...

Keyword(s):

Statistical Inference ◽

Large Scale ◽

Population Genomics ◽

Model Fitting ◽

Genomic Data ◽

Statistical Population ◽

Biologically Relevant ◽

Sequencing Technologies ◽

Adaptive Processes ◽

Careful Exploration

The field of population genomics has grown rapidly with the recent advent of affordable, large-scale sequencing technologies. As opposed to the situation during the majority of the 20th century, in which the development of theoretical and statistical population-genetic insights out-paced the generation of data to which they could be applied, genomic data are now being produced at a far greater rate than they can be meaningfully analyzed and interpreted. With this wealth of data has come a tendency to focus on fitting specific (and often rather idiosyncratic) models to data, at the expense of a careful exploration of the range of possible underlying evolutionary processes. For example, the approach of directly investigating models of adaptive evolution in each newly sequenced population or species often neglects the fact that a thorough characterization of ubiquitous non-adaptive processes is a prerequisite for accurate inference. We here describe the perils of these tendencies, present our views on current best practices in population genomic data analysis, and highlight areas of statistical inference and theory that are in need of further attention. Thereby, we argue for the importance of defining a biologically relevant baseline model tuned to the details of each new analysis, of skepticism and scrutiny in interpreting model-fitting results, and of carefully defining addressable hypotheses and underlying uncertainties.

Download Full-text

Human Herpesvirus Sequencing in the Genomic Era: The Growing Ranks of the Herpetic Legion

Pathogens ◽

10.3390/pathogens8040186 ◽

2019 ◽

Vol 8 (4) ◽

pp. 186 ◽

Cited By ~ 1

Author(s):

Charlotte J. Houldcroft

Keyword(s):

Epstein Barr Virus ◽

Population Genomics ◽

Human Herpesvirus ◽

Mixed Genotype ◽

Barr Virus ◽

Childhood Infections ◽

Sequencing Technologies ◽

Long Read ◽

Epstein Barr ◽

Human Herpesviruses

The nine human herpesviruses are some of the most ubiquitous pathogens worldwide, causing life-long latent infection in a variety of different tissues. Human herpesviruses range from mild childhood infections to known tumour viruses and ‘trolls of transplantation’. Epstein-Barr virus was the first human herpesvirus to have its whole genome sequenced; GenBank now includes thousands of herpesvirus genomes. This review will cover some of the recent advances in our understanding of herpesvirus diversity and disease that have come about as a result of new sequencing technologies, such as target enrichment and long-read sequencing. It will also look at the problem of resolving mixed-genotype infections, whether with short or long-read sequencing methods; and conclude with some thoughts on the future of the field as herpesvirus population genomics becomes a reality.

Download Full-text

Near-complete Lokiarchaeota genomes from complex environmental samples using long and short read metagenomic analyses

10.1101/2019.12.17.879148 ◽

2019 ◽

Cited By ~ 3

Author(s):

Eva F. Caceres ◽

William H. Lewis ◽

Felix Homa ◽

Tom Martin ◽

Andreas Schramm ◽

...

Keyword(s):

Large Scale ◽

Phylogenetic Analyses ◽

Metagenomic Data ◽

Endosomal Sorting ◽

Sequencing Technologies ◽

Long Reads ◽

Oxford Nanopore ◽

Complete Genomes ◽

Culture Independent ◽

Long Read

AbstractAsgard archaea is a recently proposed superphylum currently comprised of five recognised phyla: Lokiarchaeota, Thorarchaeota, Odinarchaeota, Heimdallarchaeota and Helarchaeota. Members of this group have been identified based on culture-independent approaches with several metagenome-assembled genomes (MAGs) reconstructed to date. However, most of these genomes consist of several relatively small contigs, and, until recently, no complete Asgard archaea genome is yet available. Large scale phylogenetic analyses suggest that Asgard archaea represent the closest archaeal relatives of eukaryotes. In addition, members of this superphylum encode proteins that were originally thought to be specific to eukaryotes, including components of the trafficking machinery, cytoskeleton and endosomal sorting complexes required for transport (ESCRT). Yet, these findings have been questioned on the basis that the genome sequences that underpin them were assembled from metagenomic data, and could have been subjected to contamination and other assembly artefacts. Even though several lines of evidence indicate that the previously reported findings were not affected by these issues, having access to high-quality and preferentially fully closed Asgard archaea genomes is needed to definitively close this debate. Current long-read sequencing technologies such as Oxford Nanopore allow the generation of long reads in a high-throughput manner making them suitable for their use in metagenomics. Although the use of long reads is still limited in this field, recent analyses have shown that it is feasible to obtain complete or near-complete genomes of abundant members of mock communities and metagenomes of various level of complexity. Here, we show that long read metagenomics can be successfully applied to obtain near-complete genomes of low-abundant members of complex communities from sediment samples. We were able to reconstruct six MAGs from different Lokiarchaeota lineages that show high completeness and low fragmentation, with one of them being a near-complete genome only consisting of three contigs. Our analyses confirm that the eukaryote-like features previously associated with Lokiarchaeota are not the result of contamination or assembly artefacts, and can indeed be found in the newly reconstructed genomes.

Download Full-text

proovframe: frameshift-correction for long-read (meta)genomics

10.1101/2021.08.23.457338 ◽

2021 ◽

Author(s):

Thomas Hackl ◽

Florian Trigodet ◽

A Murat Eren ◽

Steven J Biller ◽

John M Eppley ◽

...

Keyword(s):

Gene Prediction ◽

Genomic Analysis ◽

Community Members ◽

Reading Frame ◽

Complex Samples ◽

Small Indels ◽

Sequencing Technologies ◽

Long Reads ◽

Long Read ◽

Improving Accuracy

Long-read sequencing technologies hold big promises for the genomic analysis of complex samples such as microbial communities. Yet, despite improving accuracy, basic gene prediction on long-read data is still often impaired by frameshifts resulting from small indels. Consensus polishing using either complementary short reads or to a lesser extent the long reads themselves can mitigate this effect but requires universally high sequencing depth, which is difficult to achieve in complex samples where the majority of community members are rare. Here we present proovframe, a software implementing an alternative approach to overcome frameshift errors in long-read assemblies and raw long reads. We utilize protein-to-nucleotide alignments against reference databases to pinpoint indels in contigs or reads and correct them by deleting or inserting 1-2 bases, thereby conservatively restoring reading-frame fidelity in aligned regions. Using simulated and real-world benchmark data we show that proovframe performs comparably to short-read-based polishing on assembled data, works well with remote protein homologs, and can even be applied to raw reads directly. Together, our results demonstrate that protein-guided frameshift correction significantly improves the analyzability of long-read data both in combination with and as an alternative to common polishing strategies. Proovframe is available from https://github.com/thackl/proovframe.

Download Full-text