scholarly journals An Escherichia coli ST131 pangenome atlas reveals population structure and evolution across 4,071 isolates

2019 ◽  
Author(s):  
Arun Gonzales Decano ◽  
Tim Downing

AbstractEscherichia coli ST131 is a major cause of infection with extensive antimicrobial resistance (AMR) facilitated by widespread beta-lactam antibiotic use. This drug pressure has driven extended-spectrum beta-lactamase (ESBL) gene acquisition and evolution in pathogens, so a clearer resolution of ST131’s origin, adaptation and spread is essential. E. coli ST131’s ESBL genes are typically embedded in mobile genetic elements (MGEs) that aid transfer to new plasmid or chromosomal locations, which are mobilised further by plasmid conjugation and recombination, resulting in a flexible ESBL, MGE and plasmid composition with a conserved core genome. We used population genomics to trace the evolution of AMR in ST131 more precisely by extracting all available high-quality Illumina HiSeq read libraries to investigate 4,071 globally-sourced genomes, the largest ST131 collection examined so far. We applied rigorous quality-control, genome de novo assembly and ESBL gene screening to resolve ST131’s population structure across three genetically distinct Clades (A, B, C) and abundant subclades from the dominant Clade C. We reconstructed their evolutionary relationships across the core and accessory genomes using published reference genomes, long read assemblies and k-mer-based methods to contextualise pangenome diversity. The three main C subclades have co-circulated globally at relatively stable frequencies over time, suggesting attaining an equilibrium after their origin and initial rapid spread. This contrasted with their ESBL genes, which had stronger patterns across time, geography and subclade, and were located at distinct locations across the chromosomes and plasmids between isolates. Within the three C subclades, the core and accessory genome diversity levels were not correlated due to plasmid and MGE activity, unlike patterns between the three main clades, A, B and C. This population genomic study highlights the dynamic nature of the accessory genomes in ST131, suggesting that surveillance should anticipate genetically variable outbreaks with broader antibiotic resistance levels. Our findings emphasise the potential of evolutionary pangenomics to improve our understanding of AMR gene transfer, adaptation and transmission to discover accessory genome changes linked to novel subtypes.

2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Arun Gonzales Decano ◽  
Tim Downing

AbstractEscherichia coli ST131 is a major cause of infection with extensive antimicrobial resistance (AMR) facilitated by widespread beta-lactam antibiotic use. This drug pressure has driven extended-spectrum beta-lactamase (ESBL) gene acquisition and evolution in pathogens, so a clearer resolution of ST131’s origin, adaptation and spread is essential. E. coli ST131’s ESBL genes are typically embedded in mobile genetic elements (MGEs) that aid transfer to new plasmid or chromosomal locations, which are mobilised further by plasmid conjugation and recombination, resulting in a flexible ESBL, MGE and plasmid composition with a conserved core genome. We used population genomics to trace the evolution of AMR in ST131 more precisely by extracting all available high-quality Illumina HiSeq read libraries to investigate 4,071 globally-sourced genomes, the largest ST131 collection examined so far. We applied rigorous quality-control, genome de novo assembly and ESBL gene screening to resolve ST131’s population structure across three genetically distinct Clades (A, B, C) and abundant subclades from the dominant Clade C. We reconstructed their evolutionary relationships across the core and accessory genomes using published reference genomes, long read assemblies and k-mer-based methods to contextualise pangenome diversity. The three main C subclades have co-circulated globally at relatively stable frequencies over time, suggesting attaining an equilibrium after their origin and initial rapid spread. This contrasted with their ESBL genes, which had stronger patterns across time, geography and subclade, and were located at distinct locations across the chromosomes and plasmids between isolates. Within the three C subclades, the core and accessory genome diversity levels were not correlated due to plasmid and MGE activity, unlike patterns between the three main clades, A, B and C. This population genomic study highlights the dynamic nature of the accessory genomes in ST131, suggesting that surveillance should anticipate genetically variable outbreaks with broader antibiotic resistance levels. Our findings emphasise the potential of evolutionary pangenomics to improve our understanding of AMR gene transfer, adaptation and transmission to discover accessory genome changes linked to novel subtypes.


2021 ◽  
Author(s):  
Teng Li ◽  
David Kainer ◽  
William J Foley ◽  
Allen Rodrigo ◽  
Carsten Kuelheim

Eucalyptus polybractea is a small, multi-stemmed tree, which is widely cultivated in Australia for the production of Eucalyptus oil. We report the hybrid assembly of the E. polybractea genome utilizing both short- and long-read technology. We generated 44 Gb of Illumina HiSeq short reads and 8 Gb of Nanopore long reads, representing approximately 83 and 15 times genome coverage, respectively. The hybrid-assembled genome, after polishing, contained 24,864 scaffolds with an accumulated length of 523 Mb (N50 = 40.3 kb; BUSCO-calculated genome completeness of 94.3%). The genome contained 35,385 predicted protein-coding genes detected by combining homology-based and de novo approaches. We have provided the first assembled genome based on hybrid sequences from the highly diverse Eucalyptus subgenus Symphyomyrtus, and revealed the value of including long-reads from Nanopore technology for enhancing the contiguity of the assembled genome, as well as for improving its completeness. We anticipate that the E. polybractea genome will be an invaluable resource supporting a range of studies in genetics, population genomics and evolution of related species in Eucalyptus.


2021 ◽  
Vol 7 (9) ◽  
Author(s):  
Rebecca J. Hall ◽  
Fiona J. Whelan ◽  
Elizabeth A. Cummins ◽  
Christopher Connor ◽  
Alan McNally ◽  
...  

The pangenome contains all genes encoded by a species, with the core genome present in all strains and the accessory genome in only a subset. Coincident gene relationships are expected within the accessory genome, where the presence or absence of one gene is influenced by the presence or absence of another. Here, we analysed the accessory genome of an Escherichia coli pangenome consisting of 400 genomes from 20 sequence types to identify genes that display significant co-occurrence or avoidance patterns with one another. We present a complex network of genes that are either found together or that avoid one another more often than would be expected by chance, and show that these relationships vary by lineage. We demonstrate that genes co-occur by function, and that several highly connected gene relationships are linked to mobile genetic elements. We find that genes are more likely to co-occur with, rather than avoid, another gene in the accessory genome. This work furthers our understanding of the dynamic nature of prokaryote pangenomes and implicates both function and mobility as drivers of gene relationships.


2017 ◽  
Author(s):  
Khalil Abudahab ◽  
Joaquín M. Prada ◽  
Zhirong Yang ◽  
Stephen D. Bentley ◽  
Nicholas J. Croucher ◽  
...  

ABSTRACTThe standard workhorse for genomic analysis of the evolution of bacterial populations is phylogenetic modelling of mutations in the core genome. However, in the current era of population genomics, a notable amount of information about evolutionary and transmission processes in diverse populations can be lost unless the accessory genome is also taken into consideration. Here we introduce PANINI, a computationally scalable method for identifying the neighbours for each isolate in a data set using unsupervised machine learning with stochastic neighbour embedding. PANINI is browser-based and integrates with the Microreact platform for rapid online visualisation and exploration of both core and accessory genome evolutionary signals together with relevant epidemiological, geographic, temporal and other metadata. Several case studies with single-and multi-clone pneumococcal populations are presented to demonstrate ability to identify biologically important signals from gene content data. PANINI is available at http://panini.wgsa.net/ and code at http://gitlab.com/cgps/panini


2020 ◽  
Vol 69 (1) ◽  
pp. 116-122
Author(s):  
Tsam Ju ◽  
Perla Farhat ◽  
Wenjing Tao ◽  
Jibin Miao ◽  
Jialiang Li ◽  
...  

AbstractJuniperus squamata, an endemic conifer of Asia, is an important shrub ecologically and economically. Yet little is known about its genetic diversity and population structure due to lacking of highly polymorphic molecular markers. In this study, expressed sequence tag microsatellite markers (EST-SSR) were developed for Juniperus squamata. Illumina HiSeq data were used to reconstruct the transcriptome of this species by de novo assembly. Based on this transcriptome, 18 SSR markers were designed and successfully amplified. Just one locus was eliminated due to its detection of null alleles and the remaining 17 loci were polymorphic, generating five to 14 alleles per locus in J. squamata. Markers cross-amplification tests were successful in two closely related species of J. squamata. These markers will serve as a basis for further studies to assess the genetic diversity and population structure of J. squamata. As well, they could be useful in promoting sustainable forest management strategies for this species in the face of global climate change.


2021 ◽  
Author(s):  
Rebecca J Hall ◽  
Fiona J Whelan ◽  
Elizabeth A Cummins ◽  
Christopher Connor ◽  
Alan McNally ◽  
...  

The pangenome contains all genes encoded by a species, with the core genome present in all strains and the accessory genome in only a subset. Coincident gene relationships are expected within the accessory genome, where the presence or absence of one gene is influenced by the presence or absence of another. Here, we analysed the accessory genome of an Escherichia coli pangenome consisting of 400 genomes from 20 sequence types to identify genes that display significant co-occurrence or avoidance patterns with one another. We present a complex network of genes that are either found together or that avoid one another more often than would be expected by chance, and show that these relationships vary by lineage. We demonstrate that genes co-occur by function, and that several highly connected gene relationships are linked to mobile genetic elements. We find that genes are more likely to co-occur with, rather than avoid, another gene, suggesting that cooperation is more common than conflict in the accessory genome. This work furthers our understanding of the dynamic nature of prokaryote pangenomes and implicates both function and mobility as drivers of gene relationships.


Forests ◽  
2021 ◽  
Vol 12 (2) ◽  
pp. 222
Author(s):  
Bartosz Ulaszewski ◽  
Joanna Meger ◽  
Jaroslaw Burczyk

Next-generation sequencing of reduced representation genomic libraries (RRL) is capable of providing large numbers of genetic markers for population genetic studies at relatively low costs. However, one major concern of these types of markers is the precision of genotyping, which is related to the common problem of missing data, which appears to be particularly important in association and genomic selection studies. We evaluated three RRL approaches (GBS, RADseq, ddRAD) and different SNP identification methods (de novo or based on a reference genome) to find the best solutions for future population genomics studies in two economically and ecologically important broadleaved tree species, namely F. sylvatica and Q. robur. We found that the use of ddRAD method coupled with SNP calling based on reference genomes provided the largest numbers of markers (28 k and 36 k for beech and oak, respectively), given standard filtering criteria. Using technical replicates of samples, we demonstrated that more than 80% of SNP loci should be considered as reliable markers in GBS and ddRAD, but not in RADseq data. According to the reference genomes’ annotations, more than 30% of the identified ddRAD loci appeared to be related to genes. Our findings provide a solid support for using ddRAD-based SNPs for future population genomics studies in beech and oak.


Animals ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 1585
Author(s):  
Laila Darwich ◽  
Chiara Seminati ◽  
Jorge R. López-Olvera ◽  
Anna Vidal ◽  
Laia Aguirre ◽  
...  

Disease transmission among wild boars, domestic animals and humans is a public health concern, especially in areas with high wild boar densities. In this study, fecal samples of wild boars (n = 200) from different locations of the Metropolitan Area of Barcelona were analyzed by PCR to explore the frequency of β-lactamases and extended cephalosporin and carbapenem resistance genes (ESBLs) in Escherichia coli strains and the presence of toxigenic Clostridioides difficile. The prevalence of genes conferring resistance to β-lactam antimicrobials was 8.0% (16/200): blaCMY-2 (3.0%), blaTEM-1b (2.5%), blaCTX-M-14 (1.0%), blaSHV-28 (1.0%), blaCTX-M-15 (0.5%) and blaCMY-1 (0.5%). Clostridioides difficile TcdA+ was detected in two wild boars (1.0%), which is the first report of this pathogen in wild boars in Spain. Moreover, the wild boars foraging in urban and peri-urban locations were more exposed to AMRB sources than the wild boars dwelling in natural environments. In conclusion, the detection of E. coli carrying ESBL/AmpC genes and toxigenic C. difficile in wild boars foraging in urban areas reinforces the value of this game species as a sentinel of environmental AMRB sources. In addition, these wild boars can be a public and environmental health concern by disseminating AMRB and other zoonotic agents. Although this study provides the first hints of the potential anthropogenic sources of AMR, further efforts should be conducted to identify and control them.


Sign in / Sign up

Export Citation Format

Share Document