Exploration of Survival Traits, Probiotic Determinants, Host Interactions, and Functional Evolution of Bifidobacterial Genomes Using Comparative Genomics

Vikas Sharma; Fauzul Mobeen; Tulika Prakash

doi:10.3390/genes9100477

Exploration of Survival Traits, Probiotic Determinants, Host Interactions, and Functional Evolution of Bifidobacterial Genomes Using Comparative Genomics

Genes ◽

10.3390/genes9100477 ◽

2018 ◽

Vol 9 (10) ◽

pp. 477 ◽

Cited By ~ 5

Author(s):

Vikas Sharma ◽

Fauzul Mobeen ◽

Tulika Prakash

Keyword(s):

Core Genome ◽

Size Variation ◽

Genomic Islands ◽

Genome Size Variation ◽

Host Interactions ◽

The Core ◽

Pan Genome ◽

Wide Range ◽

Insertion Elements ◽

Open Nature

Members of the genus Bifidobacterium are found in a wide-range of habitats and are used as important probiotics. Thus, exploration of their functional traits at the genus level is of utmost significance. Besides, this genus has been demonstrated to exhibit an open pan-genome based on the limited number of genomes used in earlier studies. However, the number of genomes is a crucial factor for pan-genome calculations. We have analyzed the pan-genome of a comparatively larger dataset of 215 members of the genus Bifidobacterium belonging to different habitats, which revealed an open nature. The pan-genome for the 56 probiotic and human-gut strains of this genus, was also found to be open. The accessory- and unique-components of this pan-genome were found to be under the operation of Darwinian selection pressure. Further, their genome-size variation was predicted to be attributed to the abundance of certain functions carried by genomic islands, which are facilitated by insertion elements and prophages. In silico functional and host-microbe interaction analyses of their core-genome revealed significant genomic factors for niche-specific adaptations and probiotic traits. The core survival traits include stress tolerance, biofilm formation, nutrient transport, and Sec-secretion system, whereas the core probiotic traits are imparted by the factors involved in carbohydrate- and protein-metabolism and host-immunomodulations.

Download Full-text

First Steps in the Analysis of Prokaryotic Pan-Genomes

Bioinformatics and Biology Insights ◽

10.1177/1177932220938064 ◽

2020 ◽

Vol 14 ◽

pp. 117793222093806

Author(s):

Sávio Souza Costa ◽

Luís Carlos Guimarães ◽

Artur Silva ◽

Siomar Castro Soares ◽

Rafael Azevedo Baraúna

Keyword(s):

Genome Analysis ◽

Core Genome ◽

Bacterial Species ◽

Genomic Analysis ◽

Gene Families ◽

Specific Group ◽

The Core ◽

Pan Genome ◽

Research Areas ◽

Key Concepts

Pan-genome is defined as the set of orthologous and unique genes of a specific group of organisms. The pan-genome is composed by the core genome, accessory genome, and species- or strain-specific genes. The pan-genome is considered open or closed based on the alpha value of the Heap law. In an open pan-genome, the number of gene families will continuously increase with the addition of new genomes to the analysis, while in a closed pan-genome, the number of gene families will not increase considerably. The first step of a pan-genome analysis is the homogenization of genome annotation. The same software should be used to annotate genomes, such as GeneMark or RAST. Subsequently, several software are used to calculate the pan-genome such as BPGA, GET_HOMOLOGUES, PGAP, among others. This review presents all these initial steps for those who want to perform a pan-genome analysis, explaining key concepts of the area. Furthermore, we present the pan-genomic analysis of 9 bacterial species. These are the species with the highest number of genomes deposited in GenBank. We also show the influence of the identity and coverage parameters on the prediction of orthologous and paralogous genes. Finally, we cite the perspectives of several research areas where pan-genome analysis can be used to answer important issues.

Download Full-text

Virulence and antibiotic resistance plasticity of Arcobacter butzleri: insights on the genomic diversity of an emerging human pathogen

10.1101/775932 ◽

2019 ◽

Author(s):

Joana Isidro ◽

Susana Ferreira ◽

Miguel Pinto ◽

Fernanda Domingues ◽

Mónica Oleastro ◽

...

Keyword(s):

Antibiotic Resistance ◽

Comparative Genomics ◽

Core Genome ◽

Human Pathogen ◽

Genome Diversity ◽

Pathogenic Potential ◽

The Core ◽

Pan Genome ◽

Arcobacter Butzleri ◽

Genome Scale

AbstractArcobacter butzleri is a food and waterborne bacteria and an emerging human pathogen, frequently displaying a multidrug resistant character. Still, no comprehensive genome-scale comparative analysis has been performed so far, which has limited our knowledge on A. butzleri diversification and pathogenicity. Here, we performed a deep genome analysis of A. butzleri focused on decoding its core- and pan-genome diversity and specific genetic traits underlying its pathogenic potential and diverse ecology. In total, 49 A. butzleri strains (collected from human, animal, food and environmental sources) were screened.A. butzleri (genome size 2.07-2.58 Mbp) revealed a large open pan-genome with 7474 genes (about 50% being singletons) and a small core-genome with 1165 genes. The core-genome is highly diverse (≥55% of the core genes presenting at least 40/49 alleles), being enriched with genes associated with housekeeping functions. In contrast, the accessory genome presented a high proportion of loci with an unknown function, also being particularly overrepresented by genes associated with defence mechanisms. A. butzleri revealed a plastic virulome (including newly identified determinants), marked by the differential presence of multiple adaptation-related virulence factors, such as the urease cluster ureD(AB)CEFG (phenotypically confirmed), the hypervariable hemagglutinin-encoding hecA, a putative type I secretion system (T1SS) harboring another agglutinin potentially related to adherence and a novel VirB/D4 T4SS likely linked to interbacterial competition and cytotoxicity. In addition, A. butzleri harbors a large repertoire of efflux pumps (EPs) (ten “core” and nine differentially present) and other antibiotic resistant determinants. We provide the first description of a genetic determinant of macrolides resistance in A. butzleri, by associating the inactivation of a TetR repressor (likely regulating an EP) with erythromycin resistance. Fluoroquinolones resistance correlated with the Thr-85-Ile substitution in GyrA and ampicillin resistance was linked to an OXA-15-like β-lactamase. Remarkably, by decoding the polymorphism pattern of the porin- and adhesin-encoding main antigen PorA, this study strongly supports that this pathogen is able to exchange porA as a whole and/or hypervariable epitope-encoding regions separately, leading to a multitude of chimeric PorA presentations that can impact pathogen-host interaction during infection. Ultimately, our unprecedented screening of short sequence repeats detected potential phase-variable genes related to adaptation and host/environment interaction, such as lipopolysaccharide modification and motility/chemotaxis, suggesting that phase variation likely modulate A. butzleri key adaptive functions.In summary, this study constitutes a turning point on A. butzleri comparative genomics revealing that this human gastrointestinal pathogen is equipped with vast virulence and antibiotic resistance arsenals, which, coupled with its remarkable core- and pan-genome diversity, opens a multitude of phenotypic fingerprints for environmental/host adaptation and pathogenicity.IMPACT STATEMENTDiarrhoeal diseases are the most common cause of human illness caused by foodborne hazards, but the surveillance of diarrhoeal diseases is biased towards the most commonly searched infectious agents (namely Campylobacter jejuni and C. coli). In fact, other less studied pathogens are frequently found as the etiological agent when refined non-selective culture conditions are applied. A hallmark example is the diarrhoeal-causing Arcobacter butzleri which, despite being also associated with extra-intestinal diseases, such as bacteremia in humans and mastitis in animals, and displaying high rates of antibiotic resistance, has not yet been profoundly investigated regarding its epidemiology, diversity and pathogenicity. To overcome the general lack of knowledge on A. butzleri comparative genomics, we provide the first comprehensive genome-scale analysis of A. butzleri focused on exploring the intraspecies virulome content and diversity, resistance determinants, as well as how this pathogen shapes its genome towards ecological adaptation and host invasion. The unveiled scenario of A. butzleri rampant diversity and plasticity reinforces the pathogenic potential of this food and waterborne hazard, while opening multiple research lines that will certainly contribute to the future development of more robust species-oriented diagnostics and molecular surveillance of A. butzleri.DATA SUMMARYA. butzleri raw sequence reads generated in the present study were deposited in the European Nucleotide Archive (ENA) (BioProject PRJEB34441). The assembled contigs (.fasta and .gbk files), the nucleotide sequences of the predicted transcripts (CDS, rRNA, tRNA, tmRNA, misc_RNA) (.ffn files) and the respective amino acid sequences of the translated CDS sequences (.faa files) are available at http://doi.org/10.5281/zenodo.3434222. Detailed ENA accession numbers, as well as the draft genome statistics are described in Table S1.

Download Full-text

Heterogeneity among estimates of the core genome and pan-genome in different pneumococcal populations

10.1101/133991 ◽

2017 ◽

Cited By ~ 5

Author(s):

Andries J van Tonder ◽

James E Bray ◽

Keith A Jolley ◽

Sigríður J Quirk ◽

Gunnsteinn Haraldsson ◽

...

Keyword(s):

Bacterial Population ◽

Core Genome ◽

Bacterial Species ◽

Essential Point ◽

Genetic Lineages ◽

The Core ◽

Pan Genome ◽

Single Dataset ◽

Genomic Regions ◽

Core Genes

AbstractBackgroundUnderstanding the structure of a bacterial population is essential in order to understand bacterial evolution, or which genetic lineages cause disease, or the consequences of perturbations to the bacterial population. Estimating the core genome, the genes common to all or nearly all strains of a species, is an essential component of such analyses. The size and composition of the core genome varies by dataset, but our hypothesis was that variation between different collections of the same bacterial species should be minimal. To test this, the genome sequences of 3,121 pneumococci recovered from healthy individuals in Reykjavik (Iceland), Southampton (United Kingdom), Boston (USA) and Maela (Thailand) were analysed.ResultsThe analyses revealed a ‘supercore’ genome (genes shared by all 3,121 pneumococci) of only 303 genes, although 461 additional core genes were shared by pneumococci from Reykjavik, Southampton and Boston. Overall, the size and composition of the core genomes and pan-genomes among pneumococci recovered in Reykjavik, Southampton and Boston were very similar, but pneumococci from Maela were distinctly different. Inspection of the pan-genome of Maela pneumococci revealed several >25 Kb sequence regions that were homologous to genomic regions found in other bacterial species.ConclusionsSome subsets of the global pneumococcal population are highly heterogeneous and thus our hypothesis was rejected. This is an essential point of consideration before generalising the findings from a single dataset to the wider pneumococcal population.

Download Full-text

Pan-genome of Novel Pantoea stewartii subsp. indologenes Reveal Genes Involved in Onion Pathogenicity and Evidence of Lateral Gene Transfer

10.20944/preprints202107.0400.v1 ◽

2021 ◽

Author(s):

Gaurav Agarwal ◽

Ronald D. Gitaitis ◽

Bhabesh Dutta

Keyword(s):

Gene Transfer ◽

Core Genome ◽

Foxtail Millet ◽

Evaluation Study ◽

Full Spectrum ◽

The Core ◽

Pan Genome ◽

Pantoea Stewartii ◽

Comparative Phylogenetic Analysis ◽

Core Genes

Pantoea stewartii subsp. indologenes (Psi) is a causative agent of leafspot of foxtail millet and pearl millet; however, novel strains were recently identified that are pathogenic on onion. Our recent host range evaluation study identified two pathovars; P. stewartii subsp. indologenes pv. cepacicola pv. nov. and P. stewartii subsp. indologenes pv. setariae pv. nov. that are pathogenic on onion and millets or on millets only, respectively. In the current study we developed a pan-genome using the whole genome sequencing of newly identified/classified Psi strains from both pathovars [pv. cepacicola (n= 4) and pv. setariae (n=13)]. The full spectrum of the pan-genome contained 7,030 genes. Among these, 3,546 (present in genomes of all 17 strains) were the core genes that were a subset of 3,682 soft-core genes (present in ≥16 strains). The accessory genome included 1,308 shell genes and 2,040 cloud genes (present in ≤ 2 strains). The pan-genome showed a clear liner progression with >6,000 genes, suggesting the pan-genome of Psi is open. Comparative phylogenetic analysis showed differences in phylogenetic clustering of Pantoea spp. using PAVs/wgMLST approach in comparison to core genome SNP-based phylogeny. Further, we conducted a horizontal gene transfer (HGT) study including four other Pantoea species namely, P. stewartii subsp. stewartii LMG 2715T, P. ananatis LMG 2665T, P. agglomerans LMG L15, and P. allii LMG 24248T. A total of 317 HGT events among four Pantoea species were identified with most gene transfers observed between Psi pv. cepacicola and Psi pv. setariae. Pan-GWAS analysis predicted a total of 154 genes including seven cluster of genes associated with the pathogenicity phenotype on onion. One of the clusters contain 11 genes with known functions and are found to be chromosomally located.

Download Full-text

Analysis of the Core Genome and Pan-Genome of Autotrophic Acetogenic Bacteria

Frontiers in Microbiology ◽

10.3389/fmicb.2016.01531 ◽

2016 ◽

Vol 7 ◽

Cited By ~ 25

Author(s):

Jongoh Shin ◽

Yoseb Song ◽

Yujin Jeong ◽

Byung-Kwan Cho

Keyword(s):

Core Genome ◽

The Core ◽

Pan Genome ◽

Acetogenic Bacteria

Download Full-text

Diversification of Aeonium Species Across Macaronesian Archipelagos: Correlations Between Genome-Size Variation and Their Conservation Status

Frontiers in Ecology and Evolution ◽

10.3389/fevo.2021.607338 ◽

2021 ◽

Vol 9 ◽

Author(s):

Miguel Brilhante ◽

Guilherme Roxo ◽

Sílvia Catarino ◽

Patrícia dos Santos ◽

J. Alfredo Reyes-Betancort ◽

...

Keyword(s):

Genome Size ◽

Environmental Variables ◽

Morphological Traits ◽

Conservation Status ◽

Size Variation ◽

Ecological Niches ◽

Conservation Priorities ◽

Genome Size Variation ◽

Wide Range ◽

Macaronesian Islands

The rich endemic flora of the Macaronesian Islands places these oceanic archipelagos among the top biodiversity hotspots worldwide. The radiations that have determined the evolution of many of these insular lineages resulted in a wealth of endemic species, many of which occur in a wide range of ecological niches, but show small distribution areas in each of them. Aeonium (Crassulaceae) is the most speciose lineage in the Canary Islands (ca. 40 taxa), and as such can be considered a good model system to understand the diversification dynamics of oceanic endemic floras. The present study aims to assess the genome size variation within Aeonium distribution, i.e., the Macaronesian archipelagos of Madeira, Canaries and Cabo Verde, and analyse it together with information on distribution (i.e., geography and conservation status), taxonomy (i.e., sections), morphological traits (i.e., growth-form), geological data (i.e., island's geological age), and environmental variables (i.e., altitude, annual mean temperature, and precipitation). Based on extensive fieldwork, a cytogeographic screening of 24 Aeonium species was performed. The conservation status of these species was assessed based on IUCN criteria. 61% of the taxa were found to be threatened (4% Endangered and 57% Vulnerable). For the first time, the genome size of a comprehensive sample of Aeonium across the Macaronesian archipelagos was estimated, and considerable differences in Cx-values were found, ranging from 0.984 pg (A. dodrantale) to 2.768 pg (A. gorgoneum). An overall positive correlation between genome size and conservation status was found, with the more endangered species having the larger genomes on average. However, only slight relationships were found between genome size, morphological traits, and environmental variables. These results underscore the importance of characterizing the cytogenomic diversity and conservation status of endemic plants found in Macaronesian Islands, providing, therefore, new data to establish conservation priorities.

Download Full-text

Pan-Genome Analyses of Geobacillus spp. Reveal Genetic Characteristics and Composting Potential

International Journal of Molecular Sciences ◽

10.3390/ijms21093393 ◽

2020 ◽

Vol 21 (9) ◽

pp. 3393

Author(s):

Mengmeng Wang ◽

Han Zhu ◽

Zhijian Kong ◽

Tuo Li ◽

Lei Ma ◽

...

Keyword(s):

Core Genome ◽

Agricultural Waste ◽

Environmental Parameters ◽

Housekeeping Genes ◽

Ecological Diversity ◽

Thermostable Enzymes ◽

Genetic Characteristics ◽

Evolutionary Mechanism ◽

The Core ◽

Pan Genome

The genus Geobacillus is abundant in ecological diversity and is also well-known as an authoritative source for producing various thermostable enzymes. Although it is clear now that Geobacillus evolved from Bacillus, relatively little knowledge has been obtained regarding its evolutionary mechanism, which might also contribute to its ecological diversity and biotechnology potential. Here, a statistical comparison of thirty-two Geobacillus genomes was performed with a specific focus on pan- and core genomes. The pan-genome of this set of Geobacillus strains contained 14,913 genes, and the core genome contained 940 genes. The Clusters of Orthologous Groups (COG) and Carbohydrate-Active Enzymes (CAZymes) analysis revealed that the Geobacillus strains had huge potential industrial application in composting for agricultural waste management. Detailed comparative analyses showed that basic functional classes and housekeeping genes were conserved in the core genome, while genes associated with environmental interaction or energy metabolism were more enriched in the pan-genome. Therefore, the evolution of Geobacillus seems to be guided by environmental parameters. In addition, horizontal gene transfer (HGT) events among different Geobacillus species were detected. Altogether, pan-genome analysis was a useful method for detecting the evolutionary mechanism, and Geobacillus’ evolution was directed by the environment and HGT events.

Download Full-text

Genome Size and Ploidy Levels of Cercis (Redbud) Species, Cultivars, and Botanical Varieties

HortScience ◽

10.21273/hortsci.51.4.330 ◽

2016 ◽

Vol 51 (4) ◽

pp. 330-333 ◽

Cited By ~ 2

Author(s):

David J. Roberts ◽

Dennis J. Werner

Keyword(s):

Genome Size ◽

Size Variation ◽

Internal Standard ◽

Close Relative ◽

Genome Size Variation ◽

Ploidy Levels ◽

Wide Range ◽

Average Size ◽

Size Estimates ◽

Botanical Varieties

Cercis is an ancient member of Fabaceae, often cultivated as an ornamental tree, and can be found in numerous regions around the world. Previous studies have reported Cercis canadensis as being diploid with 2n = 2x = 14. However, there have been no further investigations into ploidy and genome size variation among Cercis taxa. A study was conducted to evaluate the relative genome size and ploidy levels of numerous species, cultivars, and botanical varieties of Cercis, representing taxa found in North America, Asia, and the Middle East. In addition, the genome size of Bauhinia forficata, a close relative of Cercis, was also determined. Genome size estimates (2C values) were determined by calculating the mean fluorescence of stained nuclei via flow cytometry. Propidium iodide was used as the staining agent and Glycine max was used as an internal standard for each taxon analyzed. Genome size estimates for all Cercis sampled ranged from 0.70 to 0.81 pg with an average size of 0.75 pg. The genome size of B. forficata was found to be smaller than any other Bauhinia sp. currently on record, with an average size of 0.87 pg. This study confirmed an initial estimation of the genome size of Cercis chinensis and found that floral buds of Cercis proved to be an excellent source of plant tissue for obtaining intact nuclei. All species, botanical varieties, and cultivars of Cercis surveyed for this study had remarkably similar genome sizes despite their wide range of distribution. This information can facilitate a better understanding of phylogenetic relationships within Cercideae and Cercis specifically.

Download Full-text

Comparative genome analysis of Clostridium beijerinckii strains isolated from pit mud of Chinese strong flavor baijiu ecosystem

G3 Genes|Genome|Genetics ◽

10.1093/g3journal/jkab317 ◽

2021 ◽

Author(s):

Wei Zou ◽

Guangbin Ye ◽

Chaojie Liu ◽

Kaizheng Zhang ◽

Hehe Li ◽

...

Keyword(s):

Core Genome ◽

Carbon Sources ◽

Clostridium Beijerinckii ◽

Orthologous Genes ◽

The Core ◽

Phage Integrase ◽

Pit Mud ◽

Wide Range ◽

Current Sampling ◽

Acetate Butyrate

Abstract Clostridium beijerinckii is a well-known anaerobic solventogenic bacterium which inhabits a wide range of different niches. Previously, we isolated five butyrate-producing C. beijerinckii strains from pit mud (PM) of strong-flavor baijiu (SFB) ecosystems. Genome annotation of the five strains showed that they could assimilate various carbon sources as well as ammonium to produce acetate, butyrate, lactate, hydrogen, and esters but did not produce the undesirable flavours isopropanol and acetone, making them useful for further exploration in SFB production. Our analysis of the genomes of an additional 233 C. beijerinckii strains revealed an open pangenome based on current sampling and will likely change with additional genomes. The core genome, accessory genome, and strain-specific genes comprised 1567, 8851, and 2154 genes, respectively. A total of 298 genes were found only in the five C. beijerinckii strains from PM, among which only 77 genes were assigned to Clusters of Orthologous Genes (COG) categories. In addition, 15 transposase and 12 phage integrase families were found in all five C. beijerinckii strains from PM. Between 18 and 21 genome islands (GIs) were predicted for the five C. beijerinckii genomes. The existence of a large number of MGEs indicated that the genomes of the five C. beijerinckii strains evolved with the loss or insertion of DNA fragments in the PM of SFB ecosystems. This study presents a genomic framework of C. beijerinckii strains from PM that could be used for genetic diversification studies and further exploration of these strains.

Download Full-text

Pan-genome analysis and ancestral state reconstruction of class halobacteria: probability of a new super-order

Scientific Reports ◽

10.1038/s41598-020-77723-6 ◽

2020 ◽

Vol 10 (1) ◽

Author(s):

Sonam Gaba ◽

Abha Kumari ◽

Marnix Medema ◽

Rajeev Kaushik

Keyword(s):

Salt Concentration ◽

Genome Analysis ◽

Gene Tree ◽

Halophilic Archaea ◽

Variable Component ◽

Last Common Ancestor ◽

Ancestral State ◽

The Core ◽

Pan Genome ◽

Wide Range

AbstractHalobacteria, a class of Euryarchaeota are extremely halophilic archaea that can adapt to a wide range of salt concentration generally from 10% NaCl to saturated salt concentration of 32% NaCl. It consists of the orders: Halobacteriales, Haloferaciales and Natriabales. Pan-genome analysis of class Halobacteria was done to explore the core (300) and variable components (Softcore: 998, Cloud:36531, Shell:11784). The core component revealed genes of replication, transcription, translation and repair, whereas the variable component had a major portion of environmental information processing. The pan-gene matrix was mapped onto the core-gene tree to find the ancestral (44.8%) and derived genes (55.1%) of the Last Common Ancestor of Halobacteria. A High percentage of derived genes along with presence of transformation and conjugation genes indicate the occurrence of horizontal gene transfer during the evolution of Halobacteria. A Core and pan-gene tree were also constructed to infer a phylogeny which implicated on the new super-order comprising of Natrialbales and Halobacteriales.

Download Full-text