scholarly journals Recent development of antiSMASH and other computational approaches to mine secondary metabolite biosynthetic gene clusters

2017 ◽  
Vol 20 (4) ◽  
pp. 1103-1113 ◽  
Author(s):  
Kai Blin ◽  
Hyun Uk Kim ◽  
Marnix H Medema ◽  
Tilmann Weber

Abstract Many drugs are derived from small molecules produced by microorganisms and plants, so-called natural products. Natural products have diverse chemical structures, but the biosynthetic pathways producing those compounds are often organized as biosynthetic gene clusters (BGCs) and follow a highly conserved biosynthetic logic. This allows for the identification of core biosynthetic enzymes using genome mining strategies that are based on the sequence similarity of the involved enzymes/genes. However, mining for a variety of BGCs quickly approaches a complexity level where manual analyses are no longer possible and require the use of automated genome mining pipelines, such as the antiSMASH software. In this review, we discuss the principles underlying the predictions of antiSMASH and other tools and provide practical advice for their application. Furthermore, we discuss important caveats such as rule-based BGC detection, sequence and annotation quality and cluster boundary prediction, which all have to be considered while planning for, performing and analyzing the results of genome mining studies.

Author(s):  
Patrick Videau ◽  
Kaitlyn Wells ◽  
Arun Singh ◽  
Jessie Eiting ◽  
Philip Proteau ◽  
...  

Cyanobacteria are prolific producers of natural products and genome mining has shown that many orphan biosynthetic gene clusters can be found in sequenced cyanobacterial genomes. New tools and methodologies are required to investigate these biosynthetic gene clusters and here we present the use of <i>Anabaena </i>sp. strain PCC 7120 as a host for combinatorial biosynthesis of natural products using the indolactam natural products (lyngbyatoxin A, pendolmycin, and teleocidin B-4) as a test case. We were able to successfully produce all three compounds using codon optimized genes from Actinobacteria. We also introduce a new plasmid backbone based on the native <i>Anabaena</i>7120 plasmid pCC7120ζ and show that production of teleocidin B-4 can be accomplished using a two-plasmid system, which can be introduced by co-conjugation.


2015 ◽  
Author(s):  
Pablo Cruz-Morales ◽  
Christian E. Martínez-Guerrero ◽  
Marco A. Morales-Escalante ◽  
Luis Yáñez-Guerra ◽  
Johannes Florian Kopp ◽  
...  

AbstractNatural products have provided humans with antibiotics for millennia. However, a decline in the pace of chemical discovery exerts pressure on human health as antibiotic resistance spreads. The empirical nature of current genome mining approaches used for natural products research limits the chemical space that is explored. By integration of evolutionary concepts related to emergence of metabolism, we have gained fundamental insights that are translated into an alternative genome mining approach, termed EvoMining. As the founding assumption of EvoMining is the evolution of enzymes, we solved two milestone problems revealing unprecedented conversions. First, we report the biosynthetic gene cluster of the ‘orphan’ metabolite leupeptin in Streptomyces roseus. Second, we discover an enzyme involved in formation of an arsenic-carbon bond in Streptomyces coelicolor and Streptomyces lividans. This work provides evidence that bacterial chemical repertoire is underexploited, as well as an approach to accelerate the discovery of novel antibiotics from bacterial genomes.


Biomolecules ◽  
2020 ◽  
Vol 10 (7) ◽  
pp. 1027 ◽  
Author(s):  
Loïc Martinet ◽  
Aymeric Naômé ◽  
Dominique Baiwir ◽  
Edwin De Pauw ◽  
Gabriel Mazzucchelli ◽  
...  

Strain prioritization for drug discovery aims at excluding redundant strains of a collection in order to limit the repetitive identification of the same molecules. In this work, we wanted to estimate what can be unexploited in terms of the amount, diversity, and novelty of compounds if the search is focused on only one single representative strain of a species, taking Streptomyces lunaelactis as a model. For this purpose, we selected 18 S. lunaelactis strains taxonomically clustered with the archetype strain S. lunaelactis MM109T. Genome mining of all S. lunaelactis isolated from the same cave revealed that 54% of the 42 biosynthetic gene clusters (BGCs) are strain specific, and five BGCs are not present in the reference strain MM109T. In addition, even when a BGC is conserved in all strains such as the bag/fev cluster involved in bagremycin and ferroverdin production, the compounds produced highly differ between the strains and previously unreported compounds are not produced by the archetype MM109T. Moreover, metabolomic pattern analysis uncovered important profile heterogeneity, confirming that identical BGC predisposition between two strains does not automatically imply chemical uniformity. In conclusion, trying to avoid strain redundancy based on phylogeny and genome mining information alone can compromise the discovery of new natural products and might prevent the exploitation of the best naturally engineered producers of specific molecules.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Kirstin Scherlach ◽  
Christian Hertweck

AbstractGenetically encoded small molecules (secondary metabolites) play eminent roles in ecological interactions, as pathogenicity factors and as drug leads. Yet, these chemical mediators often evade detection, and the discovery of novel entities is hampered by low production and high rediscovery rates. These limitations may be addressed by genome mining for biosynthetic gene clusters, thereby unveiling cryptic metabolic potential. The development of sophisticated data mining methods and genetic and analytical tools has enabled the discovery of an impressive array of previously overlooked natural products. This review shows the newest developments in the field, highlighting compound discovery from unconventional sources and microbiomes.


2020 ◽  
Author(s):  
Rafael Popin ◽  
Danillo Alvarenga ◽  
Raquel Castelo-Branco ◽  
David Fewer ◽  
Kaarina Sivonen

Abstract Background Microbial natural products have unique chemical structures and diverse biological activities. Cyanobacteria commonly possess a wide range of biosynthetic gene clusters to produce natural products. Several studies have mapped the distribution of natural product biosynthetic gene clusters in cyanobacterial genomes. However, little attention has been paid to natural product biosynthesis in plasmids. Some genes encoding cyanobacterial natural product biosynthetic pathways are believed to be dispersed by plasmids through horizontal gene transfer. Thus, we examined complete cyanobacterial genomes to assess if plasmids are involved in the production and dissemination of natural products by cyanobacteria.Results The 185 analyzed genomes possessed 1 to 42 gene clusters and an average of 10. In total, 1816 biosynthetic gene clusters were found. Approximately 95% of these clusters were present in chromosomes. The remaining 5% were present in plasmids, from which homologs of the biosynthetic pathways for aeruginosin, anabaenopeptin, ambiguine, cryptophycin, hassallidin, geosmin, and microcystin were manually curated. The cryptophycin pathway was previously described as active while the other gene cluster include all genes for biosynthesis. Approximately 12% of the 424 analyzed cyanobacterial plasmids contained homologs of genes involved in conjugation. Large plasmids, previously named as “chromids”, were also observed to be widespread in cyanobacteria. Sixteen cryptic natural product biosynthetic gene clusters and geosmin biosynthetic gene clusters were located in those mobile plasmids.Conclusion Homologues of genes involved in the production of toxins, protease inhibitors, odorous compounds, antimicrobials, antitumorals, and other unidentified natural products are located in cyanobacterial plasmids. Some of these plasmids are predicted to be conjugative. The present study provides in silico evidence that plasmids are involved in the distribution of natural product biosynthetic pathways in cyanobacteria.


Metabolites ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 785
Author(s):  
Junyang Wang ◽  
Jens Nielsen ◽  
Zihe Liu

A wide variety of bacteria, fungi and plants can produce bioactive secondary metabolites, which are often referred to as natural products. With the rapid development of DNA sequencing technology and bioinformatics, a large number of putative biosynthetic gene clusters have been reported. However, only a limited number of natural products have been discovered, as most biosynthetic gene clusters are not expressed or are expressed at extremely low levels under conventional laboratory conditions. With the rapid development of synthetic biology, advanced genome mining and engineering strategies have been reported and they provide new opportunities for discovery of natural products. This review discusses advances in recent years that can accelerate the design, build, test, and learn (DBTL) cycle of natural product discovery, and prospects trends and key challenges for future research directions.


2020 ◽  
Author(s):  
Neil L Grenade ◽  
Dragos S. Chiriac ◽  
Graeme W. Howe ◽  
Avena Ross

Bacterial natural products are an immensely valuable source of therapeutics. As modern DNA sequencing efforts provide increasing numbers of microbial genomes, it is clear that the molecules produced by most natural product biosynthetic gene clusters (BGCs) remain unknown. Genome mining makes use of bioinformatic techniques to elucidate the natural products produced by these “orphan” BGCs. Here, we report the use of sequence similarity networks (SSNs) and genome neighborhood networks (GNNs) to identify an orphan BGC that is responsible for the production of the antitumor tambjamine BE-18591 in Streptomyces albus NRRL B-2362. Although BE-18591 is a close structural analogue of tambjamine YP1 produced by Pseudoalteromonas tunicata, the biosynthetic routes to produce these molecules differ significantly. Notably, the C12-alkylamine tail that is appended onto the bipyrrole core of tambjamine YP1 is derived from fatty acids siphoned from the primary metabolism of the pseudoalteromonad, whilst the S. albus NRRL B-2362 BGC encodes a dedicated system for the de novo biosynthesis of the alkylamine portion of tambjamine BE-18591. These remarkably different biosynthetic strategies represent a striking example of convergent BGC evolution, with selective pressure for the production of tambjamines seemingly leading to the emergence of separate biosynthetic pathways in pseudoalteromonads and streptomycetes that ultimately produce closely related compounds


2021 ◽  
Vol 118 (19) ◽  
pp. e2020230118
Author(s):  
Matthew T. Robey ◽  
Lindsay K. Caesar ◽  
Milton T. Drott ◽  
Nancy P. Keller ◽  
Neil L. Kelleher

Fungi are prolific producers of natural products, compounds which have had a large societal impact as pharmaceuticals, mycotoxins, and agrochemicals. Despite the availability of over 1,000 fungal genomes and several decades of compound discovery efforts from fungi, the biosynthetic gene clusters (BGCs) encoded by these genomes and the associated chemical space have yet to be analyzed systematically. Here, we provide detailed annotation and analyses of fungal biosynthetic and chemical space to enable genome mining and discovery of fungal natural products. Using 1,037 genomes from species across the fungal kingdom (e.g., Ascomycota, Basidiomycota, and non-Dikarya taxa), 36,399 predicted BGCs were organized into a network of 12,067 gene cluster families (GCFs). Anchoring these GCFs with reference BGCs enabled automated annotation of 2,026 BGCs with predicted metabolite scaffolds. We performed parallel analyses of the chemical repertoire of fungi, organizing 15,213 fungal compounds into 2,945 molecular families (MFs). The taxonomic landscape of fungal GCFs is largely species specific, though select families such as the equisetin GCF are present across vast phylogenetic distances with parallel diversifications in the GCF and MF. We compare these fungal datasets with a set of 5,453 bacterial genomes and their BGCs and 9,382 bacterial compounds, revealing dramatic differences between bacterial and fungal biosynthetic logic and chemical space. These genomics and cheminformatics analyses reveal the large extent to which fungal and bacterial sources represent distinct compound reservoirs. With a >10-fold increase in the number of interpreted strains and annotated BGCs, this work better regularizes the biosynthetic potential of fungi for rational compound discovery.


2020 ◽  
Author(s):  
Matthew T. Robey ◽  
Lindsay K. Caesar ◽  
Milton T. Drott ◽  
Nancy P. Keller ◽  
Neil L. Kelleher

AbstractFungi are prolific producers of natural products, compounds which have had a large societal impact as pharmaceuticals, mycotoxins, and agrochemicals. Despite the availability of over 1000 fungal genomes and several decades of compound discovery efforts from fungi, the biosynthetic gene clusters (BGCs) encoded by these genomes and the associated chemical space have yet to be analyzed systematically. Here we provide detailed annotation and analyses of fungal biosynthetic and chemical space to enable genome mining and discovery of fungal natural products. Using 1037 genomes from species across the fungal kingdom (e.g., Ascomycota, Basidiomycota, and non-Dikarya taxa), 36,399 predicted BGCs were organized into a network of 12,067 gene cluster families (GCFs). Anchoring these GCFs with reference BGCs enabled automated annotation of 2,026 BGCs with predicted metabolite scaffolds. We performed parallel analyses of the chemical repertoire of Fungi, organizing 15,213 fungal compounds into 2,945 molecular families (MFs). The taxonomic landscape of fungal GCFs is largely species-specific, though select families such as the equisetin GCF are present across vast phylogenetic distances with parallel diversifications in the GCF and MF. We compare these fungal datasets with a set of 5,453 bacterial genomes and their BGCs and 9,382 bacterial compounds, revealing dramatic differences between bacterial and fungal biosynthetic logic and chemical space. These genomics and cheminformatics analyses reveal the large extent to which fungal and bacterial sources represent distinct compound reservoirs. With a >10-fold increase in the number of interpreted strains and annotated BGCs, this work better regularizes the biosynthetic potential of fungi for rational compound discovery.Significance StatementFungi represent an underexploited resource for new compounds with applications in the pharmaceutical and agriscience industries. Despite the availability of >1000 fungal genomes, our knowledge of the biosynthetic space encoded by these genomes is limited and ad hoc. We present results from systematically organizing the biosynthetic content of 1037 fungal genomes, providing a resource for data-driven genome mining and large-scale comparison of the genetic and molecular repertoires produced in fungi and compare to those present in bacteria.


2019 ◽  
Author(s):  
Patrick Videau ◽  
Kaitlyn Wells ◽  
Arun Singh ◽  
Jessie Eiting ◽  
Philip Proteau ◽  
...  

Cyanobacteria are prolific producers of natural products and genome mining has shown that many orphan biosynthetic gene clusters can be found in sequenced cyanobacterial genomes. New tools and methodologies are required to investigate these biosynthetic gene clusters and here we present the use of <i>Anabaena </i>sp. strain PCC 7120 as a host for combinatorial biosynthesis of natural products using the indolactam natural products (lyngbyatoxin A, pendolmycin, and teleocidin B-4) as a test case. We were able to successfully produce all three compounds using codon optimized genes from Actinobacteria. We also introduce a new plasmid backbone based on the native <i>Anabaena</i>7120 plasmid pCC7120ζ and show that production of teleocidin B-4 can be accomplished using a two-plasmid system, which can be introduced by co-conjugation.


Sign in / Sign up

Export Citation Format

Share Document