Recent development of antiSMASH and other computational approaches to mine secondary metabolite biosynthetic gene clusters

Kai Blin; Hyun Uk Kim; Marnix H Medema; Tilmann Weber

doi:10.1093/bib/bbx146

Expanding the Natural Products Heterologous Expression Repertoire in the Model Cyanobacterium Anabaena sp. Strain PCC 7120: Production of Pendolmycin and Teleocidin B-4

10.26434/chemrxiv.11316098.v1 ◽

2019 ◽

Cited By ~ 1

Author(s):

Patrick Videau ◽

Kaitlyn Wells ◽

Arun Singh ◽

Jessie Eiting ◽

Philip Proteau ◽

...

Keyword(s):

Natural Products ◽

Genome Mining ◽

Gene Clusters ◽

Combinatorial Biosynthesis ◽

Test Case ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Cyanobacterium Anabaena ◽

Anabaena Sp ◽

Pcc 7120

Cyanobacteria are prolific producers of natural products and genome mining has shown that many orphan biosynthetic gene clusters can be found in sequenced cyanobacterial genomes. New tools and methodologies are required to investigate these biosynthetic gene clusters and here we present the use of Anabaena sp. strain PCC 7120 as a host for combinatorial biosynthesis of natural products using the indolactam natural products (lyngbyatoxin A, pendolmycin, and teleocidin B-4) as a test case. We were able to successfully produce all three compounds using codon optimized genes from Actinobacteria. We also introduce a new plasmid backbone based on the native Anabaena7120 plasmid pCC7120ζ and show that production of teleocidin B-4 can be accomplished using a two-plasmid system, which can be introduced by co-conjugation.

Download Full-text

Recapitulation of the evolution of biosynthetic gene clusters reveals hidden chemical diversity on bacterial genomes

10.1101/020503 ◽

2015 ◽

Cited By ~ 6

Author(s):

Pablo Cruz-Morales ◽

Christian E. Martínez-Guerrero ◽

Marco A. Morales-Escalante ◽

Luis Yáñez-Guerra ◽

Johannes Florian Kopp ◽

...

Keyword(s):

Natural Products ◽

Chemical Space ◽

Streptomyces Coelicolor ◽

Genome Mining ◽

Gene Clusters ◽

Biosynthetic Gene Cluster ◽

Chemical Diversity ◽

Biosynthetic Gene ◽

Bacterial Genomes ◽

Biosynthetic Gene Clusters

AbstractNatural products have provided humans with antibiotics for millennia. However, a decline in the pace of chemical discovery exerts pressure on human health as antibiotic resistance spreads. The empirical nature of current genome mining approaches used for natural products research limits the chemical space that is explored. By integration of evolutionary concepts related to emergence of metabolism, we have gained fundamental insights that are translated into an alternative genome mining approach, termed EvoMining. As the founding assumption of EvoMining is the evolution of enzymes, we solved two milestone problems revealing unprecedented conversions. First, we report the biosynthetic gene cluster of the ‘orphan’ metabolite leupeptin in Streptomyces roseus. Second, we discover an enzyme involved in formation of an arsenic-carbon bond in Streptomyces coelicolor and Streptomyces lividans. This work provides evidence that bacterial chemical repertoire is underexploited, as well as an approach to accelerate the discovery of novel antibiotics from bacterial genomes.

Download Full-text

On the Risks of Phylogeny-Based Strain Prioritization for Drug Discovery: Streptomyces lunaelactis as a Case Study

Biomolecules ◽

10.3390/biom10071027 ◽

2020 ◽

Vol 10 (7) ◽

pp. 1027 ◽

Cited By ~ 1

Author(s):

Loïc Martinet ◽

Aymeric Naômé ◽

Dominique Baiwir ◽

Edwin De Pauw ◽

Gabriel Mazzucchelli ◽

...

Keyword(s):

Natural Products ◽

Drug Discovery ◽

Reference Strain ◽

Pattern Analysis ◽

Genome Mining ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Representative Strain

Strain prioritization for drug discovery aims at excluding redundant strains of a collection in order to limit the repetitive identification of the same molecules. In this work, we wanted to estimate what can be unexploited in terms of the amount, diversity, and novelty of compounds if the search is focused on only one single representative strain of a species, taking Streptomyces lunaelactis as a model. For this purpose, we selected 18 S. lunaelactis strains taxonomically clustered with the archetype strain S. lunaelactis MM109T. Genome mining of all S. lunaelactis isolated from the same cave revealed that 54% of the 42 biosynthetic gene clusters (BGCs) are strain specific, and five BGCs are not present in the reference strain MM109T. In addition, even when a BGC is conserved in all strains such as the bag/fev cluster involved in bagremycin and ferroverdin production, the compounds produced highly differ between the strains and previously unreported compounds are not produced by the archetype MM109T. Moreover, metabolomic pattern analysis uncovered important profile heterogeneity, confirming that identical BGC predisposition between two strains does not automatically imply chemical uniformity. In conclusion, trying to avoid strain redundancy based on phylogeny and genome mining information alone can compromise the discovery of new natural products and might prevent the exploitation of the best naturally engineered producers of specific molecules.

Download Full-text

Mining and unearthing hidden biosynthetic potential

Nature Communications ◽

10.1038/s41467-021-24133-5 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Kirstin Scherlach ◽

Christian Hertweck

Keyword(s):

Small Molecules ◽

Genome Mining ◽

Gene Clusters ◽

Ecological Interactions ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Metabolic Potential ◽

Analytical Tools ◽

Drug Leads ◽

Chemical Mediators

AbstractGenetically encoded small molecules (secondary metabolites) play eminent roles in ecological interactions, as pathogenicity factors and as drug leads. Yet, these chemical mediators often evade detection, and the discovery of novel entities is hampered by low production and high rediscovery rates. These limitations may be addressed by genome mining for biosynthetic gene clusters, thereby unveiling cryptic metabolic potential. The development of sophisticated data mining methods and genetic and analytical tools has enabled the discovery of an impressive array of previously overlooked natural products. This review shows the newest developments in the field, highlighting compound discovery from unconventional sources and microbiomes.

Download Full-text

Mining of Cyanobacterial Genomes Indicates That Plasmids Are Involved in the Production of Natural Products

10.21203/rs.3.rs-121751/v1 ◽

2020 ◽

Author(s):

Rafael Popin ◽

Danillo Alvarenga ◽

Raquel Castelo-Branco ◽

David Fewer ◽

Kaarina Sivonen

Keyword(s):

Natural Products ◽

Natural Product ◽

Biological Activities ◽

Gene Clusters ◽

Biosynthetic Pathways ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Chemical Structures ◽

Wide Range ◽

Genes Encoding

Abstract Background Microbial natural products have unique chemical structures and diverse biological activities. Cyanobacteria commonly possess a wide range of biosynthetic gene clusters to produce natural products. Several studies have mapped the distribution of natural product biosynthetic gene clusters in cyanobacterial genomes. However, little attention has been paid to natural product biosynthesis in plasmids. Some genes encoding cyanobacterial natural product biosynthetic pathways are believed to be dispersed by plasmids through horizontal gene transfer. Thus, we examined complete cyanobacterial genomes to assess if plasmids are involved in the production and dissemination of natural products by cyanobacteria.Results The 185 analyzed genomes possessed 1 to 42 gene clusters and an average of 10. In total, 1816 biosynthetic gene clusters were found. Approximately 95% of these clusters were present in chromosomes. The remaining 5% were present in plasmids, from which homologs of the biosynthetic pathways for aeruginosin, anabaenopeptin, ambiguine, cryptophycin, hassallidin, geosmin, and microcystin were manually curated. The cryptophycin pathway was previously described as active while the other gene cluster include all genes for biosynthesis. Approximately 12% of the 424 analyzed cyanobacterial plasmids contained homologs of genes involved in conjugation. Large plasmids, previously named as “chromids”, were also observed to be widespread in cyanobacteria. Sixteen cryptic natural product biosynthetic gene clusters and geosmin biosynthetic gene clusters were located in those mobile plasmids.Conclusion Homologues of genes involved in the production of toxins, protease inhibitors, odorous compounds, antimicrobials, antitumorals, and other unidentified natural products are located in cyanobacterial plasmids. Some of these plasmids are predicted to be conjugative. The present study provides in silico evidence that plasmids are involved in the distribution of natural product biosynthetic pathways in cyanobacteria.

Download Full-text

Synthetic Biology Advanced Natural Product Discovery

Metabolites ◽

10.3390/metabo11110785 ◽

2021 ◽

Vol 11 (11) ◽

pp. 785

Author(s):

Junyang Wang ◽

Jens Nielsen ◽

Zihe Liu

Keyword(s):

Natural Products ◽

Synthetic Biology ◽

Natural Product ◽

Rapid Development ◽

Genome Mining ◽

Gene Clusters ◽

Future Research ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Natural Product Discovery

A wide variety of bacteria, fungi and plants can produce bioactive secondary metabolites, which are often referred to as natural products. With the rapid development of DNA sequencing technology and bioinformatics, a large number of putative biosynthetic gene clusters have been reported. However, only a limited number of natural products have been discovered, as most biosynthetic gene clusters are not expressed or are expressed at extremely low levels under conventional laboratory conditions. With the rapid development of synthetic biology, advanced genome mining and engineering strategies have been reported and they provide new opportunities for discovery of natural products. This review discusses advances in recent years that can accelerate the design, build, test, and learn (DBTL) cycle of natural product discovery, and prospects trends and key challenges for future research directions.

Download Full-text

Identification of a Tambjamine Gene Cluster in Streptomyces Reveals Convergent Evolution of the Biosynthetic Pathway

10.26434/chemrxiv.12899264 ◽

2020 ◽

Author(s):

Neil L Grenade ◽

Dragos S. Chiriac ◽

Graeme W. Howe ◽

Avena Ross

Keyword(s):

Natural Products ◽

De Novo ◽

Sequence Similarity ◽

Genome Mining ◽

Gene Clusters ◽

Primary Metabolism ◽

Biosynthetic Gene Clusters ◽

Microbial Genomes ◽

Sequence Similarity Networks ◽

Related Compounds

Bacterial natural products are an immensely valuable source of therapeutics. As modern DNA sequencing efforts provide increasing numbers of microbial genomes, it is clear that the molecules produced by most natural product biosynthetic gene clusters (BGCs) remain unknown. Genome mining makes use of bioinformatic techniques to elucidate the natural products produced by these “orphan” BGCs. Here, we report the use of sequence similarity networks (SSNs) and genome neighborhood networks (GNNs) to identify an orphan BGC that is responsible for the production of the antitumor tambjamine BE-18591 in Streptomyces albus NRRL B-2362. Although BE-18591 is a close structural analogue of tambjamine YP1 produced by Pseudoalteromonas tunicata, the biosynthetic routes to produce these molecules differ significantly. Notably, the C12-alkylamine tail that is appended onto the bipyrrole core of tambjamine YP1 is derived from fatty acids siphoned from the primary metabolism of the pseudoalteromonad, whilst the S. albus NRRL B-2362 BGC encodes a dedicated system for the de novo biosynthesis of the alkylamine portion of tambjamine BE-18591. These remarkably different biosynthetic strategies represent a striking example of convergent BGC evolution, with selective pressure for the production of tambjamines seemingly leading to the emergence of separate biosynthetic pathways in pseudoalteromonads and streptomycetes that ultimately produce closely related compounds

Download Full-text

An interpreted atlas of biosynthetic gene clusters from 1,000 fungal genomes

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2020230118 ◽

2021 ◽

Vol 118 (19) ◽

pp. e2020230118

Author(s):

Matthew T. Robey ◽

Lindsay K. Caesar ◽

Milton T. Drott ◽

Nancy P. Keller ◽

Neil L. Kelleher

Keyword(s):

Natural Products ◽

Chemical Space ◽

Genome Mining ◽

Fold Increase ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Automated Annotation ◽

Fungal Genomes ◽

Species Specific

Fungi are prolific producers of natural products, compounds which have had a large societal impact as pharmaceuticals, mycotoxins, and agrochemicals. Despite the availability of over 1,000 fungal genomes and several decades of compound discovery efforts from fungi, the biosynthetic gene clusters (BGCs) encoded by these genomes and the associated chemical space have yet to be analyzed systematically. Here, we provide detailed annotation and analyses of fungal biosynthetic and chemical space to enable genome mining and discovery of fungal natural products. Using 1,037 genomes from species across the fungal kingdom (e.g., Ascomycota, Basidiomycota, and non-Dikarya taxa), 36,399 predicted BGCs were organized into a network of 12,067 gene cluster families (GCFs). Anchoring these GCFs with reference BGCs enabled automated annotation of 2,026 BGCs with predicted metabolite scaffolds. We performed parallel analyses of the chemical repertoire of fungi, organizing 15,213 fungal compounds into 2,945 molecular families (MFs). The taxonomic landscape of fungal GCFs is largely species specific, though select families such as the equisetin GCF are present across vast phylogenetic distances with parallel diversifications in the GCF and MF. We compare these fungal datasets with a set of 5,453 bacterial genomes and their BGCs and 9,382 bacterial compounds, revealing dramatic differences between bacterial and fungal biosynthetic logic and chemical space. These genomics and cheminformatics analyses reveal the large extent to which fungal and bacterial sources represent distinct compound reservoirs. With a >10-fold increase in the number of interpreted strains and annotated BGCs, this work better regularizes the biosynthetic potential of fungi for rational compound discovery.

Download Full-text

An Interpreted Atlas of Biosynthetic Gene Clusters from 1000 Fungal Genomes

10.1101/2020.09.21.307157 ◽

2020 ◽

Author(s):

Matthew T. Robey ◽

Lindsay K. Caesar ◽

Milton T. Drott ◽

Nancy P. Keller ◽

Neil L. Kelleher

Keyword(s):

Natural Products ◽

Large Scale ◽

Ad Hoc ◽

Chemical Space ◽

Genome Mining ◽

Fold Increase ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Fungal Genomes

AbstractFungi are prolific producers of natural products, compounds which have had a large societal impact as pharmaceuticals, mycotoxins, and agrochemicals. Despite the availability of over 1000 fungal genomes and several decades of compound discovery efforts from fungi, the biosynthetic gene clusters (BGCs) encoded by these genomes and the associated chemical space have yet to be analyzed systematically. Here we provide detailed annotation and analyses of fungal biosynthetic and chemical space to enable genome mining and discovery of fungal natural products. Using 1037 genomes from species across the fungal kingdom (e.g., Ascomycota, Basidiomycota, and non-Dikarya taxa), 36,399 predicted BGCs were organized into a network of 12,067 gene cluster families (GCFs). Anchoring these GCFs with reference BGCs enabled automated annotation of 2,026 BGCs with predicted metabolite scaffolds. We performed parallel analyses of the chemical repertoire of Fungi, organizing 15,213 fungal compounds into 2,945 molecular families (MFs). The taxonomic landscape of fungal GCFs is largely species-specific, though select families such as the equisetin GCF are present across vast phylogenetic distances with parallel diversifications in the GCF and MF. We compare these fungal datasets with a set of 5,453 bacterial genomes and their BGCs and 9,382 bacterial compounds, revealing dramatic differences between bacterial and fungal biosynthetic logic and chemical space. These genomics and cheminformatics analyses reveal the large extent to which fungal and bacterial sources represent distinct compound reservoirs. With a >10-fold increase in the number of interpreted strains and annotated BGCs, this work better regularizes the biosynthetic potential of fungi for rational compound discovery.Significance StatementFungi represent an underexploited resource for new compounds with applications in the pharmaceutical and agriscience industries. Despite the availability of >1000 fungal genomes, our knowledge of the biosynthetic space encoded by these genomes is limited and ad hoc. We present results from systematically organizing the biosynthetic content of 1037 fungal genomes, providing a resource for data-driven genome mining and large-scale comparison of the genetic and molecular repertoires produced in fungi and compare to those present in bacteria.

Download Full-text

Expanding the Natural Products Heterologous Expression Repertoire in the Model Cyanobacterium Anabaena sp. Strain PCC 7120: Production of Pendolmycin and Teleocidin B-4

10.26434/chemrxiv.11316098 ◽

2019 ◽

Author(s):

Patrick Videau ◽

Kaitlyn Wells ◽

Arun Singh ◽

Jessie Eiting ◽

Philip Proteau ◽

...

Keyword(s):

Natural Products ◽

Genome Mining ◽

Gene Clusters ◽

Combinatorial Biosynthesis ◽

Test Case ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Cyanobacterium Anabaena ◽

Anabaena Sp ◽

Pcc 7120

Cyanobacteria are prolific producers of natural products and genome mining has shown that many orphan biosynthetic gene clusters can be found in sequenced cyanobacterial genomes. New tools and methodologies are required to investigate these biosynthetic gene clusters and here we present the use of Anabaena sp. strain PCC 7120 as a host for combinatorial biosynthesis of natural products using the indolactam natural products (lyngbyatoxin A, pendolmycin, and teleocidin B-4) as a test case. We were able to successfully produce all three compounds using codon optimized genes from Actinobacteria. We also introduce a new plasmid backbone based on the native Anabaena7120 plasmid pCC7120ζ and show that production of teleocidin B-4 can be accomplished using a two-plasmid system, which can be introduced by co-conjugation.

Download Full-text