RESUMO
Many microorganisms produce natural products that are frequently used in the development of medicines and crop protection agents. Genome mining has evolved into a prominent method to access this potential. antiSMASH is the most popular tool for this task. Here we present version 4 of the antiSMASH database, providing biosynthetic gene clusters detected by antiSMASH 7.1 in publicly available, dereplicated, high-quality microbial genomes via an interactive graphical user interface. In version 4, the database contains 231 534 high quality BGC regions from 592 archaeal, 35 726 bacterial and 236 fungal genomes and is available at https://antismash-db.secondarymetabolites.org/.
Assuntos
Produtos Biológicos , Vias Biossintéticas , Bases de Dados Genéticas , Genoma Microbiano , Vias Biossintéticas/genética , Família Multigênica , SoftwareRESUMO
Filamentous Actinobacteria, recently renamed Actinomycetia, are the most prolific source of microbial bioactive natural products. Studies on biosynthetic gene clusters benefit from or require chromosome-level assemblies. Here, we provide DNA sequences from >1000 isolates: 881 complete genomes and 153 near-complete genomes, representing 28 genera and 389 species, including 244 likely novel species. All genomes are from filamentous isolates of the class Actinomycetia from the NBC culture collection. The largest genus is Streptomyces with 886 genomes including 742 complete assemblies. We use this data to show that analysis of complete genomes can bring biological understanding not previously derived from more fragmented sequences or less systematic datasets. We document the central and structured location of core genes and distal location of specialized metabolite biosynthetic gene clusters and duplicate core genes on the linear Streptomyces chromosome, and analyze the content and length of the terminal inverted repeats which are characteristic for Streptomyces. We then analyze the diversity of trans-AT polyketide synthase biosynthetic gene clusters, which encodes the machinery of a biotechnologically highly interesting compound class. These insights have both ecological and biotechnological implications in understanding the importance of high quality genomic resources and the complex role synteny plays in Actinomycetia biology.
Assuntos
Actinobacteria , Genoma Bacteriano , Família Multigênica , Policetídeo Sintases , Genoma Bacteriano/genética , Actinobacteria/genética , Actinobacteria/classificação , Actinobacteria/metabolismo , Policetídeo Sintases/genética , Policetídeo Sintases/metabolismo , Streptomyces/genética , Streptomyces/classificação , Streptomyces/metabolismo , Filogenia , Genômica/métodosRESUMO
Secondary metabolites are compounds not essential for an organism's development, but provide significant ecological and physiological benefits. These compounds have applications in medicine, biotechnology and agriculture. Their production is encoded in biosynthetic gene clusters (BGCs), groups of genes collectively directing their biosynthesis. The advent of metagenomics has allowed researchers to study BGCs directly from environmental samples, identifying numerous previously unknown BGCs encoding unprecedented chemistry. Here, we present the BGC Atlas (https://bgc-atlas.cs.uni-tuebingen.de), a web resource that facilitates the exploration and analysis of BGC diversity in metagenomes. The BGC Atlas identifies and clusters BGCs from publicly available datasets, offering a centralized database and a web interface for metadata-aware exploration of BGCs and gene cluster families (GCFs). We analyzed over 35 000 datasets from MGnify, identifying nearly 1.8 million BGCs, which were clustered into GCFs. The analysis showed that ribosomally synthesized and post-translationally modified peptides are the most abundant compound class, with most GCFs exhibiting high environmental specificity. We believe that our tool will enable researchers to easily explore and analyze the BGC diversity in environmental samples, significantly enhancing our understanding of bacterial secondary metabolites, and promote the identification of ecological and evolutionary factors shaping the biosynthetic potential of microbial communities.
RESUMO
Microorganisms produce small bioactive compounds as part of their secondary or specialised metabolism. Often, such metabolites have antimicrobial, anticancer, antifungal, antiviral or other bio-activities and thus play an important role for applications in medicine and agriculture. In the past decade, genome mining has become a widely-used method to explore, access, and analyse the available biodiversity of these compounds. Since 2011, the 'antibiotics and secondary metabolite analysis shell-antiSMASH' (https://antismash.secondarymetabolites.org/) has supported researchers in their microbial genome mining tasks, both as a free to use web server and as a standalone tool under an OSI-approved open source licence. It is currently the most widely used tool for detecting and characterising biosynthetic gene clusters (BGCs) in archaea, bacteria, and fungi. Here, we present the updated version 7 of antiSMASH. antiSMASH 7 increases the number of supported cluster types from 71 to 81, as well as containing improvements in the areas of chemical structure prediction, enzymatic assembly-line visualisation and gene cluster regulation.
Assuntos
Computadores , Software , Bactérias/genética , Bactérias/metabolismo , Archaea/genética , Genoma Microbiano , Família Multigênica , Metabolismo Secundário/genéticaRESUMO
With an ever-increasing amount of (meta)genomic data being deposited in sequence databases, (meta)genome mining for natural product biosynthetic pathways occupies a critical role in the discovery of novel pharmaceutical drugs, crop protection agents and biomaterials. The genes that encode these pathways are often organised into biosynthetic gene clusters (BGCs). In 2015, we defined the Minimum Information about a Biosynthetic Gene cluster (MIBiG): a standardised data format that describes the minimally required information to uniquely characterise a BGC. We simultaneously constructed an accompanying online database of BGCs, which has since been widely used by the community as a reference dataset for BGCs and was expanded to 2021 entries in 2019 (MIBiG 2.0). Here, we describe MIBiG 3.0, a database update comprising large-scale validation and re-annotation of existing entries and 661 new entries. Particular attention was paid to the annotation of compound structures and biological activities, as well as protein domain selectivities. Together, these new features keep the database up-to-date, and will provide new opportunities for the scientific community to use its freely available data, e.g. for the training of new machine learning models to predict sequence-structure-function relationships for diverse natural products. MIBiG 3.0 is accessible online at https://mibig.secondarymetabolites.org/.
Assuntos
Genoma , Genômica , Família Multigênica , Vias Biossintéticas/genéticaRESUMO
Genome analysis of Kutzneria sp. CA-103260 revealed a putative lipopeptide-encoding biosynthetic gene cluster (BGC) that was cloned into a bacterial artificial chromosome (BAC) and heterologously expressed in Streptomyces coelicolor M1152. As a result, a novel cyclic lipo-tetrapeptide containing two diaminopropionic acid residues and an exotic N,N-acetonide ring, kutzneridine A (1), was isolated and structurally characterized. Evaluation of the extraction conditions and isotope-labeling chemical modifications showed that the acetonide ring originated from acetone during isolation. The BGC was analyzed in silico and a biosynthetic pathway to 1 was proposed. Kutzneridine A displayed remarkable antibacterial activity against methicillin-resistant Staphylococcus aureus and vancomycin-resistant Enterococci.
Assuntos
Antibacterianos , Lipopeptídeos , Staphylococcus aureus Resistente à Meticilina , Testes de Sensibilidade Microbiana , Família Multigênica , Peptídeos Cíclicos , Antibacterianos/farmacologia , Antibacterianos/química , Antibacterianos/biossíntese , Staphylococcus aureus Resistente à Meticilina/efeitos dos fármacos , Lipopeptídeos/farmacologia , Lipopeptídeos/química , Lipopeptídeos/biossíntese , Lipopeptídeos/isolamento & purificação , Estrutura Molecular , Peptídeos Cíclicos/farmacologia , Peptídeos Cíclicos/química , Peptídeos Cíclicos/biossíntese , Enterococos Resistentes à Vancomicina/efeitos dos fármacos , Streptomyces coelicolor/genética , Streptomyces coelicolor/metabolismoRESUMO
As a result of the continuous evolution of drug resistant bacteria, new antibiotics are urgently needed. Encoded by biosynthetic gene clusters (BGCs), antibiotic compounds are mostly produced by bacteria. With the exponential increase in the number of publicly available, sequenced genomes and the advancements of BGC prediction tools, genome mining algorithms have uncovered millions of uncharacterized BGCs for further evaluation. Since compound identification and characterization remain bottlenecks, a major challenge is prioritizing promising BGCs. Recently, researchers adopted self-resistance based strategies allowing them to predict the biological activities of natural products encoded by uncharacterized BGCs. Since 2017, the Antibiotic Resistant Target Seeker (ARTS) facilitated this so-called target-directed genome mining (TDGM) approach for the prioritization of BGCs encoding potentially novel antibiotics. Here, we present the ARTS database, available at https://arts-db.ziemertlab.com/. The ARTS database provides pre-computed ARTS results for >70,000 genomes and metagenome assembled genomes in total. Advanced search queries allow users to rapidly explore the fundamental criteria of TDGM such as BGC proximity, duplication and horizontal gene transfers of essential housekeeping genes. Furthermore, the ARTS database provides results interconnected throughout the bacterial kingdom as well as links to known databases in natural product research.
Assuntos
Bases de Dados Factuais , Farmacorresistência Bacteriana/genética , Metagenoma/genética , Software , Antibacterianos , Bactérias/efeitos dos fármacos , Bactérias/genética , Vias Biossintéticas/efeitos dos fármacos , Vias Biossintéticas/genética , Transferência Genética Horizontal/genética , Genoma BacterianoRESUMO
Computational analysis of biosynthetic gene clusters (BGCs) has revolutionized natural product discovery by enabling the rapid investigation of secondary metabolic potential within microbial genome sequences. Grouping homologous BGCs into Gene Cluster Families (GCFs) facilitates mapping their architectural and taxonomic diversity and provides insights into the novelty of putative BGCs, through dereplication with BGCs of known function. While multiple databases exist for exploring BGCs from publicly available data, no public resources exist that focus on GCF relationships. Here, we present BiG-FAM, a database of 29,955 GCFs capturing the global diversity of 1,225,071 BGCs predicted from 209,206 publicly available microbial genomes and metagenome-assembled genomes (MAGs). The database offers rich functionalities, such as multi-criterion GCF searches, direct links to BGC databases such as antiSMASH-DB, and rapid GCF annotation of user-supplied BGCs from antiSMASH results. BiG-FAM can be accessed online at https://bigfam.bioinformatics.nl.
Assuntos
Vias Biossintéticas/genética , Bases de Dados Genéticas , Família Multigênica , Clostridium/genética , Ferramenta de Busca , Streptomyces/genéticaRESUMO
Microorganisms produce natural products that are frequently used in the development of antibacterial, antiviral, and anticancer drugs, pesticides, herbicides, or fungicides. In recent years, genome mining has evolved into a prominent method to access this potential. antiSMASH is one of the most popular tools for this task. Here, we present version 3 of the antiSMASH database, providing a means to access and query precomputed antiSMASH-5.2-detected biosynthetic gene clusters from representative, publicly available, high-quality microbial genomes via an interactive graphical user interface. In version 3, the database contains 147 517 high quality BGC regions from 388 archaeal, 25 236 bacterial and 177 fungal genomes and is available at https://antismash-db.secondarymetabolites.org/.
Assuntos
Mineração de Dados , Bases de Dados como Assunto , Enzimas/classificação , Vias Biossintéticas/genética , Família Multigênica , Ferramenta de BuscaRESUMO
Many microorganisms produce natural products that form the basis of antimicrobials, antivirals, and other drugs. Genome mining is routinely used to complement screening-based workflows to discover novel natural products. Since 2011, the "antibiotics and secondary metabolite analysis shell-antiSMASH" (https://antismash.secondarymetabolites.org/) has supported researchers in their microbial genome mining tasks, both as a free-to-use web server and as a standalone tool under an OSI-approved open-source license. It is currently the most widely used tool for detecting and characterising biosynthetic gene clusters (BGCs) in bacteria and fungi. Here, we present the updated version 6 of antiSMASH. antiSMASH 6 increases the number of supported cluster types from 58 to 71, displays the modular structure of multi-modular BGCs, adds a new BGC comparison algorithm, allows for the integration of results from other prediction tools, and more effectively detects tailoring enzymes in RiPP clusters.
Assuntos
Produtos Biológicos/metabolismo , Genoma Microbiano , Software , Bactérias/genética , Vias Biossintéticas/genética , Fungos/genética , Metabolismo Secundário/genéticaRESUMO
Multi-drug resistant pathogens have become a major threat to human health and new antibiotics are urgently needed. Most antibiotics are derived from secondary metabolites produced by bacteria. In order to avoid suicide, these bacteria usually encode resistance genes, in some cases within the biosynthetic gene cluster (BGC) of the respective antibiotic compound. Modern genome mining tools enable researchers to computationally detect and predict BGCs that encode the biosynthesis of secondary metabolites. The major challenge now is the prioritization of the most promising BGCs encoding antibiotics with novel modes of action. A recently developed target-directed genome mining approach allows researchers to predict the mode of action of the encoded compound of an uncharacterized BGC based on the presence of resistant target genes. In 2017, we introduced the 'Antibiotic Resistant Target Seeker' (ARTS). ARTS allows for specific and efficient genome mining for antibiotics with interesting and novel targets by rapidly linking housekeeping and known resistance genes to BGC proximity, duplication and horizontal gene transfer (HGT) events. Here, we present ARTS 2.0 available at http://arts.ziemertlab.com. ARTS 2.0 now includes options for automated target directed genome mining in all bacterial taxa as well as metagenomic data. Furthermore, it enables comparison of similar BGCs from different genomes and their putative resistance genes.
Assuntos
Farmacorresistência Bacteriana/genética , Genoma Bacteriano , Software , Vias Biossintéticas/genética , Mineração de Dados , Genes Bacterianos , MetagenômicaRESUMO
Fueled by the explosion of (meta)genomic data, genome mining of specialized metabolites has become a major technology for drug discovery and studying microbiome ecology. In these efforts, computational tools like antiSMASH have played a central role through the analysis of Biosynthetic Gene Clusters (BGCs). Thousands of candidate BGCs from microbial genomes have been identified and stored in public databases. Interpreting the function and novelty of these predicted BGCs requires comparison with a well-documented set of BGCs of known function. The MIBiG (Minimum Information about a Biosynthetic Gene Cluster) Data Standard and Repository was established in 2015 to enable curation and storage of known BGCs. Here, we present MIBiG 2.0, which encompasses major updates to the schema, the data, and the online repository itself. Over the past five years, 851 new BGCs have been added. Additionally, we performed extensive manual data curation of all entries to improve the annotation quality of our repository. We also redesigned the data schema to ensure the compliance of future annotations. Finally, we improved the user experience by adding new features such as query searches and a statistics page, and enabled direct link-outs to chemical structure databases. The repository is accessible online at https://mibig.secondarymetabolites.org/.
Assuntos
Bases de Dados Genéticas , Genoma Bacteriano , Genômica/métodos , Família Multigênica , Software , Vias Biossintéticas/genética , Anotação de Sequência MolecularRESUMO
Streptomycetes serve as major producers of various pharmacologically and industrially important natural products. Although CRISPR-Cas9 systems have been developed for more robust genetic manipulations, concerns of genome instability caused by the DNA double-strand breaks (DSBs) and the toxicity of Cas9 remain. To overcome these limitations, here we report development of the DSB-free, single-nucleotide-resolution genome editing system CRISPR-BEST (CRISPR-Base Editing SysTem), which comprises a cytidine (CRISPR-cBEST) and an adenosine (CRISPR-aBEST) deaminase-based base editor. Specifically targeted by an sgRNA, CRISPR-cBEST can efficiently convert a C:G base pair to a T:A base pair and CRISPR-aBEST can convert an A:T base pair to a G:C base pair within a window of approximately 7 and 6 nucleotides, respectively. CRISPR-BEST was validated and successfully used in different Streptomyces species. Particularly in nonmodel actinomycete Streptomyces collinus Tü365, CRISPR-cBEST efficiently inactivated the 2 copies of kirN gene that are in the duplicated kirromycin biosynthetic pathways simultaneously by STOP codon introduction. Generating such a knockout mutant repeatedly failed using the conventional DSB-based CRISPR-Cas9. An unbiased, genome-wide off-target evaluation indicates the high fidelity and applicability of CRISPR-BEST. Furthermore, the system supports multiplexed editing with a single plasmid by providing a Csy4-based sgRNA processing machinery. To simplify the protospacer identification process, we also updated the CRISPy-web (https://crispy.secondarymetabolites.org), and now it allows designing sgRNAs specifically for CRISPR-BEST applications.
Assuntos
Sistemas CRISPR-Cas , Edição de Genes , Streptomyces coelicolor/genética , DNA Bacteriano/genética , Regulação Bacteriana da Expressão Gênica , Genoma Bacteriano , Estudo de Associação Genômica Ampla , PlasmídeosRESUMO
Many drugs are derived from small molecules produced by microorganisms and plants, so-called natural products. Natural products have diverse chemical structures, but the biosynthetic pathways producing those compounds are often organized as biosynthetic gene clusters (BGCs) and follow a highly conserved biosynthetic logic. This allows for the identification of core biosynthetic enzymes using genome mining strategies that are based on the sequence similarity of the involved enzymes/genes. However, mining for a variety of BGCs quickly approaches a complexity level where manual analyses are no longer possible and require the use of automated genome mining pipelines, such as the antiSMASH software. In this review, we discuss the principles underlying the predictions of antiSMASH and other tools and provide practical advice for their application. Furthermore, we discuss important caveats such as rule-based BGC detection, sequence and annotation quality and cluster boundary prediction, which all have to be considered while planning for, performing and analyzing the results of genome mining studies.
Assuntos
Vias Biossintéticas/genética , Família Multigênica , Software , Produtos Biológicos/metabolismo , Biologia Computacional/métodos , Mineração de Dados/métodos , Bases de Dados Genéticas , Genoma Microbiano , Genoma de Planta , Internet , Modelos BiológicosRESUMO
Secondary metabolites produced by bacteria and fungi are an important source of antimicrobials and other bioactive compounds. In recent years, genome mining has seen broad applications in identifying and characterizing new compounds as well as in metabolic engineering. Since 2011, the 'antibiotics and secondary metabolite analysis shell-antiSMASH' (https://antismash.secondarymetabolites.org) has assisted researchers in this, both as a web server and a standalone tool. It has established itself as the most widely used tool for identifying and analysing biosynthetic gene clusters (BGCs) in bacterial and fungal genome sequences. Here, we present an entirely redesigned and extended version 5 of antiSMASH. antiSMASH 5 adds detection rules for clusters encoding the biosynthesis of acyl-amino acids, ß-lactones, fungal RiPPs, RaS-RiPPs, polybrominated diphenyl ethers, C-nucleosides, PPY-like ketones and lipolanthines. For type II polyketide synthase-encoding gene clusters, antiSMASH 5 now offers more detailed predictions. The HTML output visualization has been redesigned to improve the navigation and visual representation of annotations. We have again improved the runtime of analysis steps, making it possible to deliver comprehensive annotations for bacterial genomes within a few minutes. A new output file in the standard JavaScript object notation (JSON) format is aimed at downstream tools that process antiSMASH results programmatically.
Assuntos
Genoma Bacteriano/genética , Genoma Fúngico/genética , Genômica , Software , Bactérias/genética , Vias Biossintéticas/genética , Biologia Computacional , Mineração de Dados , Fungos/genética , InternetRESUMO
Natural products originating from microorganisms are frequently used in antimicrobial and anticancer drugs, pesticides, herbicides or fungicides. In the last years, the increasing availability of microbial genome data has made it possible to access the wealth of biosynthetic clusters responsible for the production of these compounds by genome mining. antiSMASH is one of the most popular tools in this field. The antiSMASH database provides pre-computed antiSMASH results for many publicly available microbial genomes and allows for advanced cross-genome searches. The current version 2 of the antiSMASH database contains annotations for 6200 full bacterial genomes and 18,576 bacterial draft genomes and is available at https://antismash-db.secondarymetabolites.org/.
Assuntos
Bases de Dados Genéticas , Genoma Bacteriano , Anotação de Sequência Molecular , Metabolismo Secundário/genética , Família Multigênica , SoftwareRESUMO
By culturing microorganisms under standard laboratory conditions, most biosynthetic gene clusters (BGCs) are not expressed, and thus, the products are not produced. To explore this biosynthetic potential, we developed a novel "semi-targeted" approach focusing on activating "silent" BGCs by concurrently introducing a group of regulator genes into streptomycetes of the Tübingen strain collection. We constructed integrative plasmids containing two classes of regulatory genes under the control of the constitutive promoter ermE*p (cluster situated regulators (CSR) and Streptomyces antibiotic regulatory proteins (SARPs)). These plasmids were introduced into Streptomyces sp. TÜ17, Streptomyces sp. TÜ10 and Streptomyces sp. TÜ102. Introduction of the CSRs-plasmid into strain S. sp. TÜ17 activated the production of mayamycin A. By using the individual regulator genes, we proved that Aur1P, was responsible for the activation. In strain S. sp. TÜ102, the introduction of the SARP-plasmid triggered the production of a chartreusin-like compound. Insertion of the CSRs-plasmid into strain S. sp. TÜ10 resulted in activating the warkmycin-BGC. In both recombinants, activation of the BGCs was only possible through the simultaneous expression of aur1PR3 and griR in S. sp. TÜ102 and aur1P and pntR in of S. sp. TÜ10.
Assuntos
Proteínas de Bactérias/genética , Benzo(a)Antracenos/metabolismo , Família Multigênica , Proteínas Recombinantes/genética , Streptomyces/genética , Proteínas de Bactérias/metabolismo , Benzopiranos , Regulação Bacteriana da Expressão Gênica , Glicosídeos/biossíntese , Regiões Promotoras Genéticas , Proteínas Recombinantes/metabolismo , Streptomyces/crescimento & desenvolvimento , Streptomyces/metabolismo , Fatores de Transcrição/metabolismo , Trissacarídeos/biossínteseRESUMO
Patterns in biological sequences frequently signify interesting features in the underlying molecule. Many tools exist to search for well-known patterns. Less support is available for exploratory analysis, where no well-defined patterns are known yet. PatScanUI (https://patscan.secondarymetabolites.org/) provides a highly interactive web interface to the powerful generic pattern search tool PatScan. The complex PatScan-patterns are created in a drag-and-drop aware interface allowing researchers to do rapid prototyping of the often complicated patterns useful to identifying features of interest.
Assuntos
Análise de Sequência de DNA/métodos , Análise de Sequência de Proteína/métodos , Software , Bactérias/genética , Internet , Ferro , Elementos de Resposta , Interface Usuário-ComputadorRESUMO
Secondary metabolites produced by microorganisms are the main source of bioactive compounds that are in use as antimicrobial and anticancer drugs, fungicides, herbicides and pesticides. In the last decade, the increasing availability of microbial genomes has established genome mining as a very important method for the identification of their biosynthetic gene clusters (BGCs). One of the most popular tools for this task is antiSMASH. However, so far, antiSMASH is limited to de novo computing results for user-submitted genomes and only partially connects these with BGCs from other organisms. Therefore, we developed the antiSMASH database, a simple but highly useful new resource to browse antiSMASH-annotated BGCs in the currently 3907 bacterial genomes in the database and perform advanced search queries combining multiple search criteria. antiSMASH-DB is available at http://antismash-db.secondarymetabolites.org/.
Assuntos
Vias Biossintéticas , Bases de Dados Factuais , Microbiologia , Metabolismo Secundário , Vias Biossintéticas/genética , Biologia Computacional/métodos , Regulação da Expressão Gênica , Processamento de Proteína Pós-Traducional , Metabolismo Secundário/genética , NavegadorRESUMO
Plant specialized metabolites are chemically highly diverse, play key roles in host-microbe interactions, have important nutritional value in crops and are frequently applied as medicines. It has recently become clear that plant biosynthetic pathway-encoding genes are sometimes densely clustered in specific genomic loci: biosynthetic gene clusters (BGCs). Here, we introduce plantiSMASH, a versatile online analysis platform that automates the identification of candidate plant BGCs. Moreover, it allows integration of transcriptomic data to prioritize candidate BGCs based on the coexpression patterns of predicted biosynthetic enzyme-coding genes, and facilitates comparative genomic analysis to study the evolutionary conservation of each cluster. Applied on 48 high-quality plant genomes, plantiSMASH identifies a rich diversity of candidate plant BGCs. These results will guide further experimental exploration of the nature and dynamics of gene clustering in plant metabolism. Moreover, spurred by the continuing decrease in costs of plant genome sequencing, they will allow genome mining technologies to be applied to plant natural product discovery. The plantiSMASH web server, precalculated results and source code are freely available from http://plantismash.secondarymetabolites.org.