Your browser doesn't support javascript.
loading
BGCFlow: systematic pangenome workflow for the analysis of biosynthetic gene clusters across large genomic datasets.
Nuhamunada, Matin; Mohite, Omkar S; Phaneuf, Patrick V; Palsson, Bernhard O; Weber, Tilmann.
Afiliação
  • Nuhamunada M; The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby 2800, Denmark.
  • Mohite OS; The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby 2800, Denmark.
  • Phaneuf PV; The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby 2800, Denmark.
  • Palsson BO; The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby 2800, Denmark.
  • Weber T; Department of Bioengineering, University of California San Diego, La Jolla, CA 92093, USA.
Nucleic Acids Res ; 52(10): 5478-5495, 2024 Jun 10.
Article em En | MEDLINE | ID: mdl-38686794
ABSTRACT
Genome mining is revolutionizing natural products discovery efforts. The rapid increase in available genomes demands comprehensive computational platforms to effectively extract biosynthetic knowledge encoded across bacterial pangenomes. Here, we present BGCFlow, a novel systematic workflow integrating analytics for large-scale genome mining of bacterial pangenomes. BGCFlow incorporates several genome analytics and mining tools grouped into five common stages of analysis such as (i) data selection, (ii) functional annotation, (iii) phylogenetic analysis, (iv) genome mining, and (v) comparative analysis. Furthermore, BGCFlow provides easy configuration of different projects, parallel distribution, scheduled job monitoring, an interactive database to visualize tables, exploratory Jupyter Notebooks, and customized reports. Here, we demonstrate the application of BGCFlow by investigating the phylogenetic distribution of various biosynthetic gene clusters detected across 42 genomes of the Saccharopolyspora genus, known to produce industrially important secondary/specialized metabolites. The BGCFlow-guided analysis predicted more accurate dereplication of BGCs and guided the targeted comparative analysis of selected RiPPs. The scalable, interoperable, adaptable, re-entrant, and reproducible nature of the BGCFlow will provide an effective novel way to extract the biosynthetic knowledge from the ever-growing genomic datasets of biotechnologically relevant bacterial species.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Família Multigênica / Genômica / Vias Biossintéticas / Fluxo de Trabalho Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Família Multigênica / Genômica / Vias Biossintéticas / Fluxo de Trabalho Idioma: En Ano de publicação: 2024 Tipo de documento: Article