RESUMEN
The human microbiome has emerged as a rich source of diverse and bioactive natural products, harboring immense potential for therapeutic applications. To facilitate systematic exploration and analysis of its biosynthetic landscape, we present ABC-HuMi: the Atlas of Biosynthetic Gene Clusters (BGCs) in the Human Microbiome. ABC-HuMi integrates data from major human microbiome sequence databases and provides an expansive repository of BGCs compared to the limited coverage offered by existing resources. Employing state-of-the-art BGC prediction and analysis tools, our database ensures accurate annotation and enhanced prediction capabilities. ABC-HuMi empowers researchers with advanced browsing, filtering, and search functionality, enabling efficient exploration of the resource. At present, ABC-HuMi boasts a catalog of 19 218 representative BGCs derived from the human gut, oral, skin, respiratory and urogenital systems. By capturing the intricate biosynthetic potential across diverse human body sites, our database fosters profound insights into the molecular repertoire encoded within the human microbiome and offers a comprehensive resource for the discovery and characterization of novel bioactive compounds. The database is freely accessible at https://www.ccb.uni-saarland.de/abc_humi/.
Asunto(s)
Vías Biosintéticas , Bases de Datos Genéticas , Microbiota , Familia de Multigenes , Humanos , Vías Biosintéticas/genética , Biología Computacional/instrumentación , Internet , Microbiota/genética , Familia de Multigenes/genética , Metagenoma/genéticaRESUMEN
Secondary metabolites are compounds not essential for an organism's development, but provide significant ecological and physiological benefits. These compounds have applications in medicine, biotechnology and agriculture. Their production is encoded in biosynthetic gene clusters (BGCs), groups of genes collectively directing their biosynthesis. The advent of metagenomics has allowed researchers to study BGCs directly from environmental samples, identifying numerous previously unknown BGCs encoding unprecedented chemistry. Here, we present the BGC Atlas (https://bgc-atlas.cs.uni-tuebingen.de), a web resource that facilitates the exploration and analysis of BGC diversity in metagenomes. The BGC Atlas identifies and clusters BGCs from publicly available datasets, offering a centralized database and a web interface for metadata-aware exploration of BGCs and gene cluster families (GCFs). We analyzed over 35 000 datasets from MGnify, identifying nearly 1.8 million BGCs, which were clustered into GCFs. The analysis showed that ribosomally synthesized and post-translationally modified peptides are the most abundant compound class, with most GCFs exhibiting high environmental specificity. We believe that our tool will enable researchers to easily explore and analyze the BGC diversity in environmental samples, significantly enhancing our understanding of bacterial secondary metabolites, and promote the identification of ecological and evolutionary factors shaping the biosynthetic potential of microbial communities.
RESUMEN
Evaluating metagenomic software is key for optimizing metagenome interpretation and focus of the Initiative for the Critical Assessment of Metagenome Interpretation (CAMI). The CAMI II challenge engaged the community to assess methods on realistic and complex datasets with long- and short-read sequences, created computationally from around 1,700 new and known genomes, as well as 600 new plasmids and viruses. Here we analyze 5,002 results by 76 program versions. Substantial improvements were seen in assembly, some due to long-read data. Related strains still were challenging for assembly and genome recovery through binning, as was assembly quality for the latter. Profilers markedly matured, with taxon profilers and binners excelling at higher bacterial ranks, but underperforming for viruses and Archaea. Clinical pathogen detection results revealed a need to improve reproducibility. Runtime and memory usage analyses identified efficient programs, including top performers with other metrics. The results identify challenges and guide researchers in selecting methods for analyses.
Asunto(s)
Metagenoma , Metagenómica , Archaea/genética , Metagenómica/métodos , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN , Programas InformáticosRESUMEN
Selecting proper genome assembly is key for downstream analysis in genomics studies. However, the availability of many genome assembly tools and the huge variety of their running parameters challenge this task. The existing online evaluation tools are limited to specific taxa or provide just a one-sided view on the assembly quality. We present WebQUAST, a web server for multifaceted quality assessment and comparison of genome assemblies based on the state-of-the-art QUAST tool. The server is freely available at https://www.ccb.uni-saarland.de/quast/. WebQUAST can handle an unlimited number of genome assemblies and evaluate them against a user-provided or pre-loaded reference genome or in a completely reference-free fashion. We demonstrate key WebQUAST features in three common evaluation scenarios: assembly of an unknown species, a model organism, and a close variant of it.
Asunto(s)
Genómica , Programas Informáticos , Genoma , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN , InternetRESUMEN
The lymphatic system is critical in fluid balance homeostasis. Yet, until recently, lymphatic imaging has been outside of mainstream medicine due to a lack of robust imaging and interventional options. However, during the last 20 years, both clinical lymphatic imaging and interventions have shown dramatic advancement. The key to imaging advancement has been the interstitial delivery of contrast agents through lymphatic-rich tissues. These techniques include intranodal lymphangiography and dynamic contrast-enhanced MR lymphangiography. These methods provide the ability to image and recognize lymphatic anatomy and pathologic conditions. Percutaneous thoracic duct catheterization and embolization became the first widely accepted interventional technique for the management of chyle leaks. Advances in interstitial lymphatic embolization, as well as liver and mesenteric lymphatic interventions, have broadened the scope of possible lymphatic interventions. Also, recent techniques of lymphatic decompression allow for the treatment of a variety of lymphatic disorders. Finally, immunologic studies of central lymphatic fluid reveal the potential of lymphatic interventions on immunity. These advances herald an exciting new chapter for lymphatic imaging and interventions in the coming years.
Asunto(s)
Embolización Terapéutica , Vasos Linfáticos , Humanos , Medios de Contraste , Imagen por Resonancia Magnética/métodos , Sistema Linfático , Linfografía/métodos , Embolización Terapéutica/métodosRESUMEN
Long-read sequencing technologies have substantially improved the assemblies of many isolate bacterial genomes as compared to fragmented short-read assemblies. However, assembling complex metagenomic datasets remains difficult even for state-of-the-art long-read assemblers. Here we present metaFlye, which addresses important long-read metagenomic assembly challenges, such as uneven bacterial composition and intra-species heterogeneity. First, we benchmarked metaFlye using simulated and mock bacterial communities and show that it consistently produces assemblies with better completeness and contiguity than state-of-the-art long-read assemblers. Second, we performed long-read sequencing of the sheep microbiome and applied metaFlye to reconstruct 63 complete or nearly complete bacterial genomes within single contigs. Finally, we show that long-read assembly of human microbiomes enables the discovery of full-length biosynthetic gene clusters that encode biomedically important natural products.
Asunto(s)
Genoma Bacteriano/genética , Genoma Humano/genética , Metagenoma/genética , Metagenómica/métodos , Microbiota/genética , Algoritmos , Animales , Benchmarking , Microbioma Gastrointestinal/genética , Humanos , Análisis de Secuencia de ADN/métodos , Ovinos , Programas Informáticos , Especificidad de la EspecieRESUMEN
Molecular networking has become a key method to visualize and annotate the chemical space in non-targeted mass spectrometry data. We present feature-based molecular networking (FBMN) as an analysis method in the Global Natural Products Social Molecular Networking (GNPS) infrastructure that builds on chromatographic feature detection and alignment tools. FBMN enables quantitative analysis and resolution of isomers, including from ion mobility spectrometry.
Asunto(s)
Productos Biológicos/química , Espectrometría de Masas , Biología Computacional/métodos , Bases de Datos Factuales , Metabolómica/métodos , Programas InformáticosRESUMEN
Background Transarterial embolization (TAE) is the most common treatment for hepatocellular carcinoma (HCC); however, there remain limited data describing the influence of TAE on the tumor immune microenvironment. Purpose To characterize TAE-induced modulation of the tumor immune microenvironment in a rat model of HCC and identify factors that modulate this response. Materials and Methods TAE was performed on autochthonous HCCs induced in rats with use of diethylnitrosamine. CD3, CD4, CD8, and FOXP3 lymphocytes, as well as programmed cell death protein ligand-1 (PD-L1) expression, were examined in three cohorts: tumors from rats that did not undergo embolization (control), embolized tumors (target), and nonembolized tumors from rats that had a different target tumor embolized (nontarget). Differences in immune cell recruitment associated with embolic agent type (tris-acryl gelatin microspheres [TAGM] vs hydrogel embolics) and vascular location were examined in rat and human tissues. A generalized estimating equation model and t, Mann-Whitney U, and χ2 tests were used to compare groups. Results Cirrhosis-induced alterations in CD8, CD4, and CD25/CD4 lymphocytes were partially normalized following TAE (CD8: 38.4%, CD4: 57.6%, and CD25/CD4: 21.1% in embolized liver vs 47.7% [P = .02], 47.0% [P = .01], and 34.9% [P = .03], respectively, in cirrhotic liver [36.1%, 59.6%, and 4.6% in normal liver]). Embolized tumors had a greater number of CD3, CD4, and CD8 tumor-infiltrating lymphocytes relative to controls (191.4 cells/mm2 vs 106.7 cells/mm2 [P = .03]; 127.8 cells/mm2 vs 53.8 cells/mm2 [P < .001]; and 131.4 cells/mm2 vs 78.3 cells/mm2 [P = .01]) as well as a higher PD-L1 expression score (4.1 au vs 1.9 au [P < .001]). A greater number of CD3, CD4, and CD8 lymphocytes were found near TAGM versus hydrogel embolics (4.1 vs 2.0 [P = .003]; 3.7 vs 2.0 [P = .01]; and 2.2 vs 1.1 [P = .03], respectively). The number of lymphocytes adjacent to embolics differed based on vascular location (17.9 extravascular CD68+ peri-TAGM cells vs 7.0 intravascular [P < .001]; 6.4 extravascular CD68+ peri-hydrogel embolic cells vs 3.4 intravascular [P < .001]). Conclusion Transarterial embolization-induced dynamic alterations of the tumor immune microenvironment are influenced by underlying liver disease, embolic agent type, and vascular location. © RSNA, 2022 Online supplemental material is available for this article. See also the editorials by Kennedy et al and by White in this issue.
Asunto(s)
Carcinoma Hepatocelular , Neoplasias Hepáticas , Animales , Antígeno B7-H1 , Carcinoma Hepatocelular/patología , Humanos , Hidrogeles , Inmunidad , Neoplasias Hepáticas/patología , Ratas , Microambiente TumoralRESUMEN
MOTIVATION: Extra-long tandem repeats (ETRs) are widespread in eukaryotic genomes and play an important role in fundamental cellular processes, such as chromosome segregation. Although emerging long-read technologies have enabled ETR assemblies, the accuracy of such assemblies is difficult to evaluate since there are no tools for their quality assessment. Moreover, since the mapping of error-prone reads to ETRs remains an open problem, it is not clear how to polish draft ETR assemblies. RESULTS: To address these problems, we developed the TandemTools software that includes the TandemMapper tool for mapping reads to ETRs and the TandemQUAST tool for polishing ETR assemblies and their quality assessment. We demonstrate that TandemTools not only reveals errors in ETR assemblies but also improves the recently generated assemblies of human centromeres. AVAILABILITY AND IMPLEMENTATION: https://github.com/ablab/TandemTools. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Programas Informáticos , Eucariontes , Humanos , Análisis de Secuencia de ADN , Secuencias Repetidas en TándemRESUMEN
Chyluria is the leakage of intestinal lymph (chyle) into the urine. Novel lymphatic intervention techniques, such as interstitial lymphatic embolization, proved to be a useful treatment option for chyluria. However, one of the challenges of this approach is the difficulty in identifying connections between the lymphatic system and kidney collecting system. Here, embolization of the abnormal lymphatic connection through retrograde thoracic duct access in 3 chyluria patients is introduced.
Asunto(s)
Quilo , Embolización Terapéutica , Enbucrilato/administración & dosificación , Enfermedades Linfáticas/terapia , Conducto Torácico , Adulto , Anciano , Quilo/diagnóstico por imagen , Femenino , Humanos , Enfermedades Linfáticas/diagnóstico por imagen , Enfermedades Linfáticas/orina , Linfografía , Imagen por Resonancia Magnética , Persona de Mediana Edad , Conducto Torácico/diagnóstico por imagen , Resultado del Tratamiento , Ultrasonografía IntervencionalRESUMEN
Microbial natural products are important for the understanding of microbial interactions, chemical defense and communication, and have also served as an inspirational source for numerous pharmaceutical drugs. Tropical marine cyanobacteria have been highlighted as a great source of new natural products, however, few reports have appeared wherein a multi-omics approach has been used to study their natural products potential (i.e., reports are often focused on an individual natural product and its biosynthesis). This study focuses on describing the natural product genetic potential as well as the expressed natural product molecules in benthic tropical cyanobacteria. We collected from several sites around the world and sequenced the genomes of 24 tropical filamentous marine cyanobacteria. The informatics program antiSMASH was used to annotate the major classes of gene clusters. BiG-SCAPE phylum-wide analysis revealed the most promising strains for natural product discovery among these cyanobacteria. LCMS/MS-based metabolomics highlighted the most abundant molecules and molecular classes among 10 of these marine cyanobacterial samples. We observed that despite many genes encoding for peptidic natural products, peptides were not as abundant as lipids and lipopeptides in the chemical extracts. Our results highlight a number of highly interesting biosynthetic gene clusters for genome mining among these cyanobacterial samples.
Asunto(s)
Productos Biológicos/farmacología , Cianobacterias/química , Cromatografía Líquida de Alta Presión , Cianobacterias/genética , Genoma Bacteriano , Genómica , Biología Marina , Espectrometría de Masas , Metabolómica , Familia de Multigenes , Filogenia , Clima TropicalRESUMEN
Methods for assembly, taxonomic profiling and binning are key to interpreting metagenome data, but a lack of consensus about benchmarking complicates performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on highly complex and realistic data sets, generated from â¼700 newly sequenced microorganisms and â¼600 novel viruses and plasmids and representing common experimental setups. Assembly and genome binning programs performed well for species represented by individual genomes but were substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below family level. Parameter settings markedly affected performance, underscoring their importance for program reproducibility. The CAMI results highlight current challenges but also provide a roadmap for software selection to answer specific research questions.
Asunto(s)
Metagenómica , Programas Informáticos , Algoritmos , Benchmarking , Análisis de Secuencia de ADNRESUMEN
MOTIVATION: Peptidic natural products (PNPs) are considered a promising compound class that has many applications in medicine. Recently developed mass spectrometry-based pipelines are transforming PNP discovery into a high-throughput technology. However, the current computational methods for PNP identification via database search of mass spectra are still in their infancy and could be substantially improved. RESULTS: Here we present NPS, a statistical learning-based approach for scoring PNP-spectrum matches. We incorporated NPS into two leading PNP discovery tools and benchmarked them on millions of natural product mass spectra. The results demonstrate more than 45% increase in the number of identified spectra and 20% more found PNPs at a false discovery rate of 1%. AVAILABILITY AND IMPLEMENTATION: NPS is available as a command line tool and as a web application at http://cab.spbu.ru/software/NPS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Programas Informáticos , Productos Biológicos , Bases de Datos Factuales , Espectrometría de Masas , PéptidosRESUMEN
Obstructions to replication fork progression, referred to collectively as DNA replication stress, challenge genome stability. In Saccharomyces cerevisiae, cells lacking RTT107 or SLX4 show genome instability and sensitivity to DNA replication stress and are defective in the completion of DNA replication during recovery from replication stress. We demonstrate that Slx4 is recruited to chromatin behind stressed replication forks, in a region that is spatially distinct from that occupied by the replication machinery. Slx4 complex formation is nucleated by Mec1 phosphorylation of histone H2A, which is recognized by the constitutive Slx4 binding partner Rtt107. Slx4 is essential for recruiting the Mec1 activator Dpb11 behind stressed replication forks, and Slx4 complexes are important for full activity of Mec1. We propose that Slx4 complexes promote robust checkpoint signaling by Mec1 by stably recruiting Dpb11 within a discrete domain behind the replication fork, during DNA replication stress.
Asunto(s)
Replicación del ADN , ADN de Hongos/metabolismo , Endodesoxirribonucleasas/metabolismo , Multimerización de Proteína , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/fisiología , Proteínas de Ciclo Celular , Histonas , Péptidos y Proteínas de Señalización Intracelular , Proteínas Nucleares , Unión Proteica , Proteínas Serina-Treonina Quinasas , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismoRESUMEN
Motivation: The emergence of high-throughput sequencing technologies revolutionized genomics in early 2000s. The next revolution came with the era of long-read sequencing. These technological advances along with novel computational approaches became the next step towards the automatic pipelines capable to assemble nearly complete mammalian-size genomes. Results: In this manuscript, we demonstrate performance of the state-of-the-art genome assembly software on six eukaryotic datasets sequenced using different technologies. To evaluate the results, we developed QUAST-LG-a tool that compares large genomic de novo assemblies against reference sequences and computes relevant quality metrics. Since genomes generally cannot be reconstructed completely due to complex repeat patterns and low coverage regions, we introduce a concept of upper bound assembly for a given genome and set of reads, and compute theoretical limits on assembly correctness and completeness. Using QUAST-LG, we show how close the assemblies are to the theoretical optimum, and how far this optimum is from the finished reference. Availability and implementation: http://cab.spbu.ru/software/quast-lg. Supplementary information: Supplementary data are available at Bioinformatics online.
Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Animales , Genómica/métodos , Humanos , Saccharomyces cerevisiae/genéticaRESUMEN
Peptidic natural products (PNPs) are widely used compounds that include many antibiotics and a variety of other bioactive peptides. Although recent breakthroughs in PNP discovery raised the challenge of developing new algorithms for their analysis, identification of PNPs via database search of tandem mass spectra remains an open problem. To address this problem, natural product researchers use dereplication strategies that identify known PNPs and lead to the discovery of new ones, even in cases when the reference spectra are not present in existing spectral libraries. DEREPLICATOR is a new dereplication algorithm that enables high-throughput PNP identification and that is compatible with large-scale mass-spectrometry-based screening platforms for natural product discovery. After searching nearly one hundred million tandem mass spectra in the Global Natural Products Social (GNPS) molecular networking infrastructure, DEREPLICATOR identified an order of magnitude more PNPs (and their new variants) than any previous dereplication efforts.
Asunto(s)
Algoritmos , Productos Biológicos/análisis , Bases de Datos de Compuestos Químicos , Descubrimiento de Drogas/métodos , Péptidos/análisis , Espectrometría de Masas en TándemRESUMEN
UNLABELLED: During the past years we have witnessed the rapid development of new metagenome assembly methods. Although there are many benchmark utilities designed for single-genome assemblies, there is no well-recognized evaluation and comparison tool for metagenomic-specific analogues. In this article, we present MetaQUAST, a modification of QUAST, the state-of-the-art tool for genome assembly evaluation based on alignment of contigs to a reference. MetaQUAST addresses such metagenome datasets features as (i) unknown species content by detecting and downloading reference sequences, (ii) huge diversity by giving comprehensive reports for multiple genomes and (iii) presence of highly relative species by detecting chimeric contigs. We demonstrate MetaQUAST performance by comparing several leading assemblers on one simulated and two real datasets. AVAILABILITY AND IMPLEMENTATION: http://bioinf.spbau.ru/metaquast CONTACT: aleksey.gurevich@spbu.ru SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Metagenómica , Programas Informáticos , Algoritmos , Variación Estructural del Genoma , MetagenomaRESUMEN
: Data visualization plays an increasingly important role in NGS data analysis. With advances in both sequencing and computational technologies, it has become a new bottleneck in genomics studies. Indeed, evaluation of de novo genome assemblies is one of the areas that can benefit from the visualization. However, even though multiple quality assessment methods are now available, existing visualization tools are hardly suitable for this purpose. Here, we present Icarus-a novel genome visualizer for accurate assessment and analysis of genomic draft assemblies, which is based on the tool QUAST. Icarus can be used in studies where a related reference genome is available, as well as for non-model organisms. The tool is available online and as a standalone application. AVAILABILITY AND IMPLEMENTATION: http://cab.spbu.ru/software/icarus CONTACT: aleksey.gurevich@spbu.ruSupplementary information: Supplementary data are available at Bioinformatics online.
Asunto(s)
Genómica , Programas Informáticos , Genoma , Análisis de Secuencia de ADNRESUMEN
MOTIVATION: Advances in Next-Generation Sequencing technologies and sample preparation recently enabled generation of high-quality jumping libraries that have a potential to significantly improve short read assemblies. However, assembly algorithms have to catch up with experimental innovations to benefit from them and to produce high-quality assemblies. RESULTS: We present a new algorithm that extends recently described exSPAnder universal repeat resolution approach to enable its applications to several challenging data types, including jumping libraries generated by the recently developed Illumina Nextera Mate Pair protocol. We demonstrate that, with these improvements, bacterial genomes often can be assembled in a few contigs using only a single Nextera Mate Pair library of short reads. AVAILABILITY AND IMPLEMENTATION: Described algorithms are implemented in C++ as a part of SPAdes genome assembler, which is freely available at bioinf.spbau.ru/en/spades. CONTACT: ap@bioinf.spbau.ru SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.