Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 42
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Food Microbiol ; 121: 104520, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38637082

RESUMO

Sequence-based analysis of fermented foods and beverages' microbiomes offers insights into their impact on taste and consumer health. High-throughput metagenomics provide detailed taxonomic and functional community profiling, but bacterial and yeast genome reconstruction and mobile genetic elements tracking are to be improved. We established a pipeline for exploring fermented foods microbiomes using metagenomics coupled with chromosome conformation capture (Hi-C metagenomics). The approach was applied to analyze a collection of spontaneously fermented beers and ciders (n = 12). The Hi-C reads were used to reconstruct the metagenome-assembled genomes (MAGs) of bacteria and yeasts facilitating subsequent comparative genomic analysis, assembly scaffolding and exploration of "plasmid-bacteria" links. For a subset of beverages, yeasts were isolated and characterized phenotypically. The reconstructed Hi-C MAGs primarily belonged to the Lactobacillaceae family in beers, along with Acetobacteraceae and Enterobacteriaceae in ciders, exhibiting improved quality compared to conventional metagenomic MAGs. Comparative genomic analysis of Lactobacillaceae Hi-C MAGs revealed clustering by niche and suggested genetic determinants of survival and probiotic potential. For Pediococcus damnosus, Hi-C-based networks of contigs enabled linking bacteria with plasmids. Analyzing phylogeny and accessory genes in the context of known reference genomes offered insights into the niche specialization of beer lactobacilli. The subspecies-level diversity of cider Tatumella spp. was disentangled using a Hi-C-based graph. We obtained highly complete yeast Hi-C MAGs primarily represented by Brettanomyces and Saccharomyces, with Hi-C-facilitated chromosome-level genome assembly for the former. Utilizing Hi-C metagenomics to unravel the genomic content of individual species can provide a deeper understanding of the ecological interactions within the food microbiome, aid in bioprospecting beneficial microorganisms, improving quality control and improving innovative fermented products.


Assuntos
Saccharomyces cerevisiae , Saccharomyces , Saccharomyces cerevisiae/genética , Cerveja/microbiologia , Bactérias/genética , Plasmídeos , Saccharomyces/genética , Metagenoma , Metagenômica , Enterobacteriaceae/genética
2.
Biology (Basel) ; 12(8)2023 Jul 29.
Artigo em Inglês | MEDLINE | ID: mdl-37626951

RESUMO

A recently published article in BMCGenomics by Fuentes-Trillo et al. contains a comparison of assembly approaches of several noroviral samples via different tools and preprocessing strategies. It turned out that the study used outdated versions of tools as well as tools that were not designed for the viral assembly task. In order to improve the suboptimal assemblies, authors suggested different sophisticated preprocessing strategies that seem to make only minor contributions to the results. We have reproduced the analysis using state-of-the-art tools designed for viral assembly, and we demonstrate that tools from the SPAdes toolkit (rnaviralSPAdes and coronaSPAdes) allow one to assemble the samples from the original study into a single contig without any additional preprocessing.

3.
STAR Protoc ; 4(3): 102417, 2023 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-37405923

RESUMO

The analysis of metagenomic data obtained via high-throughput DNA sequencing is primarily carried out by a dedicated binning process involving clustering contigs, presumably belonging to the same species. Here, we present a protocol for improving the quality of binning using BinSPreader. We describe steps for typical metagenome assembly and binning workflow. We then detail binning refining, its variants, output, and possible caveats. This protocol optimizes the process of reconstructing more complete genomes of microorganisms that make up the metagenome. For complete details on the use and execution of this protocol, please refer to Tolstoganov et al.1.


Assuntos
Metagenoma , Metagenômica , Metagenoma/genética , Análise de Sequência de DNA/métodos , Metagenômica/métodos , Análise por Conglomerados , Sequenciamento de Nucleotídeos em Larga Escala
4.
Nucleic Acids Res ; 51(D1): D753-D759, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36477304

RESUMO

The MGnify platform (https://www.ebi.ac.uk/metagenomics) facilitates the assembly, analysis and archiving of microbiome-derived nucleic acid sequences. The platform provides access to taxonomic assignments and functional annotations for nearly half a million analyses covering metabarcoding, metatranscriptomic, and metagenomic datasets, which are derived from a wide range of different environments. Over the past 3 years, MGnify has not only grown in terms of the number of datasets contained but also increased the breadth of analyses provided, such as the analysis of long-read sequences. The MGnify protein database now exceeds 2.4 billion non-redundant sequences predicted from metagenomic assemblies. This collection is now organised into a relational database making it possible to understand the genomic context of the protein through navigation back to the source assembly and sample metadata, marking a major improvement. To extend beyond the functional annotations already provided in MGnify, we have applied deep learning-based annotation methods. The technology underlying MGnify's Application Programming Interface (API) and website has been upgraded, and we have enabled the ability to perform downstream analysis of the MGnify data through the introduction of a coupled Jupyter Lab environment.


Assuntos
Microbiota , Análise de Sequência , Genômica/métodos , Metagenoma , Metagenômica/métodos , Microbiota/genética , Software , Análise de Sequência/métodos
5.
Front Microbiol ; 13: 981458, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36386613

RESUMO

While metagenome sequencing may provide insights on the genome sequences and composition of microbial communities, metatranscriptome analysis can be useful for studying the functional activity of a microbiome. RNA-Seq data provides the possibility to determine active genes in the community and how their expression levels depend on external conditions. Although the field of metatranscriptomics is relatively young, the number of projects related to metatranscriptome analysis increases every year and the scope of its applications expands. However, there are several problems that complicate metatranscriptome analysis: complexity of microbial communities, wide dynamic range of transcriptome expression and importantly, the lack of high-quality computational methods for assembling meta-RNA sequencing data. These factors deteriorate the contiguity and completeness of metatranscriptome assemblies, therefore affecting further downstream analysis. Here we present MetaGT, a pipeline for de novo assembly of metatranscriptomes, which is based on the idea of combining both metatranscriptomic and metagenomic data sequenced from the same sample. MetaGT assembles metatranscriptomic contigs and fills in missing regions based on their alignments to metagenome assembly. This approach allows to overcome described complexities and obtain complete RNA sequences, and additionally estimate their abundances. Using various publicly available real and simulated datasets, we demonstrate that MetaGT yields significant improvement in coverage and completeness of metatranscriptome assemblies compared to existing methods that do not exploit metagenomic data. The pipeline is implemented in NextFlow and is freely available from https://github.com/ablab/metaGT.

6.
iScience ; 25(8): 104770, 2022 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-35992057

RESUMO

Despite the recent advances in high-throughput sequencing, metagenome analysis of microbial populations still remains a challenge. In particular, the metagenome-assembled genomes (MAGs) are often fragmented due to interspecies repeats, uneven coverage, and varying strain abundance. MAGs are constructed via a binning process that uses features of input data in order to cluster long contigs presumably belonging to the same species. In this work, we present BinSPreader-a binning refiner tool that exploits the assembly graph topology and other connectivity information to refine binning, correct binning errors, and propagate binning to shorter contigs. We show that BinSPreader could increase the completeness of the bins without sacrificing the purity and could predict contigs belonging to several MAGs. BinSPreader is effective in binning shorter contigs that often contain important conservative sequences that might be of great interest to researchers.

7.
Nat Methods ; 19(4): 429-440, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35396482

RESUMO

Evaluating metagenomic software is key for optimizing metagenome interpretation and focus of the Initiative for the Critical Assessment of Metagenome Interpretation (CAMI). The CAMI II challenge engaged the community to assess methods on realistic and complex datasets with long- and short-read sequences, created computationally from around 1,700 new and known genomes, as well as 600 new plasmids and viruses. Here we analyze 5,002 results by 76 program versions. Substantial improvements were seen in assembly, some due to long-read data. Related strains still were challenging for assembly and genome recovery through binning, as was assembly quality for the latter. Profilers markedly matured, with taxon profilers and binners excelling at higher bacterial ranks, but underperforming for viruses and Archaea. Clinical pathogen detection results revealed a need to improve reproducibility. Runtime and memory usage analyses identified efficient programs, including top performers with other metrics. The results identify challenges and guide researchers in selecting methods for analyses.


Assuntos
Metagenoma , Metagenômica , Archaea/genética , Metagenômica/métodos , Reprodutibilidade dos Testes , Análise de Sequência de DNA , Software
8.
Nat Biotechnol ; 40(5): 711-719, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-34980911

RESUMO

Microbial communities might include distinct lineages of closely related organisms that complicate metagenomic assembly and prevent the generation of complete metagenome-assembled genomes (MAGs). Here we show that deep sequencing using long (HiFi) reads combined with Hi-C binning can address this challenge even for complex microbial communities. Using existing methods, we sequenced the sheep fecal metagenome and identified 428 MAGs with more than 90% completeness, including 44 MAGs in single circular contigs. To resolve closely related strains (lineages), we developed MAGPhase, which separates lineages of related organisms by discriminating variant haplotypes across hundreds of kilobases of genomic sequence. MAGPhase identified 220 lineage-resolved MAGs in our dataset. The ability to resolve closely related microbes in complex microbial communities improves the identification of biosynthetic gene clusters and the precision of assigning mobile genetic elements to host genomes. We identified 1,400 complete and 350 partial biosynthetic gene clusters, most of which are novel, as well as 424 (298) potential host-viral (host-plasmid) associations using Hi-C data.


Assuntos
Metagenoma , Microbiota , Animais , Fezes , Metagenoma/genética , Metagenômica , Microbiota/genética , Análise de Sequência de DNA , Ovinos
9.
Nature ; 602(7895): 142-147, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-35082445

RESUMO

Public databases contain a planetary collection of nucleic acid sequences, but their systematic exploration has been inhibited by a lack of efficient methods for searching this corpus, which (at the time of writing) exceeds 20 petabases and is growing exponentially1. Here we developed a cloud computing infrastructure, Serratus, to enable ultra-high-throughput sequence alignment at the petabase scale. We searched 5.7 million biologically diverse samples (10.2 petabases) for the hallmark gene RNA-dependent RNA polymerase and identified well over 105 novel RNA viruses, thereby expanding the number of known species by roughly an order of magnitude. We characterized novel viruses related to coronaviruses, hepatitis delta virus and huge phages, respectively, and analysed their environmental reservoirs. To catalyse the ongoing revolution of viral discovery, we established a free and comprehensive database of these data and tools. Expanding the known sequence diversity of viruses can reveal the evolutionary origins of emerging pathogens and improve pathogen surveillance for the anticipation and mitigation of future pandemics.


Assuntos
Computação em Nuvem , Bases de Dados Genéticas , Vírus de RNA/genética , Vírus de RNA/isolamento & purificação , Alinhamento de Sequência/métodos , Virologia/métodos , Viroma/genética , Animais , Arquivos , Bacteriófagos/enzimologia , Bacteriófagos/genética , Biodiversidade , Coronavirus/classificação , Coronavirus/enzimologia , Coronavirus/genética , Evolução Molecular , Vírus Delta da Hepatite/enzimologia , Vírus Delta da Hepatite/genética , Humanos , Modelos Moleculares , Vírus de RNA/classificação , Vírus de RNA/enzimologia , RNA Polimerase Dependente de RNA/química , RNA Polimerase Dependente de RNA/genética , Software
10.
Front Microbiol ; 12: 714836, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34690959

RESUMO

The lack of control over the usage of antibiotics leads to propagation of the microbial strains that are resistant to many antimicrobial substances. This situation is an emerging threat to public health and therefore the development of approaches to infer the presence of resistant strains is a topic of high importance. The resistome construction of an isolate microbial species could be considered a solved task with many state-of-the-art tools available. However, when it comes to the analysis of the resistome of a microbial community (metagenome), then there exist many challenges that influence the accuracy and precision of the predictions. For example, the prediction sensitivity of the existing tools suffer from the fragmented metagenomic assemblies due to interspecies repeats: usually it is impossible to recover conservative parts of antibiotic resistance genes that belong to different species that occur due to e.g., horizontal gene transfer or residing on a plasmid. The recent advances in development of new graph-based methods open a way to recover gene sequences of interest directly from the assembly graph without relying on cumbersome and incomplete metagenomic assembly. We present GraphAMR-a novel computational pipeline for recovery and identification of antibiotic resistance genes from fragmented metagenomic assemblies. The pipeline involves the alignment of profile hidden Markov models of target genes directly to the assembly graph of a metagenome with further dereplication and annotation of the results using state-of-the art tools. We show significant improvement of the quality of the results obtained (both in terms of accuracy and completeness) as compared to the analysis of an output of ordinary metagenomic assembly as well as different read mapping approaches. The pipeline is freely available from https://github.com/ablab/graphamr.

11.
Metabolites ; 11(10)2021 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-34677408

RESUMO

Microbial natural products are a major source of bioactive compounds for drug discovery. Among these molecules, nonribosomal peptides (NRPs) represent a diverse class of natural products that include antibiotics, immunosuppressants, and anticancer agents. Recent breakthroughs in natural product discovery have revealed the chemical structure of several thousand NRPs. However, biosynthetic gene clusters (BGCs) encoding them are known only for a few hundred compounds. Here, we developed Nerpa, a computational method for the high-throughput discovery of novel BGCs responsible for producing known NRPs. After searching 13,399 representative bacterial genomes from the RefSeq repository against 8368 known NRPs, Nerpa linked 117 BGCs to their products. We further experimentally validated the predicted BGC of ngercheumicin from Photobacterium galatheae via mass spectrometry. Nerpa supports searching new genomes against thousands of known NRP structures, and novel molecular structures against tens of thousands of bacterial genomes. The availability of these tools can enhance our understanding of NRP synthesis and the function of their biosynthetic enzymes.

12.
Bioinformatics ; 38(1): 1-8, 2021 12 22.
Artigo em Inglês | MEDLINE | ID: mdl-34406356

RESUMO

MOTIVATION: The COVID-19 pandemic has ignited a broad scientific interest in viral research in general and coronavirus research in particular. The identification and characterization of viral species in natural reservoirs typically involves de novo assembly. However, existing genome, metagenome and transcriptome assemblers often are not able to assemble many viruses (including coronaviruses) into a single contig. Coverage variation between datasets and within dataset, presence of close strains, splice variants and contamination set a high bar for assemblers to process viral datasets with diverse properties. RESULTS: We developed coronaSPAdes, a novel assembler for RNA viral species recovery in general and coronaviruses in particular. coronaSPAdes leverages the knowledge about viral genome structures to improve assembly extending ideas initially implemented in biosyntheticSPAdes. We have shown that coronaSPAdes outperforms existing SPAdes modes and other popular short-read metagenome and viral assemblers in the recovery of full-length RNA viral genomes. AVAILABILITY AND IMPLEMENTATION: coronaSPAdes version used in this article is a part of SPAdes 3.15 release and is freely available at http://cab.spbu.ru/software/spades. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
COVID-19 , Software , Humanos , Pandemias , Metagenoma , Genoma Viral
13.
Front Microbiol ; 12: 613791, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33833738

RESUMO

Metagenomics is a segment of conventional microbial genomics dedicated to the sequencing and analysis of combined genomic DNA of entire environmental samples. The most critical step of the metagenomic data analysis is the reconstruction of individual genes and genomes of the microorganisms in the communities using metagenomic assemblers - computational programs that put together small fragments of sequenced DNA generated by sequencing instruments. Here, we describe the challenges of metagenomic assembly, a wide spectrum of applications in which metagenomic assemblies were used to better understand the ecology and evolution of microbial ecosystems, and present one of the most efficient microbial assemblers, SPAdes that was upgraded to become applicable for metagenomics.

14.
Front Microbiol ; 12: 770323, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-35185811

RESUMO

Gut microbiome in critically ill patients shows profound dysbiosis. The most vulnerable is the subgroup of chronically critically ill (CCI) patients - those suffering from long-term dependence on support systems in intensive care units. It is important to investigate their microbiome as a potential reservoir of opportunistic taxa causing co-infections and a morbidity factor. We explored dynamics of microbiome composition in the CCI patients by combining "shotgun" metagenomics with chromosome conformation capture (Hi-C). Stool samples were collected at 2 time points from 2 patients with severe brain injury with different outcomes within a 1-2-week interval. The metagenome-assembled genomes (MAGs) were reconstructed based on the Hi-C data using a novel hicSPAdes method (along with the bin3c method for comparison), as well as independently of the Hi-C using MetaBAT2. The resistomes of the samples were derived using a novel assembly graph-based approach. Links of bacteria to antibiotic resistance genes, plasmids and viruses were analyzed using Hi-C-based networks. The gut community structure was enriched in opportunistic microorganisms. The binning using hicSPAdes was superior to the conventional WGS-based binning as well as to the bin3c in terms of the number, completeness and contamination of the reconstructed MAGs. Using Klebsiella pneumoniae as an example, we showed how chromosome conformation capture can aid comparative genomic analysis of clinically important pathogens. Diverse associations of resistome with antimicrobial therapy from the level of assembly graphs to gene content were discovered. Analysis of Hi-C networks suggested multiple "host-plasmid" and "host-phage" links. Hi-C metagenomics is a promising technique for investigating clinical microbiome samples. It provides a community composition profile with increased details on bacterial gene content and mobile genetic elements compared to conventional metagenomics. The ability of Hi-C binning to encompass the MAG's plasmid content facilitates metagenomic evaluation of virulence and drug resistance dynamics in clinically relevant opportunistic pathogens. These findings will help to identify the targets for developing cost-effective and rapid tests for assessing microbiome-related health risks.

15.
BMC Bioinformatics ; 21(1): 362, 2020 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-32814545

RESUMO

An amendment to this paper has been published and can be accessed via the original article.

16.
BMC Bioinformatics ; 21(Suppl 12): 303, 2020 Jul 24.
Artigo em Inglês | MEDLINE | ID: mdl-32703166

RESUMO

BACKGROUND: Illumina paired-end reads are often used for 16S analysis in metagenomic studies. Since DNA fragment size is usually smaller than the sum of lengths of paired reads, reads can be merged for downstream analysis. In spite of development of several tools for merging of paired-end reads, poor quality at the 3' ends within the overlapping region prevents the accurate combining of significant portion of read pairs. Recently CD-HIT-OTU-Miseq was presented as a new approach for 16S analysis using the paired-end reads, it completely avoids the reads merging process due to separate clustering of paired reads. CD-HIT-OTU-Miseq is a set of tools which are supposed to be successively launched by auxiliary shell scripts. This launch mode is not suitable for processing of big amounts of data generated in modern omics experiments. To solve this issue we created CDSnake - Snakemake pipeline utilizing CD-HIT tools for easier consecutive launch of CD-HIT-OTU-Miseq tools for complete processing of paired end reads in metagenomic studies. Usage of pipeline make 16S analysis easier due to one-command launch and helps to yield reproducible results. RESULTS: We benchmarked our pipeline against two commonly used pipelines for OTU retrieval, incorporated into popular workflow for microbiome analysis, QIIME2 - DADA2 and deblur. Three mock datasets having highly overlapping paired-end 2 × 250 bp reads were used for benchmarking - Balanced, HMP, and Extreme. CDSnake outputted less OTUs than DADA2 and deblur. However, on Balanced and HMP datasets number of OTUs outputted by CDSnake was closer to real number of strains which were used for mock community generation, than those outputted by DADA2 and deblur. Though generally slower than other pipelines, CDSnake outputted higher total counts, preserving more information from raw data. Inheriting this properties from original CD-HIT-OTU-MiSeq utilities, CDSnake made their usage handier due to simple scalability, easier automated runs and other Snakemake benefits. CONCLUSIONS: We developed Snakemake pipeline for OTU-MiSeq utilities, which simplified and automated data analysis. Benchmarking showed that this approach is capable to outperform popular tools in certain conditions.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Anotação de Sequência Molecular , Software , Bases de Dados Genéticas , Humanos , Microbiota/genética , RNA Ribossômico 16S/genética
17.
BMC Bioinformatics ; 21(Suppl 12): 306, 2020 Jul 24.
Artigo em Inglês | MEDLINE | ID: mdl-32703258

RESUMO

BACKGROUND: Graph-based representation of genome assemblies has been recently used in different contexts - from improved reconstruction of plasmid sequences and refined analysis of metagenomic data to read error correction and reference-free haplotype reconstruction. While many of these applications heavily utilize the alignment of long nucleotide sequences to assembly graphs, first general-purpose software tools for finding such alignments have been released only recently and their deficiencies and limitations are yet to be discovered. Moreover, existing tools can not perform alignment of amino acid sequences, which could prove useful in various contexts - in particular the analysis of metagenomic sequencing data. RESULTS: In this work we present a novel SPAligner (Saint-Petersburg Aligner) tool for aligning long diverged nucleotide and amino acid sequences to assembly graphs. We demonstrate that SPAligner is an efficient solution for mapping third generation sequencing reads onto assembly graphs of various complexity and also show how it can facilitate the identification of known genes in complex metagenomic datasets. CONCLUSIONS: Our work will facilitate accelerating the development of graph-based approaches in solving sequence to genome assembly alignment problem. SPAligner is implemented as a part of SPAdes tools library and is available on Github.


Assuntos
Algoritmos , Variação Genética , Alinhamento de Sequência , Sequência de Bases , Haplótipos/genética , Humanos , Software , Estatística como Assunto , beta-Lactamases/química
18.
Curr Protoc Bioinformatics ; 70(1): e102, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-32559359

RESUMO

SPAdes-St. Petersburg genome Assembler-was originally developed for de novo assembly of genome sequencing data produced for cultivated microbial isolates and for single-cell genomic DNA sequencing. With time, the functionality of SPAdes was extended to enable assembly of IonTorrent data, as well as hybrid assembly from short and long reads (PacBio and Oxford Nanopore). In this article we present protocols for five different assembly pipelines that comprise the SPAdes package and that are used for assembly of metagenomes and transcriptomes as well as assembly of putative plasmids and biosynthetic gene clusters from whole-genome sequencing and metagenomic datasets. In addition, we present guidelines for understanding results with use cases for each pipeline, and several additional support protocols that help in using SPAdes properly. © 2020 Wiley Periodicals LLC. Basic Protocol 1: Assembling isolate bacterial datasets Basic Protocol 2: Assembling metagenomic datasets Basic Protocol 3: Assembling sets of putative plasmids Basic Protocol 4: Assembling transcriptomes Basic Protocol 5: Assembling putative biosynthetic gene clusters Support Protocol 1: Installing SPAdes Support Protocol 2: Providing input via command line Support Protocol 3: Providing input data via YAML format Support Protocol 4: Restarting previous run Support Protocol 5: Determining strand-specificity of RNA-seq data.


Assuntos
Algoritmos , Análise de Sequência de DNA/métodos , Bactérias/genética , Vias Biossintéticas/genética , Bases de Dados Genéticas , Metagenoma , Família Multigênica , Plasmídeos/genética , RNA-Seq , Transcriptoma/genética
19.
Nucleic Acids Res ; 48(D1): D570-D578, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31696235

RESUMO

MGnify (http://www.ebi.ac.uk/metagenomics) provides a free to use platform for the assembly, analysis and archiving of microbiome data derived from sequencing microbial populations that are present in particular environments. Over the past 2 years, MGnify (formerly EBI Metagenomics) has more than doubled the number of publicly available analysed datasets held within the resource. Recently, an updated approach to data analysis has been unveiled (version 5.0), replacing the previous single pipeline with multiple analysis pipelines that are tailored according to the input data, and that are formally described using the Common Workflow Language, enabling greater provenance, reusability, and reproducibility. MGnify's new analysis pipelines offer additional approaches for taxonomic assertions based on ribosomal internal transcribed spacer regions (ITS1/2) and expanded protein functional annotations. Biochemical pathways and systems predictions have also been added for assembled contigs. MGnify's growing focus on the assembly of metagenomic data has also seen the number of datasets it has assembled and analysed increase six-fold. The non-redundant protein database constructed from the proteins encoded by these assemblies now exceeds 1 billion sequences. Meanwhile, a newly developed contig viewer provides fine-grained visualisation of the assembled contigs and their enriched annotations.


Assuntos
Metagenoma , Microbiota , Filogenia , Software , Archaea/classificação , Archaea/genética , Bactérias/classificação , Bactérias/genética , DNA Espaçador Ribossômico/genética , Bases de Dados Genéticas , Metagenômica/métodos
20.
Cell Syst ; 9(6): 600-608.e4, 2019 12 18.
Artigo em Inglês | MEDLINE | ID: mdl-31629686

RESUMO

Ribosomally synthesized and post-translationally modified peptides (RiPPs) are an important class of natural products that contain antibiotics and a variety of other bioactive compounds. The existing methods for discovery of RiPPs by combining genome mining and computational mass spectrometry are limited to discovering specific classes of RiPPs from small datasets, and these methods fail to handle unknown post-translational modifications. Here, we present MetaMiner, a software tool for addressing these challenges that is compatible with large-scale screening platforms for natural product discovery. After searching millions of spectra in the Global Natural Products Social (GNPS) molecular networking infrastructure against just eight genomic and metagenomic datasets, MetaMiner discovered 31 known and seven unknown RiPPs from diverse microbial communities, including human microbiome and lichen microbiome, and microorganisms isolated from the International Space Station.


Assuntos
Biologia Computacional/métodos , Microbiota/genética , Processamento de Proteína Pós-Traducional/genética , Genômica/métodos , Humanos , Peptídeos/química , Ribossomos/genética , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA