Búsqueda | OPS/OMS Uruguay

1.

Petabase-scale sequence alignment catalyses viral discovery.

Edgar, Robert C; Taylor, Brie; Lin, Victor; Altman, Tomer; Barbera, Pierre; Meleshko, Dmitry; Lohr, Dan; Novakovsky, Gherman; Buchfink, Benjamin; Al-Shayeb, Basem; Banfield, Jillian F; de la Peña, Marcos; Korobeynikov, Anton; Chikhi, Rayan; Babaian, Artem.

Nature ; 602(7895): 142-147, 2022 02.

Artículo en Inglés | MEDLINE | ID: mdl-35082445

RESUMEN

Public databases contain a planetary collection of nucleic acid sequences, but their systematic exploration has been inhibited by a lack of efficient methods for searching this corpus, which (at the time of writing) exceeds 20 petabases and is growing exponentially1. Here we developed a cloud computing infrastructure, Serratus, to enable ultra-high-throughput sequence alignment at the petabase scale. We searched 5.7 million biologically diverse samples (10.2 petabases) for the hallmark gene RNA-dependent RNA polymerase and identified well over 105 novel RNA viruses, thereby expanding the number of known species by roughly an order of magnitude. We characterized novel viruses related to coronaviruses, hepatitis delta virus and huge phages, respectively, and analysed their environmental reservoirs. To catalyse the ongoing revolution of viral discovery, we established a free and comprehensive database of these data and tools. Expanding the known sequence diversity of viruses can reveal the evolutionary origins of emerging pathogens and improve pathogen surveillance for the anticipation and mitigation of future pandemics.

Asunto(s)

Nube Computacional , Bases de Datos Genéticas , Virus ARN/genética , Virus ARN/aislamiento & purificación , Alineación de Secuencia/métodos , Virología/métodos , Viroma/genética , Animales , Archivos , Bacteriófagos/enzimología , Bacteriófagos/genética , Biodiversidad , Coronavirus/clasificación , Coronavirus/enzimología , Coronavirus/genética , Evolución Molecular , Virus de la Hepatitis Delta/enzimología , Virus de la Hepatitis Delta/genética , Humanos , Modelos Moleculares , Virus ARN/clasificación , Virus ARN/enzimología , ARN Polimerasa Dependiente del ARN/química , ARN Polimerasa Dependiente del ARN/genética , Programas Informáticos

2.

Critical Assessment of Metagenome Interpretation: the second round of challenges.

Meyer, Fernando; Fritz, Adrian; Deng, Zhi-Luo; Koslicki, David; Lesker, Till Robin; Gurevich, Alexey; Robertson, Gary; Alser, Mohammed; Antipov, Dmitry; Beghini, Francesco; Bertrand, Denis; Brito, Jaqueline J; Brown, C Titus; Buchmann, Jan; Buluç, Aydin; Chen, Bo; Chikhi, Rayan; Clausen, Philip T L C; Cristian, Alexandru; Dabrowski, Piotr Wojciech; Darling, Aaron E; Egan, Rob; Eskin, Eleazar; Georganas, Evangelos; Goltsman, Eugene; Gray, Melissa A; Hansen, Lars Hestbjerg; Hofmeyr, Steven; Huang, Pingqin; Irber, Luiz; Jia, Huijue; Jørgensen, Tue Sparholt; Kieser, Silas D; Klemetsen, Terje; Kola, Axel; Kolmogorov, Mikhail; Korobeynikov, Anton; Kwan, Jason; LaPierre, Nathan; Lemaitre, Claire; Li, Chenhao; Limasset, Antoine; Malcher-Miranda, Fabio; Mangul, Serghei; Marcelino, Vanessa R; Marchet, Camille; Marijon, Pierre; Meleshko, Dmitry; Mende, Daniel R; Milanese, Alessio.

Nat Methods ; 19(4): 429-440, 2022 04.

Artículo en Inglés | MEDLINE | ID: mdl-35396482

RESUMEN

Evaluating metagenomic software is key for optimizing metagenome interpretation and focus of the Initiative for the Critical Assessment of Metagenome Interpretation (CAMI). The CAMI II challenge engaged the community to assess methods on realistic and complex datasets with long- and short-read sequences, created computationally from around 1,700 new and known genomes, as well as 600 new plasmids and viruses. Here we analyze 5,002 results by 76 program versions. Substantial improvements were seen in assembly, some due to long-read data. Related strains still were challenging for assembly and genome recovery through binning, as was assembly quality for the latter. Profilers markedly matured, with taxon profilers and binners excelling at higher bacterial ranks, but underperforming for viruses and Archaea. Clinical pathogen detection results revealed a need to improve reproducibility. Runtime and memory usage analyses identified efficient programs, including top performers with other metrics. The results identify challenges and guide researchers in selecting methods for analyses.

Asunto(s)

Metagenoma , Metagenómica , Archaea/genética , Metagenómica/métodos , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN , Programas Informáticos

3.

MGnify: the microbiome sequence data analysis resource in 2023.

Richardson, Lorna; Allen, Ben; Baldi, Germana; Beracochea, Martin; Bileschi, Maxwell L; Burdett, Tony; Burgin, Josephine; Caballero-Pérez, Juan; Cochrane, Guy; Colwell, Lucy J; Curtis, Tom; Escobar-Zepeda, Alejandra; Gurbich, Tatiana A; Kale, Varsha; Korobeynikov, Anton; Raj, Shriya; Rogers, Alexander B; Sakharova, Ekaterina; Sanchez, Santiago; Wilkinson, Darren J; Finn, Robert D.

Nucleic Acids Res ; 51(D1): D753-D759, 2023 01 06.

Artículo en Inglés | MEDLINE | ID: mdl-36477304

RESUMEN

The MGnify platform (https://www.ebi.ac.uk/metagenomics) facilitates the assembly, analysis and archiving of microbiome-derived nucleic acid sequences. The platform provides access to taxonomic assignments and functional annotations for nearly half a million analyses covering metabarcoding, metatranscriptomic, and metagenomic datasets, which are derived from a wide range of different environments. Over the past 3 years, MGnify has not only grown in terms of the number of datasets contained but also increased the breadth of analyses provided, such as the analysis of long-read sequences. The MGnify protein database now exceeds 2.4 billion non-redundant sequences predicted from metagenomic assemblies. This collection is now organised into a relational database making it possible to understand the genomic context of the protein through navigation back to the source assembly and sample metadata, marking a major improvement. To extend beyond the functional annotations already provided in MGnify, we have applied deep learning-based annotation methods. The technology underlying MGnify's Application Programming Interface (API) and website has been upgraded, and we have enabled the ability to perform downstream analysis of the MGnify data through the introduction of a coupled Jupyter Lab environment.

Asunto(s)

Microbiota , Análisis de Secuencia , Genómica/métodos , Metagenoma , Metagenómica/métodos , Microbiota/genética , Programas Informáticos , Análisis de Secuencia/métodos

4.

Hi-C metagenomics facilitate comparative genome analysis of bacteria and yeast from spontaneous beer and cider.

Sonets, Ignat V; Solovyev, Mikhail A; Ivanova, Valeriia A; Vasiluev, Petr A; Kachalkin, Aleksey V; Ochkalova, Sofia D; Korobeynikov, Anton I; Razin, Sergey V; Ulianov, Sergey V; Tyakht, Alexander V.

Food Microbiol ; 121: 104520, 2024 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-38637082

RESUMEN

Sequence-based analysis of fermented foods and beverages' microbiomes offers insights into their impact on taste and consumer health. High-throughput metagenomics provide detailed taxonomic and functional community profiling, but bacterial and yeast genome reconstruction and mobile genetic elements tracking are to be improved. We established a pipeline for exploring fermented foods microbiomes using metagenomics coupled with chromosome conformation capture (Hi-C metagenomics). The approach was applied to analyze a collection of spontaneously fermented beers and ciders (n = 12). The Hi-C reads were used to reconstruct the metagenome-assembled genomes (MAGs) of bacteria and yeasts facilitating subsequent comparative genomic analysis, assembly scaffolding and exploration of "plasmid-bacteria" links. For a subset of beverages, yeasts were isolated and characterized phenotypically. The reconstructed Hi-C MAGs primarily belonged to the Lactobacillaceae family in beers, along with Acetobacteraceae and Enterobacteriaceae in ciders, exhibiting improved quality compared to conventional metagenomic MAGs. Comparative genomic analysis of Lactobacillaceae Hi-C MAGs revealed clustering by niche and suggested genetic determinants of survival and probiotic potential. For Pediococcus damnosus, Hi-C-based networks of contigs enabled linking bacteria with plasmids. Analyzing phylogeny and accessory genes in the context of known reference genomes offered insights into the niche specialization of beer lactobacilli. The subspecies-level diversity of cider Tatumella spp. was disentangled using a Hi-C-based graph. We obtained highly complete yeast Hi-C MAGs primarily represented by Brettanomyces and Saccharomyces, with Hi-C-facilitated chromosome-level genome assembly for the former. Utilizing Hi-C metagenomics to unravel the genomic content of individual species can provide a deeper understanding of the ecological interactions within the food microbiome, aid in bioprospecting beneficial microorganisms, improving quality control and improving innovative fermented products.

Asunto(s)

Saccharomyces cerevisiae , Saccharomyces , Saccharomyces cerevisiae/genética , Cerveza/microbiología , Bacterias/genética , Plásmidos , Saccharomyces/genética , Metagenoma , Metagenómica , Enterobacteriaceae/genética

5.

BiosyntheticSPAdes: reconstructing biosynthetic gene clusters from assembly graphs.

Meleshko, Dmitry; Mohimani, Hosein; Tracanna, Vittorio; Hajirasouliha, Iman; Medema, Marnix H; Korobeynikov, Anton; Pevzner, Pavel A.

Genome Res ; 29(8): 1352-1362, 2019 08.

Artículo en Inglés | MEDLINE | ID: mdl-31160374

RESUMEN

Predicting biosynthetic gene clusters (BGCs) is critically important for discovery of antibiotics and other natural products. While BGC prediction from complete genomes is a well-studied problem, predicting BGCs in fragmented genomic assemblies remains challenging. The existing BGC prediction tools often assume that each BGC is encoded within a single contig in the genome assembly, a condition that is violated for most sequenced microbial genomes where BGCs are often scattered through several contigs, making it difficult to reconstruct them. The situation is even more severe in shotgun metagenomics, where the contigs are often short, and the existing tools fail to predict a large fraction of long BGCs. While it is difficult to assemble BGCs in a single contig, the structure of the genome assembly graph often provides clues on how to combine multiple contigs into segments encoding long BGCs. We describe biosyntheticSPAdes, a tool for predicting BGCs in assembly graphs and demonstrate that it greatly improves the reconstruction of BGCs from genomic and metagenomics data sets.

Asunto(s)

Genes Bacterianos , Metagenoma , Metagenómica/métodos , Familia de Multigenes , Programas Informáticos , Mapeo Contig , Conjuntos de Datos como Asunto , Placa Dental/microbiología , Encía/microbiología , Humanos , Internet , Mucosa Bucal/microbiología , Faringe/microbiología , Biosíntesis de Proteínas , Lengua/microbiología

6.

coronaSPAdes: from biosynthetic gene clusters to RNA viral assemblies.

Meleshko, Dmitry; Hajirasouliha, Iman; Korobeynikov, Anton.

Bioinformatics ; 38(1): 1-8, 2021 12 22.

Artículo en Inglés | MEDLINE | ID: mdl-34406356

RESUMEN

MOTIVATION: The COVID-19 pandemic has ignited a broad scientific interest in viral research in general and coronavirus research in particular. The identification and characterization of viral species in natural reservoirs typically involves de novo assembly. However, existing genome, metagenome and transcriptome assemblers often are not able to assemble many viruses (including coronaviruses) into a single contig. Coverage variation between datasets and within dataset, presence of close strains, splice variants and contamination set a high bar for assemblers to process viral datasets with diverse properties. RESULTS: We developed coronaSPAdes, a novel assembler for RNA viral species recovery in general and coronaviruses in particular. coronaSPAdes leverages the knowledge about viral genome structures to improve assembly extending ideas initially implemented in biosyntheticSPAdes. We have shown that coronaSPAdes outperforms existing SPAdes modes and other popular short-read metagenome and viral assemblers in the recovery of full-length RNA viral genomes. AVAILABILITY AND IMPLEMENTATION: coronaSPAdes version used in this article is a part of SPAdes 3.15 release and is freely available at http://cab.spbu.ru/software/spades. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

COVID-19 , Programas Informáticos , Humanos , Pandemias , Metagenoma , Genoma Viral

7.

MGnify: the microbiome analysis resource in 2020.

Mitchell, Alex L; Almeida, Alexandre; Beracochea, Martin; Boland, Miguel; Burgin, Josephine; Cochrane, Guy; Crusoe, Michael R; Kale, Varsha; Potter, Simon C; Richardson, Lorna J; Sakharova, Ekaterina; Scheremetjew, Maxim; Korobeynikov, Anton; Shlemov, Alex; Kunyavskaya, Olga; Lapidus, Alla; Finn, Robert D.

Nucleic Acids Res ; 48(D1): D570-D578, 2020 01 08.

Artículo en Inglés | MEDLINE | ID: mdl-31696235

RESUMEN

MGnify (http://www.ebi.ac.uk/metagenomics) provides a free to use platform for the assembly, analysis and archiving of microbiome data derived from sequencing microbial populations that are present in particular environments. Over the past 2 years, MGnify (formerly EBI Metagenomics) has more than doubled the number of publicly available analysed datasets held within the resource. Recently, an updated approach to data analysis has been unveiled (version 5.0), replacing the previous single pipeline with multiple analysis pipelines that are tailored according to the input data, and that are formally described using the Common Workflow Language, enabling greater provenance, reusability, and reproducibility. MGnify's new analysis pipelines offer additional approaches for taxonomic assertions based on ribosomal internal transcribed spacer regions (ITS1/2) and expanded protein functional annotations. Biochemical pathways and systems predictions have also been added for assembled contigs. MGnify's growing focus on the assembly of metagenomic data has also seen the number of datasets it has assembled and analysed increase six-fold. The non-redundant protein database constructed from the proteins encoded by these assemblies now exceeds 1 billion sequences. Meanwhile, a newly developed contig viewer provides fine-grained visualisation of the assembled contigs and their enriched annotations.

Asunto(s)

Metagenoma , Microbiota , Filogenia , Programas Informáticos , Archaea/clasificación , Archaea/genética , Bacterias/clasificación , Bacterias/genética , ADN Espaciador Ribosómico/genética , Bases de Datos Genéticas , Metagenómica/métodos

8.

CDSnake: Snakemake pipeline for retrieval of annotated OTUs from paired-end reads using CD-HIT utilities.

Kondratenko, Yulia; Korobeynikov, Anton; Lapidus, Alla.

BMC Bioinformatics ; 21(Suppl 12): 303, 2020 Jul 24.

Artículo en Inglés | MEDLINE | ID: mdl-32703166

RESUMEN

BACKGROUND: Illumina paired-end reads are often used for 16S analysis in metagenomic studies. Since DNA fragment size is usually smaller than the sum of lengths of paired reads, reads can be merged for downstream analysis. In spite of development of several tools for merging of paired-end reads, poor quality at the 3' ends within the overlapping region prevents the accurate combining of significant portion of read pairs. Recently CD-HIT-OTU-Miseq was presented as a new approach for 16S analysis using the paired-end reads, it completely avoids the reads merging process due to separate clustering of paired reads. CD-HIT-OTU-Miseq is a set of tools which are supposed to be successively launched by auxiliary shell scripts. This launch mode is not suitable for processing of big amounts of data generated in modern omics experiments. To solve this issue we created CDSnake - Snakemake pipeline utilizing CD-HIT tools for easier consecutive launch of CD-HIT-OTU-Miseq tools for complete processing of paired end reads in metagenomic studies. Usage of pipeline make 16S analysis easier due to one-command launch and helps to yield reproducible results. RESULTS: We benchmarked our pipeline against two commonly used pipelines for OTU retrieval, incorporated into popular workflow for microbiome analysis, QIIME2 - DADA2 and deblur. Three mock datasets having highly overlapping paired-end 2 × 250 bp reads were used for benchmarking - Balanced, HMP, and Extreme. CDSnake outputted less OTUs than DADA2 and deblur. However, on Balanced and HMP datasets number of OTUs outputted by CDSnake was closer to real number of strains which were used for mock community generation, than those outputted by DADA2 and deblur. Though generally slower than other pipelines, CDSnake outputted higher total counts, preserving more information from raw data. Inheriting this properties from original CD-HIT-OTU-MiSeq utilities, CDSnake made their usage handier due to simple scalability, easier automated runs and other Snakemake benefits. CONCLUSIONS: We developed Snakemake pipeline for OTU-MiSeq utilities, which simplified and automated data analysis. Benchmarking showed that this approach is capable to outperform popular tools in certain conditions.

Asunto(s)

Secuenciación de Nucleótidos de Alto Rendimiento , Anotación de Secuencia Molecular , Programas Informáticos , Bases de Datos Genéticas , Humanos , Microbiota/genética , ARN Ribosómico 16S/genética

9.

Correction to: CDSnake: Snakemake pipeline for retrieval of annotated OTUs from paired-end reads using CD-HIT utilities.

Kondratenko, Yulia; Korobeynikov, Anton; Lapidus, Alla.

BMC Bioinformatics ; 21(1): 362, 2020 Aug 19.

Artículo en Inglés | MEDLINE | ID: mdl-32814545

RESUMEN

An amendment to this paper has been published and can be accessed via the original article.

10.

SPAligner: alignment of long diverged molecular sequences to assembly graphs.

Dvorkina, Tatiana; Antipov, Dmitry; Korobeynikov, Anton; Nurk, Sergey.

BMC Bioinformatics ; 21(Suppl 12): 306, 2020 Jul 24.

Artículo en Inglés | MEDLINE | ID: mdl-32703258

RESUMEN

BACKGROUND: Graph-based representation of genome assemblies has been recently used in different contexts - from improved reconstruction of plasmid sequences and refined analysis of metagenomic data to read error correction and reference-free haplotype reconstruction. While many of these applications heavily utilize the alignment of long nucleotide sequences to assembly graphs, first general-purpose software tools for finding such alignments have been released only recently and their deficiencies and limitations are yet to be discovered. Moreover, existing tools can not perform alignment of amino acid sequences, which could prove useful in various contexts - in particular the analysis of metagenomic sequencing data. RESULTS: In this work we present a novel SPAligner (Saint-Petersburg Aligner) tool for aligning long diverged nucleotide and amino acid sequences to assembly graphs. We demonstrate that SPAligner is an efficient solution for mapping third generation sequencing reads onto assembly graphs of various complexity and also show how it can facilitate the identification of known genes in complex metagenomic datasets. CONCLUSIONS: Our work will facilitate accelerating the development of graph-based approaches in solving sequence to genome assembly alignment problem. SPAligner is implemented as a part of SPAdes tools library and is available on Github.

Asunto(s)

Algoritmos , Variación Genética , Alineación de Secuencia , Secuencia de Bases , Haplotipos/genética , Humanos , Programas Informáticos , Estadística como Asunto , beta-Lactamasas/química

11.

metaSPAdes: a new versatile metagenomic assembler.

Nurk, Sergey; Meleshko, Dmitry; Korobeynikov, Anton; Pevzner, Pavel A.

Genome Res ; 27(5): 824-834, 2017 05.

Artículo en Inglés | MEDLINE | ID: mdl-28298430

RESUMEN

While metagenomics has emerged as a technology of choice for analyzing bacterial populations, the assembly of metagenomic data remains challenging, thus stifling biological discoveries. Moreover, recent studies revealed that complex bacterial populations may be composed from dozens of related strains, thus further amplifying the challenge of metagenomic assembly. metaSPAdes addresses various challenges of metagenomic assembly by capitalizing on computational ideas that proved to be useful in assemblies of single cells and highly polymorphic diploid genomes. We benchmark metaSPAdes against other state-of-the-art metagenome assemblers and demonstrate that it results in high-quality assemblies across diverse data sets.

Asunto(s)

Mapeo Contig/métodos , Genómica/métodos , Metagenoma , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Genoma Bacteriano

12.

Comparative genomics uncovers the prolific and distinctive metabolic potential of the cyanobacterial genus Moorea.

Leao, Tiago; Castelão, Guilherme; Korobeynikov, Anton; Monroe, Emily A; Podell, Sheila; Glukhov, Evgenia; Allen, Eric E; Gerwick, William H; Gerwick, Lena.

Proc Natl Acad Sci U S A ; 114(12): 3198-3203, 2017 03 21.

Artículo en Inglés | MEDLINE | ID: mdl-28265051

RESUMEN

Cyanobacteria are major sources of oxygen, nitrogen, and carbon in nature. In addition to the importance of their primary metabolism, some cyanobacteria are prolific producers of unique and bioactive secondary metabolites. Chemical investigations of the cyanobacterial genus Moorea have resulted in the isolation of over 190 compounds in the last two decades. However, preliminary genomic analysis has suggested that genome-guided approaches can enable the discovery of novel compounds from even well-studied Moorea strains, highlighting the importance of obtaining complete genomes. We report a complete genome of a filamentous tropical marine cyanobacterium, Moorea producens PAL, which reveals that about one-fifth of its genome is devoted to production of secondary metabolites, an impressive four times the cyanobacterial average. Moreover, possession of the complete PAL genome has allowed improvement to the assembly of three other Moorea draft genomes. Comparative genomics revealed that they are remarkably similar to one another, despite their differences in geography, morphology, and secondary metabolite profiles. Gene cluster networking highlights that this genus is distinctive among cyanobacteria, not only in the number of secondary metabolite pathways but also in the content of many pathways, which are potentially distinct from all other bacterial gene clusters to date. These findings portend that future genome-guided secondary metabolite discovery and isolation efforts should be highly productive.

Asunto(s)

Cianobacterias/genética , Cianobacterias/metabolismo , Genoma Bacteriano , Genómica , Metaboloma , Metabolómica , Composición de Base , Genes Bacterianos , Genómica/métodos , Metabolómica/métodos , Familia de Multigenes , Fijación del Nitrógeno , Sistemas de Lectura Abierta , Filogenia

13.

hybridSPAdes: an algorithm for hybrid assembly of short and long reads.

Antipov, Dmitry; Korobeynikov, Anton; McLean, Jeffrey S; Pevzner, Pavel A.

Bioinformatics ; 32(7): 1009-15, 2016 04 01.

Artículo en Inglés | MEDLINE | ID: mdl-26589280

RESUMEN

MOTIVATION: Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologies produce accurate but fragmented assemblies. Thus, a hybrid approach that assembles long reads (with low coverage) and short reads has a potential to generate high-quality assemblies at reduced cost. RESULTS: We describe hybridSPAdes algorithm for assembling short and long reads and benchmark it on a variety of bacterial assembly projects. Our results demonstrate that hybridSPAdes generates accurate assemblies (even in projects with relatively low coverage by long reads) thus reducing the overall cost of genome sequencing. We further present the first complete assembly of a genome from single cells using SMRT reads. AVAILABILITY AND IMPLEMENTATION: hybridSPAdes is implemented in C++ as a part of SPAdes genome assembler and is publicly available at http://bioinf.spbau.ru/en/spades CONTACT: d.antipov@spbu.ru SUPPLEMENTARY INFORMATION: supplementary data are available at Bioinformatics online.

Asunto(s)

Algoritmos , Análisis de Secuencia de ADN , Secuencia de Bases , Mapeo Cromosómico , Genoma

14.

A Maldiisotopic Approach to Discover Natural Products: Cryptomaldamide, a Hybrid Tripeptide from the Marine Cyanobacterium Moorea producens.

Kinnel, Robin B; Esquenazi, Eduardo; Leao, Tiago; Moss, Nathan; Mevers, Emily; Pereira, Alban R; Monroe, Emily A; Korobeynikov, Anton; Murray, Thomas F; Sherman, David; Gerwick, Lena; Dorrestein, Pieter C; Gerwick, William H.

J Nat Prod ; 80(5): 1514-1521, 2017 05 26.

Artículo en Inglés | MEDLINE | ID: mdl-28448144

RESUMEN

Genome sequencing of microorganisms has revealed a greatly increased capacity for natural products biosynthesis than was previously recognized from compound isolation efforts alone. Hence, new methods are needed for the discovery and description of this hidden secondary metabolite potential. Here we show that provision of heavy nitrogen 15N-nitrate to marine cyanobacterial cultures followed by single-filament MALDI analysis over a period of days was highly effective in identifying a new natural product with an exceptionally high nitrogen content. The compound, named cryptomaldamide, was subsequently isolated using MS to guide the purification process, and its structure determined by 2D NMR and other spectroscopic and chromatographic methods. Bioinformatic analysis of the draft genome sequence identified a 28.7 kB gene cluster that putatively encodes for cryptomaldamide biosynthesis. Notably, an amidinotransferase is proposed to initiate the biosynthetic process by transferring an amidino group from arginine to serine to produce the first residue to be incorporated by the hybrid NRPS-PKS pathway. The maldiisotopic approach presented here is thus demonstrated to provide an orthogonal method by which to discover novel chemical diversity from Nature.

Asunto(s)

Productos Biológicos/aislamiento & purificación , Cianobacterias/química , Oligopéptidos/biosíntesis , Oligopéptidos/aislamiento & purificación , Productos Biológicos/química , Biología Computacional , Espectroscopía de Resonancia Magnética , Estructura Molecular , Oligopéptidos/química

15.

Sequencing rare marine actinomycete genomes reveals high density of unique natural product biosynthetic gene clusters.

Schorn, Michelle A; Alanjary, Mohammad M; Aguinaldo, Kristen; Korobeynikov, Anton; Podell, Sheila; Patin, Nastassia; Lincecum, Tommie; Jensen, Paul R; Ziemert, Nadine; Moore, Bradley S.

Microbiology (Reading) ; 162(12): 2075-2086, 2016 12.

Artículo en Inglés | MEDLINE | ID: mdl-27902408

RESUMEN

Traditional natural product discovery methods have nearly exhausted the accessible diversity of microbial chemicals, making new sources and techniques paramount in the search for new molecules. Marine actinomycete bacteria have recently come into the spotlight as fruitful producers of structurally diverse secondary metabolites, and remain relatively untapped. In this study, we sequenced 21 marine-derived actinomycete strains, rarely studied for their secondary metabolite potential and under-represented in current genomic databases. We found that genome size and phylogeny were good predictors of biosynthetic gene cluster diversity, with larger genomes rivalling the well-known marine producers in the Streptomyces and Salinispora genera. Genomes in the Micrococcineae suborder, however, had consistently the lowest number of biosynthetic gene clusters. By networking individual gene clusters into gene cluster families, we were able to computationally estimate the degree of novelty each genus contributed to the current sequence databases. Based on the similarity measures between all actinobacteria in the Joint Genome Institute's Atlas of Biosynthetic gene Clusters database, rare marine genera show a high degree of novelty and diversity, with Corynebacterium, Gordonia, Nocardiopsis, Saccharomonospora and Pseudonocardia genera representing the highest gene cluster diversity. This research validates that rare marine actinomycetes are important candidates for exploration, as they are relatively unstudied, and their relatives are historically rich in secondary metabolites.

Asunto(s)

Actinobacteria/genética , Actinobacteria/aislamiento & purificación , Productos Biológicos/metabolismo , Genoma Bacteriano , Agua de Mar/microbiología , Actinobacteria/clasificación , Actinobacteria/metabolismo , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Filogenia , Análisis de Secuencia de ADN

16.

The Phormidolide Biosynthetic Gene Cluster: A trans-AT PKS Pathway Encoding a Toxic Macrocyclic Polyketide.

Bertin, Matthew J; Vulpanovici, Alexandra; Monroe, Emily A; Korobeynikov, Anton; Sherman, David H; Gerwick, Lena; Gerwick, William H.

Chembiochem ; 17(2): 164-73, 2016 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-26769357

RESUMEN

Phormidolide is a polyketide produced by a cultured filamentous marine cyanobacterium and incorporates a 16-membered macrolactone. Its complex structure is recognizably derived from a polyketide synthase pathway, but possesses unique and intriguing structural features that prompted interest in investigating its biosynthetic origin. Stable isotope incorporation experiments confirmed the polyketide nature of this compound. We further characterized the phormidolide gene cluster (phm) through genome sequencing followed by bioinformatic analysis. Two discrete trans-type acyltransferase (trans-AT) ORFs along with KS-AT adaptor regions (ATd) within the polyketide synthase (PKS) megasynthases, suggest that the phormidolide gene cluster is a trans-AT PKS. Insights gained from analysis of the mode of acetate incorporation and ensuing keto reduction prompted our reevaluation of the stereochemistry of phormidolide hydroxy groups located along the linear polyketide chain.

Asunto(s)

Aciltransferasas/química , Biología Computacional , Macrólidos , Familia de Multigenes , Sintasas Poliquetidas , Secuencia de Aminoácidos , Secuencia Conservada , Cianobacterias/metabolismo , Macrólidos/química , Sintasas Poliquetidas/química , Alineación de Secuencia

17.

Assembling short reads from jumping libraries with large insert sizes.

Vasilinetc, Irina; Prjibelski, Andrey D; Gurevich, Alexey; Korobeynikov, Anton; Pevzner, Pavel A.

Bioinformatics ; 31(20): 3262-8, 2015 Oct 15.

Artículo en Inglés | MEDLINE | ID: mdl-26040456

RESUMEN

MOTIVATION: Advances in Next-Generation Sequencing technologies and sample preparation recently enabled generation of high-quality jumping libraries that have a potential to significantly improve short read assemblies. However, assembly algorithms have to catch up with experimental innovations to benefit from them and to produce high-quality assemblies. RESULTS: We present a new algorithm that extends recently described exSPAnder universal repeat resolution approach to enable its applications to several challenging data types, including jumping libraries generated by the recently developed Illumina Nextera Mate Pair protocol. We demonstrate that, with these improvements, bacterial genomes often can be assembled in a few contigs using only a single Nextera Mate Pair library of short reads. AVAILABILITY AND IMPLEMENTATION: Described algorithms are implemented in C++ as a part of SPAdes genome assembler, which is freely available at bioinf.spbau.ru/en/spades. CONTACT: ap@bioinf.spbau.ru SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Algoritmos , Biblioteca de Genes , Genómica/métodos , Genoma Bacteriano , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos

18.

A novel uncultured heterotrophic bacterial associate of the cyanobacterium Moorea producens JHB.

Cummings, Milo E; Barbé, Debby; Leao, Tiago Ferreira; Korobeynikov, Anton; Engene, Niclas; Glukhov, Evgenia; Gerwick, William H; Gerwick, Lena.

BMC Microbiol ; 16(1): 198, 2016 08 30.

Artículo en Inglés | MEDLINE | ID: mdl-27577966

RESUMEN

BACKGROUND: Filamentous tropical marine cyanobacteria such as Moorea producens strain JHB possess a rich community of heterotrophic bacteria on their polysaccharide sheaths; however, these bacterial communities have not yet been adequately studied or characterized. RESULTS AND DISCUSSION: Through efforts to sequence the genome of this cyanobacterial strain, the 5.99 MB genome of an unknown bacterium emerged from the metagenomic information, named here as Mor1. Analysis of its genome revealed that the bacterium is heterotrophic and belongs to the phylum Acidobacteria, subgroup 22; however, it is only 85 % identical to the nearest cultured representative. Comparative genomics further revealed that Mor1 has a large number of genes involved in transcriptional regulation, is completely devoid of transposases, is not able to synthesize the full complement of proteogenic amino acids and appears to lack genes for nitrate uptake. Mor1 was found to be present in lab cultures of M. producens collected from various locations, but not other cyanobacterial species. Diverse efforts failed to culture the bacterium separately from filaments of M. producens JHB. Additionally, a co-culturing experiment between M. producens JHB possessing Mor1 and cultures of other genera of cyanobacteria indicated that the bacterium was not transferable. CONCLUSION: The data presented support a specific relationship between this novel uncultured bacterium and M. producens, however, verification of this proposed relationship cannot be done until the "uncultured" bacterium can be cultured.

Asunto(s)

Cianobacterias/clasificación , Cianobacterias/genética , Agua de Mar/microbiología , Acidobacteria/clasificación , Acidobacteria/genética , Secuencia de Bases , Técnicas de Cocultivo , Cianobacterias/metabolismo , ADN Bacteriano/genética , Genoma Bacteriano , Procesos Heterotróficos , Biología Marina , Metagenómica , Consorcios Microbianos , Microscopía Electrónica de Transmisión , Nitratos/metabolismo , Nitrógeno/metabolismo , Filogenia , Polisacáridos Bacterianos/metabolismo , Proteogenómica , ARN Ribosómico 16S/genética

19.

ExSPAnder: a universal repeat resolver for DNA fragment assembly.

Prjibelski, Andrey D; Vasilinetc, Irina; Bankevich, Anton; Gurevich, Alexey; Krivosheeva, Tatiana; Nurk, Sergey; Pham, Son; Korobeynikov, Anton; Lapidus, Alla; Pevzner, Pavel A.

Bioinformatics ; 30(12): i293-301, 2014 Jun 15.

Artículo en Inglés | MEDLINE | ID: mdl-24931996

RESUMEN

UNLABELLED: Next-generation sequencing (NGS) technologies have raised a challenging de novo genome assembly problem that is further amplified in recently emerged single-cell sequencing projects. While various NGS assemblers can use information from several libraries of read-pairs, most of them were originally developed for a single library and do not fully benefit from multiple libraries. Moreover, most assemblers assume uniform read coverage, condition that does not hold for single-cell projects where utilization of read-pairs is even more challenging. We have developed an exSPAnder algorithm that accurately resolves repeats in the case of both single and multiple libraries of read-pairs in both standard and single-cell assembly projects. AVAILABILITY AND IMPLEMENTATION: http://bioinf.spbau.ru/en/spades

Asunto(s)

Algoritmos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Actinomycetales/genética , ADN/química , Biblioteca de Genes , Genoma Bacteriano , Humanos , Secuencias Repetitivas de Ácidos Nucleicos , Staphylococcus aureus/genética

20.

Spongosine production by a Vibrio harveyi strain associated with the sponge Tectitethya crypta.

Bertin, Matthew J; Schwartz, Sarah L; Lee, John; Korobeynikov, Anton; Dorrestein, Pieter C; Gerwick, Lena; Gerwick, William H.

J Nat Prod ; 78(3): 493-9, 2015 Mar 27.

Artículo en Inglés | MEDLINE | ID: mdl-25668560

RESUMEN

Spongosine (1), deoxyspongosine (2), spongothymidine (Ara T) (3), and spongouridine (Ara U) were isolated from the Caribbean sponge Tectitethya crypta and given the general name "spongonucleosides". Spongosine, a methoxyadenosine derivative, has demonstrated a diverse bioactivity profile including anti-inflammatory activity and analgesic and vasodilation properties. Investigations into unusual nucleoside production by T. crypta-associated microorganisms using mass spectrometric techniques have identified a spongosine-producing strain of Vibrio harveyi and several structurally related compounds from multiple strains.

Asunto(s)

Adenosina/análogos & derivados , Poríferos/microbiología , Vibrio/química , Adenosina/química , Adenosina/farmacología , Animales , Región del Caribe , Estructura Molecular

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA