Pesquisa | Portal Regional da BVS

NPvis: An Interactive Visualizer of Peptidic Natural Product-MS/MS Matches.

Kunyavskaya, Olga; Mikheenko, Alla; Gurevich, Alexey.

Metabolites ; 12(8)2022 Jul 29.

Artigo em Inglês | MEDLINE | ID: mdl-36005578

RESUMO

Peptidic natural products (PNPs) represent a medically important class of secondary metabolites that includes antibiotics, anti-inflammatory and antitumor agents. Advances in tandem mass spectra (MS/MS) acquisition and in silico database search methods have enabled high-throughput PNP discovery. However, the resulting spectra annotations are often error-prone and their validation remains a bottleneck. Here, we present NPvis, a visualizer suitable for the evaluation of PNP-MS/MS matches. The tool interactively maps annotated spectrum peaks to the corresponding PNP fragments and allows researchers to assess the match correctness. NPvis accounts for the wide chemical diversity of PNPs that prevents the use of the existing proteomics visualizers. Moreover, NPvis works even if the exact chemical structure of the matching PNP is unknown. The tool is available online and as a standalone application. We hope that it will benefit the community by streamlining PNP data analysis and validation.

Automated annotation of human centromeres with HORmon.

Kunyavskaya, Olga; Dvorkina, Tatiana; Bzikadze, Andrey V; Alexandrov, Ivan A; Pevzner, Pavel A.

Genome Res ; 32(6): 1137-1151, 2022 06.

Artigo em Inglês | MEDLINE | ID: mdl-35545449

RESUMO

Recent advances in long-read sequencing opened a possibility to address the long-standing questions about the architecture and evolution of human centromeres. They also emphasized the need for centromere annotation (partitioning human centromeres into monomers and higher-order repeats [HORs]). Although there was a half-century-long series of semi-manual studies of centromere architecture, a rigorous centromere annotation algorithm is still lacking. Moreover, an automated centromere annotation is a prerequisite for studies of genetic diseases associated with centromeres and evolutionary studies of centromeres across multiple species. Although the monomer decomposition (transforming a centromere into a monocentromere written in the monomer alphabet) and the HOR decomposition (representing a monocentromere in the alphabet of HORs) are currently viewed as two separate problems, we show that they should be integrated into a single framework in such a way that HOR (monomer) inference affects monomer (HOR) inference. We thus developed the HORmon algorithm that integrates the monomer/HOR inference and automatically generates the human monomers/HORs that are largely consistent with the previous semi-manual inference.

Assuntos

Algoritmos , Centrômero , Centrômero/genética , Humanos

Complete genomic and epigenetic maps of human centromeres.

Altemose, Nicolas; Logsdon, Glennis A; Bzikadze, Andrey V; Sidhwani, Pragya; Langley, Sasha A; Caldas, Gina V; Hoyt, Savannah J; Uralsky, Lev; Ryabov, Fedor D; Shew, Colin J; Sauria, Michael E G; Borchers, Matthew; Gershman, Ariel; Mikheenko, Alla; Shepelev, Valery A; Dvorkina, Tatiana; Kunyavskaya, Olga; Vollger, Mitchell R; Rhie, Arang; McCartney, Ann M; Asri, Mobin; Lorig-Roach, Ryan; Shafin, Kishwar; Lucas, Julian K; Aganezov, Sergey; Olson, Daniel; de Lima, Leonardo Gomes; Potapova, Tamara; Hartley, Gabrielle A; Haukness, Marina; Kerpedjiev, Peter; Gusev, Fedor; Tigyi, Kristof; Brooks, Shelise; Young, Alice; Nurk, Sergey; Koren, Sergey; Salama, Sofie R; Paten, Benedict; Rogaev, Evgeny I; Streets, Aaron; Karpen, Gary H; Dernburg, Abby F; Sullivan, Beth A; Straight, Aaron F; Wheeler, Travis J; Gerton, Jennifer L; Eichler, Evan E; Phillippy, Adam M; Timp, Winston.

Science ; 376(6588): eabl4178, 2022 04.

Artigo em Inglês | MEDLINE | ID: mdl-35357911

RESUMO

Existing human genome assemblies have almost entirely excluded repetitive sequences within and near centromeres, limiting our understanding of their organization, evolution, and functions, which include facilitating proper chromosome segregation. Now, a complete, telomere-to-telomere human genome assembly (T2T-CHM13) has enabled us to comprehensively characterize pericentromeric and centromeric repeats, which constitute 6.2% of the genome (189.9 megabases). Detailed maps of these regions revealed multimegabase structural rearrangements, including in active centromeric repeat arrays. Analysis of centromere-associated sequences uncovered a strong relationship between the position of the centromere and the evolution of the surrounding DNA through layered repeat expansions. Furthermore, comparisons of chromosome X centromeres across a diverse panel of individuals illuminated high degrees of structural, epigenetic, and sequence variation in these complex and rapidly evolving regions.

Assuntos

Centrômero/genética , Mapeamento Cromossômico , Epigênese Genética , Genoma Humano , Evolução Molecular , Genômica , Humanos , Sequências Repetitivas de Ácido Nucleico

Nerpa: A Tool for Discovering Biosynthetic Gene Clusters of Bacterial Nonribosomal Peptides.

Kunyavskaya, Olga; Tagirdzhanov, Azat M; Caraballo-Rodríguez, Andrés Mauricio; Nothias, Louis-Félix; Dorrestein, Pieter C; Korobeynikov, Anton; Mohimani, Hosein; Gurevich, Alexey.

Metabolites ; 11(10)2021 Oct 11.

Artigo em Inglês | MEDLINE | ID: mdl-34677408

RESUMO

Microbial natural products are a major source of bioactive compounds for drug discovery. Among these molecules, nonribosomal peptides (NRPs) represent a diverse class of natural products that include antibiotics, immunosuppressants, and anticancer agents. Recent breakthroughs in natural product discovery have revealed the chemical structure of several thousand NRPs. However, biosynthetic gene clusters (BGCs) encoding them are known only for a few hundred compounds. Here, we developed Nerpa, a computational method for the high-throughput discovery of novel BGCs responsible for producing known NRPs. After searching 13,399 representative bacterial genomes from the RefSeq repository against 8368 known NRPs, Nerpa linked 117 BGCs to their products. We further experimentally validated the predicted BGC of ngercheumicin from Photobacterium galatheae via mass spectrometry. Nerpa supports searching new genomes against thousands of known NRP structures, and novel molecular structures against tens of thousands of bacterial genomes. The availability of these tools can enhance our understanding of NRP synthesis and the function of their biosynthetic enzymes.

CentromereArchitect: inference and analysis of the architecture of centromeres.

Dvorkina, Tatiana; Kunyavskaya, Olga; Bzikadze, Andrey V; Alexandrov, Ivan; Pevzner, Pavel A.

Bioinformatics ; 37(Suppl_1): i196-i204, 2021 07 12.

Artigo em Inglês | MEDLINE | ID: mdl-34252949

RESUMO

MOTIVATION: Recent advances in long-read sequencing technologies led to rapid progress in centromere assembly in the last year and, for the first time, opened a possibility to address the long-standing questions about the architecture and evolution of human centromeres. However, since these advances have not been yet accompanied by the development of the centromere-specific bioinformatics algorithms, even the fundamental questions (e.g. centromere annotation by deriving the complete set of human monomers and high-order repeats), let alone more complex questions (e.g. explaining how monomers and high-order repeats evolved) about human centromeres remain open. Moreover, even though there was a four-decade-long series of studies aimed at cataloging all human monomers and high-order repeats, the rigorous algorithmic definitions of these concepts are still lacking. Thus, the development of a centromere annotation tool is a prerequisite for follow-up personalized biomedical studies of centromeres across the human population and evolutionary studies of centromeres across various species. RESULTS: We describe the CentromereArchitect, the first tool for the centromere annotation in a newly sequenced genome, apply it to the recently generated complete assembly of a human genome by the Telomere-to-Telomere consortium, generate the complete set of human monomers and high-order repeats for 'live' centromeres, and reveal a vast set of hybrid monomers that may represent the focal points of centromere evolution. AVAILABILITY AND IMPLEMENTATION: CentromereArchitect is publicly available on https://github.com/ablab/stringdecomposer/tree/ismb2021. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Centrômero , Genoma , Algoritmos , Sequência de Bases , Centrômero/genética , Humanos , Telômero

MGnify: the microbiome analysis resource in 2020.

Mitchell, Alex L; Almeida, Alexandre; Beracochea, Martin; Boland, Miguel; Burgin, Josephine; Cochrane, Guy; Crusoe, Michael R; Kale, Varsha; Potter, Simon C; Richardson, Lorna J; Sakharova, Ekaterina; Scheremetjew, Maxim; Korobeynikov, Anton; Shlemov, Alex; Kunyavskaya, Olga; Lapidus, Alla; Finn, Robert D.

Nucleic Acids Res ; 48(D1): D570-D578, 2020 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-31696235

RESUMO

MGnify (http://www.ebi.ac.uk/metagenomics) provides a free to use platform for the assembly, analysis and archiving of microbiome data derived from sequencing microbial populations that are present in particular environments. Over the past 2 years, MGnify (formerly EBI Metagenomics) has more than doubled the number of publicly available analysed datasets held within the resource. Recently, an updated approach to data analysis has been unveiled (version 5.0), replacing the previous single pipeline with multiple analysis pipelines that are tailored according to the input data, and that are formally described using the Common Workflow Language, enabling greater provenance, reusability, and reproducibility. MGnify's new analysis pipelines offer additional approaches for taxonomic assertions based on ribosomal internal transcribed spacer regions (ITS1/2) and expanded protein functional annotations. Biochemical pathways and systems predictions have also been added for assembled contigs. MGnify's growing focus on the assembly of metagenomic data has also seen the number of datasets it has assembled and analysed increase six-fold. The non-redundant protein database constructed from the proteins encoded by these assemblies now exceeds 1 billion sequences. Meanwhile, a newly developed contig viewer provides fine-grained visualisation of the assembled contigs and their enriched annotations.

Assuntos

Metagenoma , Microbiota , Filogenia , Software , Archaea/classificação , Archaea/genética , Bactérias/classificação , Bactérias/genética , DNA Espaçador Ribossômico/genética , Bases de Dados Genéticas , Metagenômica/métodos

SGTK: a toolkit for visualization and assessment of scaffold graphs.

Kunyavskaya, Olga; Prjibelski, Andrey D.

Bioinformatics ; 35(13): 2303-2305, 2019 07 01.

Artigo em Inglês | MEDLINE | ID: mdl-30475983

RESUMO

SUMMARY: Scaffolding is an important step in every genome assembly pipeline, which allows to order contigs into longer sequences using various types of linkage information, such as mate-pair libraries and long reads. In this work, we operate with a notion of a scaffold graph-a graph, vertices of which correspond to the assembled contigs and edges represent connections between them. We present a software package called Scaffold Graph ToolKit that allows to construct and visualize scaffold graphs using different kinds of sequencing data. We show that the scaffold graph appears to be useful for analyzing and assessing genome assemblies, and demonstrate several use cases that can be helpful for both assembly software developers and their users. AVAILABILITY AND IMPLEMENTATION: SGTK is implemented in C++, Python and JavaScript and is freely available at https://github.com/olga24912/SGTK. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Software , Análise de Sequência de DNA

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA