Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
Nat Methods ; 17(9): 905-908, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32839597

RESUMO

Molecular networking has become a key method to visualize and annotate the chemical space in non-targeted mass spectrometry data. We present feature-based molecular networking (FBMN) as an analysis method in the Global Natural Products Social Molecular Networking (GNPS) infrastructure that builds on chromatographic feature detection and alignment tools. FBMN enables quantitative analysis and resolution of isomers, including from ion mobility spectrometry.


Assuntos
Produtos Biológicos/química , Espectrometria de Massas , Biologia Computacional/métodos , Bases de Dados Factuais , Metabolômica/métodos , Software
2.
Nat Methods ; 13(9): 741-8, 2016 08 30.
Artigo em Inglês | MEDLINE | ID: mdl-27575624

RESUMO

High-resolution mass spectrometry (MS) has become an important tool in the life sciences, contributing to the diagnosis and understanding of human diseases, elucidating biomolecular structural information and characterizing cellular signaling networks. However, the rapid growth in the volume and complexity of MS data makes transparent, accurate and reproducible analysis difficult. We present OpenMS 2.0 (http://www.openms.de), a robust, open-source, cross-platform software specifically designed for the flexible and reproducible analysis of high-throughput MS data. The extensible OpenMS software implements common mass spectrometric data processing tasks through a well-defined application programming interface in C++ and Python and through standardized open data formats. OpenMS additionally provides a set of 185 tools and ready-made workflows for common mass spectrometric data processing tasks, which enable users to perform complex quantitative mass spectrometric analyses with ease.


Assuntos
Biologia Computacional/métodos , Processamento Eletrônico de Dados , Espectrometria de Massas/métodos , Proteômica/métodos , Software , Envelhecimento/sangue , Proteínas Sanguíneas/química , Humanos , Anotação de Sequência Molecular , Proteogenômica/métodos , Fluxo de Trabalho
3.
Bioinformatics ; 33(20): 3202-3210, 2017 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-28633438

RESUMO

SUMMARY: Nonribosomally synthesized peptides (NRPs) are natural products with widespread applications in medicine and biotechnology. Many algorithms have been developed to predict the substrate specificities of nonribosomal peptide synthetase adenylation (A) domains from DNA sequences, which enables prioritization and dereplication, and integration with other data types in discovery efforts. However, insufficient training data and a lack of clarity regarding prediction quality have impeded optimal use. Here, we introduce prediCAT, a new phylogenetics-inspired algorithm, which quantitatively estimates the degree of predictability of each A-domain. We then systematically benchmarked all algorithms on a newly gathered, independent test set of 434 A-domain sequences, showing that active-site-motif-based algorithms outperform whole-domain-based methods. Subsequently, we developed SANDPUMA, a powerful ensemble algorithm, based on newly trained versions of all high-performing algorithms, which significantly outperforms individual methods. Finally, we deployed SANDPUMA in a systematic investigation of 7635 Actinobacteria genomes, suggesting that NRP chemical diversity is much higher than previously estimated. SANDPUMA has been integrated into the widely used antiSMASH biosynthetic gene cluster analysis pipeline and is also available as an open-source, standalone tool. AVAILABILITY AND IMPLEMENTATION: SANDPUMA is freely available at https://bitbucket.org/chevrm/sandpuma and as a docker image at https://hub.docker.com/r/chevrm/sandpuma/ under the GNU Public License 3 (GPL3). CONTACT: chevrette@wisc.edu or marnix.medema@wur.nl. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Actinobacteria/metabolismo , Algoritmos , Biologia Computacional/métodos , Peptídeo Sintases/metabolismo , Peptídeos/metabolismo , Análise de Sequência de Proteína/métodos , Actinobacteria/enzimologia , Actinobacteria/genética , Domínio Catalítico , Família Multigênica , Software , Especificidade por Substrato
4.
J Proteome Res ; 15(9): 3441-8, 2016 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-27476824

RESUMO

Modern mass spectrometry setups used in today's proteomics studies generate vast amounts of raw data, calling for highly efficient data processing and analysis tools. Software for analyzing these data is either monolithic (easy to use, but sometimes too rigid) or workflow-driven (easy to customize, but sometimes complex). Thermo Proteome Discoverer (PD) is a powerful software for workflow-driven data analysis in proteomics which, in our eyes, achieves a good trade-off between flexibility and usability. Here, we present two open-source plugins for PD providing additional functionality: LFQProfiler for label-free quantification of peptides and proteins, and RNP(xl) for UV-induced peptide-RNA cross-linking data analysis. LFQProfiler interacts with existing PD nodes for peptide identification and validation and takes care of the entire quantitative part of the workflow. We show that it performs at least on par with other state-of-the-art software solutions for label-free quantification in a recently published benchmark ( Ramus, C.; J. Proteomics 2016 , 132 , 51 - 62 ). The second workflow, RNP(xl), represents the first software solution to date for identification of peptide-RNA cross-links including automatic localization of the cross-links at amino acid resolution and localization scoring. It comes with a customized integrated cross-link fragment spectrum viewer for convenient manual inspection and validation of the results.


Assuntos
Proteoma/análise , Proteômica/métodos , Software , Proteínas/metabolismo , RNA/metabolismo , Fluxo de Trabalho
5.
Hum Mutat ; 36(5): 513-23, 2015 May.
Artigo em Inglês | MEDLINE | ID: mdl-25684150

RESUMO

Prioritizing missense variants for further experimental investigation is a key challenge in current sequencing studies for exploring complex and Mendelian diseases. A large number of in silico tools have been employed for the task of pathogenicity prediction, including PolyPhen-2, SIFT, FatHMM, MutationTaster-2, MutationAssessor, Combined Annotation Dependent Depletion, LRT, phyloP, and GERP++, as well as optimized methods of combining tool scores, such as Condel and Logit. Due to the wealth of these methods, an important practical question to answer is which of these tools generalize best, that is, correctly predict the pathogenic character of new variants. We here demonstrate in a study of 10 tools on five datasets that such a comparative evaluation of these tools is hindered by two types of circularity: they arise due to (1) the same variants or (2) different variants from the same protein occurring both in the datasets used for training and for evaluation of these tools, which may lead to overly optimistic results. We show that comparative evaluations of predictors that do not address these types of circularity may erroneously conclude that circularity confounded tools are most accurate among all tools, and may even outperform optimized combinations of tools.


Assuntos
Biologia Computacional/métodos , Mutação de Sentido Incorreto , Software , Conjuntos de Dados como Assunto , Humanos , Internet , Reprodutibilidade dos Testes , Navegador
6.
Anal Chem ; 87(15): 7698-704, 2015 Aug 04.
Artigo em Inglês | MEDLINE | ID: mdl-26145158

RESUMO

Identification of lipids in nontargeted lipidomics based on liquid-chromatography coupled to mass spectrometry (LC-MS) is still a major issue. While both accurate mass and fragment spectra contain valuable information, retention time (tR) information can be used to augment this data. We present a retention time model based on machine learning approaches which enables an improved assignment of lipid structures and automated annotation of lipidomics data. In contrast to common approaches we used a complex mixture of 201 lipids originating from fat tissue instead of a standard mixture to train a support vector regression (SVR) model including molecular structural features. The cross-validated model achieves a correlation coefficient between predicted and experimental test sample retention times of r = 0.989. Combining our retention time model with identification via accurate mass search (AMS) of lipids against the comprehensive LIPID MAPS database, retention time filtering can significantly reduce the rate of false positives in complex data sets like adipose tissue extracts. In our case, filtering with retention time information removed more than half of the potential identifications, while retaining 95% of the correct identifications. Combination of high-precision retention time prediction and accurate mass can thus significantly narrow down the number of hypotheses to be assessed for lipid identification in complex lipid pattern like tissue profiles.


Assuntos
Técnicas de Química Analítica/métodos , Lipídeos/análise , Cromatografia Líquida , Lipídeos/química , Espectrometria de Massas
7.
Methods Mol Biol ; 2104: 49-60, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31953812

RESUMO

This chapter describes the open-source tool suite OpenMS. OpenMS contains more than 180 tools which can be combined to build complex and flexible data-processing workflows. The broad range of functionality and the interoperability of these tools enable complex, complete, and reproducible data analysis workflows in computational proteomics and metabolomics. We introduce the key concepts of OpenMS and illustrate its capabilities with a complete workflow for the analysis of untargeted metabolomics data, including metabolite quantification and identification.


Assuntos
Biologia Computacional/métodos , Interpretação Estatística de Dados , Metabolômica , Software , Algoritmos , Bases de Dados Factuais , Humanos , Metabolômica/métodos , Proteômica/métodos , Navegador , Fluxo de Trabalho
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa