Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Nat Methods ; 17(9): 905-908, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32839597

RESUMEN

Molecular networking has become a key method to visualize and annotate the chemical space in non-targeted mass spectrometry data. We present feature-based molecular networking (FBMN) as an analysis method in the Global Natural Products Social Molecular Networking (GNPS) infrastructure that builds on chromatographic feature detection and alignment tools. FBMN enables quantitative analysis and resolution of isomers, including from ion mobility spectrometry.


Asunto(s)
Productos Biológicos/química , Espectrometría de Masas , Biología Computacional/métodos , Bases de Datos Factuales , Metabolómica/métodos , Programas Informáticos
2.
Methods Mol Biol ; 2104: 49-60, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-31953812

RESUMEN

This chapter describes the open-source tool suite OpenMS. OpenMS contains more than 180 tools which can be combined to build complex and flexible data-processing workflows. The broad range of functionality and the interoperability of these tools enable complex, complete, and reproducible data analysis workflows in computational proteomics and metabolomics. We introduce the key concepts of OpenMS and illustrate its capabilities with a complete workflow for the analysis of untargeted metabolomics data, including metabolite quantification and identification.


Asunto(s)
Biología Computacional/métodos , Interpretación Estadística de Datos , Metabolómica , Programas Informáticos , Algoritmos , Bases de Datos Factuales , Humanos , Metabolómica/métodos , Proteómica/métodos , Navegador Web , Flujo de Trabajo
3.
Bioinformatics ; 33(20): 3202-3210, 2017 Oct 15.
Artículo en Inglés | MEDLINE | ID: mdl-28633438

RESUMEN

SUMMARY: Nonribosomally synthesized peptides (NRPs) are natural products with widespread applications in medicine and biotechnology. Many algorithms have been developed to predict the substrate specificities of nonribosomal peptide synthetase adenylation (A) domains from DNA sequences, which enables prioritization and dereplication, and integration with other data types in discovery efforts. However, insufficient training data and a lack of clarity regarding prediction quality have impeded optimal use. Here, we introduce prediCAT, a new phylogenetics-inspired algorithm, which quantitatively estimates the degree of predictability of each A-domain. We then systematically benchmarked all algorithms on a newly gathered, independent test set of 434 A-domain sequences, showing that active-site-motif-based algorithms outperform whole-domain-based methods. Subsequently, we developed SANDPUMA, a powerful ensemble algorithm, based on newly trained versions of all high-performing algorithms, which significantly outperforms individual methods. Finally, we deployed SANDPUMA in a systematic investigation of 7635 Actinobacteria genomes, suggesting that NRP chemical diversity is much higher than previously estimated. SANDPUMA has been integrated into the widely used antiSMASH biosynthetic gene cluster analysis pipeline and is also available as an open-source, standalone tool. AVAILABILITY AND IMPLEMENTATION: SANDPUMA is freely available at https://bitbucket.org/chevrm/sandpuma and as a docker image at https://hub.docker.com/r/chevrm/sandpuma/ under the GNU Public License 3 (GPL3). CONTACT: chevrette@wisc.edu or marnix.medema@wur.nl. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Actinobacteria/metabolismo , Algoritmos , Biología Computacional/métodos , Péptido Sintasas/metabolismo , Péptidos/metabolismo , Análisis de Secuencia de Proteína/métodos , Actinobacteria/enzimología , Actinobacteria/genética , Dominio Catalítico , Familia de Multigenes , Programas Informáticos , Especificidad por Sustrato
4.
Nat Methods ; 13(9): 741-8, 2016 08 30.
Artículo en Inglés | MEDLINE | ID: mdl-27575624

RESUMEN

High-resolution mass spectrometry (MS) has become an important tool in the life sciences, contributing to the diagnosis and understanding of human diseases, elucidating biomolecular structural information and characterizing cellular signaling networks. However, the rapid growth in the volume and complexity of MS data makes transparent, accurate and reproducible analysis difficult. We present OpenMS 2.0 (http://www.openms.de), a robust, open-source, cross-platform software specifically designed for the flexible and reproducible analysis of high-throughput MS data. The extensible OpenMS software implements common mass spectrometric data processing tasks through a well-defined application programming interface in C++ and Python and through standardized open data formats. OpenMS additionally provides a set of 185 tools and ready-made workflows for common mass spectrometric data processing tasks, which enable users to perform complex quantitative mass spectrometric analyses with ease.


Asunto(s)
Biología Computacional/métodos , Procesamiento Automatizado de Datos , Espectrometría de Masas/métodos , Proteómica/métodos , Programas Informáticos , Envejecimiento/sangre , Proteínas Sanguíneas/química , Humanos , Anotación de Secuencia Molecular , Proteogenómica/métodos , Flujo de Trabajo
5.
J Proteome Res ; 15(9): 3441-8, 2016 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-27476824

RESUMEN

Modern mass spectrometry setups used in today's proteomics studies generate vast amounts of raw data, calling for highly efficient data processing and analysis tools. Software for analyzing these data is either monolithic (easy to use, but sometimes too rigid) or workflow-driven (easy to customize, but sometimes complex). Thermo Proteome Discoverer (PD) is a powerful software for workflow-driven data analysis in proteomics which, in our eyes, achieves a good trade-off between flexibility and usability. Here, we present two open-source plugins for PD providing additional functionality: LFQProfiler for label-free quantification of peptides and proteins, and RNP(xl) for UV-induced peptide-RNA cross-linking data analysis. LFQProfiler interacts with existing PD nodes for peptide identification and validation and takes care of the entire quantitative part of the workflow. We show that it performs at least on par with other state-of-the-art software solutions for label-free quantification in a recently published benchmark ( Ramus, C.; J. Proteomics 2016 , 132 , 51 - 62 ). The second workflow, RNP(xl), represents the first software solution to date for identification of peptide-RNA cross-links including automatic localization of the cross-links at amino acid resolution and localization scoring. It comes with a customized integrated cross-link fragment spectrum viewer for convenient manual inspection and validation of the results.


Asunto(s)
Proteoma/análisis , Proteómica/métodos , Programas Informáticos , Proteínas/metabolismo , ARN/metabolismo , Flujo de Trabajo
6.
Anal Chem ; 87(15): 7698-704, 2015 Aug 04.
Artículo en Inglés | MEDLINE | ID: mdl-26145158

RESUMEN

Identification of lipids in nontargeted lipidomics based on liquid-chromatography coupled to mass spectrometry (LC-MS) is still a major issue. While both accurate mass and fragment spectra contain valuable information, retention time (tR) information can be used to augment this data. We present a retention time model based on machine learning approaches which enables an improved assignment of lipid structures and automated annotation of lipidomics data. In contrast to common approaches we used a complex mixture of 201 lipids originating from fat tissue instead of a standard mixture to train a support vector regression (SVR) model including molecular structural features. The cross-validated model achieves a correlation coefficient between predicted and experimental test sample retention times of r = 0.989. Combining our retention time model with identification via accurate mass search (AMS) of lipids against the comprehensive LIPID MAPS database, retention time filtering can significantly reduce the rate of false positives in complex data sets like adipose tissue extracts. In our case, filtering with retention time information removed more than half of the potential identifications, while retaining 95% of the correct identifications. Combination of high-precision retention time prediction and accurate mass can thus significantly narrow down the number of hypotheses to be assessed for lipid identification in complex lipid pattern like tissue profiles.


Asunto(s)
Técnicas de Química Analítica/métodos , Lípidos/análisis , Cromatografía Liquida , Lípidos/química , Espectrometría de Masas
7.
Hum Mutat ; 36(5): 513-23, 2015 May.
Artículo en Inglés | MEDLINE | ID: mdl-25684150

RESUMEN

Prioritizing missense variants for further experimental investigation is a key challenge in current sequencing studies for exploring complex and Mendelian diseases. A large number of in silico tools have been employed for the task of pathogenicity prediction, including PolyPhen-2, SIFT, FatHMM, MutationTaster-2, MutationAssessor, Combined Annotation Dependent Depletion, LRT, phyloP, and GERP++, as well as optimized methods of combining tool scores, such as Condel and Logit. Due to the wealth of these methods, an important practical question to answer is which of these tools generalize best, that is, correctly predict the pathogenic character of new variants. We here demonstrate in a study of 10 tools on five datasets that such a comparative evaluation of these tools is hindered by two types of circularity: they arise due to (1) the same variants or (2) different variants from the same protein occurring both in the datasets used for training and for evaluation of these tools, which may lead to overly optimistic results. We show that comparative evaluations of predictors that do not address these types of circularity may erroneously conclude that circularity confounded tools are most accurate among all tools, and may even outperform optimized combinations of tools.


Asunto(s)
Biología Computacional/métodos , Mutación Missense , Programas Informáticos , Conjuntos de Datos como Asunto , Humanos , Internet , Reproducibilidad de los Resultados , Navegador Web
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...