Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Más filtros

Bases de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
J Proteome Res ; 22(2): 350-358, 2023 02 03.
Artículo en Inglés | MEDLINE | ID: mdl-36648107

RESUMEN

Reliable peptide identification is key in mass spectrometry (MS) based proteomics. To this end, the target decoy approach (TDA) has become the cornerstone for extracting a set of reliable peptide-to-spectrum matches (PSMs) that will be used in downstream analysis. Indeed, TDA is now the default method to estimate the false discovery rate (FDR) for a given set of PSMs, and users typically view it as a universal solution for assessing the FDR in the peptide identification step. However, the TDA also relies on a minimal set of assumptions, which are typically never verified in practice. We argue that a violation of these assumptions can lead to poor FDR control, which can be detrimental to any downstream data analysis. We here therefore first clearly spell out these TDA assumptions, and introduce TargetDecoy, a Bioconductor package with all the necessary functionality to control the TDA quality and its underlying assumptions for a given set of PSMs.


Asunto(s)
Péptidos , Espectrometría de Masas en Tándem , Espectrometría de Masas en Tándem/métodos , Péptidos/análisis , Proteómica/métodos , Análisis de Datos , Control de Calidad , Bases de Datos de Proteínas , Algoritmos
2.
Mol Cell Proteomics ; 19(7): 1209-1219, 2020 07.
Artículo en Inglés | MEDLINE | ID: mdl-32321741

RESUMEN

Label-Free Quantitative mass spectrometry based workflows for differential expression (DE) analysis of proteins impose important challenges on the data analysis because of peptide-specific effects and context dependent missingness of peptide intensities. Peptide-based workflows, like MSqRob, test for DE directly from peptide intensities and outperform summarization methods which first aggregate MS1 peptide intensities to protein intensities before DE analysis. However, these methods are computationally expensive, often hard to understand for the non-specialized end-user, and do not provide protein summaries, which are important for visualization or downstream processing. In this work, we therefore evaluate state-of-the-art summarization strategies using a benchmark spike-in dataset and discuss why and when these fail compared with the state-of-the-art peptide based model, MSqRob. Based on this evaluation, we propose a novel summarization strategy, MSqRobSum, which estimates MSqRob's model parameters in a two-stage procedure circumventing the drawbacks of peptide-based workflows. MSqRobSum maintains MSqRob's superior performance, while providing useful protein expression summaries for plotting and downstream analysis. Summarizing peptide to protein intensities considerably reduces the computational complexity, the memory footprint and the model complexity, and makes it easier to disseminate DE inferred on protein summaries. Moreover, MSqRobSum provides a highly modular analysis framework, which provides researchers with full flexibility to develop data analysis workflows tailored toward their specific applications.


Asunto(s)
Espectrometría de Masas/métodos , Péptidos/metabolismo , Proteoma/metabolismo , Proteómica/métodos , Cromatografía Liquida , Bases de Datos de Proteínas , Humanos , Programas Informáticos
3.
Anal Chem ; 92(9): 6278-6287, 2020 05 05.
Artículo en Inglés | MEDLINE | ID: mdl-32227882

RESUMEN

Missing values are a major issue in quantitative data-dependent mass spectrometry-based proteomics. We therefore present an innovative solution to this key issue by introducing a hurdle model, which is a mixture between a binomial peptide count and a peptide intensity-based model component. It enables dramatically enhanced quantification of proteins with many missing values without having to resort to harmful assumptions for missingness. We demonstrate the superior performance of our method by comparing it with state-of-the-art methods in the field.


Asunto(s)
Proteómica/métodos , Proyectos de Investigación , Cromatografía Líquida de Alta Presión , Espectrometría de Masas , Modelos Teóricos , Péptidos/análisis , Proteoma/análisis
4.
Mol Cell Proteomics ; 16(6): 1064-1080, 2017 06.
Artículo en Inglés | MEDLINE | ID: mdl-28432195

RESUMEN

Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/MS spectra were matched to a specific N-terminal peptide library encompassing protein N termini encoded in the Arabidopsis thaliana genome. After a stringent false discovery rate filtering, 117 protein N termini compliant with N-terminal methionine excision specificity and indicative of translation initiation were found. These include N-terminal protein extensions and translation from transposable elements and pseudogenes. Gene prediction provided supporting protein-coding models for approximately half of the protein N termini. Besides the prediction of functional domains (partially) contained within the newly predicted ORFs, further supporting evidence of translation was found in the recently released Araport11 genome re-annotation of Arabidopsis and computational translations of sequences stored in public repositories. Most interestingly, complementary evidence by ribosome profiling was found for 23 protein N termini. Finally, by analyzing protein N-terminal peptides, an in silico analysis demonstrates the applicability of our N-terminal proteogenomics strategy in revealing protein-coding potential in species with well- and poorly-annotated genomes.


Asunto(s)
Arabidopsis/genética , Arabidopsis/metabolismo , Biosíntesis de Proteínas/genética , Genoma de Planta , Biblioteca de Péptidos , Péptidos/genética , Péptidos/metabolismo , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Proteogenómica
5.
J Proteome Res ; 15(6): 1963-70, 2016 06 03.
Artículo en Inglés | MEDLINE | ID: mdl-27089233

RESUMEN

Shotgun proteomics experiments often take the form of a differential analysis, where two or more samples are compared against each other. The objective is to identify proteins that are either unique to a specific sample or a set of samples (qualitative differential proteomics), or that are significantly differentially expressed in one or more samples (quantitative differential proteomics). However, the success depends on the availability of a reliable protein sequence database for each sample. To perform such an analysis in the absence of a database, we here propose a novel, generic pipeline comprising an adapted spectral similarity score derived from database search algorithms that compares samples at the spectrum level to detect unique spectra. We applied our pipeline to compare two parasitic tapeworms: Taenia solium and Taenia hydatigena, of which only the former poses a threat to humans. Furthermore, because the genome of T. solium recently became available, we were able to prove the effectiveness and reliability of our pipeline a posteriori.


Asunto(s)
Proteómica/métodos , Taenia/química , Algoritmos , Animales , Bases de Datos de Proteínas , Genoma , Especificidad de la Especie , Espectrometría de Masas en Tándem , Flujo de Trabajo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA