Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 48(D1): D328-D334, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31724716

RESUMO

The neXtProt knowledgebase (https://www.nextprot.org) is an integrative resource providing both data on human protein and the tools to explore these. In order to provide comprehensive and up-to-date data, we evaluate and add new data sets. We describe the incorporation of three new data sets that provide expression, function, protein-protein binary interaction, post-translational modifications (PTM) and variant information. New SPARQL query examples illustrating uses of the new data were added. neXtProt has continued to develop tools for proteomics. We have improved the peptide uniqueness checker and have implemented a new protein digestion tool. Together, these tools make it possible to determine which proteases can be used to identify trypsin-resistant proteins by mass spectrometry. In terms of usability, we have finished revamping our web interface and completely rewritten our API. Our SPARQL endpoint now supports federated queries. All the neXtProt data are available via our user interface, API, SPARQL endpoint and FTP site, including the new PEFF 1.0 format files. Finally, the data on our FTP site is now CC BY 4.0 to promote its reuse.


Assuntos
Bases de Dados de Proteínas , Bases de Conhecimento , Humanos , Internet , Espectrometria de Massas , Peptídeos/química , Proteínas Quinases/química , Proteínas Quinases/metabolismo , Processamento de Proteína Pós-Traducional , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Análise de Sequência de RNA , Software , Tripsina , Interface Usuário-Computador
2.
Nucleic Acids Res ; 45(D1): D177-D182, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899619

RESUMO

The neXtProt human protein knowledgebase (https://www.nextprot.org) continues to add new content and tools, with a focus on proteomics and genetic variation data. neXtProt now has proteomics data for over 85% of the human proteins, as well as new tools tailored to the proteomics community.Moreover, the neXtProt release 2016-08-25 includes over 8000 phenotypic observations for over 4000 variations in a number of genes involved in hereditary cancers and channelopathies. These changes are presented in the current neXtProt update. All of the neXtProt data are available via our user interface and FTP site. We also provide an API access and a SPARQL endpoint for more technical applications.


Assuntos
Bases de Dados de Proteínas , Proteômica , Estudos de Associação Genética , Variação Genética , Humanos , Internet , Fenótipo , Proteômica/métodos , Software , Navegador
3.
Rapid Commun Mass Spectrom ; 31(9): 753-761, 2017 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-28199054

RESUMO

RATIONALE: In peptide quantification by liquid chromatography/mass spectrometry (LC/MS), the optimization of multiple reaction monitoring (MRM) parameters is essential for sensitive detection. We have compared different approaches to build MRM assays, based either on flow injection analysis (FIA) of isotopically labelled peptides, or on the knowledge and the prediction of the best settings for MRM transitions and collision energies (CE). In this context, we introduce MRMOptimizer, an open-source software tool that processes spectra and assists the user in selecting transitions in the FIA workflow. METHODS: MS/MS spectral libraries with CE voltages from 10 to 70 V are automatically acquired in FIA mode for isotopically labelled peptides. Then MRMOptimizer determines the optimal MRM settings for each peptide. To assess the quantitative performance of our approach, 155 peptides, representing 84 proteins, were analysed by LC/MRM-MS and the peak areas were compared between: (A) the MRMOptimizer-based workflow, (B1) the SRMAtlas transitions set used 'as-is'; (B2) the same SRMAtlas set with CE parameters optimized by Skyline. RESULTS: 51% of the three most intense transitions per peptide were shown to be common to both A and B1/B2 methods, and displayed similar sensitivity and peak area distributions. The peak areas obtained with MRMOptimizer for transitions sharing either the precursor ion charge state or the fragment ions with the SRMAtlas set at unique transitions were increased 1.8- to 2.3-fold. The gain in sensitivity using MRMOptimizer for transitions with different precursor ion charge state and fragment ions (8% of the total), reaches a ~ 11-fold increase. CONCLUSIONS: Isotopically labelled peptides can be used to optimize MRM transitions more efficiently in FIA than by searching databases. The MRMOptimizer software is MS independent and enables the post-acquisition selection of MRM parameters. Coefficients of variation for optimal CE values are lower than those obtained with the SRMAtlas approach (B2) and one additional peptide was detected. Copyright © 2017 John Wiley & Sons, Ltd.


Assuntos
Cromatografia Líquida/métodos , Fragmentos de Peptídeos/análise , Espectrometria de Massas em Tandem/métodos , Células Cultivadas , Bases de Dados Factuais , Células Dendríticas/química , Humanos , Íons/análise , Íons/química , Modelos Lineares , Fragmentos de Peptídeos/química , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Tripsina
4.
Proteomics ; 15(15): 2568-79, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-25825003

RESUMO

Formalin-fixed paraffin-embedded (FFPE) tissue is considered as an appropriate alternative to frozen/fresh tissue for proteomic analysis. Here we study formalin-induced alternations on a proteome-wide level. We compared LC-MS/MS data of FFPE and frozen human kidney tissues by two methods. First, clustering analysis revealed that the biological variation is higher than the variation introduced by the two sample processing techniques and clusters formed in accordance with the biological tissue origin and not with the sample preservation method. Second, we combined open modification search and spectral counting to find modifications that are more abundant in FFPE samples compared to frozen samples. This analysis revealed lysine methylation (+14 Da) as the most frequent modification induced by FFPE preservation. We also detected a slight increase in methylene (+12 Da) and methylol (+30 Da) adducts as well as a putative modification of +58 Da, but they contribute less to the overall modification count. Subsequent SEQUEST analysis and X!Tandem searches of different datasets confirmed these trends. However, the modifications due to FFPE sample processing are a minor disturbance affecting 2-6% of all peptide-spectrum matches and the peptides lists identified in FFPE and frozen tissues are still highly similar.


Assuntos
Rim/metabolismo , Lisina/metabolismo , Inclusão em Parafina/métodos , Proteoma/metabolismo , Proteômica/métodos , Fixação de Tecidos/métodos , Sequência de Aminoácidos , Cromatografia Líquida , Análise por Conglomerados , Fixadores/química , Formaldeído/química , Secções Congeladas/métodos , Humanos , Metilação , Proteoma/classificação , Reprodutibilidade dos Testes , Espectrometria de Massas em Tandem
5.
Proteomics ; 11(20): 4085-95, 2011 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-21898822

RESUMO

The relevance of libraries of annotated MS/MS spectra is growing with the amount of proteomic data generated in high-throughput experiments. These reference libraries provide a fast and accurate way to identify newly acquired MS/MS spectra. In the context of multiple hypotheses testing, the control of the number of false-positive identifications expected in the final result list by means of the calculation of the false discovery rate (FDR). In a classical sequence search where experimental MS/MS spectra are compared with the theoretical peptide spectra calculated from a sequence database, the FDR is estimated by searching randomized or decoy sequence databases. Despite on-going discussion on how exactly the FDR has to be calculated, this method is widely accepted in the proteomic community. Recently, similar approaches to control the FDR of spectrum library searches were discussed. We present in this paper a detailed analysis of the similarity between spectra of distinct peptides to set the basis of our own solution for decoy library creation (DeLiberator). It differs from the previously published results in some key points, mainly in implementing new methods that prevent decoy spectra from being too similar to the original library spectra while keeping important features of real MS/MS spectra. Using different proteomic data sets and library creation methods, we evaluate our approach and compare it with alternative methods.


Assuntos
Algoritmos , Peptídeos/química , Proteômica/métodos , Software , Espectrometria de Massas em Tandem , Animais , Bases de Dados de Proteínas , Estudos de Associação Genética , Humanos
6.
J Proteome Res ; 10(7): 2913-21, 2011 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-21500769

RESUMO

MS2 library spectra are rich in reproducible information about peptide fragmentation patterns compared to theoretical spectra modeled by a sequence search tool. So far, spectrum library searches are mostly applied to detect peptides as they are present in the library. However, they also allow finding modified variants of the library peptides if the search is done with a large precursor mass window and an adapted Spectrum-Spectrum Match (SSM) scoring algorithm. We perform a thorough evaluation on the use of library spectra as opposed to theoretical peptide spectra for the identification of PTMs, analyzing spectra of a well-annotated modification-rich test data set compiled from public data repositories. These initial studies motivate the development of our modification tolerant spectrum library search tool QuickMod, designed to identify modified variants of the peptides listed in the spectrum library without any prior input from the user estimating the modifications present in the sample. We built the search algorithm of QuickMod after carefully testing different SSM similarity scores. The final spectrum scoring scheme uses a support vector machine (SVM) on a selection of scoring features to classify correct and incorrect SSM. After identification of a list of modified peptides at a given False Discovery Rate (FDR), the modifications need to be positioned on the peptide sequence. We present a rapid modification site assignment algorithm and evaluate its positioning accuracy. Finally, we demonstrate that QuickMod performs favorably in terms of speed and identification rate when compared to other software solutions for PTM analysis.


Assuntos
Algoritmos , Fragmentos de Peptídeos/análise , Biblioteca de Peptídeos , Proteômica/métodos , Acetilação , Bases de Dados de Proteínas , Humanos , Espectrometria de Massas , Oxirredução , Fragmentos de Peptídeos/sangue , Fosforilação , Processamento de Proteína Pós-Traducional , Projetos de Pesquisa , Análise de Sequência de Proteína , Software
7.
J Proteomics ; 129: 63-70, 2015 Nov 03.
Artigo em Inglês | MEDLINE | ID: mdl-26141507

RESUMO

Mass spectrometry (MS) is a widely used and evolving technique for the high-throughput identification of molecules in biological samples. The need for sharing and reuse of code among bioinformaticians working with MS data prompted the design and implementation of MzJava, an open-source Java Application Programming Interface (API) for MS related data processing. MzJava provides data structures and algorithms for representing and processing mass spectra and their associated biological molecules, such as metabolites, glycans and peptides. MzJava includes functionality to perform mass calculation, peak processing (e.g. centroiding, filtering, transforming), spectrum alignment and clustering, protein digestion, fragmentation of peptides and glycans as well as scoring functions for spectrum-spectrum and peptide/glycan-spectrum matches. For data import and export MzJava implements readers and writers for commonly used data formats. For many classes support for the Hadoop MapReduce (hadoop.apache.org) and Apache Spark (spark.apache.org) frameworks for cluster computing was implemented. The library has been developed applying best practices of software engineering. To ensure that MzJava contains code that is correct and easy to use the library's API was carefully designed and thoroughly tested. MzJava is an open-source project distributed under the AGPL v3.0 licence. MzJava requires Java 1.7 or higher. Binaries, source code and documentation can be downloaded from http://mzjava.expasy.org and https://bitbucket.org/sib-pig/mzjava. This article is part of a Special Issue entitled: Computational Proteomics.


Assuntos
Bases de Dados de Proteínas , Armazenamento e Recuperação da Informação/métodos , Espectrometria de Massas/métodos , Linguagens de Programação , Proteínas/química , Interface Usuário-Computador , Sequência de Aminoácidos , Sistemas de Gerenciamento de Base de Dados , Dados de Sequência Molecular , Mapeamento de Peptídeos/métodos , Análise de Sequência de Proteína/métodos
8.
Comput Biol Chem ; 27(4-5): 481-95, 2003 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-14642756

RESUMO

Protein-related information is more accumulated rather than reduced to a synthetic view. Itemising properties of protein sequences is informative, so is the list of ingredients to do some cooking, but without a recipe, that is, quantification and chronology, understanding is incomplete. If the goal of accumulating information is to discover or reveal the function and related biochemical mechanisms, information has to be weighed and ordered. As a guideline, the weight of a piece of information should reflect how often it consistently occurs in various contexts. We propose a common sense approach to quantify and put data and information into perspective. Complete bacterial proteomes are individually mapped with the Pfam-A database of domains and protein family signatures in an attempt to assess the modularity of proteins at the level of a single proteome and the implications of a modular description of proteins for a functional interpretation. Poorly annotated proteins in the most documented bacteria (E. coli and B. subtilis) were considered in an attempt to formulate hypothesis on the basis of domain/module content.


Assuntos
Proteínas de Bactérias/química , Bases de Dados de Proteínas , Proteoma/química , Bacillus subtilis/genética , Proteínas de Bactérias/classificação , Proteínas de Bactérias/genética , Escherichia coli/genética , Genoma Bacteriano , Proteoma/classificação , Proteoma/genética , Análise de Sequência de Proteína , Homologia de Sequência de Aminoácidos
9.
Comput Biol Chem ; 27(1): 29-35, 2003 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-12798037

RESUMO

Proteomics enforces the reverse chronological order on the gene to protein dogma and imposes amino acid sequences as a starting point of an investigation relative to function. By this approach, proteomics data can confirm the presence of multiple forms of a protein. Notwithstanding variations attributed specific individual features of organisms and tissues, from two to over ten protein forms can be identified in a given sample. The present work describes some guidelines for tracking the origin of alternative protein forms and attempts to tag the details of sequence data in the literature. Working via these guidelines we have uncovered a third alternative form of the Pim subfamily of oncogenes. The term form is here combined with the qualification alternative to describe any product of a given gene including closely related paralogs. This paper also emphasizes the need for consistency checks in annotation processes, such as gene clustering, to avoid losing important details describing protein alternative forms. By identifying alternative protein forms, we illustrate the fact that rationalizing of protein function via the identification of protein-protein interactions should in reality be that of identifying (alternative) form-form interactions.


Assuntos
Proteômica/normas , Proteínas Proto-Oncogênicas/genética , Sequência de Aminoácidos/genética , Animais , Biologia Computacional/métodos , Biologia Computacional/normas , DNA Complementar/classificação , DNA Complementar/genética , Bases de Dados de Proteínas/estatística & dados numéricos , Etiquetas de Sequências Expressas , Variação Genética , Humanos , Dados de Sequência Molecular , Família Multigênica/genética , Proteínas Serina-Treonina Quinases/química , Proteínas Serina-Treonina Quinases/classificação , Proteínas Serina-Treonina Quinases/genética , Proteômica/métodos , Proteínas Proto-Oncogênicas/química , Proteínas Proto-Oncogênicas/classificação , Proteínas Proto-Oncogênicas c-pim-1 , Controle de Qualidade , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Suínos
10.
Genome Inform ; 15(2): 266-75, 2004.
Artigo em Inglês | MEDLINE | ID: mdl-15706512

RESUMO

We have studied the projection of protein family data onto single bacterial translated genome as a solution to visualise relationships between families restricted to bacterial sequences. Any member of any type of family as defined in the Pfam database (domains, signatures, etc.) is considered as a protein module. Our first goal is to discover rules correlating the occurrence of modules with biochemical properties. To achieve this goal we have developed a platform to quantify information found in protein databases and to support the analysis of the nature of modules, their position and corresponding frequencies of occurrence (in isolation or in combination) in association with pathway knowledge as found in KEGG. This paper focuses on two pathways: the two-component system and the aminophosphonate metabolism, that are partially but not completely documented. Proteins involved in those pathways were listed separately in each organism to analyse module composition and rules constraining pathway interactions were identified. It is shown how these results can be used to update KEGG pathways and orthologue tables.


Assuntos
Bases de Dados Genéticas , Bases de Dados de Proteínas , Genoma , Proteínas , Animais , Biologia Computacional , Gráficos por Computador , Perfilação da Expressão Gênica , Humanos , Armazenamento e Recuperação da Informação , Família Multigênica , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Homologia de Sequência
11.
J Am Soc Mass Spectrom ; 24(12): 1862-71, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24006250

RESUMO

Data-independent mass spectrometry activates all ion species isolated within a given mass-to-charge window (m/z) regardless of their abundance. This acquisition strategy overcomes the traditional data-dependent ion selection boosting data reproducibility and sensitivity. However, several tandem mass (MS/MS) spectra of the same precursor ion are acquired during chromatographic elution resulting in large data redundancy. Also, the significant number of chimeric spectra and the absence of accurate precursor ion masses hamper peptide identification. Here, we describe an algorithm to preprocess data-independent MS/MS spectra by filtering out noise peaks and clustering the spectra according to both the chromatographic elution profiles and the spectral similarity. In addition, we developed an approach to estimate the m/z value of precursor ions from clustered MS/MS spectra in order to improve database search performance. Data acquired using a small 3 m/z units precursor mass window and multiple injections to cover a m/z range of 400-1400 was processed with our algorithm. It showed an improvement in the number of both peptide and protein identifications by 8% while reducing the number of submitted spectra by 18% and the number of peaks by 55%. We conclude that our clustering method is a valid approach for data analysis of these data-independent fragmentation spectra. The software including the source code is available for the scientific community.


Assuntos
Proteínas/química , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos , Algoritmos , Linhagem Celular , Análise por Conglomerados , Humanos , Software
12.
J Proteomics ; 79: 146-60, 2013 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-23277275

RESUMO

High throughput protein identification and quantification analysis based on mass spectrometry are fundamental steps in most proteomics projects. Here, we present EasyProt (available at http://easyprot.unige.ch), a new platform for mass spectrometry data processing, protein identification, quantification and unexpected post-translational modification characterization. EasyProt provides a fully integrated graphical experience to perform a large part of the proteomic data analysis workflow. Our goal was to develop a software platform that would fulfill the needs of scientists in the field, while emphasizing ease-of-use for non-bioinformatician users. Protein identification is based on OLAV scoring schemes and protein quantification is implemented for both, isobaric labeling and label-free methods. Additional features are available, such as peak list processing, isotopic correction, spectra filtering, charge-state deconvolution and spectra merging. To illustrate the EasyProt platform, we present two identification and quantification workflows based on isobaric tagging and label-free methods.


Assuntos
Proteômica/métodos , Análise de Sequência de Proteína/métodos , Software , Espectrometria de Massas/métodos , Processamento de Proteína Pós-Traducional , Proteínas/análise
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA