RESUMO
A new result report for Mascot search results is described. A greedy set cover algorithm is used to create a minimal set of proteins, which is then grouped into families on the basis of shared peptide matches. Protein families with multiple members are represented by dendrograms, generated by hierarchical clustering using the score of the nonshared peptide matches as a distance metric. The peptide matches to the proteins in a family can be compared side by side to assess the experimental evidence for each protein. If the evidence for a particular family member is considered inadequate, the dendrogram can be cut to reduce the number of distinct family members.
Assuntos
Algoritmos , Análise por Conglomerados , Proteômica/métodos , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Simulação por Computador , Bases de Dados de Proteínas , Modelos Moleculares , Dados de Sequência Molecular , Peso Molecular , Fragmentos de Peptídeos/química , Espectrometria de Massas em Tandem/métodosRESUMO
A letter published in January 2011, "The Problem with Peptide Presumption and Low Mascot Scoring", raised concerns about the reporting of peptide identifications based on mass spectrometry data with high precursor mass accuracy. We explain why we believe these concerns are unfounded.
Assuntos
Algoritmos , Sequência de Aminoácidos , Proteínas/análise , Proteínas/genética , SoftwareRESUMO
Over the last five years, the Human Proteome Organisation Proteomics Standards Initiative (HUPO PSI) has produced and released community-accepted XML interchange formats in the fields of mass spectrometry, molecular interactions and gel electrophoresis, have led the field in the discussion of the minimum information with which such data should be annotated and are now in the process of publishing much of this information. At this 4(th) Spring workshop, the emphasis was on consolidating this effort, refining and improving the existing models and in pushing these forward to align with more broadly encompassing efforts such as FuGE (Jones, A.R., Pizarro, A., Spellman, P., Miller, M., FuGE Working Group FuGE: Functional Genomics Experiment Object Model. OMICS 2006, 10, 179-184) and the Ontology for Biomedical Investigation (OBI). The effort to merge the existing mass spectrometry XML interchange formats, mzData and mzXML, into one single standard mzML yielded significant progress. Also the preliminary design of AnalysisXML was extended to include several new use cases and better support for quantification information. Finally the Molecular Interaction group discussed the development of a molecular interaction scoring system with accompanying gold standard data test sets.
Assuntos
Educação , Proteômica/normas , Genômica , Humanos , Proteômica/instrumentação , Proteômica/métodosRESUMO
Unimod is a database of protein modifications for use in mass spectrometry applications, especially protein identification and de novo sequencing. It contains accurate and verifiable values, derived from elemental compositions, for the mass differences introduced by both natural and artificial modifications.
Assuntos
Bases de Dados Factuais , Espectrometria de Massas/métodos , Proteínas/metabolismo , Internet , Processamento de Proteína Pós-Traducional , Proteínas/química , Análise de Sequência de Proteína , Terminologia como AssuntoRESUMO
An error tolerant mode for database matching of uninterpreted tandem mass spectrometry data is described. Selected database entries are searched without enzyme specificity, using a comprehensive list of chemical and post-translational modifications, together with a residue substitution matrix. The modifications are tested serially, to avoid the catastrophic loss of discrimination that would occur if all the permutations of large numbers of modifications in combination were possible. The new mode has been coded as an extension to the Mascot search engine, and tested against a number of Liquid chromatography-tandem mass spectrometry datasets. The results show a number of additional peptide matches, but require careful interpretation. The most significant limitation of this approach is that it can only reveal new matches to proteins that already have at least one significant peptide match.