Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Mol Cell Proteomics ; 11(8): 478-91, 2012 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-22493177

RESUMO

Peptide identification using tandem mass spectrometry is a core technology in proteomics. Latest generations of mass spectrometry instruments enable the use of electron transfer dissociation (ETD) to complement collision induced dissociation (CID) for peptide fragmentation. However, a critical limitation to the use of ETD has been optimal database search software. Percolator is a post-search algorithm, which uses semi-supervised machine learning to improve the rate of peptide spectrum identifications (PSMs) together with providing reliable significance measures. We have previously interfaced the Mascot search engine with Percolator and demonstrated sensitivity and specificity benefits with CID data. Here, we report recent developments in the Mascot Percolator V2.0 software including an improved feature calculator and support for a wider range of ion series. The updated software is applied to the analysis of several CID and ETD fragmented peptide data sets. This version of Mascot Percolator increases the number of CID PSMs by up to 80% and ETD PSMs by up to 60% at a 0.01 q-value (1% false discovery rate) threshold over a standard Mascot search, notably recovering PSMs from high charge state precursor ions. The greatly increased number of PSMs and peptide coverage afforded by Mascot Percolator has enabled a fuller assessment of CID/ETD complementarity to be performed. Using a data set of CID and ETcaD spectral pairs, we find that at a 1% false discovery rate, the overlap in peptide identifications by CID and ETD is 83%, which is significantly higher than that obtained using either stand-alone Mascot (69%) or OMSSA (39%). We conclude that Mascot Percolator is a highly sensitive and accurate post-search algorithm for peptide identification and allows direct comparison of peptide identifications using multiple alternative fragmentation techniques.


Assuntos
Algoritmos , Peptídeos/análise , Proteômica/métodos , Software , Espectrometria de Massas em Tandem/métodos , Inteligência Artificial , Cromatografia Líquida , Bases de Dados de Proteínas , Escherichia coli/metabolismo , Proteínas de Escherichia coli/análise , Proteínas Fúngicas/análise , Humanos , Reprodutibilidade dos Testes , Leveduras/metabolismo
2.
Genome Res ; 21(5): 756-67, 2011 May.
Artigo em Inglês | MEDLINE | ID: mdl-21460061

RESUMO

Recent advances in proteomic mass spectrometry (MS) offer the chance to marry high-throughput peptide sequencing to transcript models, allowing the validation, refinement, and identification of new protein-coding loci. We present a novel pipeline that integrates highly sensitive and statistically robust peptide spectrum matching with genome-wide protein-coding predictions to perform large-scale gene validation and discovery in the mouse genome for the first time. In searching an excess of 10 million spectra, we have been able to validate 32%, 17%, and 7% of all protein-coding genes, exons, and splice boundaries, respectively. Moreover, we present strong evidence for the identification of multiple alternatively spliced translations from 53 genes and have uncovered 10 entirely novel protein-coding genes, which are not covered in any mouse annotation data sources. One such novel protein-coding gene is a fusion protein that spans the Ins2 and Igf2 loci to produce a transcript encoding the insulin II and the insulin-like growth factor 2-derived peptides. We also report nine processed pseudogenes that have unique peptide hits, demonstrating, for the first time, that they are not just transcribed but are translated and are therefore resurrected into new coding loci. This work not only highlights an important utility for MS data in genome annotation but also provides unique insights into the gene structure and propagation in the mouse genome. All these data have been subsequently used to improve the publicly available mouse annotation available in both the Vega and Ensembl genome browsers (http://vega.sanger.ac.uk).


Assuntos
Processamento Alternativo , Genes , Peptídeos/genética , Proteômica/métodos , Pseudogenes/genética , Espectrometria de Massas em Tandem/métodos , Animais , Genoma , Genômica/métodos , Camundongos , Peptídeos/química
3.
Cancer Res ; 70(3): 883-95, 2010 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-20103622

RESUMO

Comparative genomic hybridization (CGH) can reveal important disease genes but the large regions identified could sometimes contain hundreds of genes. Here we combine high-resolution CGH analysis of 598 human cancer cell lines with insertion sites isolated from 1,005 mouse tumors induced with the murine leukemia virus (MuLV). This cross-species oncogenomic analysis revealed candidate tumor suppressor genes and oncogenes mutated in both human and mouse tumors, making them strong candidates for novel cancer genes. A significant number of these genes contained binding sites for the stem cell transcription factors Oct4 and Nanog. Notably, mice carrying tumors with insertions in or near stem cell module genes, which are thought to participate in cell self-renewal, died significantly faster than mice without these insertions. A comparison of the profile we identified to that induced with the Sleeping Beauty (SB) transposon system revealed significant differences in the profile of recurrently mutated genes. Collectively, this work provides a rich catalogue of new candidate cancer genes for functional analysis.


Assuntos
Hibridização Genômica Comparativa/métodos , Predisposição Genética para Doença/genética , Neoplasias/genética , Proteínas Supressoras de Tumor/genética , Animais , Sítios de Ligação/genética , Linhagem Celular Tumoral , Elementos de DNA Transponíveis/genética , Feminino , Genômica/métodos , Proteínas de Homeodomínio/metabolismo , Humanos , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Mutagênese Insercional , Mutação , Proteína Homeobox Nanog , Neoplasias/metabolismo , Neoplasias/patologia , Fator 3 de Transcrição de Octâmero/metabolismo , Especificidade da Espécie , Células-Tronco/metabolismo , Proteínas Supressoras de Tumor/metabolismo
4.
Methods Mol Biol ; 604: 43-53, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20013363

RESUMO

A variety of methods are described in the literature to assign peptide sequences to observed tandem MS data. Typically, the identified peptides are associated only with an arbitrary score that reflects the quality of the peptide-spectrum match but not with a statistically meaningful significance measure. In this chapter, we discuss why statistical significance measures can simplify and unify the interpretation of MS-based proteomic experiments. In addition, we also present available software solutions that convert scores into sound statistical measures.


Assuntos
Peptídeos/análise , Software , Espectrometria de Massas em Tandem/métodos , Bases de Dados de Proteínas , Distribuições Estatísticas
5.
J Proteome Res ; 8(6): 3176-81, 2009 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-19338334

RESUMO

Sound scoring methods for sequence database search algorithms such as Mascot and Sequest are essential for sensitive and accurate peptide and protein identifications from proteomic tandem mass spectrometry data. In this paper, we present a software package that interfaces Mascot with Percolator, a well performing machine learning method for rescoring database search results, and demonstrate it to be amenable for both low and high accuracy mass spectrometry data, outperforming all available Mascot scoring schemes as well as providing reliable significance measures. Mascot Percolator can be readily used as a stand alone tool or integrated into existing data analysis pipelines.


Assuntos
Fragmentos de Peptídeos/análise , Proteômica/métodos , Software , Algoritmos , Inteligência Artificial , Cromatografia Líquida , Bases de Dados de Proteínas , Fragmentos de Peptídeos/química , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Análise de Sequência de Proteína , Espectrometria de Massas em Tandem
6.
Mol Cell Proteomics ; 7(5): 962-70, 2008 May.
Artigo em Inglês | MEDLINE | ID: mdl-18216375

RESUMO

It is a major challenge to develop effective sequence database search algorithms to translate molecular weight and fragment mass information obtained from tandem mass spectrometry into high quality peptide and protein assignments. We investigated the peptide identification performance of Mascot and X!Tandem for mass tolerance settings common for low and high accuracy mass spectrometry. We demonstrated that sensitivity and specificity of peptide identification can vary substantially for different mass tolerance settings, but this effect was more significant for Mascot. We present an adjusted Mascot threshold, which allows the user to freely select the best trade-off between sensitivity and specificity. The adjusted Mascot threshold was compared with the default Mascot and X!Tandem scoring thresholds and shown to be more sensitive at the same false discovery rates for both low and high accuracy mass spectrometry data.


Assuntos
Peptídeos/análise , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos , Algoritmos , Animais , Células Cultivadas , Camundongos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
7.
PLoS Comput Biol ; 3(10): 2032-42, 2007 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-17967053

RESUMO

Network analysis transcends conventional pairwise approaches to data analysis as the context of components in a network graph can be taken into account. Such approaches are increasingly being applied to genomics data, where functional linkages are used to connect genes or proteins. However, while microarray gene expression datasets are now abundant and of high quality, few approaches have been developed for analysis of such data in a network context. We present a novel approach for 3-D visualisation and analysis of transcriptional networks generated from microarray data. These networks consist of nodes representing transcripts connected by virtue of their expression profile similarity across multiple conditions. Analysing genome-wide gene transcription across 61 mouse tissues, we describe the unusual topography of the large and highly structured networks produced, and demonstrate how they can be used to visualise, cluster, and mine large datasets. This approach is fast, intuitive, and versatile, and allows the identification of biological relationships that may be missed by conventional analysis techniques. This work has been implemented in a freely available open-source application named BioLayout Express(3D).


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Transcrição Gênica , Algoritmos , Animais , Análise por Conglomerados , Expressão Gênica , Redes Reguladoras de Genes , Imageamento Tridimensional , Camundongos , Reconhecimento Automatizado de Padrão , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...