Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Mol Cell Proteomics ; 11(8): 478-91, 2012 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-22493177

RESUMEN

Peptide identification using tandem mass spectrometry is a core technology in proteomics. Latest generations of mass spectrometry instruments enable the use of electron transfer dissociation (ETD) to complement collision induced dissociation (CID) for peptide fragmentation. However, a critical limitation to the use of ETD has been optimal database search software. Percolator is a post-search algorithm, which uses semi-supervised machine learning to improve the rate of peptide spectrum identifications (PSMs) together with providing reliable significance measures. We have previously interfaced the Mascot search engine with Percolator and demonstrated sensitivity and specificity benefits with CID data. Here, we report recent developments in the Mascot Percolator V2.0 software including an improved feature calculator and support for a wider range of ion series. The updated software is applied to the analysis of several CID and ETD fragmented peptide data sets. This version of Mascot Percolator increases the number of CID PSMs by up to 80% and ETD PSMs by up to 60% at a 0.01 q-value (1% false discovery rate) threshold over a standard Mascot search, notably recovering PSMs from high charge state precursor ions. The greatly increased number of PSMs and peptide coverage afforded by Mascot Percolator has enabled a fuller assessment of CID/ETD complementarity to be performed. Using a data set of CID and ETcaD spectral pairs, we find that at a 1% false discovery rate, the overlap in peptide identifications by CID and ETD is 83%, which is significantly higher than that obtained using either stand-alone Mascot (69%) or OMSSA (39%). We conclude that Mascot Percolator is a highly sensitive and accurate post-search algorithm for peptide identification and allows direct comparison of peptide identifications using multiple alternative fragmentation techniques.


Asunto(s)
Algoritmos , Péptidos/análisis , Proteómica/métodos , Programas Informáticos , Espectrometría de Masas en Tándem/métodos , Inteligencia Artificial , Cromatografía Liquida , Bases de Datos de Proteínas , Escherichia coli/metabolismo , Proteínas de Escherichia coli/análisis , Proteínas Fúngicas/análisis , Humanos , Reproducibilidad de los Resultados , Levaduras/metabolismo
2.
Genome Res ; 21(5): 756-67, 2011 May.
Artículo en Inglés | MEDLINE | ID: mdl-21460061

RESUMEN

Recent advances in proteomic mass spectrometry (MS) offer the chance to marry high-throughput peptide sequencing to transcript models, allowing the validation, refinement, and identification of new protein-coding loci. We present a novel pipeline that integrates highly sensitive and statistically robust peptide spectrum matching with genome-wide protein-coding predictions to perform large-scale gene validation and discovery in the mouse genome for the first time. In searching an excess of 10 million spectra, we have been able to validate 32%, 17%, and 7% of all protein-coding genes, exons, and splice boundaries, respectively. Moreover, we present strong evidence for the identification of multiple alternatively spliced translations from 53 genes and have uncovered 10 entirely novel protein-coding genes, which are not covered in any mouse annotation data sources. One such novel protein-coding gene is a fusion protein that spans the Ins2 and Igf2 loci to produce a transcript encoding the insulin II and the insulin-like growth factor 2-derived peptides. We also report nine processed pseudogenes that have unique peptide hits, demonstrating, for the first time, that they are not just transcribed but are translated and are therefore resurrected into new coding loci. This work not only highlights an important utility for MS data in genome annotation but also provides unique insights into the gene structure and propagation in the mouse genome. All these data have been subsequently used to improve the publicly available mouse annotation available in both the Vega and Ensembl genome browsers (http://vega.sanger.ac.uk).


Asunto(s)
Empalme Alternativo , Genes , Péptidos/genética , Proteómica/métodos , Seudogenes/genética , Espectrometría de Masas en Tándem/métodos , Animales , Genoma , Genómica/métodos , Ratones , Péptidos/química
3.
Cancer Res ; 70(3): 883-95, 2010 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-20103622

RESUMEN

Comparative genomic hybridization (CGH) can reveal important disease genes but the large regions identified could sometimes contain hundreds of genes. Here we combine high-resolution CGH analysis of 598 human cancer cell lines with insertion sites isolated from 1,005 mouse tumors induced with the murine leukemia virus (MuLV). This cross-species oncogenomic analysis revealed candidate tumor suppressor genes and oncogenes mutated in both human and mouse tumors, making them strong candidates for novel cancer genes. A significant number of these genes contained binding sites for the stem cell transcription factors Oct4 and Nanog. Notably, mice carrying tumors with insertions in or near stem cell module genes, which are thought to participate in cell self-renewal, died significantly faster than mice without these insertions. A comparison of the profile we identified to that induced with the Sleeping Beauty (SB) transposon system revealed significant differences in the profile of recurrently mutated genes. Collectively, this work provides a rich catalogue of new candidate cancer genes for functional analysis.


Asunto(s)
Hibridación Genómica Comparativa/métodos , Predisposición Genética a la Enfermedad/genética , Neoplasias/genética , Proteínas Supresoras de Tumor/genética , Animales , Sitios de Unión/genética , Línea Celular Tumoral , Elementos Transponibles de ADN/genética , Femenino , Genómica/métodos , Proteínas de Homeodominio/metabolismo , Humanos , Masculino , Ratones , Ratones Endogámicos C57BL , Mutagénesis Insercional , Mutación , Proteína Homeótica Nanog , Neoplasias/metabolismo , Neoplasias/patología , Factor 3 de Transcripción de Unión a Octámeros/metabolismo , Especificidad de la Especie , Células Madre/metabolismo , Proteínas Supresoras de Tumor/metabolismo
4.
Methods Mol Biol ; 604: 43-53, 2010.
Artículo en Inglés | MEDLINE | ID: mdl-20013363

RESUMEN

A variety of methods are described in the literature to assign peptide sequences to observed tandem MS data. Typically, the identified peptides are associated only with an arbitrary score that reflects the quality of the peptide-spectrum match but not with a statistically meaningful significance measure. In this chapter, we discuss why statistical significance measures can simplify and unify the interpretation of MS-based proteomic experiments. In addition, we also present available software solutions that convert scores into sound statistical measures.


Asunto(s)
Péptidos/análisis , Programas Informáticos , Espectrometría de Masas en Tándem/métodos , Bases de Datos de Proteínas , Distribuciones Estadísticas
5.
J Proteome Res ; 8(6): 3176-81, 2009 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-19338334

RESUMEN

Sound scoring methods for sequence database search algorithms such as Mascot and Sequest are essential for sensitive and accurate peptide and protein identifications from proteomic tandem mass spectrometry data. In this paper, we present a software package that interfaces Mascot with Percolator, a well performing machine learning method for rescoring database search results, and demonstrate it to be amenable for both low and high accuracy mass spectrometry data, outperforming all available Mascot scoring schemes as well as providing reliable significance measures. Mascot Percolator can be readily used as a stand alone tool or integrated into existing data analysis pipelines.


Asunto(s)
Fragmentos de Péptidos/análisis , Proteómica/métodos , Programas Informáticos , Algoritmos , Inteligencia Artificial , Cromatografía Liquida , Bases de Datos de Proteínas , Fragmentos de Péptidos/química , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Análisis de Secuencia de Proteína , Espectrometría de Masas en Tándem
6.
Mol Cell Proteomics ; 7(5): 962-70, 2008 May.
Artículo en Inglés | MEDLINE | ID: mdl-18216375

RESUMEN

It is a major challenge to develop effective sequence database search algorithms to translate molecular weight and fragment mass information obtained from tandem mass spectrometry into high quality peptide and protein assignments. We investigated the peptide identification performance of Mascot and X!Tandem for mass tolerance settings common for low and high accuracy mass spectrometry. We demonstrated that sensitivity and specificity of peptide identification can vary substantially for different mass tolerance settings, but this effect was more significant for Mascot. We present an adjusted Mascot threshold, which allows the user to freely select the best trade-off between sensitivity and specificity. The adjusted Mascot threshold was compared with the default Mascot and X!Tandem scoring thresholds and shown to be more sensitive at the same false discovery rates for both low and high accuracy mass spectrometry data.


Asunto(s)
Péptidos/análisis , Proteómica/métodos , Espectrometría de Masas en Tándem/métodos , Algoritmos , Animales , Células Cultivadas , Ratones , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
7.
PLoS Comput Biol ; 3(10): 2032-42, 2007 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-17967053

RESUMEN

Network analysis transcends conventional pairwise approaches to data analysis as the context of components in a network graph can be taken into account. Such approaches are increasingly being applied to genomics data, where functional linkages are used to connect genes or proteins. However, while microarray gene expression datasets are now abundant and of high quality, few approaches have been developed for analysis of such data in a network context. We present a novel approach for 3-D visualisation and analysis of transcriptional networks generated from microarray data. These networks consist of nodes representing transcripts connected by virtue of their expression profile similarity across multiple conditions. Analysing genome-wide gene transcription across 61 mouse tissues, we describe the unusual topography of the large and highly structured networks produced, and demonstrate how they can be used to visualise, cluster, and mine large datasets. This approach is fast, intuitive, and versatile, and allows the identification of biological relationships that may be missed by conventional analysis techniques. This work has been implemented in a freely available open-source application named BioLayout Express(3D).


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Regulación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Transcripción Genética , Algoritmos , Animales , Análisis por Conglomerados , Expresión Génica , Redes Reguladoras de Genes , Imagenología Tridimensional , Ratones , Reconocimiento de Normas Patrones Automatizadas , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...