Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
J Proteome Res ; 20(4): 2151-2156, 2021 04 02.
Artículo en Inglés | MEDLINE | ID: mdl-33703904

RESUMEN

For differential expression studies in all omics disciplines, data normalization is a crucial step that is often subject to a balance between speed and effectiveness. To keep up with the data produced by high-throughput instruments, researchers require fast and easy-to-use yet effective methods that fit into automated analysis pipelines. The CONSTANd normalization method meets these criteria, so we have made its source code available for R/BioConductor and Python. We briefly review the method and demonstrate how it can be used in different omics contexts for experiments of any scale. Widespread adoption across omics disciplines would ease data integration in multiomics experiments.


Asunto(s)
Boidae , Programas Informáticos , Animales , Proteómica
2.
Rapid Commun Mass Spectrom ; : e8962, 2020 Oct 02.
Artículo en Inglés | MEDLINE | ID: mdl-33009686

RESUMEN

RATIONALE: The current methods for identifying peptides in mass spectral product ion data still struggle to do so for the majority of spectra. Based on the experimental setup and other assumptions, such methods restrict the search space to speed up computations, but at the cost of creating blind spots. The proteomics community would greatly benefit from a method that is capable of covering the entire search space without using any restrictions, thus establishing a baseline for identification. METHODS: We conceived the "mass pattern paradigm" (MPP) that enables the creation of such an identification method, and we implemented it into a prototype database search engine "PRiSM" (PRotein-Spectrum Matching). We then assessed its operational characteristics by applying it to publicly available high-precision mass spectra of low and high identification difficulty. We used those characteristics to gain theoretical insights into trade-offs between sensitivity and speed when trying to establish a baseline for identification. RESULTS: Of 100 low difficulty spectra, PRiSM and SEQUEST agree on 84 identifications (of which 75 are statistically significant). Of 15 of 100 spectra not identified in a previous study (using SEQUEST), 13 are considered reliable after visual inspection and represent 3 proteins (out of 9 in total) not detected previously. CONCLUSIONS: Despite leaving noise intact, the simple PRiSM prototype can make statistically reliable identifications, while controlling the false discovery rate by fitting a null distribution. It also identifies some spectra previously unidentifiable in an "extremely open" SEQUEST search, paving the way to establishing a baseline for identification in proteomics.

3.
J Proteome Res ; 18(5): 2221-2227, 2019 05 03.
Artículo en Inglés | MEDLINE | ID: mdl-30942071

RESUMEN

In the context of omics disciplines and especially proteomics and biomarker discovery, the analysis of a clinical sample using label-based tandem mass spectrometry (MS) can be affected by sample preparation effects or by the measurement process itself, resulting in an incorrect outcome. Detection and correction of these mistakes using state-of-the-art methods based on mixed models can use large amounts of (computing) time. MS-based proteomics laboratories are high-throughput and need to avoid a bottleneck in their quantitative pipeline by quickly discriminating between high- and low-quality data. To this end we developed an easy-to-use web-tool called QCQuan (available at qcquan.net ) which is built around the CONSTANd normalization algorithm. It automatically provides the user with exploratory and quality control information as well as a differential expression analysis based on conservative, simple statistics. In this document we describe in detail the scientifically relevant steps that constitute the workflow and assess its qualitative and quantitative performance on three reference data sets. We find that QCQuan provides clear and accurate indications about the scientific value of both a high- and a low-quality data set. Moreover, it performed quantitatively better on a third data set than a comparable workflow assembled using established, reliable software.


Asunto(s)
Algoritmos , Proteínas Bacterianas/aislamiento & purificación , Exactitud de los Datos , Pectobacterium carotovorum/química , Proteómica/estadística & datos numéricos , Programas Informáticos , Animales , Bovinos , Cromatografía Liquida , Mezclas Complejas/química , Citocromos c/aislamiento & purificación , Conjuntos de Datos como Asunto , Glucógeno Fosforilasa/aislamiento & purificación , Internet , Fosfopiruvato Hidratasa/aislamiento & purificación , Proteómica/métodos , Control de Calidad , Conejos , Albúmina Sérica Bovina/aislamiento & purificación , Coloración y Etiquetado/métodos , Espectrometría de Masas en Tándem
4.
J Mol Biol ; 433(11): 166966, 2021 05 28.
Artículo en Inglés | MEDLINE | ID: mdl-33794260

RESUMEN

In high-throughput omics disciplines like transcriptomics, researchers face a need to assess the quality of an experiment prior to an in-depth statistical analysis. To efficiently analyze such voluminous collections of data, researchers need triage methods that are both quick and easy to use. Such a normalization method for relative quantitation, CONSTANd, was recently introduced for isobarically-labeled mass spectra in proteomics. It transforms the data matrix of abundances through an iterative, convergent process enforcing three constraints: (I) identical column sums; (II) each row sum is fixed (across matrices) and (III) identical to all other row sums. In this study, we investigate whether CONSTANd is suitable for count data from massively parallel sequencing, by qualitatively comparing its results to those of DESeq2. Further, we propose an adjustment of the method so that it may be applied to identically balanced but differently sized experiments for joint analysis. We find that CONSTANd can process large data sets at well over 1 million count records per second whilst mitigating unwanted systematic bias and thus quickly uncovering the underlying biological structure when combined with a PCA plot or hierarchical clustering. Moreover, it allows joint analysis of data sets obtained from different batches, with different protocols and from different labs but without exploiting information from the experimental setup other than the delineation of samples into identically processed sets (IPSs). CONSTANd's simplicity and applicability to proteomics as well as transcriptomics data make it an interesting candidate for integration in multi-omics workflows.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/normas , Animales , Bases de Datos de Proteínas , Humanos , Leishmania/metabolismo , Masculino , Ratones Endogámicos C57BL , Análisis de Componente Principal , Proteómica , Estándares de Referencia
5.
J Mass Spectrom ; 55(8): e4471, 2020 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-31713933

RESUMEN

There is a trend in the analysis of shotgun proteomics data that aims to combine information from multiple search engines to increase the number of peptide annotations in an experiment. Typically, the degree of search engine complementarity and search engine agreement is visually illustrated by means of Venn diagrams that present the findings of a database search on the level of the nonredundant peptide annotations. We argue this practice to be not fit-for-purpose since the diagrams do not take into account and often conceal the information on complementarity and agreement at the level of the spectrum identification. We promote a new type of visualization that provides insight on the peptide sequence agreement at the level of the peptide-spectrum match (PSM) as a measure of consensus between two search engines with nominal outcomes. We applied the visualizations and percentage sequence agreement to an in-house data set of our benchmark organism, Caenorhabditis elegans, and illustrated that when assessing the agreement between search engine, one should disentangle the notion of PSM confidence and PSM identity. The visualizations presented in this manuscript provide a more informative assessment of pairs of search engines and are made available as an R function in the Supporting Information.


Asunto(s)
Bases de Datos de Proteínas , Péptidos , Proteómica , Péptidos/análisis , Péptidos/química , Péptidos/clasificación , Proteómica/métodos , Proteómica/normas , Motor de Búsqueda/métodos , Motor de Búsqueda/normas , Espectrometría de Masas en Tándem
6.
Methods Mol Biol ; 1719: 141-159, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29476509

RESUMEN

In differential peptidomics, peptide profiles are compared between biological samples and the resulting expression levels are correlated to a phenotype of interest. This, in turn, allows us insight into how peptides may affect the phenotype of interest. In quantitative differential peptidomics, both label-based and label-free techniques are often employed. Label-based techniques have several advantages over label-free methods, primarily that labels allow for various samples to be pooled prior to liquid chromatography-mass spectrometry (LC-MS) analysis, reducing between-run variation. Here, we detail a method for performing quantitative peptidomics using stable amine-binding isotopic and isobaric tags.


Asunto(s)
Cromatografía Liquida/métodos , Marcaje Isotópico/métodos , Fragmentos de Péptidos/análisis , Fragmentos de Péptidos/metabolismo , Proteómica/métodos , Espectrometría de Masas en Tándem/métodos , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA