Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 96
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Nat Methods ; 20(12): 1883-1886, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37996752

RESUMO

Cardinal v.3 is an open-source software for reproducible analysis of mass spectrometry imaging experiments. A major update from its previous versions, Cardinal v.3 supports most mass spectrometry imaging workflows. Its analytical capabilities include advanced data processing such as mass recalibration, advanced statistical analyses such as single-ion segmentation and rough annotation-based classification, and memory-efficient analyses of large-scale multitissue experiments.


Assuntos
Processamento de Imagem Assistida por Computador , Software , Espectrometria de Massas/métodos
2.
Nat Methods ; 20(10): 1523-1529, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37749212

RESUMO

Protein complexes are responsible for the enactment of most cellular functions. For the protein complex to form and function, its subunits often need to be present at defined quantitative ratios. Typically, global changes in protein complex composition are assessed with experimental approaches that tend to be time consuming. Here, we have developed a computational algorithm for the detection of altered protein complexes based on the systematic assessment of subunit ratios from quantitative proteomic measurements. We applied it to measurements from breast cancer cell lines and patient biopsies and were able to identify strong remodeling of HDAC2 epigenetic complexes in more aggressive forms of cancer. The presented algorithm is available as an R package and enables the inference of changes in protein complex states by extracting functionally relevant information from bottom-up proteomic datasets.


Assuntos
Proteoma , Proteômica , Humanos , Proteoma/metabolismo , Algoritmos , Células MCF-7 , Biologia Computacional
3.
Mol Cell Proteomics ; 22(1): 100477, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36496144

RESUMO

Liquid chromatography coupled with bottom-up mass spectrometry (LC-MS/MS)-based proteomics is increasingly used to detect changes in posttranslational modifications (PTMs) in samples from different conditions. Analysis of data from such experiments faces numerous statistical challenges. These include the low abundance of modified proteoforms, the small number of observed peptides that span modification sites, and confounding between changes in the abundance of PTM and the overall changes in the protein abundance. Therefore, statistical approaches for detecting differential PTM abundance must integrate all the available information pertaining to a PTM site and consider all the relevant sources of confounding and variation. In this manuscript, we propose such a statistical framework, which is versatile, accurate, and leads to reproducible results. The framework requires an experimental design, which quantifies, for each sample, both peptides with PTMs and peptides from the same proteins with no modification sites. The proposed framework supports both label-free and tandem mass tag-based LC-MS/MS acquisitions. The statistical methodology separately summarizes the abundances of peptides with and without the modification sites, by fitting separate linear mixed effects models appropriate for the experimental design. Next, model-based inferences regarding the PTM and the protein-level abundances are combined to account for the confounding between these two sources. Evaluations on computer simulations, a spike-in experiment with known ground truth, and three biological experiments with different organisms, modification types, and data acquisition types demonstrate the improved fold change estimation and detection of differential PTM abundance, as compared to currently used approaches. The proposed framework is implemented in the free and open-source R/Bioconductor package MSstatsPTM.


Assuntos
Proteômica , Espectrometria de Massas em Tandem , Proteômica/métodos , Cromatografia Líquida , Processamento de Proteína Pós-Traducional , Proteínas , Peptídeos/química
4.
Bioinformatics ; 39(2)2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36744928

RESUMO

MOTIVATION: Mass Spectrometry Imaging (MSI) analyzes complex biological samples such as tissues. It simultaneously characterizes the ions present in the tissue in the form of mass spectra, and the spatial distribution of the ions across the tissue in the form of ion images. Unsupervised clustering of ion images facilitates the interpretation in the spectral domain, by identifying groups of ions with similar spatial distributions. Unfortunately, many current methods for clustering ion images ignore the spatial features of the images, and are therefore unable to learn these features for clustering purposes. Alternative methods extract spatial features using deep neural networks pre-trained on natural image tasks; however, this is often inadequate since ion images are substantially noisier than natural images. RESULTS: We contribute a deep clustering approach for ion images that accounts for both spatial contextual features and noise. In evaluations on a simulated dataset and on four experimental datasets of different tissue types, the proposed method grouped ions from the same source into a same cluster more frequently than existing methods. We further demonstrated that using ion image clustering as a pre-processing step facilitated the interpretation of a subsequent spatial segmentation as compared to using either all the ions or one ion at a time. As a result, the proposed approach facilitated the interpretability of MSI data in both the spectral domain and the spatial domain. AVAILABILITYAND IMPLEMENTATION: The data and code are available at https://github.com/DanGuo1223/mzClustering. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Redes Neurais de Computação , Espectrometria de Massas/métodos , Análise por Conglomerados , Íons/análise
5.
Bioinformatics ; 39(39 Suppl 1): i494-i503, 2023 06 30.
Artigo em Inglês | MEDLINE | ID: mdl-37387179

RESUMO

Causal query estimation in biomolecular networks commonly selects a 'valid adjustment set', i.e. a subset of network variables that eliminates the bias of the estimator. A same query may have multiple valid adjustment sets, each with a different variance. When networks are partially observed, current methods use graph-based criteria to find an adjustment set that minimizes asymptotic variance. Unfortunately, many models that share the same graph topology, and therefore same functional dependencies, may differ in the processes that generate the observational data. In these cases, the topology-based criteria fail to distinguish the variances of the adjustment sets. This deficiency can lead to sub-optimal adjustment sets, and to miss-characterization of the effect of the intervention. We propose an approach for deriving 'optimal adjustment sets' that takes into account the nature of the data, bias and finite-sample variance of the estimator, and cost. It empirically learns the data generating processes from historical experimental data, and characterizes the properties of the estimators by simulation. We demonstrate the utility of the proposed approach in four biomolecular Case studies with different topologies and different data generation processes. The implementation and reproducible Case studies are at https://github.com/srtaheri/OptimalAdjustmentSet.


Assuntos
Biologia Computacional , Simulação por Computador
6.
J Proteome Res ; 22(2): 551-556, 2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36622173

RESUMO

Liquid chromatography coupled with bottom-up mass spectrometry (LC-MS/MS)-based proteomics is a versatile technology for identifying and quantifying proteins in complex biological mixtures. Postidentification, analysis of changes in protein abundances between conditions requires increasingly complex and specialized statistical methods. Many of these methods, in particular the family of open-source Bioconductor packages MSstats, are implemented in a coding language such as R. To make the methods in MSstats accessible to users with limited programming and statistical background, we have created MSstatsShiny, an R-Shiny graphical user interface (GUI) integrated with MSstats, MSstatsTMT, and MSstatsPTM. The GUI provides a point and click analysis pipeline applicable to a wide variety of proteomics experimental types, including label-free data-dependent acquisitions (DDAs) or data-independent acquisitions (DIAs), or tandem mass tag (TMT)-based TMT-DDAs, answering questions such as relative changes in the abundance of peptides, proteins, or post-translational modifications (PTMs). To support reproducible research, the application saves user's selections and builds an R script that programmatically recreates the analysis. MSstatsShiny can be installed locally via Github and Bioconductor, or utilized on the cloud at www.msstatsshiny.com. We illustrate the utility of the platform using two experimental data sets (MassIVE IDs MSV000086623 and MSV000085565).


Assuntos
Proteômica , Software , Proteômica/métodos , Cromatografia Líquida/métodos , Espectrometria de Massas em Tandem/métodos , Proteínas/análise
7.
J Proteome Res ; 22(8): 2641-2659, 2023 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-37467362

RESUMO

Repeated measures experimental designs, which quantify proteins in biological subjects repeatedly over multiple experimental conditions or times, are commonly used in mass spectrometry-based proteomics. Such designs distinguish the biological variation within and between the subjects and increase the statistical power of detecting within-subject changes in protein abundance. Meanwhile, proteomics experiments increasingly incorporate tandem mass tag (TMT) labeling, a multiplexing strategy that gains both relative protein quantification accuracy and sample throughput. However, combining repeated measures and TMT multiplexing in a large-scale investigation presents statistical challenges due to unique interplays of between-mixture, within-mixture, between-subject, and within-subject variation. This manuscript proposes a family of linear mixed-effects models for differential analysis of proteomics experiments with repeated measures and TMT multiplexing. These models decompose the variation in the data into the contributions from its sources as appropriate for the specifics of each experiment, enable statistical inference of differential protein abundance, and recognize a difference in the uncertainty of between-subject versus within-subject comparisons. The proposed family of models is implemented in the R/Bioconductor package MSstatsTMT v2.2.0. Evaluations of four simulated datasets and four investigations answering diverse biological questions demonstrated the value of this approach as compared to the existing general-purpose approaches and implementations.


Assuntos
Projetos de Pesquisa , Espectrometria de Massas em Tandem , Humanos , Proteoma/análise
8.
J Proteome Res ; 22(5): 1466-1482, 2023 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-37018319

RESUMO

The MSstats R-Bioconductor family of packages is widely used for statistical analyses of quantitative bottom-up mass spectrometry-based proteomic experiments to detect differentially abundant proteins. It is applicable to a variety of experimental designs and data acquisition strategies and is compatible with many data processing tools used to identify and quantify spectral features. In the face of ever-increasing complexities of experiments and data processing strategies, the core package of the family, with the same name MSstats, has undergone a series of substantial updates. Its new version MSstats v4.0 improves the usability, versatility, and accuracy of statistical methodology, and the usage of computational resources. New converters integrate the output of upstream processing tools directly with MSstats, requiring less manual work by the user. The package's statistical models have been updated to a more robust workflow. Finally, MSstats' code has been substantially refactored to improve memory use and computation speed. Here we detail these updates, highlighting methodological differences between the new and old versions. An empirical comparison of MSstats v4.0 to its previous implementations, as well as to the packages MSqRob and DEqMS, on controlled mixtures and biological experiments demonstrated a stronger performance and better usability of MSstats v4.0 as compared to existing methods.


Assuntos
Proteômica , Projetos de Pesquisa , Proteômica/métodos , Software , Espectrometria de Massas/métodos , Cromatografia Líquida/métodos
9.
Nat Methods ; 17(10): 981-984, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-32929271

RESUMO

MassIVE.quant is a repository infrastructure and data resource for reproducible quantitative mass spectrometry-based proteomics, which is compatible with all mass spectrometry data acquisition types and computational analysis tools. A branch structure enables MassIVE.quant to systematically store raw experimental data, metadata of the experimental design, scripts of the quantitative analysis workflow, intermediate input and output files, as well as alternative reanalyses of the same dataset.


Assuntos
Bases de Dados de Proteínas , Espectrometria de Massas , Proteômica , Algoritmos , Proteínas Fúngicas/química , Reprodutibilidade dos Testes , Saccharomyces cerevisiae/metabolismo , Software
10.
Bioinformatics ; 38(Suppl 1): i350-i358, 2022 06 24.
Artigo em Inglês | MEDLINE | ID: mdl-35758817

RESUMO

MOTIVATION: Estimating causal queries, such as changes in protein abundance in response to a perturbation, is a fundamental task in the analysis of biomolecular pathways. The estimation requires experimental measurements on the pathway components. However, in practice many pathway components are left unobserved (latent) because they are either unknown, or difficult to measure. Latent variable models (LVMs) are well-suited for such estimation. Unfortunately, LVM-based estimation of causal queries can be inaccurate when parameters of the latent variables are not uniquely identified, or when the number of latent variables is misspecified. This has limited the use of LVMs for causal inference in biomolecular pathways. RESULTS: In this article, we propose a general and practical approach for LVM-based estimation of causal queries. We prove that, despite the challenges above, LVM-based estimators of causal queries are accurate if the queries are identifiable according to Pearl's do-calculus and describe an algorithm for its estimation. We illustrate the breadth and the practical utility of this approach for estimating causal queries in four synthetic and two experimental case studies, where structures of biomolecular pathways challenge the existing methods for causal query estimation. AVAILABILITY AND IMPLEMENTATION: The code and the data documenting all the case studies are available at https://github.com/srtaheri/LVMwithDoCalculus. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Cálculos , Humanos , Modelos Teóricos , Proteínas
11.
J Proteome Res ; 21(1): 289-294, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34919405

RESUMO

Skyline Batch is a newly developed Windows forms application that enables the easy and consistent reprocessing of data with Skyline. Skyline has made previous advances in this direction; however, none enable seamless automated reprocessing of local and remote files. Skyline keeps a log of all of the steps that were taken in the document; however, reproducing these steps takes time and allows room for human error. Skyline also has a command-line interface, enabling it to be run from a batch script, but using the program in this way requires expertise in editing these scripts. By formalizing the workflow of a highly used set of batch scripts into an intuitive and powerful user interface, Skyline Batch can reprocess data stored in remote repositories just by opening and running a Skyline Batch configuration file. When run, a Skyline Batch configuration downloads all necessary remote files and then runs a four-step Skyline workflow. By condensing the steps needed to reprocess the data into one file, Skyline Batch gives researchers the opportunity to publish their processing along with their data and other analysis files. These easily run configuration files will greatly increase the transparency and reproducibility of published work. Skyline Batch is freely available at https://skyline.ms/batch.url.


Assuntos
Software , Interface Usuário-Computador , Humanos , Reprodutibilidade dos Testes , Fluxo de Trabalho
12.
Clin Proteomics ; 19(1): 8, 2022 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-35439943

RESUMO

BACKGROUND: Mass spectrometry imaging (MSI) derives spatial molecular distribution maps directly from clinical tissue specimens and thus bears great potential for assisting pathologists with diagnostic decisions or personalized treatments. Unfortunately, progress in translational MSI is often hindered by insufficient quality control and lack of reproducible data analysis. Raw data and analysis scripts are rarely publicly shared. Here, we demonstrate the application of the Galaxy MSI tool set for the reproducible analysis of a urothelial carcinoma dataset. METHODS: Tryptic peptides were imaged in a cohort of 39 formalin-fixed, paraffin-embedded human urothelial cancer tissue cores with a MALDI-TOF/TOF device. The complete data analysis was performed in a fully transparent and reproducible manner on the European Galaxy Server. Annotations of tumor and stroma were performed by a pathologist and transferred to the MSI data to allow for supervised classifications of tumor vs. stroma tissue areas as well as for muscle-infiltrating and non-muscle infiltrating urothelial carcinomas. For putative peptide identifications, m/z features were matched to the MSiMass list. RESULTS: Rigorous quality control in combination with careful pre-processing enabled reduction of m/z shifts and intensity batch effects. High classification accuracy was found for both, tumor vs. stroma and muscle-infiltrating vs. non-muscle infiltrating urothelial tumors. Some of the most discriminative m/z features for each condition could be assigned a putative identity: stromal tissue was characterized by collagen peptides and tumor tissue by histone peptides. Immunohistochemistry confirmed an increased histone H2A abundance in the tumor compared to the stroma tissues. The muscle-infiltration status was distinguished via MSI by peptides from intermediate filaments such as cytokeratin 7 in non-muscle infiltrating carcinomas and vimentin in muscle-infiltrating urothelial carcinomas, which was confirmed by immunohistochemistry. To make the study fully reproducible and to advocate the criteria of FAIR (findability, accessibility, interoperability, and reusability) research data, we share the raw data, spectra annotations as well as all Galaxy histories and workflows. Data are available via ProteomeXchange with identifier PXD026459 and Galaxy results via https://github.com/foellmelanie/Bladder_MSI_Manuscript_Galaxy_links . CONCLUSION: Here, we show that translational MSI data analysis in a fully transparent and reproducible manner is possible and we would like to encourage the community to join our efforts.

13.
Mol Cell Proteomics ; 19(2): 421-430, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31888964

RESUMO

In bottom-up, label-free discovery proteomics, biological samples are acquired in a data-dependent (DDA) or data-independent (DIA) manner, with peptide signals recorded in an intact (MS1) and fragmented (MS2) form. While DDA has only the MS1 space for quantification, DIA contains both MS1 and MS2 at high quantitative quality. DIA profiles of complex biological matrices such as tissues or cells can contain quantitative interferences, and the interferences at the MS1 and the MS2 signals are often independent. When comparing biological conditions, the interferences can compromise the detection of differential peptide or protein abundance and lead to false positive or false negative conclusions.We hypothesized that the combined use of MS1 and MS2 quantitative signals could improve our ability to detect differentially abundant proteins. Therefore, we developed a statistical procedure incorporating both MS1 and MS2 quantitative information of DIA. We benchmarked the performance of the MS1-MS2-combined method to the individual use of MS1 or MS2 in DIA using four previously published controlled mixtures, as well as in two previously unpublished controlled mixtures. In the majority of the comparisons, the combined method outperformed the individual use of MS1 or MS2. This was particularly true for comparisons with low fold changes, few replicates, and situations where MS1 and MS2 were of similar quality. When applied to a previously unpublished investigation of lung cancer, the MS1-MS2-combined method increased the coverage of known activated pathways.Since recent technological developments continue to increase the quality of MS1 signals (e.g. using the BoxCar scan mode for Orbitrap instruments), the combination of the MS1 and MS2 information has a high potential for future statistical analysis of DIA data.


Assuntos
Proteômica/métodos , Animais , Caenorhabditis elegans , Cerebelo/metabolismo , Interpretação Estatística de Dados , Células HeLa , Humanos , Pulmão/metabolismo , Neoplasias Pulmonares/metabolismo , Espectrometria de Massas , Camundongos , Saccharomyces cerevisiae
14.
Mol Cell Proteomics ; 19(6): 944-959, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-32234965

RESUMO

In bottom-up mass spectrometry-based proteomics, relative protein quantification is often achieved with data-dependent acquisition (DDA), data-independent acquisition (DIA), or selected reaction monitoring (SRM). These workflows quantify proteins by summarizing the abundances of all the spectral features of the protein (e.g. precursor ions, transitions or fragments) in a single value per protein per run. When abundances of some features are inconsistent with the overall protein profile (for technological reasons such as interferences, or for biological reasons such as post-translational modifications), the protein-level summaries and the downstream conclusions are undermined. We propose a statistical approach that automatically detects spectral features with such inconsistent patterns. The detected features can be separately investigated, and if necessary, removed from the data set. We evaluated the proposed approach on a series of benchmark-controlled mixtures and biological investigations with DDA, DIA and SRM data acquisitions. The results demonstrated that it could facilitate and complement manual curation of the data. Moreover, it can improve the estimation accuracy, sensitivity and specificity of detecting differentially abundant proteins, and reproducibility of conclusions across different data processing tools. The approach is implemented as an option in the open-source R-based software MSstats.


Assuntos
Espectrometria de Massas/métodos , Proteínas/análise , Proteômica/métodos , Bases de Dados de Proteínas , Processamento de Proteína Pós-Traducional , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Software
15.
Mol Cell Proteomics ; 19(10): 1706-1723, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-32680918

RESUMO

Tandem mass tag (TMT) is a multiplexing technology widely-used in proteomic research. It enables relative quantification of proteins from multiple biological samples in a single MS run with high efficiency and high throughput. However, experiments often require more biological replicates or conditions than can be accommodated by a single run, and involve multiple TMT mixtures and multiple runs. Such larger-scale experiments combine sources of biological and technical variation in patterns that are complex, unique to TMT-based workflows, and challenging for the downstream statistical analysis. These patterns cannot be adequately characterized by statistical methods designed for other technologies, such as label-free proteomics or transcriptomics. This manuscript proposes a general statistical approach for relative protein quantification in MS- based experiments with TMT labeling. It is applicable to experiments with multiple conditions, multiple biological replicate runs and multiple technical replicate runs, and unbalanced designs. It is based on a flexible family of linear mixed-effects models that handle complex patterns of technical artifacts and missing values. The approach is implemented in MSstatsTMT, a freely available open-source R/Bioconductor package compatible with data processing tools such as Proteome Discoverer, MaxQuant, OpenMS, and SpectroMine. Evaluation on a controlled mixture, simulated datasets, and three biological investigations with diverse designs demonstrated that MSstatsTMT balanced the sensitivity and the specificity of detecting differentially abundant proteins, in large-scale experiments with multiple biological mixtures.


Assuntos
Marcação por Isótopo , Proteoma/metabolismo , Estatística como Assunto , Espectrometria de Massas em Tandem , Humanos , Proteômica
16.
Bioinformatics ; 36(Suppl_1): i300-i308, 2020 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-32657378

RESUMO

MOTIVATION: Mass spectrometry imaging (MSI) characterizes the molecular composition of tissues at spatial resolution, and has a strong potential for distinguishing tissue types, or disease states. This can be achieved by supervised classification, which takes as input MSI spectra, and assigns class labels to subtissue locations. Unfortunately, developing such classifiers is hindered by the limited availability of training sets with subtissue labels as the ground truth. Subtissue labeling is prohibitively expensive, and only rough annotations of the entire tissues are typically available. Classifiers trained on data with approximate labels have sub-optimal performance. RESULTS: To alleviate this challenge, we contribute a semi-supervised approach mi-CNN. mi-CNN implements multiple instance learning with a convolutional neural network (CNN). The multiple instance aspect enables weak supervision from tissue-level annotations when classifying subtissue locations. The convolutional architecture of the CNN captures contextual dependencies between the spectral features. Evaluations on simulated and experimental datasets demonstrated that mi-CNN improved the subtissue classification as compared to traditional classifiers. We propose mi-CNN as an important step toward accurate subtissue classification in MSI, enabling rapid distinction between tissue types and disease states. AVAILABILITY AND IMPLEMENTATION: The data and code are available at https://github.com/Vitek-Lab/mi-CNN_MSI.


Assuntos
Redes Neurais de Computação , Espectrometria de Massas
17.
Bioinformatics ; 36(Suppl_2): i745-i753, 2020 12 30.
Artigo em Inglês | MEDLINE | ID: mdl-33381824

RESUMO

MOTIVATION: Accurate estimation of false discovery rate (FDR) of spectral identification is a central problem in mass spectrometry-based proteomics. Over the past two decades, target-decoy approaches (TDAs) and decoy-free approaches (DFAs) have been widely used to estimate FDR. TDAs use a database of decoy species to faithfully model score distributions of incorrect peptide-spectrum matches (PSMs). DFAs, on the other hand, fit two-component mixture models to learn the parameters of correct and incorrect PSM score distributions. While conceptually straightforward, both approaches lead to problems in practice, particularly in experiments that push instrumentation to the limit and generate low fragmentation-efficiency and low signal-to-noise-ratio spectra. RESULTS: We introduce a new decoy-free framework for FDR estimation that generalizes present DFAs while exploiting more search data in a manner similar to TDAs. Our approach relies on multi-component mixtures, in which score distributions corresponding to the correct PSMs, best incorrect PSMs and second-best incorrect PSMs are modeled by the skew normal family. We derive EM algorithms to estimate parameters of these distributions from the scores of best and second-best PSMs associated with each experimental spectrum. We evaluate our models on multiple proteomics datasets and a HeLa cell digest case study consisting of more than a million spectra in total. We provide evidence of improved performance over existing DFAs and improved stability and speed over TDAs without any performance degradation. We propose that the new strategy has the potential to extend beyond peptide identification and reduce the need for TDA on all analytical platforms. AVAILABILITYAND IMPLEMENTATION: https://github.com/shawn-peng/FDR-estimation. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Proteômica , Espectrometria de Massas em Tandem , Algoritmos , Bases de Dados de Proteínas , Células HeLa , Humanos , Peptídeos
18.
Mol Cell Proteomics ; 18(9): 1836-1850, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31289117

RESUMO

Protein biomarkers for epithelial ovarian cancer are critical for the early detection of the cancer to improve patient prognosis and for the clinical management of the disease to monitor treatment response and to detect recurrences. Unfortunately, the discovery of protein biomarkers is hampered by the limited availability of reliable and sensitive assays needed for the reproducible quantification of proteins in complex biological matrices such as blood plasma. In recent years, targeted mass spectrometry, exemplified by selected reaction monitoring (SRM) has emerged as a method, capable of overcoming this limitation. Here, we present a comprehensive SRM-based strategy for developing plasma-based protein biomarkers for epithelial ovarian cancer and illustrate how the SRM platform, when combined with rigorous experimental design and statistical analysis, can result in detection of predictive analytes.Our biomarker development strategy first involved a discovery-driven proteomic effort to derive potential N-glycoprotein biomarker candidates for plasma-based detection of human ovarian cancer from a genetically engineered mouse model of endometrioid ovarian cancer, which accurately recapitulates the human disease. Next, 65 candidate markers selected from proteins of different abundance in the discovery dataset were reproducibly quantified with SRM assays across a large cohort of over 200 plasma samples from ovarian cancer patients and healthy controls. Finally, these measurements were used to derive a 5-protein signature for distinguishing individuals with epithelial ovarian cancer from healthy controls. The sensitivity of the candidate biomarker signature in combination with CA125 ELISA-based measurements currently used in clinic, exceeded that of CA125 ELISA-based measurements alone. The SRM-based strategy in this study is broadly applicable. It can be used in any study that requires accurate and reproducible quantification of selected proteins in a high-throughput and multiplexed fashion.


Assuntos
Biomarcadores Tumorais/sangue , Carcinoma Epitelial do Ovário/sangue , Espectrometria de Massas/métodos , Neoplasias Ovarianas/sangue , Proteômica/métodos , Animais , Antígenos de Neoplasias/sangue , Proteínas Sanguíneas/análise , Antígeno Ca-125/sangue , Estudos de Casos e Controles , Estudos de Coortes , Desmogleína 2/sangue , Feminino , Doença das Cadeias Pesadas/sangue , Humanos , Cadeias mu de Imunoglobulina/sangue , Proteínas de Membrana/sangue , Camundongos Transgênicos , Molécula L1 de Adesão de Célula Nervosa/sangue , Sensibilidade e Especificidade , Trombospondina 1/sangue
19.
Bioinformatics ; 35(14): i208-i217, 2019 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-31510675

RESUMO

MOTIVATION: Mass spectrometry imaging (MSI) characterizes the spatial distribution of ions in complex biological samples such as tissues. Since many tissues have complex morphology, treatments and conditions often affect the spatial distribution of the ions in morphology-specific ways. Evaluating the selectivity and the specificity of ion localization and regulation across morphology types is biologically important. However, MSI lacks algorithms for segmenting images at both single-ion and spatial resolution. RESULTS: This article contributes spatial-Dirichlet Gaussian mixture model (DGMM), an algorithm and a workflow for the analyses of MSI experiments, that detects components of single-ion images with homogeneous spatial composition. The approach extends DGMMs to account for the spatial structure of MSI. Evaluations on simulated and experimental datasets with diverse MSI workflows demonstrated that spatial-DGMM accurately segments ion images, and can distinguish ions with homogeneous and heterogeneous spatial distribution. We also demonstrated that the extracted spatial information is useful for downstream analyses, such as detecting morphology-specific ions, finding groups of ions with similar spatial patterns, and detecting changes in chemical composition of tissues between conditions. AVAILABILITY AND IMPLEMENTATION: The data and code are available at https://github.com/Vitek-Lab/IonSpattern. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Íons/análise , Espectrometria de Massas , Distribuição Normal , Fluxo de Trabalho
20.
Mol Cell Proteomics ; 17(5): 913-924, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29438992

RESUMO

The need for assay characterization is ubiquitous in quantitative mass spectrometry-based proteomics. Among many assay characteristics, the limit of blank (LOB) and limit of detection (LOD) are two particularly useful figures of merit. LOB and LOD are determined by repeatedly quantifying the observed intensities of peptides in samples with known peptide concentrations and deriving an intensity versus concentration response curve. Most commonly, a weighted linear or logistic curve is fit to the intensity-concentration response, and LOB and LOD are estimated from the fit. Here we argue that these methods inaccurately characterize assays where observed intensities level off at low concentrations, which is a common situation in multiplexed systems. This manuscript illustrates the deficiencies of these methods, and proposes an alternative approach based on nonlinear regression that overcomes these inaccuracies. We evaluated the performance of the proposed method using computer simulations and using eleven experimental data sets acquired in Data-Independent Acquisition (DIA), Parallel Reaction Monitoring (PRM), and Selected Reaction Monitoring (SRM) mode. When the intensity levels off at low concentrations, the nonlinear model changes the estimates of LOB/LOD upwards, in some data sets by 20-40%. In absence of a low concentration intensity leveling off, the estimates of LOB/LOD obtained with nonlinear statistical modeling were identical to those of weighted linear regression. We implemented the nonlinear regression approach in the open-source R-based software MSstats, and advocate its general use for characterization of mass spectrometry-based assays.


Assuntos
Espectrometria de Massas/métodos , Dinâmica não Linear , Sequência de Aminoácidos , Bioensaio , Calibragem , Humanos , Limite de Detecção , Modelos Teóricos , Peptídeos/química , Análise de Regressão
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA