ABSTRACT
To explore the biology of lung adenocarcinoma (LUAD) and identify new therapeutic opportunities, we performed comprehensive proteogenomic characterization of 110 tumors and 101 matched normal adjacent tissues (NATs) incorporating genomics, epigenomics, deep-scale proteomics, phosphoproteomics, and acetylproteomics. Multi-omics clustering revealed four subgroups defined by key driver mutations, country, and gender. Proteomic and phosphoproteomic data illuminated biology downstream of copy number aberrations, somatic mutations, and fusions and identified therapeutic vulnerabilities associated with driver events involving KRAS, EGFR, and ALK. Immune subtyping revealed a complex landscape, reinforced the association of STK11 with immune-cold behavior, and underscored a potential immunosuppressive role of neutrophil degranulation. Smoking-associated LUADs showed correlation with other environmental exposure signatures and a field effect in NATs. Matched NATs allowed identification of differentially expressed proteins with potential diagnostic and therapeutic utility. This proteogenomics dataset represents a unique public resource for researchers and clinicians seeking to better understand and treat lung adenocarcinomas.
Subject(s)
Adenocarcinoma of Lung/drug therapy , Adenocarcinoma of Lung/genetics , Lung Neoplasms/drug therapy , Lung Neoplasms/genetics , Proteogenomics , Adenocarcinoma of Lung/immunology , Adult , Aged , Aged, 80 and over , Biomarkers, Tumor/metabolism , Carcinogenesis/genetics , Carcinogenesis/pathology , DNA Copy Number Variations/genetics , DNA Methylation/genetics , Female , Humans , Lung Neoplasms/immunology , Male , Middle Aged , Mutation/genetics , Oncogene Proteins, Fusion , Phenotype , Phosphoproteins/metabolism , Proteome/metabolismABSTRACT
The integration of mass spectrometry-based proteomics with next-generation DNA and RNA sequencing profiles tumors more comprehensively. Here this "proteogenomics" approach was applied to 122 treatment-naive primary breast cancers accrued to preserve post-translational modifications, including protein phosphorylation and acetylation. Proteogenomics challenged standard breast cancer diagnoses, provided detailed analysis of the ERBB2 amplicon, defined tumor subsets that could benefit from immune checkpoint therapy, and allowed more accurate assessment of Rb status for prediction of CDK4/6 inhibitor responsiveness. Phosphoproteomics profiles uncovered novel associations between tumor suppressor loss and targetable kinases. Acetylproteome analysis highlighted acetylation on key nuclear proteins involved in the DNA damage response and revealed cross-talk between cytoplasmic and mitochondrial acetylation and metabolism. Our results underscore the potential of proteogenomics for clinical investigation of breast cancer through more accurate annotation of targetable pathways and biological features of this remarkably heterogeneous malignancy.
Subject(s)
Breast Neoplasms/genetics , Breast Neoplasms/pathology , Carcinogenesis/genetics , Carcinogenesis/pathology , Molecular Targeted Therapy , Proteogenomics , APOBEC Deaminases/metabolism , Adult , Aged , Aged, 80 and over , Breast Neoplasms/immunology , Breast Neoplasms/therapy , Cohort Studies , DNA Damage , DNA Repair , Female , Humans , Immunotherapy , Metabolomics , Middle Aged , Mutagenesis/genetics , Phosphorylation , Protein Kinase Inhibitors/pharmacology , Protein Kinases/metabolism , Receptor, ErbB-2/metabolism , Retinoblastoma Protein/metabolism , Tumor Microenvironment/immunologyABSTRACT
We performed the first proteogenomic study on aĀ prospectively collected colon cancer cohort. Comparative proteomic and phosphoproteomic analysis ofĀ paired tumor and normal adjacent tissues producedĀ a catalog of colon cancer-associated proteins and phosphosites, including known and putative new biomarkers, drug targets, and cancer/testis antigens. Proteogenomic integration not only prioritized genomically inferred targets, such as copy-number drivers and mutation-derived neoantigens, but also yielded novel findings. Phosphoproteomics data associated Rb phosphorylation with increased proliferation and decreased apoptosis in colon cancer, which explains why this classical tumor suppressor is amplified in colonĀ tumors and suggests a rationaleĀ for targeting Rb phosphorylation in colon cancer. Proteomics identified an association between decreased CD8 TĀ cell infiltration and increased glycolysis in microsatellite instability-high (MSI-H) tumors,Ā suggesting glycolysis as a potential target to overcome the resistance of MSI-H tumors to immune checkpointĀ blockade. Proteogenomics presents new avenuesĀ for biological discoveries and therapeutic development.
Subject(s)
Colonic Neoplasms/genetics , Colonic Neoplasms/therapy , Proteogenomics/methods , Apoptosis/genetics , Biomarkers, Tumor/genetics , Biomarkers, Tumor/metabolism , CD8-Positive T-Lymphocytes , Cell Proliferation/genetics , Colonic Neoplasms/metabolism , Genomics/methods , Glycolysis , Humans , Microsatellite Instability , Mutation , Phosphorylation , Prospective Studies , Proteomics/methods , Retinoblastoma Protein/genetics , Retinoblastoma Protein/metabolismABSTRACT
Somatic mutations have been extensively characterized in breast cancer, but the effects of these genetic alterations on the proteomic landscape remain poorly understood. Here we describe quantitative mass-spectrometry-based proteomic and phosphoproteomic analyses of 105 genomically annotated breast cancers, of which 77 provided high-quality data. Integrated analyses provided insights into the somatic cancer genome including the consequences of chromosomal loss, such as the 5q deletion characteristic of basal-like breast cancer. Interrogation of the 5q trans-effects against the Library of Integrated Network-based Cellular Signatures, connected loss of CETN3 and SKP1 to elevated expression of epidermal growth factor receptor (EGFR), and SKP1 loss also to increased SRC tyrosine kinase. Global proteomic data confirmed a stromal-enriched group of proteins in addition to basal and luminal clusters, and pathway analysis of the phosphoproteome identified a G-protein-coupled receptor cluster that was not readily identified at the mRNA level. In addition to ERBB2, other amplicon-associated highly phosphorylated kinases were identified, including CDK12, PAK1, PTK2, RIPK2 and TLK2. We demonstrate that proteogenomic analysis of breast cancer elucidates the functional consequences of somatic mutations, narrows candidate nominations for driver genes within large deletions and amplified regions, and identifies therapeutic targets.
Subject(s)
Breast Neoplasms/genetics , Breast Neoplasms/metabolism , Genomics , Mutation/genetics , Proteomics , Signal Transduction , Breast Neoplasms/classification , Breast Neoplasms/enzymology , Calcium-Binding Proteins/deficiency , Calcium-Binding Proteins/genetics , Chromosome Deletion , Chromosomes, Human, Pair 5/genetics , Class I Phosphatidylinositol 3-Kinases , Cyclin-Dependent Kinases/genetics , Cyclin-Dependent Kinases/metabolism , ErbB Receptors/genetics , ErbB Receptors/metabolism , Female , Focal Adhesion Kinase 1/genetics , Focal Adhesion Kinase 1/metabolism , Gene Expression Regulation, Neoplastic , Humans , Mass Spectrometry , Molecular Sequence Annotation , Phosphatidylinositol 3-Kinases/genetics , Phosphoproteins/analysis , Phosphoproteins/genetics , Phosphoproteins/metabolism , Protein Kinases/genetics , Protein Kinases/metabolism , Receptor, ErbB-2/genetics , Receptor, ErbB-2/metabolism , Receptor-Interacting Protein Serine-Threonine Kinase 2/genetics , Receptor-Interacting Protein Serine-Threonine Kinase 2/metabolism , Receptors, G-Protein-Coupled/genetics , Receptors, G-Protein-Coupled/metabolism , S-Phase Kinase-Associated Proteins/genetics , S-Phase Kinase-Associated Proteins/metabolism , Tumor Suppressor Protein p53/genetics , p21-Activated Kinases/genetics , p21-Activated Kinases/metabolism , src-Family Kinases/genetics , src-Family Kinases/metabolismABSTRACT
Aberrant phospho-signaling is a hallmark of cancer. We investigated kinase-substrate regulation of 33,239 phosphorylation sites (phosphosites) in 77 breast tumors and 24 breast cancer xenografts. Our search discovered 2134 quantitatively correlated kinase-phosphosite pairs, enriching for and extending experimental or binding-motif predictions. Among the 91 kinases with auto-phosphorylation, elevated EGFR, ERBB2, PRKG1, and WNK1 phosphosignaling were enriched in basal, HER2-E, Luminal A, and Luminal B breast cancers, respectively, revealing subtype-specific regulation. CDKs, MAPKs, and ataxia-telangiectasia proteins were dominant, master regulators of substrate-phosphorylation, whose activities are not captured by genomic evidence. We unveiled phospho-signaling and druggable targets from 113 kinase-substrate pairs and cascades downstream of kinases, including AKT1, BRAF and EGFR. We further identified kinase-substrate-pairs associated with clinical or immune signatures and experimentally validated activated phosphosites of ERBB2, EIF4EBP1, and EGFR. Overall, kinase-substrate regulation revealed by the largest unbiased global phosphorylation data to date connects driver events to their signaling effects.
Subject(s)
Breast Neoplasms/metabolism , Protein Kinases/metabolism , Female , Humans , Phosphorylation , Signal TransductionABSTRACT
Clear cell renal cell carcinoma (ccRCC) is the most common type of kidney cancer, comprising approximately 75% of all kidney tumors. Recent the Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) studies have significantly advanced the molecular characterization of RCC and facilitated the development of targeted therapies. Such advances have improved the median survival of patients with advanced disease from less than 10 months prior to 2004 to 30 months by 2011. However, approximately 30% of localized ccRCC patients will nevertheless develop recurrence or metastasis after surgical resection of their tumor. Therefore, it is critical to further analyze potential tumor-associated proteins and their profiles during disease progression. Over the past decade, tremendous effort has been focused on the study of molecular pathways, including genomics, transcriptomics, and proteomics in order to identify potential molecular biomarkers, as well as to facilitate early detection, monitor tumor progression and uncover potentially therapeutic targets. In this review, we focus on recent advances in the proteomic analysis of ccRCC, current strategies and challenges, and perspectives in the field. This insight will highlight the discovery of tumor-associated proteins, and their potential clinical impact on personalized precision-based care in ccRCC.
Subject(s)
Biomarkers, Tumor/genetics , Carcinoma, Renal Cell/genetics , Proteome/genetics , Proteomics , Carcinoma, Renal Cell/pathology , Gene Expression Regulation, Neoplastic , Genomics/trends , HumansABSTRACT
Extensive genomic characterization of human cancers presents the problem of inference from genomic abnormalities to cancer phenotypes. To address this problem, we analysed proteomes of colon and rectal tumours characterized previously by The Cancer Genome Atlas (TCGA) and perform integrated proteogenomic analyses. Somatic variants displayed reduced protein abundance compared to germline variants. Messenger RNA transcript abundance did not reliably predict protein abundance differences between tumours. Proteomics identified five proteomic subtypes in the TCGA cohort, two of which overlapped with the TCGA 'microsatellite instability/CpG island methylation phenotype' transcriptomic subtype, but had distinct mutation, methylation and protein expression patterns associated with different clinical outcomes. Although copy number alterations showed strong cis- and trans-effects on mRNA abundance, relatively few of these extend to the protein level. Thus, proteomics data enabled prioritization of candidate driver genes. The chromosome 20q amplicon was associated with the largest global changes at both mRNA and protein levels; proteomics data highlighted potential 20q candidates, including HNF4A (hepatocyte nuclear factor 4, alpha), TOMM34 (translocase of outer mitochondrial membrane 34) and SRC (SRC proto-oncogene, non-receptor tyrosine kinase). Integrated proteogenomic analysis provides functional context to interpret genomic abnormalities and affords a new paradigm for understanding cancer biology.
Subject(s)
Colonic Neoplasms/genetics , Colonic Neoplasms/metabolism , Genomics , Proteome/metabolism , Rectal Neoplasms/genetics , Rectal Neoplasms/metabolism , Transcriptome/genetics , Chromosomes, Human, Pair 20/genetics , CpG Islands/genetics , DNA Copy Number Variations/genetics , DNA Methylation , Hepatocyte Nuclear Factor 4/genetics , Humans , Microsatellite Repeats/genetics , Mitochondrial Membrane Transport Proteins/genetics , Mitochondrial Precursor Protein Import Complex Proteins , Mutation, Missense/genetics , Neoplasm Proteins/analysis , Neoplasm Proteins/genetics , Neoplasm Proteins/metabolism , Point Mutation/genetics , Proteome/analysis , Proteome/genetics , Proteomics , Proto-Oncogene Mas , Proto-Oncogene Proteins pp60(c-src)/genetics , RNA, Messenger/analysis , RNA, Messenger/genetics , RNA, Messenger/metabolism , RNA, Neoplasm/analysis , RNA, Neoplasm/genetics , RNA, Neoplasm/metabolismABSTRACT
Coexpression of mRNAs under multiple conditions is commonly used to infer cofunctionality of their gene products despite well-known limitations of this "guilt-by-association" (GBA) approach. Recent advancements in mass spectrometry-based proteomic technologies have enabled global expression profiling at the protein level; however, whether proteome profiling data can outperform transcriptome profiling data for coexpression based gene function prediction has not been systematically investigated. Here, we address this question by constructing and analyzing mRNA and protein coexpression networks for three cancer types with matched mRNA and protein profiling data from The Cancer Genome Atlas (TCGA) and the Clinical Proteomic Tumor Analysis Consortium (CPTAC). Our analyses revealed a marked difference in wiring between the mRNA and protein coexpression networks. Whereas protein coexpression was driven primarily by functional similarity between coexpressed genes, mRNA coexpression was driven by both cofunction and chromosomal colocalization of the genes. Functionally coherent mRNA modules were more likely to have their edges preserved in corresponding protein networks than functionally incoherent mRNA modules. Proteomic data strengthened the link between gene expression and function for at least 75% of Gene Ontology (GO) biological processes and 90% of KEGG pathways. A web application Gene2Net (http://cptac.gene2net.org) developed based on the three protein coexpression networks revealed novel gene-function relationships, such as linking ERBB2 (HER2) to lipid biosynthetic process in breast cancer, identifying PLG as a new gene involved in complement activation, and identifying AEBP1 as a new epithelial-mesenchymal transition (EMT) marker. Our results demonstrate that proteome profiling outperforms transcriptome profiling for coexpression based gene function prediction. Proteomics should be integrated if not preferred in gene function and human disease studies.
Subject(s)
Gene Expression Profiling/methods , Neoplasms/genetics , Neoplasms/metabolism , Proteomics/methods , Algorithms , Chromosome Mapping , Epithelial-Mesenchymal Transition , Gene Expression Regulation, Neoplastic , Gene Regulatory Networks , Humans , Mass Spectrometry , Oligonucleotide Array Sequence Analysis , Protein Interaction Maps , Web BrowserABSTRACT
The Human Cancer Proteome Project (Cancer-HPP) is an international initiative organized by HUPO whose key objective is to decipher the human cancer proteome through a coordinated effort by cancer proteome researchers around the world. The ultimate goal is to map the entire human cancer proteome to disclose tumor biology and drive improved diagnostics, treatment and management of cancer. Here we report the progress in the cancer proteomics field to date, and discuss future proteomic developments that will be needed to optimally delineate cancer phenotypes and advance the molecular characterization of this significant disease that is one of the leading causes of death worldwide.
ABSTRACT
Improvements in mass spectrometry (MS)-based peptide sequencing provide a new opportunity to determine whether polymorphisms, mutations, and splice variants identified in cancer cells are translated. Herein, we apply a proteogenomic data integration tool (QUILTS) to illustrate protein variant discovery using whole genome, whole transcriptome, and global proteome datasets generated from a pair of luminal and basal-like breast-cancer-patient-derived xenografts (PDX). The sensitivity of proteogenomic analysis for singe nucleotide variant (SNV) expression and novel splice junction (NSJ) detection was probed using multiple MS/MS sample process replicates defined here as an independent tandem MS experiment using identical sample material. Despite analysis of over 30 sample process replicates, only about 10% of SNVs (somatic and germline) detected by both DNA and RNA sequencing were observed as peptides. An even smaller proportion of peptides corresponding to NSJ observed by RNA sequencing were detected (<0.1%). Peptides mapping to DNA-detected SNVs without a detectable mRNA transcript were also observed, suggesting that transcriptome coverage was incomplete (Ć¢ĀĀ¼80%). In contrast to germline variants, somatic variants were less likely to be detected at the peptide level in the basal-like tumor than in the luminal tumor, raising the possibility of differential translation or protein degradation effects. In conclusion, this large-scale proteogenomic integration allowed us to determine the degree to which mutations are translated and identify gaps in sequence coverage, thereby benchmarking current technology and progress toward whole cancer proteome and transcriptome analysis.
Subject(s)
Alternative Splicing , Mammary Neoplasms, Experimental/genetics , Mutation , Proteomics/methods , Sequence Analysis, DNA/methods , Sequence Analysis, RNA/methods , Animals , Computational Biology/methods , Databases, Genetic , Female , Genome , Humans , Mammary Neoplasms, Experimental/metabolism , Mice , Polymorphism, Single Nucleotide , Tandem Mass Spectrometry , TranscriptomeABSTRACT
Clinical proteomics requires large-scale analysis of human specimens to achieve statistical significance. We evaluated the long-term reproducibility of an iTRAQ (isobaric tags for relative and absolute quantification)-based quantitative proteomics strategy using one channel for reference across all samples in different iTRAQ sets. A total of 148 liquid chromatography tandem mass spectrometric (LC-MS/MS) analyses were completed, generating six 2D LC-MS/MS data sets for human-in-mouse breast cancer xenograft tissues representative of basal and luminal subtypes. Such large-scale studies require the implementation of robust metrics to assess the contributions of technical and biological variability in the qualitative and quantitative data. Accordingly, we derived a quantification confidence score based on the quality of each peptide-spectrum match to remove quantification outliers from each analysis. After combining confidence score filtering and statistical analysis, reproducible protein identification and quantitative results were achieved from LC-MS/MS data sets collected over a 7-month period. This study provides the first quality assessment on long-term stability and technical considerations for study design of a large-scale clinical proteomics project.
Subject(s)
Breast Neoplasms/pathology , Proteomics/methods , Animals , Breast Neoplasms/chemistry , Chromatography, Liquid , Heterografts , Humans , Mice , Neoplasm Proteins/analysis , Proteome/analysis , Quality Assurance, Health Care , Tandem Mass SpectrometryABSTRACT
There is an increasing need in biology and clinical medicine to robustly and reliably measure tens to hundreds of peptides and proteins in clinical and biological samples with high sensitivity, specificity, reproducibility, and repeatability. Previously, we demonstrated that LC-MRM-MS with isotope dilution has suitable performance for quantitative measurements of small numbers of relatively abundant proteins in human plasma and that the resulting assays can be transferred across laboratories while maintaining high reproducibility and quantitative precision. Here, we significantly extend that earlier work, demonstrating that 11 laboratories using 14 LC-MS systems can develop, determine analytical figures of merit, and apply highly multiplexed MRM-MS assays targeting 125 peptides derived from 27 cancer-relevant proteins and seven control proteins to precisely and reproducibly measure the analytes in human plasma. To ensure consistent generation of high quality data, we incorporated a system suitability protocol (SSP) into our experimental design. The SSP enabled real-time monitoring of LC-MRM-MS performance during assay development and implementation, facilitating early detection and correction of chromatographic and instrumental problems. Low to subnanogram/ml sensitivity for proteins in plasma was achieved by one-step immunoaffinity depletion of 14 abundant plasma proteins prior to analysis. Median intra- and interlaboratory reproducibility was <20%, sufficient for most biological studies and candidate protein biomarker verification. Digestion recovery of peptides was assessed and quantitative accuracy improved using heavy-isotope-labeled versions of the proteins as internal standards. Using the highly multiplexed assay, participating laboratories were able to precisely and reproducibly determine the levels of a series of analytes in blinded samples used to simulate an interlaboratory clinical study of patient samples. Our study further establishes that LC-MRM-MS using stable isotope dilution, with appropriate attention to analytical validation and appropriate quality control measures, enables sensitive, specific, reproducible, and quantitative measurements of proteins and peptides in complex biological matrices such as plasma.
Subject(s)
Neoplasm Proteins/blood , Neoplasms/metabolism , Peptides/analysis , Proteomics/methods , Chromatography, Liquid/methods , Humans , Isotope Labeling , Mass Spectrometry/methods , Neoplasm Proteins/chemistry , Neoplasm Proteins/isolation & purification , Neoplasms/blood , Peptides/chemistry , Reproducibility of ResultsABSTRACT
The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has produced large proteomics data sets from the mass spectrometric interrogation of tumor samples previously analyzed by The Cancer Genome Atlas (TCGA) program. The availability of the genomic and proteomic data is enabling proteogenomic study for both reference (i.e., contained in major sequence databases) and nonreference markers of cancer. The CPTAC laboratories have focused on colon, breast, and ovarian tissues in the first round of analyses; spectra from these data sets were produced from 2D liquid chromatography-tandem mass spectrometry analyses and represent deep coverage. To reduce the variability introduced by disparate data analysis platforms (e.g., software packages, versions, parameters, sequence databases, etc.), the CPTAC Common Data Analysis Platform (CDAP) was created. The CDAP produces both peptide-spectrum-match (PSM) reports and gene-level reports. The pipeline processes raw mass spectrometry data according to the following: (1) peak-picking and quantitative data extraction, (2) database searching, (3) gene-based protein parsimony, and (4) false-discovery rate-based filtering. The pipeline also produces localization scores for the phosphopeptide enrichment studies using the PhosphoRS program. Quantitative information for each of the data sets is specific to the sample processing, with PSM and protein reports containing the spectrum-level or gene-level ("rolled-up") precursor peak areas and spectral counts for label-free or reporter ion log-ratios for 4plex iTRAQ. The reports are available in simple tab-delimited formats and, for the PSM-reports, in mzIdentML. The goal of the CDAP is to provide standard, uniform reports for all of the CPTAC data to enable comparisons between different samples and cancer types as well as across the major omics fields.
Subject(s)
Neoplasms/diagnosis , Neoplasms/metabolism , Proteomics , Biomarkers, Tumor/metabolism , Humans , Proteome/metabolismABSTRACT
The NCI Clinical Proteomic Tumor Analysis Consortium (CPTAC) employed a pair of reference xenograft proteomes for initial platform validation and ongoing quality control of its data collection for The Cancer Genome Atlas (TCGA) tumors. These two xenografts, representing basal and luminal-B human breast cancer, were fractionated and analyzed on six mass spectrometers in a total of 46 replicates divided between iTRAQ and label-free technologies, spanning a total of 1095 LC-MS/MS experiments. These data represent a unique opportunity to evaluate the stability of proteomic differentiation by mass spectrometry over many months of time for individual instruments or across instruments running dissimilar workflows. We evaluated iTRAQ reporter ions, label-free spectral counts, and label-free extracted ion chromatograms as strategies for data interpretation (source code is available from http://homepages.uc.edu/~wang2x7/Research.htm ). From these assessments, we found that differential genes from a single replicate were confirmed by other replicates on the same instrument from 61 to 93% of the time. When comparing across different instruments and quantitative technologies, using multiple replicates, differential genes were reproduced by other data sets from 67 to 99% of the time. Projecting gene differences to biological pathways and networks increased the degree of similarity. These overlaps send an encouraging message about the maturity of technologies for proteomic differentiation.
Subject(s)
Heterografts/chemistry , Proteomics/methods , Proteomics/standards , Breast Neoplasms/chemistry , Breast Neoplasms/metabolism , Chromatography, Liquid , Data Interpretation, Statistical , Female , Gene Expression Profiling/methods , Humans , Metabolic Networks and Pathways , Observer Variation , Proteome , Proteomics/instrumentation , Quality Control , Reproducibility of Results , Tandem Mass Spectrometry/standardsABSTRACT
BACKGROUND: For many years, basic and clinical researchers have taken advantage of the analytical sensitivity and specificity afforded by mass spectrometry in the measurement of proteins. Clinical laboratories are now beginning to deploy these work flows as well. For assays that use proteolysis to generate peptides for protein quantification and characterization, synthetic stable isotope-labeled internal standard peptides are of central importance. No general recommendations are currently available surrounding the use of peptides in protein mass spectrometric assays. CONTENT: The Clinical Proteomic Tumor Analysis Consortium of the National Cancer Institute has collaborated with clinical laboratorians, peptide manufacturers, metrologists, representatives of the pharmaceutical industry, and other professionals to develop a consensus set of recommendations for peptide procurement, characterization, storage, and handling, as well as approaches to the interpretation of the data generated by mass spectrometric protein assays. Additionally, the importance of carefully characterized reference materials-in particular, peptide standards for the improved concordance of amino acid analysis methods across the industry-is highlighted. The alignment of practices around the use of peptides and the transparency of sample preparation protocols should allow for the harmonization of peptide and protein quantification in research and clinical care.
Subject(s)
Clinical Laboratory Techniques , Mass Spectrometry , Peptides/analysis , Proteomics , Specimen Handling , Guidelines as Topic , Humans , Peptides/isolation & purification , Research PersonnelABSTRACT
Multiple reaction monitoring (MRM) mass spectrometry coupled with stable isotope dilution (SID) and liquid chromatography (LC) is increasingly used in biological and clinical studies for precise and reproducible quantification of peptides and proteins in complex sample matrices. Robust LC-SID-MRM-MS-based assays that can be replicated across laboratories and ultimately in clinical laboratory settings require standardized protocols to demonstrate that the analysis platforms are performing adequately. We developed a system suitability protocol (SSP), which employs a predigested mixture of six proteins, to facilitate performance evaluation of LC-SID-MRM-MS instrument platforms, configured with nanoflow-LC systems interfaced to triple quadrupole mass spectrometers. The SSP was designed for use with low multiplex analyses as well as high multiplex approaches when software-driven scheduling of data acquisition is required. Performance was assessed by monitoring of a range of chromatographic and mass spectrometric metrics including peak width, chromatographic resolution, peak capacity, and the variability in peak area and analyte retention time (RT) stability. The SSP, which was evaluated in 11 laboratories on a total of 15 different instruments, enabled early diagnoses of LC and MS anomalies that indicated suboptimal LC-MRM-MS performance. The observed range in variation of each of the metrics scrutinized serves to define the criteria for optimized LC-SID-MRM-MS platforms for routine use, with pass/fail criteria for system suitability performance measures defined as peak area coefficient of variation <0.15, peak width coefficient of variation <0.15, standard deviation of RT <0.15 min (9 s), and the RT drift <0.5min (30 s). The deleterious effect of a marginally performing LC-SID-MRM-MS system on the limit of quantification (LOQ) in targeted quantitative assays illustrates the use and need for a SSP to establish robust and reliable system performance. Use of a SSP helps to ensure that analyte quantification measurements can be replicated with good precision within and across multiple laboratories and should facilitate more widespread use of MRM-MS technology by the basic biomedical and clinical laboratory research communities.
Subject(s)
Chromatography, Liquid/instrumentation , Chromatography, Liquid/methods , Mass Spectrometry/instrumentation , Mass Spectrometry/methods , Amino Acid Sequence , Animals , Cattle , Limit of Detection , Molecular Sequence Data , Peptides/chemistry , Peptides/metabolism , Reference Standards , Software , Time FactorsABSTRACT
Protein biomarkers are needed to deepen our understanding of cancer biology and to improve our ability to diagnose, monitor, and treat cancers. Important analytical and clinical hurdles must be overcome to allow the most promising protein biomarker candidates to advance into clinical validation studies. Although contemporary proteomics technologies support the measurement of large numbers of proteins in individual clinical specimens, sample throughput remains comparatively low. This problem is amplified in typical clinical proteomics research studies, which routinely suffer from a lack of proper experimental design, resulting in analysis of too few biospecimens to achieve adequate statistical power at each stage of a biomarker pipeline. To address this critical shortcoming, a joint workshop was held by the National Cancer Institute (NCI), National Heart, Lung, and Blood Institute (NHLBI), and American Association for Clinical Chemistry (AACC) with participation from the U.S. Food and Drug Administration (FDA). An important output from the workshop was a statistical framework for the design of biomarker discovery and verification studies. Herein, we describe the use of quantitative clinical judgments to set statistical criteria for clinical relevance and the development of an approach to calculate biospecimen sample size for proteomic studies in discovery and verification stages prior to clinical validation stage. This represents a first step toward building a consensus on quantitative criteria for statistical design of proteomics biomarker discovery and verification research.
Subject(s)
Biomarkers, Tumor/genetics , Blood Proteins/genetics , Gene Expression Regulation, Neoplastic , Neoplasm Proteins/genetics , Neoplasms/genetics , Proteomics/statistics & numerical data , Specimen Handling/statistics & numerical data , Algorithms , Biomarkers, Tumor/metabolism , Blood Proteins/metabolism , Cohort Studies , Humans , Neoplasm Proteins/metabolism , Neoplasms/diagnosis , Neoplasms/metabolism , Research Design , Sample Size , Sensitivity and SpecificityABSTRACT
Policies supporting the rapid and open sharing of proteomic data are being implemented by the leading journals in the field. The proteomics community is taking steps to ensure that data are made publicly accessible and are of high quality, a challenging task that requires the development and deployment of methods for measuring and documenting data quality metrics. On September 18, 2010, the United States National Cancer Institute convened the "International Workshop on Proteomic Data Quality Metrics" in Sydney, Australia, to identify and address issues facing the development and use of such methods for open access proteomics data. The stakeholders at the workshop enumerated the key principles underlying a framework for data quality assessment in mass spectrometry data that will meet the needs of the research community, journals, funding agencies, and data repositories. Attendees discussed and agreed up on two primary needs for the wide use of quality metrics: 1) an evolving list of comprehensive quality metrics and 2) standards accompanied by software analytics. Attendees stressed the importance of increased education and training programs to promote reliable protocols in proteomics. This workshop report explores the historic precedents, key discussions, and necessary next steps to enhance the quality of open access data. By agreement, this article is published simultaneously in the Journal of Proteome Research, Molecular and Cellular Proteomics, Proteomics, and Proteomics Clinical Applications as a public service to the research community. The peer review process was a coordinated effort conducted by a panel of referees selected by the journals.
Subject(s)
Access to Information , Mass Spectrometry , Proteomics , Benchmarking/methods , Benchmarking/standards , Guidelines as Topic , Mass Spectrometry/methods , Mass Spectrometry/standards , Proteomics/education , Proteomics/methods , Proteomics/standards , Research DesignABSTRACT
Policies supporting the rapid and open sharing of proteomic data are being implemented by the leading journals in the field. The proteomics community is taking steps to ensure that data are made publicly accessible and are of high quality, a challenging task that requires the development and deployment of methods for measuring and documenting data quality metrics. On September 18, 2010, the U.S. National Cancer Institute (NCI) convened the "International Workshop on Proteomic Data Quality Metrics" in Sydney, Australia, to identify and address issues facing the development and use of such methods for open access proteomics data. The stakeholders at the workshop enumerated the key principles underlying a framework for data quality assessment in mass spectrometry data that will meet the needs of the research community, journals, funding agencies, and data repositories. Attendees discussed and agreed upon two primary needs for the wide use of quality metrics: (i) an evolving list of comprehensive quality metrics and (ii) standards accompanied by software analytics. Attendees stressed the importance of increased education and training programs to promote reliable protocols in proteomics. This workshop report explores the historic precedents, key discussions, and necessary next steps to enhance the quality of open access data. By agreement, this article is published simultaneously in Proteomics, Proteomics Clinical Applications, Journal of Proteome Research, and Molecular and Cellular Proteomics, as a public service to the research community. The peer review process was a coordinated effort conducted by a panel of referees selected by the journals.
Subject(s)
Access to Information , Mass Spectrometry , Proteomics , Benchmarking/methods , Benchmarking/standards , Guidelines as Topic , Mass Spectrometry/methods , Mass Spectrometry/standards , Proteomics/education , Proteomics/methods , Proteomics/standards , Research DesignABSTRACT
Policies supporting the rapid and open sharing of proteomic data are being implemented by the leading journals in the field. The proteomics community is taking steps to ensure that data are made publicly accessible and are of high quality, a challenging task that requires the development and deployment of methods for measuring and documenting data quality metrics. On September 18, 2010, the U.S. National Cancer Institute (NCI) convened the "International Workshop on Proteomic Data Quality Metrics" in Sydney, Australia, to identify and address issues facing the development and use of such methods for open access proteomics data. The stakeholders at the workshop enumerated the key principles underlying a framework for data quality assessment in mass spectrometry data that will meet the needs of the research community, journals, funding agencies, and data repositories. Attendees discussed and agreed up on two primary needs for the wide use of quality metrics: (1) an evolving list of comprehensive quality metrics and (2) standards accompanied by software analytics. Attendees stressed the importance of increased education and training programs to promote reliable protocols in proteomics. This workshop report explores the historic precedents, key discussions, and necessary next steps to enhance the quality of open access data. By agreement, this article is published simultaneously in the Journal of Proteome Research, Molecular and Cellular Proteomics, Proteomics, and Proteomics Clinical Applications as a public service to the research community. The peer review process was a coordinated effort conducted by a panel of referees selected by the journals.