RESUMEN
Thousands of interactions assemble proteins into modules that impart spatial and functional organization to the cellular proteome. Through affinity-purification mass spectrometry, we have created two proteome-scale, cell-line-specific interaction networks. The first, BioPlex 3.0, results from affinity purification of 10,128 human proteins-half the proteome-in 293T cells and includes 118,162 interactions among 14,586 proteins. The second results from 5,522 immunoprecipitations in HCT116 cells. These networks model the interactome whose structure encodes protein function, localization, and complex membership. Comparison across cell lines validates thousands of interactions and reveals extensive customization. Whereas shared interactions reside in core complexes and involve essential proteins, cell-specific interactions link these complexes, "rewiring" subnetworks within each cell's interactome. Interactions covary among proteins of shared function as the proteome remodels to produce each cell's phenotype. Viewable interactively online through BioPlexExplorer, these networks define principles of proteome organization and enable unknown protein characterization.
Asunto(s)
Mapeo de Interacción de Proteínas/métodos , Mapas de Interacción de Proteínas/genética , Proteoma/genética , Biología Computacional/métodos , Células HCT116/metabolismo , Células HEK293/metabolismo , Humanos , Espectrometría de Masas/métodos , Mapas de Interacción de Proteínas/fisiología , Proteoma/metabolismo , Proteómica/métodosRESUMEN
Protein interactions form a network whose structure drives cellular function and whose organization informs biological inquiry. Using high-throughput affinity-purification mass spectrometry, we identify interacting partners for 2,594 human proteins in HEK293T cells. The resulting network (BioPlex) contains 23,744 interactions among 7,668 proteins with 86% previously undocumented. BioPlex accurately depicts known complexes, attaining 80%-100% coverage for most CORUM complexes. The network readily subdivides into communities that correspond to complexes or clusters of functionally related proteins. More generally, network architecture reflects cellular localization, biological process, and molecular function, enabling functional characterization of thousands of proteins. Network structure also reveals associations among thousands of protein domains, suggesting a basis for examining structurally related proteins. Finally, BioPlex, in combination with other approaches, can be used to reveal interactions of biological or clinical significance. For example, mutations in the membrane protein VAPB implicated in familial amyotrophic lateral sclerosis perturb a defined community of interactors.
Asunto(s)
Mapas de Interacción de Proteínas , Proteómica/métodos , Esclerosis Amiotrófica Lateral/genética , Humanos , Espectrometría de Masas , Mapeo de Interacción de Proteínas , Proteínas/química , Proteínas/aislamiento & purificación , Proteínas/metabolismoRESUMEN
Mass-spectrometry-based phosphoproteomics has become indispensable for understanding cellular signaling in complex biological systems. Despite the central role of protein phosphorylation, the field still lacks inexpensive, regenerable, and diverse phosphopeptides with ground-truth phosphorylation positions. Here, we present Iterative Synthetically Phosphorylated Isomers (iSPI), a proteome-scale library of human-derived phosphoserine-containing phosphopeptides that is inexpensive, regenerable, and diverse, with precisely known positions of phosphorylation. We demonstrate possible uses of iSPI, including use as a phosphopeptide standard, a tool to evaluate and optimize phosphorylation-site localization algorithms, and a benchmark to compare performance across data analysis pipelines. We also present AScorePro, an updated version of the AScore algorithm specifically optimized for phosphorylation-site localization in higher energy fragmentation spectra, and the FLR viewer, a web tool for phosphorylation-site localization, to enable community use of the iSPI resource. iSPI and its associated data constitute a useful, multi-purpose resource for the phosphoproteomics community.
Asunto(s)
Fosfopéptidos , Proteoma , Humanos , Proteoma/metabolismo , Fosfopéptidos/metabolismo , Fosfoserina/metabolismo , Proteómica , Espectrometría de Masas , FosforilaciónRESUMEN
Although most tissues in an organism are genetically identical, the biochemistry of each is optimized to fulfill its unique physiological roles, with important consequences for human health and disease. Each tissue's unique physiology requires tightly regulated gene and protein expression coordinated by specialized, phosphorylation-dependent intracellular signaling. To better understand the role of phosphorylation in maintenance of physiological differences among tissues, we performed proteomic and phosphoproteomic characterizations of nine mouse tissues. We identified 12,039 proteins, including 6296 phosphoproteins harboring nearly 36,000 phosphorylation sites. Comparing protein abundances and phosphorylation levels revealed specialized, interconnected phosphorylation networks within each tissue while suggesting that many proteins are regulated by phosphorylation independently of their expression. Our data suggest that the "typical" phosphoprotein is widely expressed yet displays variable, often tissue-specific phosphorylation that tunes protein activity to the specific needs of each tissue. We offer this dataset as an online resource for the biological research community.
Asunto(s)
Perfilación de la Expresión Génica , Ratones/genética , Especificidad de Órganos , Fosforilación , Proteínas/metabolismo , Animales , Ratones/metabolismo , Proteínas Quinasas/genética , ProteómicaRESUMEN
Rationale: Despite significant advances in precision treatments and immunotherapy, lung cancer is the most common cause of cancer death worldwide. To reduce incidence and improve survival rates, a deeper understanding of lung premalignancy and the multistep process of tumorigenesis is essential, allowing timely and effective intervention before cancer development. Objectives: To summarize existing information, identify knowledge gaps, formulate research questions, prioritize potential research topics, and propose strategies for future investigations into the premalignant progression in the lung. Methods: An international multidisciplinary team of basic, translational, and clinical scientists reviewed available data to develop and refine research questions pertaining to the transformation of premalignant lung lesions to advanced lung cancer. Results: This research statement identifies significant gaps in knowledge and proposes potential research questions aimed at expanding our understanding of the mechanisms underlying the progression of premalignant lung lesions to lung cancer in an effort to explore potential innovative modalities to intercept lung cancer at its nascent stages. Conclusions: The identified gaps in knowledge about the biological mechanisms of premalignant progression in the lung, together with ongoing challenges in screening, detection, and early intervention, highlight the critical need to prioritize research in this domain. Such focused investigations are essential to devise effective preventive strategies that may ultimately decrease lung cancer incidence and improve patient outcomes.
Asunto(s)
Carcinoma de Pulmón de Células no Pequeñas , Neoplasias Pulmonares , Lesiones Precancerosas , Humanos , Carcinoma de Pulmón de Células no Pequeñas/patología , Carcinoma de Pulmón de Células no Pequeñas/terapia , Progresión de la Enfermedad , Neoplasias Pulmonares/patología , Neoplasias Pulmonares/terapia , Lesiones Precancerosas/patología , Lesiones Precancerosas/terapia , Sociedades Médicas , Estados UnidosRESUMEN
Stochastic, intensity-based precursor isolation can result in isotopically enriched fragment ions. This problem is exacerbated for large peptides and stable isotope labeling experiments using deuterium or 15N. For stable isotope labeling experiments, incomplete and ubiquitous labeling strategies result in the isolation of peptide ions composed of many distinct structural isomers. Unfortunately, existing proteomics search algorithms do not account for this variability in isotopic incorporation, and thus often yield poor peptide and protein identification rates. We sought to resolve this shortcoming by deriving the expected isotopic distributions of each fragment ion and incorporating them into the theoretical mass spectra used for peptide-spectrum-matching. We adapted the Comet search platform to integrate a modified spectral prediction algorithm we term Conditional fragment Ion Distribution Search (CIDS). Comet-CIDS uses a traditional database searching strategy, but for each candidate peptide we compute the isotopic distribution of each fragment to better match the observed m/z distributions. Evaluating previously generated D2O and 15N labeled data sets, we found that Comet-CIDS identified more confident peptide spectral matches and higher protein sequence coverage compared to traditional theoretical spectra generation, with the magnitude of improvement largely determined by the amount of labeling in the sample.
Asunto(s)
Péptidos , Proteínas , Péptidos/química , Proteínas/metabolismo , Secuencia de Aminoácidos , Probabilidad , IonesRESUMEN
The physiology of a cell can be viewed as the product of thousands of proteins acting in concert to shape the cellular response. Coordination is achieved in part through networks of protein-protein interactions that assemble functionally related proteins into complexes, organelles, and signal transduction pathways. Understanding the architecture of the human proteome has the potential to inform cellular, structural, and evolutionary mechanisms and is critical to elucidating how genome variation contributes to disease. Here we present BioPlex 2.0 (Biophysical Interactions of ORFeome-derived complexes), which uses robust affinity purification-mass spectrometry methodology to elucidate protein interaction networks and co-complexes nucleated by more than 25% of protein-coding genes from the human genome, and constitutes, to our knowledge, the largest such network so far. With more than 56,000 candidate interactions, BioPlex 2.0 contains more than 29,000 previously unknown co-associations and provides functional insights into hundreds of poorly characterized proteins while enhancing network-based analyses of domain associations, subcellular localization, and co-complex formation. Unsupervised Markov clustering of interacting proteins identified more than 1,300 protein communities representing diverse cellular activities. Genes essential for cell fitness are enriched within 53 communities representing central cellular functions. Moreover, we identified 442 communities associated with more than 2,000 disease annotations, placing numerous candidate disease genes into a cellular framework. BioPlex 2.0 exceeds previous experimentally derived interaction networks in depth and breadth, and will be a valuable resource for exploring the biology of incompletely characterized proteins and for elucidating larger-scale patterns of proteome organization.
Asunto(s)
Bases de Datos de Proteínas , Enfermedad , Mapeo de Interacción de Proteínas , Mapas de Interacción de Proteínas , Proteoma/metabolismo , Fenómenos Fisiológicos Celulares/genética , Genoma Humano , Humanos , Espacio Intracelular/metabolismo , Cadenas de Markov , Espectrometría de Masas , Anotación de Secuencia Molecular , Sistemas de Lectura Abierta , Proteoma/análisis , Proteoma/química , Proteoma/genéticaRESUMEN
Reporter ion interference remains a limitation of isobaric tag-based sample multiplexing. Advances in instrumentation and data acquisition modes, such as the recently developed real-time database search (RTS), can reduce interference. However, interference persists as does the need to benchmark upstream sample preparation and data acquisition strategies. Here, we present an updated Triple yeast KnockOut (TKO) standard as well as corresponding upgrades to the TKO viewing tool (TVT2.5, http://tko.hms.harvard.edu/). Specifically, we expand the TKO standard to incorporate the TMTpro18-plex reagents (TKO18). We also construct a variant thereof which has been digested only with LysC (TKO18L). We compare proteome coverage and interference levels of TKO18 and TKO18L data that are acquired under different data acquisition modes and analyzed using TVT2.5. Our data illustrate that RTS reduces interference while improving proteome coverage and suggest that digesting with LysC alone only modestly reduces interference, albeit at the expense of proteome depth. Collectively, the two new TKO standards coupled with the updated TVT represent a convenient and versatile platform for assessing and developing methods to reduce interference in isobaric tag-based experiments.
Asunto(s)
Péptidos , Proteómica , Bases de Datos Factuales , Proteoma , Proteómica/métodos , Saccharomyces cerevisiae/genéticaRESUMEN
Accurate assignment of monoisotopic peaks is essential for the identification of peptides in bottom-up proteomics. Misassignment or inaccurate attribution of peptidic ions leads to lower sensitivity and fewer total peptide identifications. In the present work, we present a performant, open-source, cross-platform algorithm, Monocle, for the rapid reassignment of instrument-assigned precursor peaks to monoisotopic peptide assignments. We demonstrate that the present algorithm can be integrated into many common proteomic pipelines and provides rapid conversion from multiple data source types. Finally, we show that our monoisotopic peak assignment results in up to a twofold increase in total peptide identifications compared to analyses lacking monoisotopic correction and a 44% improvement over previous monoisotopic peak correction algorithms.
Asunto(s)
Proteoma , Proteómica , Algoritmos , Péptidos , Espectrometría de Masas en TándemRESUMEN
Conditional genetically engineered mouse models (GEMMs) of non-small cell lung cancer (NSCLC) harbor common oncogenic driver mutations of the disease, but in contrast to human NSCLC these models possess low tumor mutational burden (TMB). As a result, these models often lack tumor antigens that can elicit host adaptive immune responses, which limits their utility in immunotherapy studies. Here, we establish Kras-mutant murine models of NSCLC bearing the common driver mutations associated with the disease and increased TMB, by in vitro exposure of cell lines derived from GEMMs of NSCLC [KrasG12D (K), KrasG12DTp53-/-(KP), KrasG12DTp53+/-Lkb1-/- (KPL)] to the alkylating agent N-methyl-N-nitrosourea (MNU). Increasing the TMB enhanced host anti-tumor T cell responses and improved anti-PD-1 efficacy in syngeneic models across all genetic backgrounds. However, limited anti-PD-1 efficacy was observed in the KPL cell lines with increased TMB, which possessed a distinct immunosuppressed tumor microenvironment (TME) primarily composed of granulocytic myeloid-derived suppressor cells (G-MDSCs). This KPL phenotype is consistent with findings in human KRAS-mutant NSCLC where LKB1 loss is a driver of primary resistance to PD-1 blockade. In summary, these novel Kras-mutant NSCLC murine models with known driver mutations and increased TMB have distinct TMEs and recapitulate the therapeutic vulnerabilities of human NSCLC. We anticipate that these immunogenic models will facilitate the development of innovative immunotherapies in NSCLC.
Asunto(s)
Carcinoma de Pulmón de Células no Pequeñas/genética , Neoplasias Pulmonares/genética , Mutación/genética , Proteínas Proto-Oncogénicas p21(ras)/genética , Animales , Antígeno B7-H1/genética , Biomarcadores de Tumor/genética , Línea Celular Tumoral , Modelos Animales de Enfermedad , Ratones , Proteínas Serina-Treonina Quinasas/genética , Microambiente Tumoral/genética , Proteína p53 Supresora de Tumor/genéticaRESUMEN
Gas-phase fractionation enables better quantitative accuracy, improves signal-to-noise ratios, and increases sensitivity in proteomic analyses. However, traditional gas-phase enrichment, which relies upon a large continuous bin, results in suboptimal enrichment, as most chromatographic separations are not 100% orthogonal relative to the first MS dimension (MS1m/z). As such, ions with similar m/z values tend to elute at the same retention time, which prevents the partitioning of narrow precursor m/z distributions into a few large continuous gas-phase enrichment bins. To overcome this issue, we developed and tested the use of notched isolation waveforms, which simultaneously isolate multiple discrete m/z windows in parallel (e.g., 650-700 m/z and 800-850 m/z). By comparison to a canonical gas-phase fractionation method, notched waveforms do not require bin optimization via in silico digestion or wasteful sample injections to isolate multiple precursor windows. Importantly, the collection of all m/z bins simultaneously using the isolation waveform does not suffer from the sensitivity and duty cycle pitfalls inherent to sequential collection of multiple m/z bins. Applying a notched injection waveform provided consistent enrichment of precursor ions, which resulted in improved proteome depth with greater coverage of low-abundance proteins. Finally, using a reductive dimethyl labeling approach, we show that notched isolation waveforms increase the number of quantified peptides with improved accuracy and precision across a wider dynamic range.
Asunto(s)
Proteoma , Proteómica , Fraccionamiento Químico , Iones , PéptidosRESUMEN
Multiplexed quantitative analyses of complex proteomes enable deep biological insight. While a multitude of workflows have been developed for multiplexed analyses, the most quantitatively accurate method (SPS-MS3) suffers from long acquisition duty cycles. We built a new, real-time database search (RTS) platform, Orbiter, to combat the SPS-MS3 method's longer duty cycles. RTS with Orbiter eliminates SPS-MS3 scans if no peptide matches to a given spectrum. With Orbiter's online proteomic analytical pipeline, which includes RTS and false discovery rate analysis, it was possible to process a single spectrum database search in less than 10 ms. The result is a fast, functional means to identify peptide spectral matches using Comet, filter these matches, and more efficiently quantify proteins of interest. Importantly, the use of Comet for peptide spectral matching allowed for a fully featured search, including analysis of post-translational modifications, with well-known and extensively validated scoring. These data could then be used to trigger subsequent scans in an adaptive and flexible manner. In this work we tested the utility of this adaptive data acquisition platform to improve the efficiency and accuracy of multiplexed quantitative experiments. We found that RTS enabled a 2-fold increase in mass spectrometric data acquisition efficiency. Orbiter's RTS quantified more than 8000 proteins across 10 proteomes in half the time of an SPS-MS3 analysis (18 h for RTS, 36 h for SPS-MS3).
Asunto(s)
Proteoma , Proteómica , Bases de Datos Factuales , Espectrometría de Masas , PéptidosRESUMEN
Multiplexing strategies are at the forefront of mass-spectrometry-based proteomics, with SPS-MS3 methods becoming increasingly commonplace. A known caveat of isobaric multiplexing is interference resulting from coisolated and cofragmented ions that do not originate from the selected precursor of interest. The triple knockout (TKO) standard was designed to benchmark data collection strategies to minimize interference. However, a limitation to its widespread use has been the lack of an automated analysis platform. We present a TKO Visualization Tool (TVT). The TVT viewer allows for automated, web-based, database searching of the TKO standard, returning traditional figures of merit, such as peptide and protein counts, scan-specific ion accumulation times, as well as the TKO-specific metric, the IFI (interference-free index). Moreover, the TVT viewer allows for plotting of two TKO standards to assess protocol optimizations, compare instruments, or measure degradation of instrument performance over time. We showcase the TVT viewer by probing the selection of (1) stationary phase resin, (2) MS2 isolation window width, and (3) number of synchronous precursor selection (SPS) ions for SPS-MS3 analysis. Using the TVT viewer will allow the proteomics community to search and compare TKO results to optimize user-specific data collection workflows.
Asunto(s)
Internet , Proteómica/métodos , Motor de Búsqueda , Automatización , Exactitud de los Datos , Proteoma/análisis , Proteómica/normas , Interfaz Usuario-ComputadorRESUMEN
Multiplexed, isobaric tagging methods are powerful techniques to increase throughput, precision, and accuracy in quantitative proteomics. The dynamic range and accuracy of quantitation, however, can be limited by coisolation of tag-containing peptides that release reporter ions and conflate quantitative measurements across precursors. Methods to alleviate these effects often lead to the loss of protein and peptide identifications through online or offline filtering of interference containing spectra. To alleviate this effect, high-Field Asymmetric-waveform Ion Mobility Spectroscopy (FAIMS) has been proposed as a method to reduce precursor coisolation and improve the accuracy and dynamic range of multiplex quantitation. Here we tested the use of FAIMS to improve quantitative accuracy using previously established TMT-based interference standards (triple-knockout [TKO] and Human-Yeast Proteomics Resource [HYPER]). We observed that FAIMS robustly improved the quantitative accuracy of both high-resolution MS2 (HRMS2) and synchronous precursor selection MS3 (SPS-MS3)-based methods without sacrificing protein identifications. We further optimized and characterized the main factors that enable robust use of FAIMS for multiplexed quantitation. We highlight these factors and provide method recommendations to take advantage of FAIMS technology to improve isobaric-tag-quantification moving forward.
Asunto(s)
Espectrometría de Masas/métodos , Proteínas de Neoplasias/metabolismo , Péptidos/análisis , Proteoma/análisis , Proteómica/métodos , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/metabolismo , Células HCT116 , Humanos , Péptidos/metabolismo , Proteoma/metabolismoRESUMEN
Despite the diverse biological pathways known to be regulated by ubiquitylation, global identification of substrates that are targeted for ubiquitylation has remained a challenge. To globally characterize the human ubiquitin-modified proteome (ubiquitinome), we utilized a monoclonal antibody that recognizes diglycine (diGly)-containing isopeptides following trypsin digestion. We identify ~19,000 diGly-modified lysine residues within ~5000 proteins. Using quantitative proteomics we monitored temporal changes in diGly site abundance in response to both proteasomal and translational inhibition, indicating both a dependence on ongoing translation to observe alterations in site abundance and distinct dynamics of individual modified lysines in response to proteasome inhibition. Further, we demonstrate that quantitative diGly proteomics can be utilized to identify substrates for cullin-RING ubiquitin ligases. Interrogation of the ubiquitinome allows for not only a quantitative assessment of alterations in protein homeostasis fidelity, but also identification of substrates for individual ubiquitin pathway enzymes.
Asunto(s)
Proteoma/metabolismo , Ubiquitina/metabolismo , Células Cultivadas , Proteínas Cullin/metabolismo , Glicilglicina/genética , Células HCT116 , Humanos , Lisina/genética , Proteómica , UbiquitinaciónRESUMEN
Isobaric labeling strategies for mass spectrometry-based proteomics enable multiplexed simultaneous quantification of samples and therefore substantially increase the sample throughput in proteomics. However, despite these benefits, current limits to multiplexing capacity are prohibitive for large sample sizes and impose limitations on experimental design. Here, we introduce a novel mechanism for increasing the multiplexing density of isobaric reagents. We present Combinatorial Isobaric Mass Tags (CMTs), an isobaric labeling architecture with the unique ability to generate multiple series of reporter ions simultaneously. We demonstrate that utilization of multiple reporter ion series improves multiplexing capacity of CMT with respect to a commercially available isobaric labeling reagent with preserved quantitative accuracy and depth of coverage in complex mixtures. We provide a blueprint for the realization of 16-plex reagents with 1 Da spacing between reporter ions and up to 28-plex at 6 mDa spacing using only 5 heavy isotopes per reagent. We anticipate that this improvement in multiplexing capacity will further advance the application of quantitative proteomics, particularly in high-throughput screening assays.
Asunto(s)
Espectrometría de Masas/métodos , Péptidos/análisis , Proteómica/métodos , Ensayos Analíticos de Alto Rendimiento/métodos , Indicadores y Reactivos/química , Iones/químicaRESUMEN
Multiplexed quantitation via isobaric chemical tags (e.g., tandem mass tags (TMT) and isobaric tags for relative and absolute quantitation (iTRAQ)) has the potential to revolutionize quantitative proteomics. However, until recently the utility of these tags was questionable due to reporter ion ratio distortion resulting from fragmentation of coisolated interfering species. These interfering signals can be negated through additional gas-phase manipulations (e.g., MS/MS/MS (MS3) and proton-transfer reactions (PTR)). These methods, however, have a significant sensitivity penalty. Using isolation waveforms with multiple frequency notches (i.e., synchronous precursor selection, SPS), we coisolated and cofragmented multiple MS2 fragment ions, thereby increasing the number of reporter ions in the MS3 spectrum 10-fold over the standard MS3 method (i.e., MultiNotch MS3). By increasing the reporter ion signals, this method improves the dynamic range of reporter ion quantitation, reduces reporter ion signal variance, and ultimately produces more high-quality quantitative measurements. To demonstrate utility, we analyzed biological triplicates of eight colon cancer cell lines using the MultiNotch MS3 method. Across all the replicates we quantified 8,378 proteins in union and 6,168 proteins in common. Taking into account that each of these quantified proteins contains eight distinct cell-line measurements, this data set encompasses 174,704 quantitative ratios each measured in triplicate across the biological replicates. Herein, we demonstrate that the MultiNotch MS3 method uniquely combines multiplexing capacity with quantitative sensitivity and accuracy, drastically increasing the informational value obtainable from proteomic experiments.
Asunto(s)
Neoplasias del Colon/metabolismo , Proteómica/métodos , Espectrometría de Masas en Tándem/métodos , Algoritmos , Línea Celular Tumoral , Cromatografía Líquida de Alta Presión/métodos , Células HeLa , Humanos , Iones , Isocitrato Deshidrogenasa/análisis , Isocitrato Deshidrogenasa/metabolismo , Análisis de Componente Principal , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Proteína Smad4/análisis , Proteína Smad4/metabolismo , Espectrometría de Masas en Tándem/instrumentaciónRESUMEN
Quantitative mass spectrometry-based proteomics is highly versatile but not easily multiplexed. Isobaric labeling strategies allow mass spectrometry-based multiplexed proteome quantification; however, ratio distortion owing to protein quantification interference is a common effect. We present a two-proteome model (mixture of human and yeast proteins) in a sixplex isobaric labeling system to fully document the interference effect, and we report that applying triple-stage mass spectrometry (MS3) almost completely eliminates interference.