Search | VHL Regional Portal

1.

A data analysis framework for combining multiple batches increases the power of isobaric proteomics experiments.

O'Brien, Jonathon J; Raj, Anil; Gaun, Aleksandr; Waite, Adam; Li, Wenzhou; Hendrickson, David G; Olsson, Niclas; McAllister, Fiona E.

Nat Methods ; 21(2): 290-300, 2024 Feb.

Article in English | MEDLINE | ID: mdl-38110636

ABSTRACT

We present a framework for the analysis of multiplexed mass spectrometry proteomics data that reduces estimation error when combining multiple isobaric batches. Variations in the number and quality of observations have long complicated the analysis of isobaric proteomics data. Here we show that the power to detect statistical associations is substantially improved by utilizing models that directly account for known sources of variation in the number and quality of observations that occur across batches.In a multibatch benchmarking experiment, our open-source software (msTrawler) increases the power to detect changes, especially in the range of less than twofold changes, while simultaneously increasing quantitative proteome coverage by utilizing more low-signal observations. Further analyses of previously published multiplexed datasets of 4 and 23 batches highlight both increased power and the ability to navigate complex missing data patterns without relying on unverifiable imputations or discarding reliable measurements.

Subject(s)

Proteomics , Software , Proteomics/methods , Mass Spectrometry/methods , Proteome/analysis

2.

Conditional Fragment Ion Probabilities Improve Database Searching for Nonmonoisotopic Precursors.

O'Brien, Jonathon J; Gadzuk-Shea, Meagan; Seitzer, Phillip M; Rad, Ramin; McAllister, Fiona E; Schweppe, Devin K.

J Proteome Res ; 22(2): 334-342, 2023 02 03.

Article in English | MEDLINE | ID: mdl-36414539

ABSTRACT

Stochastic, intensity-based precursor isolation can result in isotopically enriched fragment ions. This problem is exacerbated for large peptides and stable isotope labeling experiments using deuterium or 15N. For stable isotope labeling experiments, incomplete and ubiquitous labeling strategies result in the isolation of peptide ions composed of many distinct structural isomers. Unfortunately, existing proteomics search algorithms do not account for this variability in isotopic incorporation, and thus often yield poor peptide and protein identification rates. We sought to resolve this shortcoming by deriving the expected isotopic distributions of each fragment ion and incorporating them into the theoretical mass spectra used for peptide-spectrum-matching. We adapted the Comet search platform to integrate a modified spectral prediction algorithm we term Conditional fragment Ion Distribution Search (CIDS). Comet-CIDS uses a traditional database searching strategy, but for each candidate peptide we compute the isotopic distribution of each fragment to better match the observed m/z distributions. Evaluating previously generated D2O and 15N labeled data sets, we found that Comet-CIDS identified more confident peptide spectral matches and higher protein sequence coverage compared to traditional theoretical spectra generation, with the magnitude of improvement largely determined by the amount of labeling in the sample.

Subject(s)

Peptides , Proteins , Peptides/chemistry , Proteins/metabolism , Amino Acid Sequence , Probability , Ions

3.

Novel insights from a multiomics dissection of the Hayflick limit.

Chan, Michelle; Yuan, Han; Soifer, Ilya; Maile, Tobias M; Wang, Rebecca Y; Ireland, Andrea; O'Brien, Jonathon J; Goudeau, Jérôme; Chan, Leanne J G; Vijay, Twaritha; Freund, Adam; Kenyon, Cynthia; Bennett, Bryson D; McAllister, Fiona E; Kelley, David R; Roy, Margaret; Cohen, Robert L; Levinson, Arthur D; Botstein, David; Hendrickson, David G.

Elife ; 112022 02 04.

Article in English | MEDLINE | ID: mdl-35119359

ABSTRACT

The process wherein dividing cells exhaust proliferative capacity and enter into replicative senescence has become a prominent model for cellular aging in vitro. Despite decades of study, this cellular state is not fully understood in culture and even much less so during aging. Here, we revisit Leonard Hayflick's original observation of replicative senescence in WI-38 human lung fibroblasts equipped with a battery of modern techniques including RNA-seq, single-cell RNA-seq, proteomics, metabolomics, and ATAC-seq. We find evidence that the transition to a senescent state manifests early, increases gradually, and corresponds to a concomitant global increase in DNA accessibility in nucleolar and lamin associated domains. Furthermore, we demonstrate that senescent WI-38 cells acquire a striking resemblance to myofibroblasts in a process similar to the epithelial to mesenchymal transition (EMT) that is regulated by t YAP1/TEAD1 and TGF-ß2. Lastly, we show that verteporfin inhibition of YAP1/TEAD1 activity in aged WI-38 cells robustly attenuates this gene expression program.

Subject(s)

Cellular Senescence , Epithelial-Mesenchymal Transition , Aged , Aging/physiology , Cell Line , Cellular Senescence/genetics , Fibroblasts/metabolism , Humans

4.

Insights into the Molecular Basis of Genome Stability and Pristine Proteostasis in Naked Mole-Rats.

Narayan, Vikram; McMahon, Mary; O'Brien, Jonathon J; McAllister, Fiona; Buffenstein, Rochelle.

Adv Exp Med Biol ; 1319: 287-314, 2021.

Article in English | MEDLINE | ID: mdl-34424521

ABSTRACT

The naked mole-rat (Heterocephalus glaber) is the longest-lived rodent, with a maximal reported lifespan of 37 years. In addition to its long lifespan - which is much greater than predicted based on its small body size (longevity quotient of ~4.2) - naked mole-rats are also remarkably healthy well into old age. This is reflected in a striking resistance to tumorigenesis and minimal declines in cardiovascular, neurological and reproductive function in older animals. Over the past two decades, researchers have been investigating the molecular mechanisms regulating the extended life- and health- span of this animal, and since the sequencing and assembly of the naked mole-rat genome in 2011, progress has been rapid. Here, we summarize findings from published studies exploring the unique molecular biology of the naked mole-rat, with a focus on mechanisms and pathways contributing to genome stability and maintenance of proteostasis during aging. We also present new data from our laboratory relevant to the topic and discuss our findings in the context of the published literature.

Subject(s)

Mole Rats , Proteostasis , Aging/genetics , Animals , Genomic Instability , Longevity/genetics , Mole Rats/genetics

5.

Automated 16-Plex Plasma Proteomics with Real-Time Search and Ion Mobility Mass Spectrometry Enables Large-Scale Profiling in Naked Mole-Rats and Mice.

Gaun, Aleksandr; Lewis Hardell, Kaitlyn N; Olsson, Niclas; O'Brien, Jonathon J; Gollapudi, Sudha; Smith, Megan; McAlister, Graeme; Huguet, Romain; Keyser, Robert; Buffenstein, Rochelle; McAllister, Fiona E.

J Proteome Res ; 20(2): 1280-1295, 2021 02 05.

Article in English | MEDLINE | ID: mdl-33499602

ABSTRACT

Performing large-scale plasma proteome profiling is challenging due to limitations imposed by lengthy preparation and instrument time. We present a fully automated multiplexed proteome profiling platform (AutoMP3) using the Hamilton Vantage liquid handling robot capable of preparing hundreds to thousands of samples. To maximize protein depth in single-shot runs, we combined 16-plex Tandem Mass Tags (TMTpro) with high-field asymmetric waveform ion mobility spectrometry (FAIMS Pro) and real-time search (RTS). We quantified over 40 proteins/min/sample, doubling the previously published rates. We applied AutoMP3 to investigate the naked mole-rat plasma proteome both as a function of the circadian cycle and in response to ultraviolet (UV) treatment. In keeping with the lack of synchronized circadian rhythms in naked mole-rats, we find few circadian patterns in plasma proteins over the course of 48 h. Furthermore, we quantify many disparate changes between mice and naked mole-rats at both 48 h and one week after UV exposure. These species differences in plasma protein temporal responses could contribute to the pronounced cancer resistance observed in naked mole-rats. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE [1] partner repository with the dataset identifier PXD022891.

Subject(s)

Ion Mobility Spectrometry , Proteomics , Animals , Apoptosis Regulatory Proteins , Mass Spectrometry , Mice , Mole Rats , Proteome

6.

Systematic identification of cancer cell vulnerabilities to natural killer cell-mediated immune surveillance.

Pech, Matthew F; Fong, Linda E; Villalta, Jacqueline E; Chan, Leanne Jg; Kharbanda, Samir; O'Brien, Jonathon J; McAllister, Fiona E; Firestone, Ari J; Jan, Calvin H; Settleman, Jeffrey.

Elife ; 82019 08 27.

Article in English | MEDLINE | ID: mdl-31452512

ABSTRACT

Only a subset of cancer patients respond to T-cell checkpoint inhibitors, highlighting the need for alternative immunotherapeutics. We performed CRISPR-Cas9 screens in a leukemia cell line to identify perturbations that enhance natural killer effector functions. Our screens defined critical components of the tumor-immune synapse and highlighted the importance of cancer cell interferon-Î³ signaling in modulating NK activity. Surprisingly, disrupting the ubiquitin ligase substrate adaptor DCAF15 strongly sensitized cancer cells to NK-mediated clearance. DCAF15 disruption induced an inflamed state in leukemic cells, including increased expression of lymphocyte costimulatory molecules. Proteomic and biochemical analysis revealed that cohesin complex members were endogenous client substrates of DCAF15. Genetic disruption of DCAF15 was phenocopied by treatment with indisulam, an anticancer drug that functions through DCAF15 engagement. In AML patients, reduced DCAF15 expression was associated with improved survival. These findings suggest that DCAF15 inhibition may have useful immunomodulatory properties in the treatment of myeloid neoplasms.

Subject(s)

Intracellular Signaling Peptides and Proteins/genetics , Killer Cells, Natural/immunology , Leukemia, Myeloid, Acute/pathology , Cell Line, Tumor , Gene Expression Profiling , Gene Knockout Techniques , Humans , Leukemia, Myeloid, Acute/mortality , Survival Analysis

7.

eIF2B activator prevents neurological defects caused by a chronic integrated stress response.

Wong, Yao Liang; LeBon, Lauren; Basso, Ana M; Kohlhaas, Kathy L; Nikkel, Arthur L; Robb, Holly M; Donnelly-Roberts, Diana L; Prakash, Janani; Swensen, Andrew M; Rubinstein, Nimrod D; Krishnan, Swathi; McAllister, Fiona E; Haste, Nicole V; O'Brien, Jonathon J; Roy, Margaret; Ireland, Andrea; Frost, Jennifer M; Shi, Lei; Riedmaier, Stephan; Martin, Kathleen; Dart, Michael J; Sidrauski, Carmela.

Elife ; 82019 01 09.

Article in English | MEDLINE | ID: mdl-30624206

ABSTRACT

The integrated stress response (ISR) attenuates the rate of protein synthesis while inducing expression of stress proteins in cells. Various insults activate kinases that phosphorylate the GTPase eIF2 leading to inhibition of its exchange factor eIF2B. Vanishing White Matter (VWM) is a neurological disease caused by eIF2B mutations that, like phosphorylated eIF2, reduce its activity. We show that introduction of a human VWM mutation into mice leads to persistent ISR induction in the central nervous system. ISR activation precedes myelin loss and development of motor deficits. Remarkably, long-term treatment with a small molecule eIF2B activator, 2BAct, prevents all measures of pathology and normalizes the transcriptome and proteome of VWM mice. 2BAct stimulates the remaining activity of mutant eIF2B complex in vivo, abrogating the maladaptive stress response. Thus, 2BAct-like molecules may provide a promising therapeutic approach for VWM and provide relief from chronic ISR induction in a variety of disease contexts.

Subject(s)

Brain Diseases/etiology , Eukaryotic Initiation Factor-2B/metabolism , Stress, Psychological/complications , White Matter/pathology , Animals , Astrocytes/pathology , Brain Diseases/pathology , Brain Diseases/prevention & control , Chronic Disease , Eukaryotic Initiation Factor-2B/genetics , Humans , Male , Mice , Mutation , Nerve Tissue Proteins/metabolism , Oligodendroglia/pathology , Phosphorylation , Protein Biosynthesis , Proteome , Weight Gain

8.

The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments.

O'Brien, Jonathon J; Gunawardena, Harsha P; Paulo, Joao A; Chen, Xian; Ibrahim, Joseph G; Gygi, Steven P; Qaqish, Bahjat F.

Ann Appl Stat ; 12(4): 2075-2095, 2018 Dec.

Article in English | MEDLINE | ID: mdl-30473739

ABSTRACT

An idealized version of a label-free discovery mass spectrometry proteomics experiment would provide absolute abundance measurements for a whole proteome, across varying conditions. Unfortunately, this ideal is not realized. Measurements are made on peptides requiring an inferential step to obtain protein level estimates. The inference is complicated by experimental factors that necessitate relative abundance estimation and result in widespread non-ignorable missing data. Relative abundance on the log scale takes the form of parameter contrasts. In a complete-case analysis, contrast estimates may be biased by missing data and a substantial amount of useful information will often go unused. To avoid problems with missing data, many analysts have turned to single imputation solutions. Unfortunately, these methods often create further difficulties by hiding inestimable contrasts, preventing the recovery of interblock information and failing to account for imputation uncertainty. To mitigate many of the problems caused by missing values, we propose the use of a Bayesian selection model. Our model is tested on simulated data, real data with simulated missing values, and on a ground truth dilution experiment where all of the true relative changes are known. The analysis suggests that our model, compared with various imputation strategies and complete-case analyses, can increase accuracy and provide substantial improvements to interval coverage.

9.

Proteome-Wide Evaluation of Two Common Protein Quantification Methods.

O'Connell, Jeremy D; Paulo, Joao A; O'Brien, Jonathon J; Gygi, Steven P.

J Proteome Res ; 17(5): 1934-1942, 2018 05 04.

Article in English | MEDLINE | ID: mdl-29635916

ABSTRACT

Proteomics experiments commonly aim to estimate and detect differential abundance across all expressed proteins. Within this experimental design, some of the most challenging measurements are small fold changes for lower abundance proteins. While bottom-up proteomics methods are approaching comprehensive coverage of even complex eukaryotic proteomes, failing to reliably quantify lower abundance proteins can limit the precision and reach of experiments to much less than the identified-let alone total-proteome. Here we test the ability of two common methods, a tandem mass tagging (TMT) method and a label-free quantitation method (LFQ), to achieve comprehensive quantitative coverage by benchmarking their capacity to measure 3 different levels of change (3-, 2-, and 1.5-fold) across an entire data set. Both methods achieved comparably accurate estimates for all 3-fold-changes. However, the TMT method detected changes that reached statistical significance three times more often due to higher precision and fewer missing values. These findings highlight the importance of refining proteome quantitation methods to bring the number of usefully quantified proteins into closer agreement with the number of total quantified proteins.

Subject(s)

Proteome/analysis , Proteomics/methods , Staining and Labeling/methods , Benchmarking , Fungal Proteins/analysis , Sensitivity and Specificity

10.

Row versus column correlations: avoiding the ecological fallacy in RNA/protein expression studies.

O'Brien, Jonathon J; Gunawardena, Harsha P; Qaqish, Bahjat F.

Brief Bioinform ; 19(5): 946-953, 2018 09 28.

Article in English | MEDLINE | ID: mdl-28369202

ABSTRACT

Biomedical researchers are often interested in computing the correlation between RNA and protein abundance. However, correlations can be computed between rows of a data matrix or between columns, and the results are not the same. The belief that these two types of correlation are estimating the same phenomenon is a special case of a well-known logical error called the ecological fallacy. In this article, we review different uses of correlation found in the literature, explain the differences between row and column correlations and argue that one of them has an undesirable interpretation in most applications. Through simulation studies and theoretical derivations, we show that the commonly used Pearson's coefficient, computed from protein and transcript data from a single sample, is only loosely related to the biological correlation that most researchers will be interested in studying. Beyond our basic exploration of the ecological fallacy, we examine how correlations are affected by relative quantification proteomics data and common normalization procedures, finding that double normalization is capable of completely masking true correlative relationships. We conclude with guidelines for properly identifying and computing consistent correlation coefficients.

Subject(s)

Proteins/genetics , Proteins/metabolism , Proteomics/statistics & numerical data , RNA/genetics , RNA/metabolism , Bias , Computational Biology/methods , Computer Simulation , Data Interpretation, Statistical , Humans , Models, Biological , Models, Statistical , Transcription, Genetic

11.

Compositional Proteomics: Effects of Spatial Constraints on Protein Quantification Utilizing Isobaric Tags.

O'Brien, Jonathon J; O'Connell, Jeremy D; Paulo, Joao A; Thakurta, Sanjukta; Rose, Christopher M; Weekes, Michael P; Huttlin, Edward L; Gygi, Steven P.

J Proteome Res ; 17(1): 590-599, 2018 01 05.

Article in English | MEDLINE | ID: mdl-29195270

ABSTRACT

Mass spectrometry (MS) has become an accessible tool for whole proteome quantitation with the ability to characterize protein expression across thousands of proteins within a single experiment. A subset of MS quantification methods (e.g., SILAC and label-free) monitor the relative intensity of intact peptides, where thousands of measurements can be made from a single mass spectrum. An alternative approach, isobaric labeling, enables precise quantification of multiple samples simultaneously through unique and sample specific mass reporter ions. Consequently, in a single scan, the quantitative signal comes from a limited number of spectral features (≤11). The signal observed for these features is constrained by automatic gain control, forcing codependence of concurrent signals. The study of constrained outcomes primarily belongs to the field of compositional data analysis. We show experimentally that isobaric tag proteomics data are inherently compositional and highlight the implications for data analysis and interpretation. We present a new statistical model and accompanying software that improves estimation accuracy and the ability to detect changes in protein abundance. Finally, we demonstrate a unique compositional effect on proteins with infinite changes. We conclude that many infinite changes will appear small and that the magnitude of these estimates is highly dependent on experimental design.

Subject(s)

Proteomics/methods , Models, Statistical , Software , Staining and Labeling

12.

Accelerating high-dimensional clustering with lossless data reduction.

Qaqish, Bahjat F; O'Brien, Jonathon J; Hibbard, Jonathan C; Clowers, Katie J.

Bioinformatics ; 33(18): 2867-2872, 2017 Sep 15.

Article in English | MEDLINE | ID: mdl-28520900

ABSTRACT

MOTIVATION: For cluster analysis, high-dimensional data are associated with instability, decreased classification accuracy and high-computational burden. The latter challenge can be eliminated as a serious concern. For applications where dimension reduction techniques are not implemented, we propose a temporary transformation which accelerates computations with no loss of information. The algorithm can be applied for any statistical procedure depending only on Euclidean distances and can be implemented sequentially to enable analyses of data that would otherwise exceed memory limitations. RESULTS: The method is easily implemented in common statistical software as a standard pre-processing step. The benefit of our algorithm grows with the dimensionality of the problem and the complexity of the analysis. Consequently, our simple algorithm not only decreases the computation time for routine analyses, it opens the door to performing calculations that may have otherwise been too burdensome to attempt. AVAILABILITY AND IMPLEMENTATION: R, Matlab and SAS/IML code for implementing lossless data reduction is freely available in the Appendix. CONTACT: obrienj@hms.harvard.edu.

Subject(s)

Cluster Analysis , Computational Biology/methods , Software , Algorithms , DNA Methylation , Fungal Proteins/genetics , Gene Expression Regulation, Fungal , Humans , Proteomics/methods , Yeasts/genetics , Yeasts/metabolism

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL