RESUMEN
Using multiplexed quantitative proteomics, we analyzed cell cycle-dependent changes of the human proteome. We identified >4,400 proteins, each with a six-point abundance profile across the cell cycle. Hypothesizing that proteins with similar abundance profiles are co-regulated, we clustered the proteins with abundance profiles most similar to known Anaphase-Promoting Complex/Cyclosome (APC/C) substrates to identify additional putative APC/C substrates. This protein profile similarity screening (PPSS) analysis resulted in a shortlist enriched in kinases and kinesins. Biochemical studies on the kinesins confirmed KIFC1, KIF18A, KIF2C, and KIF4A as APC/C substrates. Furthermore, we showed that the APC/C(CDH1)-dependent degradation of KIFC1 regulates the bipolar spindle formation and proper cell division. A targeted quantitative proteomics experiment showed that KIFC1 degradation is modulated by a stabilizing CDK1-dependent phosphorylation site within the degradation motif of KIFC1. The regulation of KIFC1 (de-)phosphorylation and degradation provides insights into the fidelity and proper ordering of substrate degradation by the APC/C during mitosis.
Asunto(s)
Ciclosoma-Complejo Promotor de la Anafase/metabolismo , Proteolisis , Proteómica , Secuencia de Aminoácidos , Ciclo Celular , Células HeLa , Humanos , Cinesinas/metabolismo , Modelos Biológicos , Datos de Secuencia Molecular , Fosforilación , Proteínas Recombinantes de Fusión/metabolismo , Especificidad por Sustrato , UbiquitinaciónRESUMEN
In this study, we performed an in-depth characterization of the male pediatric infant urinary proteome by parallel proteomic analysis of normal healthy adult (n=6) and infant (n=6) males and comparison to available published data. A total of 1584 protein groups were identified. Of these, 708 proteins were identified in samples from both cohorts. Although present in both cohorts, 136 of these common proteins were significantly enriched in urine from adults and 94 proteins were significantly enriched in urine from infants. Using Gene Ontology, we found that the infant-enriched or specific subproteome (743 proteins) had an overrepresentation of proteins that are involved in translation and transcription, cellular growth and metabolic processes. In contrast, the adult enriched or specific subproteome (364 proteins) showed an overexpression of proteins involved in immune response and cell adhesion. This study demonstrates that the non-diseased male urinary proteome is quantitatively affected by age, has age-specific subproteomes, and identifies a common subproteome with no age-dependent abundance variations. These findings highlight the importance of age-matching in urinary proteomics. This article is part of a Special Issue entitled: Biomarkers: A Proteomic Challenge.
Asunto(s)
Biomarcadores/orina , Proteínas/análisis , Proteoma/análisis , Proteómica/métodos , Orina/química , Adulto , Cromatografía Liquida , Estudios de Cohortes , Humanos , Lactante , Masculino , Fracciones Subcelulares , Espectrometría de Masas en Tándem , Adulto JovenRESUMEN
Across a host of MS-driven-omics fields, researchers witness the acquisition of ever increasing amounts of high throughput MS data and face the need for their compact yet efficiently accessible storage. Addressing the need for an open data exchange format, the Proteomics Standards Initiative and the Seattle Proteome Center at the Institute for Systems Biology independently developed the mzData and mzXML formats, respectively. In a subsequent joint effort, they defined an ontology and associated controlled vocabulary that specifies the contents of MS data files, implemented as the newer mzML format. All three formats are based on XML and are thus not particularly efficient in either storage space requirements or read/write speed. This contribution introduces mz5, a complete reimplementation of the mzML ontology that is based on the efficient, industrial strength storage backend HDF5. Compared with the current mzML standard, this strategy yields an average file size reduction to â¼54% and increases linear read and write speeds â¼3-4-fold. The format is implemented as part of the ProteoWizard project and is available under a permissive Apache license. Additional information and download links are available from http://software.steenlab.org/mz5.
Asunto(s)
Almacenamiento y Recuperación de la Información , Espectrometría de Masas/métodos , Proteómica , Cromatografía Líquida de Alta Presión , Células HeLa , HumanosRESUMEN
Currently, the reliable identification of peptides and proteins is only feasible when thoroughly annotated sequence databases are available. Although sequencing capacities continue to grow, many organisms remain without reliable, fully annotated reference genomes required for proteomic analyses. Standard database search algorithms fail to identify peptides that are not exactly contained in a protein database. De novo searches are generally hindered by their restricted reliability, and current error-tolerant search strategies are limited by global, heuristic tradeoffs between database and spectral information. We propose a Bayesian information criterion-driven error-tolerant peptide search (BICEPS) and offer an open source implementation based on this statistical criterion to automatically balance the information of each single spectrum and the database, while limiting the run time. We show that BICEPS performs as well as current database search algorithms when such algorithms are applied to sequenced organisms, whereas BICEPS only uses a remotely related organism database. For instance, we use a chicken instead of a human database corresponding to an evolutionary distance of more than 300 million years (International Chicken Genome Sequencing Consortium (2004) Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature 432, 695-716). We demonstrate the successful application to cross-species proteomics with a 33% increase in the number of identified proteins for a filarial nematode sample of Litomosoides sigmodontis.
Asunto(s)
Pollos/genética , Filarioidea/genética , Péptidos/química , Proteómica/métodos , Programas Informáticos , Algoritmos , Secuencia de Aminoácidos , Animales , Teorema de Bayes , Evolución Biológica , Bases de Datos de Proteínas , Humanos , Internet , Espectrometría de Masas , Datos de Secuencia Molecular , Reproducibilidad de los Resultados , Análisis de Secuencia de ProteínaRESUMEN
A wide range of biomolecules, including proteins, are excreted and secreted from helminths and contribute to the parasite's successful establishment, survival, and reproduction in an adverse habitat. Excretory and secretory proteins (ESP) are active at the interface between parasite and host and comprise potential targets for intervention. The intestinal nematode Strongyloides spp. exhibits an exceptional developmental plasticity in its life cycle characterized by parasitic and free-living generations. We investigated ESP from infective larvae, parasitic females, and free-living stages of the rat parasite Strongyloides ratti, which is genetically very similar to the human pathogen, Strongyloides stercoralis. Proteomic analysis of ESP revealed 586 proteins, with the largest number of stage-specific ESP found in infective larvae (196), followed by parasitic females (79) and free-living stages (35). One hundred and forty proteins were identified in all studied stages, including anti-oxidative enzymes, heat shock proteins, and carbohydrate-binding proteins. The stage-selective ESP of (1) infective larvae included an astacin metalloproteinase, the L3 Nie antigen, and a fatty acid retinoid-binding protein; (2) parasitic females included a prolyl oligopeptidase (prolyl serine carboxypeptidase), small heat shock proteins, and a secreted acidic protein; (3) free-living stages included a lysozyme family member, a carbohydrate-hydrolyzing enzyme, and saponin-like protein. We verified the differential expression of selected genes encoding ESP by qRT-PCR. ELISA analysis revealed the recognition of ESP by antibodies of S. ratti-infected rats. A prolyl oligopeptidase was identified as abundant parasitic female-specific ESP, and the effect of pyrrolidine-based prolyl oligopeptidase inhibitors showed concentration- and time-dependent inhibitory effects on female motility. The characterization of stage-related ESP from Strongyloides will help to further understand the interaction of this unique intestinal nematode with its host.
Asunto(s)
Proteínas del Helminto/metabolismo , Larva/enzimología , Serina Endopeptidasas/metabolismo , Strongyloides ratti/enzimología , Secuencia de Aminoácidos , Animales , Secuencia de Bases , Medios de Cultivo/química , Femenino , Regulación del Desarrollo de la Expresión Génica , Proteínas del Helminto/genética , Sueros Inmunes/química , Intestinos/parasitología , Larva/genética , Larva/crecimiento & desarrollo , Masculino , Datos de Secuencia Molecular , Péptido Hidrolasas/genética , Péptido Hidrolasas/metabolismo , Prolil Oligopeptidasas , Inhibidores de Proteasas/farmacología , Señales de Clasificación de Proteína , Estructura Terciaria de Proteína , Proteómica , Ratas , Ratas Wistar , Reacción en Cadena en Tiempo Real de la Polimerasa , Análisis de Secuencia de Proteína , Serina Endopeptidasas/genética , Estadísticas no Paramétricas , Strongyloides ratti/genética , Strongyloides ratti/crecimiento & desarrollo , Estrongiloidiasis/parasitologíaRESUMEN
MOTIVATION: Algorithms for sparse data require fast search and subset selection capabilities for the determination of point neighborhoods. A natural data representation for such cases are space partitioning data structures. However, the associated range queries assume noise-free observations and cannot take into account observation-specific uncertainty estimates that are present in e.g. modern mass spectrometry data. In order to accommodate the inhomogeneous noise characteristics of sparse real-world datasets, point queries need to be reformulated in terms of box intersection queries, where box sizes correspond to uncertainty regions for each observation. RESULTS: This contribution introduces libfbi, a standard C++, header-only template implementation for fast box intersection in an arbitrary number of dimensions, with arbitrary data types in each dimension. The implementation is applied to a data aggregation task on state-of-the-art liquid chromatography/mass spectrometry data, where it shows excellent run time properties. AVAILABILITY: The library is available under an MIT license and can be downloaded from http://software.steenlab.org/libfbi. CONTACT: marc.kirchner@childrens.harvard.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Algoritmos , Espectrometría de Masas/métodos , Cromatografía Liquida , Programas InformáticosRESUMEN
MOTIVATION: Alignment of multiple liquid chromatography/mass spectrometry (LC/MS) experiments is a necessity today, which arises from the need for biological and technical repeats. Due to limits in sampling frequency and poor reproducibility of retention times, current LC systems suffer from missing observations and non-linear distortions of the retention times across runs. Existing approaches for peak correspondence estimation focus almost exclusively on solving the pairwise alignment problem, yielding straightforward but suboptimal results for multiple alignment problems. RESULTS: We propose SIMA, a novel automated procedure for alignment of peak lists from multiple LC/MS runs. SIMA combines hierarchical pairwise correspondence estimation with simultaneous alignment and global retention time correction. It employs a tailored multidimensional kernel function and a procedure based on maximum likelihood estimation to find the retention time distortion function that best fits the observed data. SIMA does not require a dedicated reference spectrum, is robust with regard to outliers, needs only two intuitive parameters and naturally incorporates incomplete correspondence information. In a comparison with seven alternative methods on four different datasets, we show that SIMA yields competitive and superior performance on real-world data. AVAILABILITY: A C++ implementation of the SIMA algorithm is available from http://hci.iwr.uni-heidelberg.de/MIP/Software.
Asunto(s)
Algoritmos , Cromatografía Liquida/métodos , Espectrometría de Masas/métodosRESUMEN
Protein S-acylation (palmitoylation), a reversible post-translational modification, is critically involved in regulating protein subcellular localization, activity, stability, and multimeric complex assembly. However, proteome scale characterization of S-acylation has lagged far behind that of phosphorylation, and global analysis of the localization of S-acylated proteins within different membrane domains has not been reported. Here we describe a novel proteomics approach, designated palmitoyl protein identification and site characterization (PalmPISC), for proteome scale enrichment and characterization of S-acylated proteins extracted from lipid raft-enriched and non-raft membranes. In combination with label-free spectral counting quantitation, PalmPISC led to the identification of 67 known and 331 novel candidate S-acylated proteins as well as the localization of 25 known and 143 novel candidate S-acylation sites. Palmitoyl acyltransferases DHHC5, DHHC6, and DHHC8 appear to be S-acylated on three cysteine residues within a novel CCX(7-13)C(S/T) motif downstream of a conserved Asp-His-His-Cys cysteine-rich domain, which may be a potential mechanism for regulating acyltransferase specificity and/or activity. S-Acylation may tether cytoplasmic acyl-protein thioesterase-1 to membranes, thus facilitating its interaction with and deacylation of membrane-associated S-acylated proteins. Our findings also suggest that certain ribosomal proteins may be targeted to lipid rafts via S-acylation, possibly to facilitate regulation of ribosomal protein activity and/or dynamic synthesis of lipid raft proteins in situ. In addition, bioinformatics analysis suggested that S-acylated proteins are highly enriched within core complexes of caveolae and tetraspanin-enriched microdomains, both cholesterol-rich membrane structures. The PalmPISC approach and the large scale human S-acylated protein data set are expected to provide powerful tools to facilitate our understanding of the functions and mechanisms of protein S-acylation.
Asunto(s)
Membrana Celular/metabolismo , Microdominios de Membrana/metabolismo , Proteínas de la Membrana/análisis , Proteoma/análisis , Aciltransferasas/metabolismo , Sitios de Unión , Línea Celular Tumoral , Humanos , Immunoblotting , Lipoilación , Espectrometría de Masas , Proteínas de la Membrana/clasificación , Proteínas de la Membrana/metabolismo , Microscopía Fluorescente , Ácido Palmítico/metabolismo , Proteoma/metabolismo , Proteómica/métodos , Proteína Ribosómica L10 , Proteínas Ribosómicas/análisis , Proteínas Ribosómicas/metabolismo , Dedos de ZincRESUMEN
Disorders of iron metabolism affect over a billion people worldwide. The circulating peptide hormone hepcidin, the central regulator of iron distribution in mammals, holds great diagnostic potential for an array of iron-associated disorders, including iron loading (ß-thalassemia), iron overload (hereditary hemochromatosis), and iron deficiency diseases. We describe a novel high-throughput matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) mass spectrometry assay for quantification of hepcidin in human plasma. This assay involves enrichment using a functionalized MALDI chip, a novel solvent-detergent precipitation buffer, and quantification using a stable isotope labeled internal standard. The linear range of hepcidin in plasma was 1-120 nM, with a low limit of quantification (LOQ) (1 nM), high accuracy (<15% relative error (RE)), and high precision (intraday average 5.52-18.48% coefficient of variation (CV) and interday 9.32-14.83% CV). The assay showed strong correlation with an established hepcidin immunoassay (Spearman; R(2) = 0.839 n = 93 ethylenediaminetetraacetic acid (EDTA) plasma). A collection of normal healthy pediatric samples (range 3.8-32.5 ng/mL; mean 12.9 ng/mL; n = 119) showed significant differences from an adult collection (range 1.8-48.7 ng/mL; mean 16.1 ng/mL; n = 95; P = 0.0096). We discuss these preliminary reference ranges and correlations with additional parameters in light of the utility and limitations of hepcidin measurements as a stand-alone diagnostic and as a tool for therapeutic intervention.
Asunto(s)
Péptidos Catiónicos Antimicrobianos/sangre , Ensayos Analíticos de Alto Rendimiento , Espectrometría de Masa por Láser de Matriz Asistida de Ionización Desorción , Adulto , Niño , Femenino , Hemocromatosis/diagnóstico , Hepcidinas , Humanos , Inmunoensayo , Masculino , Estándares de ReferenciaRESUMEN
MOTIVATION: Mass spectrometry (MS) has become the method of choice for protein/peptide sequence and modification analysis. The technology employs a two-step approach: ionized peptide precursor masses are detected, selected for fragmentation, and the fragment mass spectra are collected for computational analysis. Current precursor selection schemes are based on data- or information-dependent acquisition (DDA/IDA), where fragmentation mass candidates are selected by intensity and are subsequently included in a dynamic exclusion list to avoid constant refragmentation of highly abundant species. DDA/IDA methods do not exploit valuable information that is contained in the fractional mass of high-accuracy precursor mass measurements delivered by current instrumentation. RESULTS: We extend previous contributions that suggest that fractional mass information allows targeted fragmentation of analytes of interest. We introduce a non-linear Random Forest classification and a discrete mapping approach, which can be trained to discriminate among arbitrary fractional mass patterns for an arbitrary number of classes of analytes. These methods can be used to increase fragmentation efficiency for specific subsets of analytes or to select suitable fragmentation technologies on-the-fly. We show that theoretical generalization error estimates transfer into practical application, and that their quality depends on the accuracy of prior distribution estimate of the analyte classes. The methods are applied to two real-world proteomics datasets. AVAILABILITY: All software used in this study is available from http://software.steenlab.org/fmf CONTACT: hanno.steen@childrens.harvard.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Espectrometría de Masas/métodos , Selección Genética , Proteínas/química , Proteoma/análisis , Proteómica/métodosRESUMEN
MOTIVATION: The qualitative and quantitative characterization of protein abundance profiles over a series of time points or a set of environmental conditions is becoming increasingly important. Using isobaric mass tagging experiments, mass spectrometry-based quantitative proteomics deliver accurate peptide abundance profiles for relative quantitation. Associated data analysis workflows need to provide tailored statistical treatment that (i) takes the correlation structure of the normalized peptide abundance profiles into account and (ii) allows inference of protein-level similarity. We introduce a suitable distance measure for relative abundance profiles, derive a statistical test for equality and propose a protein-level representation of peptide-level measurements. This yields a workflow that delivers a similarity ranking of protein abundance profiles with respect to a defined reference. All procedures have in common that they operate based on the true correlation structure that underlies the measurements. This optimizes power and delivers more intuitive and efficient results than existing methods that do not take these circumstances into account. RESULTS: We use protein profile similarity screening to identify candidate proteins whose abundances are post-transcriptionally controlled by the Anaphase Promoting Complex/Cyclosome (APC/C), a specific E3 ubiquitin ligase that is a master regulator of the cell cycle. Results are compared with an established protein correlation profiling method. The proposed procedure yields a 50.9-fold enrichment of co-regulated protein candidates and a 2.5-fold improvement over the previous method. AVAILABILITY: A MATLAB toolbox is available from http://hci.iwr.uni-heidelberg.de/mip/proteomics.
Asunto(s)
Algoritmos , Perfilación de la Expresión Génica/métodos , Espectrometría de Masas/métodos , Mapeo Peptídico/métodos , Análisis de Secuencia de Proteína/métodos , Secuencia de Aminoácidos , Datos de Secuencia MolecularRESUMEN
MOTIVATION: Time-resolved hydrogen exchange (HX) followed by mass spectrometry (MS) is a key technology for studying protein structure, dynamics and interactions. HX experiments deliver a time-dependent distribution of deuteration levels of peptide sequences of the protein of interest. The robust and complete estimation of this distribution for as many peptide fragments as possible is instrumental to understanding dynamic protein-level HX behavior. Currently, this data interpretation step still is a bottleneck in the overall HX/MS workflow. RESULTS: We propose HeXicon, a novel algorithmic workflow for automatic deuteration distribution estimation at increased sequence coverage. Based on an L(1)-regularized feature extraction routine, HeXicon extracts the full deuteration distribution, which allows insight into possible bimodal exchange behavior of proteins, rather than just an average deuteration for each time point. Further, it is capable of addressing ill-posed estimation problems, yielding sparse and physically reasonable results. HeXicon makes use of existing peptide sequence information, which is augmented by an inferred list of peptide candidates derived from a known protein sequence. In conjunction with a supervised classification procedure that balances sensitivity and specificity, HeXicon can deliver results with increased sequence coverage. AVAILABILITY: The entire HeXicon workflow has been implemented in C++ and includes a graphical user interface. It is available at http://hci.iwr.uni-heidelberg.de/software.php. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Algoritmos , Medición de Intercambio de Deuterio/métodos , Espectrometría de Masas/métodos , Proteínas/química , Deuterio/química , Interfaz Usuario-ComputadorRESUMEN
The anaphase promoting complex (APC) controls the degradation of proteins during exit from mitosis and entry into S-phase. The activity of the APC is regulated by phosphorylation during mitosis. Because the phosphorylation pattern provides insights into the complexity of regulation of the APC, we studied in detail the phosphorylation patterns at a single mitotic state of arrest generated by various antimitotic drugs. We examined the phosphorylation patterns of the APC in HeLa S3 cells after they were arrested in prometaphase with taxol, nocodazole, vincristine, or monastrol. There were 71 phosphorylation sites on nine of the APC subunits. Despite the common state of arrest, the various antimitotic drug treatments resulted in differences in the phosphorylation patterns and phosphorylation stoichiometries. The relative phosphorylation stoichiometries were determined by using a method adapted from the isotope-free quantitation of the extent of modification (iQEM). We could show that during drug arrest the phosphorylation state of the APC changes, indicating that the mitotic arrest is not a static condition. We discuss these findings in terms of the variable efficacy of antimitotic drugs in cancer chemotherapy.
Asunto(s)
Antimitóticos/farmacología , Proteómica/métodos , Complejos de Ubiquitina-Proteína Ligasa/metabolismo , Ciclosoma-Complejo Promotor de la Anafase , Células HeLa , Humanos , Espectrometría de Masas , Nocodazol/farmacología , Paclitaxel/farmacología , Fosforilación , Prometafase/efectos de los fármacos , Subunidades de Proteína/metabolismo , Pirimidinas/farmacología , Huso Acromático/efectos de los fármacos , Tionas/farmacología , Vincristina/farmacologíaRESUMEN
Despite the efforts of the mass spectrometry (MS) community to migrate data representation toward modern file formats, legacy text formats still play an important role in MS data processing workflows. We provide a formal grammar and a portable, efficient C++ implementation for a Mascot Generic Format (MGF) parser. Software and technical documentation are available from http://software.steenlab.org/mgfp/.
Asunto(s)
Biología Computacional/métodos , Bases de Datos Factuales , Espectrometría de Masas/métodos , Lenguajes de Programación , Sistemas de Administración de Bases de Datos , Terminología como AsuntoRESUMEN
Using decoy databases to compute the confidence of peptide identifications has become the standard procedure for mass spectrometry driven proteomics. While decoy databases have numerous advantages, they double the run time and are not applicable to all peptide identification problems such as error-tolerant or de novo searches or the large-scale identification of cross-linked peptides. Instead, we propose a fast, simple and robust mixture modeling approach to estimate the confidence of peptide identifications without the need for decoy database searches, which automatically checks whether its underlying assumptions are fulfilled. This approach is then evaluated on 41 LC/MS data sets of varying complexity and origin. The results are very similar to those of the decoy database strategy at a negligible computational cost. Our approach is applicable not only to standard protein identification workflows, but also to proteomics problems for which meaningful decoy databases cannot be constructed.
Asunto(s)
Péptidos/análisis , Proteómica/métodos , Animales , Bases de Datos Factuales , Humanos , Espectrometría de Masas , Ratones , Reproducibilidad de los Resultados , Factores de TiempoRESUMEN
The effectiveness of database search algorithms, such as Mascot, Sequest and ProteinPilot is limited by the quality of the input spectra: spurious peaks in MS/MS spectra can jeopardize the correct identification of peptides or reduce their score significantly. Consequently, an efficient preprocessing of MS/MS spectra can increase the sensitivity of peptide identification at reduced file sizes and run time without compromising its specificity. We investigate the performance of 25 MS/MS preprocessing methods on various data sets and make software for improved preprocessing of mgf/dta-files freely available from http://hci.iwr.uni-heidelberg.de/mip/proteomics or http://www.childrenshospital.org/research/steenlab.
Asunto(s)
Biología Computacional/métodos , Péptidos/análisis , Proteómica/métodos , Diseño de Software , Espectrometría de Masas en Tándem/métodos , Animales , Humanos , Internet , Péptidos/químicaRESUMEN
BACKGROUND: The reliable extraction of features from mass spectra is a fundamental step in the automated analysis of proteomic mass spectrometry (MS) experiments. RESULTS: This contribution proposes a sparse template regression approach to peak picking called NITPICK. NITPICK is a Non-greedy, Iterative Template-based peak PICKer that deconvolves complex overlapping isotope distributions in multicomponent mass spectra. NITPICK is based on fractional averaging, a novel extension to Senko's well-known averaging model, and on a modified version of sparse, non-negative least angle regression, for which a suitable, statistically motivated early stopping criterion has been derived. The strength of NITPICK is the deconvolution of overlapping mixture mass spectra. CONCLUSION: Extensive comparative evaluation has been carried out and results are provided for simulated and real-world data sets. NITPICK outperforms pepex, to date the only alternate, publicly available, non-greedy feature extraction routine. NITPICK is available as software package for the R programming language and can be downloaded from (http://hci.iwr.uni-heidelberg.de/mip/proteomics/).
Asunto(s)
Espectrometría de Masas/métodos , Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Algoritmos , Reconocimiento de Normas Patrones Automatizadas , ProteómicaRESUMEN
Imaging mass spectrometry (IMS) is a promising technology which allows for detailed analysis of spatial distributions of (bio)molecules in organic samples. In many current applications, IMS relies heavily on (semi)automated exploratory data analysis procedures to decompose the data into characteristic component spectra and corresponding abundance maps, visualizing spectral and spatial structure. The most commonly used techniques are principal component analysis (PCA) and independent component analysis (ICA). Both methods operate in an unsupervised manner. However, their decomposition estimates usually feature negative counts and are not amenable to direct physical interpretation. We propose probabilistic latent semantic analysis (pLSA) for non-negative decomposition and the elucidation of interpretable component spectra and abundance maps. We compare this algorithm to PCA, ICA, and non-negative PARAFAC (parallel factors analysis) and show on simulated and real-world data that pLSA and non-negative PARAFAC are superior to PCA or ICA in terms of complementarity of the resulting components and reconstruction accuracy. We further combine pLSA decomposition with a statistical complexity estimation scheme based on the Akaike information criterion (AIC) to automatically estimate the number of components present in a tissue sample data set and show that this results in sensible complexity estimates.
Asunto(s)
Algoritmos , Neoplasias de la Mama/patología , Procesamiento de Imagen Asistido por Computador , Espectrometría de Masas , Análisis de Componente Principal , Simulación por Computador , Femenino , Humanos , Procesamiento de Señales Asistido por ComputadorRESUMEN
Many cancers have been associated with the deregulation of kinases, and thus, kinases have become a prime target for the development of cancer treatments. This focus on kinases has resulted in the approval of several small-molecule kinase inhibitors for cancer treatments. Further, the use of these inhibitors as tools to study cancer has provided valuable information about biological mechanisms. However, to date, not much is known about the global effects of kinases on the proteome or phosphoproteome. In this protocol, we describe methodology to study the impact of kinase inhibitors on the proteome and phosphoproteome using mass spectrometry-based quantitative proteomics. More specifically, we focus on the effects of Aurora B kinase inhibitors on the proteome, cytoskeleton proteome, the phosphoproteome, and the cytoskeleton phosphoproteome during cell cycle. This methodology is easily extended to other biological studies whose aim is to study the global proteomic effects of a kinase inhibitor.
Asunto(s)
Inhibidores Enzimáticos/farmacología , Espectrometría de Masas , Fosfotransferasas/antagonistas & inhibidores , Proteómica/métodos , Puntos de Control del Ciclo Celular/efectos de los fármacos , Fraccionamiento Celular , Línea Celular Tumoral , Células Cultivadas , Cromatografía Liquida , Células HeLa , Humanos , Fosforilación/efectos de los fármacosRESUMEN
A protein molecule exists as a heterogeneous population of posttranslationally modified forms, which are of potential interest to biologists. However, due to detection or methodology limitations, they remain uncharacterized. When a protein does become a prioritized interest in a laboratory, workflows aimed for its purification and characterization are implemented. Inherent in these workflows is the enrichment of the protein from the biological lysate, rendering it an ideal sample for mass spectrometry (MS), as detection of several peptides is greatly increased. In order to capitalize on this enhanced detection of the protein of interest, we have developed a full-length expressed protein quantification standard (FLEXIQuant standard) that is in vitro synthesized, devoid of posttranslational modifications (PTMs), and implemented into the purification workflow of the endogenous counterpart-as such it serves as an internal MS standard. FLEXIQuantification allows for the unbiased identification of peptides undergoing PTM as a function of a particular biological state. The extent of PTM is also quantified, providing further insight into the regulation of the protein.