RESUMEN
Glycoproteomics is a powerful yet analytically challenging research tool. Software packages aiding the interpretation of complex glycopeptide tandem mass spectra have appeared, but their relative performance remains untested. Conducted through the HUPO Human Glycoproteomics Initiative, this community study, comprising both developers and users of glycoproteomics software, evaluates solutions for system-wide glycopeptide analysis. The same mass spectrometrybased glycoproteomics datasets from human serum were shared with participants and the relative team performance for N- and O-glycopeptide data analysis was comprehensively established by orthogonal performance tests. Although the results were variable, several high-performance glycoproteomics informatics strategies were identified. Deep analysis of the data revealed key performance-associated search parameters and led to recommendations for improved 'high-coverage' and 'high-accuracy' glycoproteomics search solutions. This study concludes that diverse software packages for comprehensive glycopeptide data analysis exist, points to several high-performance search strategies and specifies key variables that will guide future software developments and assist informatics decision-making in glycoproteomics.
Asunto(s)
Glicopéptidos/sangre , Glicoproteínas/sangre , Informática/métodos , Proteoma/análisis , Proteómica/métodos , Investigadores/estadística & datos numéricos , Programas Informáticos , Glicosilación , Humanos , Proteoma/metabolismo , Espectrometría de Masas en TándemRESUMEN
While N-glycopeptides are relatively easy to characterize, O-glycosylation analysis is more complex. In this article, we illustrate the multiple layers of O-glycopeptide characterization that make this task so challenging. We believe our carefully curated dataset represents perhaps the largest intact human glycopeptide mixture derived from individuals, not from cell lines. The samples were collected from healthy individuals, patients with superficial or advanced bladder cancer (three of each group), and a single bladder inflammation patient. The data were scrutinized manually and interpreted using three different search engines: Byonic, Protein Prospector, and O-Pair, and the tool MS-Filter. Despite all the recent advances, reliable automatic O-glycopeptide assignment has not been solved yet. Our data reveal such diversity of site-specific O-glycosylation that has not been presented before. In addition to the potential biological implications, this dataset should be a valuable resource for software developers in the same way as some of our previously released data has been used in the development of O-Pair and O-Glycoproteome Analyzer. Based on the manual evaluation of the performance of the existing tools with our data, we lined up a series of recommendations that if implemented could significantly improve the reliability of glycopeptide assignments.
Asunto(s)
Motor de Búsqueda , Programas Informáticos , Humanos , Glicosilación , Reproducibilidad de los Resultados , Glicopéptidos/análisis , Proteoma/químicaRESUMEN
A relatively novel activation technique, electron-transfer/higher-energy collision dissociation (EThcD) was used in the LC-MS/MS analysis of tryptic glycopeptides enriched with wheat germ agglutinin from human urine samples. We focused on the characterization of mucin-type O-glycopeptides. EThcD in a single spectrum provided information on both the peptide modified and the glycan carried. Unexpectedly, glycan oxonium ions indicated the presence of O-acetyl, and even O-diacetyl-sialic acids. B and Y fragment ions revealed that (i) in core 1 structures the Gal residue featured the O-acetyl-sialic acid, when there was only one in the glycan; (ii) several glycopeptides featured core 1 glycans with disialic acids, in certain instances O-acetylated; (iii) the disialic acid was linked to the GalNAc residue whatever the degree of O-acetylation; (iv) core 2 isomers with a single O-acetyl-sialic acid were chromatographically resolved. Glycan fragmentation also helped to decipher additional core 2 oligosaccharides: a LacdiNAc-like structure, glycans carrying sialyl LewisX/A at different stages of O-acetylation, and blood antigens. A sialo core 3 structure was also identified. We believe this is the first study when such structures were characterized from a very complex mixture and were linked not only to a specific protein, but also the sites of modifications have been determined.
Asunto(s)
Glicoproteínas/orina , Polisacáridos/análisis , Proteómica/métodos , Cromatografía Liquida , Glicopéptidos/análisis , Humanos , Ácido N-Acetilneuramínico/química , Polisacáridos/química , Espectrometría de Masas en Tándem/métodosRESUMEN
A novel software, Pinnacle was used to reassess the reproducibility of a 2-step lectin-based O-glycopeptide enrichment method. A publicly available dataset consisting of 12 data files representing 3 technical replicates of enriched glycopeptides from human serum was investigated. Previously, an attempt for reproducibility assessment was made utilizing an MS/MS scan (MS2)-based method. However, the stochastic nature of precursor ion selection strongly biased this approach leading to underestimated rate of reproducibility. To bypass this problem, our present method follows the general path to confidently identify O-glycopeptides (database search with MS/MS data) supplemented with full scan/survey scan (MS1)/extracted ion chromatogram (XIC) mining in all files using two software packages, Pinnacle and Skyline. Confident MS/MS identifications were delivered by Protein Prospector. With this input Skyline indicated a 70% reproducibility for our workflow. However, Pinnacle performed better, indicating the presence of 90% of the confidently assigned glycopeptides in all the three replicates. Pinnacle, just like Skyline, performs ion extraction using the high accuracy, high resolution mass measurement data but it also utilizes all the available MS/MS spectra, even from different activation methods, within the same file to make mass spectrometric data evaluation for glycopeptides more reliable.
Asunto(s)
Cromatografía de Afinidad/métodos , Glicopéptidos/aislamiento & purificación , Programas Informáticos , Glicómica , Glicopéptidos/sangre , Glicopéptidos/química , Glicosilación , Humanos , Lectinas/química , Lectinas/metabolismo , Reproducibilidad de los Resultados , Espectrometría de Masas en Tándem/métodosRESUMEN
Growing evidence on the diverse biological roles of extracellular glycosylation as well as the need for quality control of protein pharmaceuticals make glycopeptide analysis both exciting and important again after a long hiatus. High-throughput O-glycosylation studies have to tackle the complexity of glycosylation as well as technical difficulties and, up to now, have yielded only limited results mostly from single enrichment experiments. In this study, we address the technical reproducibility of the characterization of the most prevalent O-glycosylation (mucin-type core 1 structures) in human serum, using a two-step lectin affinity-based workflow. Our results are based on automated glycopeptide identifications from higher-energy C-trap dissociation and electron transfer dissociation MS/MS data. Assignments meeting strict acceptance criteria served as the foundation for generating "spectral families" incorporating low-scoring MS/MS identifications, supported by accurate mass measurements and expected chromatographic retention times. We show that this approach helped to evaluate the reproducibility of the glycopeptide enrichment more reliably and also contributed to the expansion of the glycoform repertoire of already identified glycosylated sequences. The roadblocks hindering more in-depth investigations and quantitative analyses will also be discussed.
Asunto(s)
Glicopéptidos/química , Glicopéptidos/aislamiento & purificación , Proteínas Sanguíneas/química , Glicosilación , Humanos , Mucina-1/química , Reproducibilidad de los ResultadosRESUMEN
Glycopeptides represent cross-linked structures between chemically and physically different biomolecules. Mass spectrometric analysis of O-glycopeptides may reveal the identity of the peptide, the composition of the glycan and even the connection between certain sugar units, but usually only the combination of different MS/MS techniques provides sufficient information for reliable assignment. Currently, HCD analysis followed by diagnostic sugar fragment-triggered ETD or EThcD experiments is the most promising data acquisition protocol. However, the information content of the different MS/MS data is handled separately by search engines. We are convinced that these data should be used in concert, as we demonstrate in the present study. First, glycopeptides bearing the most common glycans can be identified from EThcD and/or HCD data. Then, searching for Y0 (the gas-phase deglycosylated peptide) in HCD spectra, the potential glycoforms of these glycopeptides could be lined up. Finally, these spectra and the corresponding EThcD data can be used to verify or discard the tentative assignments and to obtain further structural information about the glycans. We present 18 novel human urinary sialoglycan structures deciphered using this approach. To accomplish this in an automated fashion further software development is necessary.
Asunto(s)
Biología Computacional/métodos , Glicoproteínas/química , Glicoproteínas/orina , Cromatografía Liquida , Glicosilación , Humanos , Motor de Búsqueda , Espectrometría de Masas en TándemRESUMEN
Intact glycopeptide analysis is becoming more common with developments in mass spectrometry instrumentation and fragmentation approaches. In particular, collision-based fragmentation approaches such as higher energy collisional dissociation (HCD) and radical-driven fragmentation approaches such as electron transfer dissociation (ETD) provide complementary information, but bioinformatic strategies to utilize this combined information are currently lacking. In this work we adapted a software tool, MS-Filter, to search HCD peak list files for predicted Y ions based on matched EThcD results to propose additional glycopeptide assignments. The strategy proved to be extremely powerful for O-glycopeptide data, and also of benefit for N-linked data, where it allowed rescue of low confidence results from database searching.
Asunto(s)
Biología Computacional/métodos , Glicopéptidos/orina , Bases de Datos de Proteínas , Humanos , Espectrometría de Masas , Programas InformáticosRESUMEN
A very complex mixture of intact, human N- and O-glycopeptides, enriched from the tryptic digest of urinary proteins of three healthy donors using a two-step lectin affinity enrichment, was analyzed by LC-MS/MS, leading to approximately 45,000 glycopeptide EThcD spectra. Two search engines, Byonic and Protein Prospector, were used for the interpretation of the data, and N- and O-linked glycopeptides were assigned from separate searches. The identification rate was very low in all searches, even when results were combined. Thus, we investigated the reasons why was it so, to help to improve the identification success rate. Focusing on O-linked glycopeptides, we noticed that in EThcD, larger glycan oxonium ions better survive the activation than those in HCD. These fragments, combined with reducing terminal Y ions, provide important information about the glycan(s) present, so we investigated whether filtering the peaklists for glycan oxonium ions indicating the presence of a tetra- or hexasaccharide structure would help to reveal all molecules containing such glycans. Our study showed that intact glycans frequently do not survive even mild supplemental activation, meaning one cannot rely on these oxonium ions exclusively. We found that ETD efficiency is still a limiting factor, and for highly glycosylated peptides, the only information revealed in EThcD was related to the glycan structures. The limited overlap of results delivered by the two search engines draws attention to the fact that automated data interpretation of O-linked glycopeptides is not even close to being solved. Graphical abstract á .
Asunto(s)
Glicopéptidos/orina , Espectrometría de Masas en Tándem/métodos , Adulto , Secuencia de Carbohidratos , Cromatografía Liquida/métodos , Femenino , Glicopéptidos/análisis , Humanos , Masculino , Persona de Mediana Edad , Motor de BúsquedaRESUMEN
New near-infrared rhodamine dyes with large Stokes shifts were developed and applied for sensitive detection of cellular pH changes and fluctuations by incorporating an additional amine group with fused rings into the rhodamine dyes to enhance the electron donating ability of amine groups and improve the spectroscopic properties of the dyes.