RESUMEN
MOTIVATION: Including ion mobility separation (IMS) into mass spectrometry proteomics experiments is useful to improve coverage and throughput. Many IMS devices enable linking experimentally derived mobility of an ion to its collisional cross-section (CCS), a highly reproducible physicochemical property dependent on the ion's mass, charge and conformation in the gas phase. Thus, known peptide ion mobilities can be used to tailor acquisition methods or to refine database search results. The large space of potential peptide sequences, driven also by posttranslational modifications of amino acids, motivates an in silico predictor for peptide CCS. Recent studies explored the general performance of varying machine-learning techniques, however, the workflow engineering part was of secondary importance. For the sake of applicability, such a tool should be generic, data driven, and offer the possibility to be easily adapted to individual workflows for experimental design and data processing. RESULTS: We created ionmob, a Python-based framework for data preparation, training, and prediction of collisional cross-section values of peptides. It is easily customizable and includes a set of pretrained, ready-to-use models and preprocessing routines for training and inference. Using a set of ≈21 000 unique phosphorylated peptides and ≈17 000 MHC ligand sequences and charge state pairs, we expand upon the space of peptides that can be integrated into CCS prediction. Lastly, we investigate the applicability of in silico predicted CCS to increase confidence in identified peptides by applying methods of re-scoring and demonstrate that predicted CCS values complement existing predictors for that task. AVAILABILITY AND IMPLEMENTATION: The Python package is available at github: https://github.com/theGreatHerrLebert/ionmob.
Asunto(s)
Aprendizaje Automático , Péptidos , Péptidos/química , Espectrometría de Masas/métodos , Secuencia de Aminoácidos , Proteómica/métodos , IonesRESUMEN
The Bruker timsTOF Pro is an instrument that couples trapped ion mobility spectrometry (TIMS) to high-resolution time-of-flight (TOF) mass spectrometry (MS). For proteomics, lipidomics, and metabolomics applications, the instrument is typically interfaced with a liquid chromatography (LC) system. The resulting LC-TIMS-MS data sets are, in general, several gigabytes in size and are stored in the proprietary Bruker Tims data format (TDF). The raw data can be accessed using proprietary binaries in C, C++, and Python on Windows and Linux operating systems. Here we introduce a suite of computer programs for data accession, including OpenTIMS, TimsR, and TimsPy. OpenTIMS is a C++ library capable of reading Bruker TDF files. It opens up Bruker's proprietary codebase. TimsPy and TimsR build on top of OpenTIMS, enabling swift and user-friendly data access to the raw data with Python and R. Both programs are available under a GPL3 license on all major platforms, extending the possibility to interact with timsTOF data to macOS. Additionally, OpenTIMS is capable of translating Bruker data into HDF5 files that can be easily analyzed from Python with the vaex module. OpenTIMS and TimsPy therefore provide easy and quick access to Bruker timsTOF raw data.
Asunto(s)
Espectrometría de Movilidad Iónica , Proteómica , Cromatografía Liquida , Espectrometría de Masas , Programas InformáticosRESUMEN
High-resolution mass spectrometry becomes increasingly available with its ability to resolve the fine isotopic structure of measured analytes. It allows for high-sensitivity spectral deconvolution, leading to less false-positive identifications. Analytes can be identified by comparing their theoretical isotopic signal with the observed peaks. Necessary calculations are, however, computationally demanding and lead to long processing times. For wheat (trictum oestivum) alone, Uniprot holds more than 142â¯000 candidate protein sequences. This is doubled upon sequence reversal for identification FDR estimation and further multiplied by performing in silico digestion into peptides. The same peptide might originate from more than one protein, which reduces the overall number of sequences to be calculated. However, it is still huge. IsoSpec2 can perform these calculations fast. Compared to IsoSpec1, the algorithm is simpler, orders of magnitude faster, and offers more flexibility for the developers of algorithms for raw data analysis. It is freely available under a 2-clause BSD license, with bindings for the C++, C, R, and Python programming languages.
RESUMEN
Top-down mass spectrometry methods are becoming continuously more popular in the effort to describe the proteome. They rely on the fragmentation of intact protein ions inside the mass spectrometer. Among the existing fragmentation methods, electron transfer dissociation is known for its precision and wide coverage of different cleavage sites. However, several side reactions can occur under electron transfer dissociation (ETD) conditions, including nondissociative electron transfer and proton transfer reaction. Evaluating their extent can provide more insight into reaction kinetics as well as instrument operation. Furthermore, preferential formation of certain reaction products can reveal important structural information. To the best of our knowledge, there are currently no tools capable of tracing and analyzing the products of these reactions in a systematic way. In this Article, we present in detail masstodon: a computer program for assigning peaks and interpreting mass spectra. Besides being a general purpose tool, masstodon also offers the possibility to trace the products of reactions occurring under ETD conditions and provides insights into the parameters driving them. It is available free of charge under the GNU AGPL V3 public license.
Asunto(s)
Apolipoproteína A-I/análisis , Espectrometría de Masas/estadística & datos numéricos , Programas Informáticos , Sustancia P/análisis , Ubiquitina/análisis , Algoritmos , ElectronesRESUMEN
In this work, we studied the changes in high-light tolerance and photosynthetic activity in leaves of the Arabidopsis (Arabidopsis thaliana) rosette throughout the vegetative stage of growth. We implemented an image-analysis work flow to analyze the capacity of both the whole plant and individual leaves to cope with excess excitation energy by following the changes in absorbed light energy partitioning. The data show that leaf and plant age are both important factors influencing the fate of excitation energy. During the dark-to-light transition, the age of the plant affects mostly steady-state levels of photochemical and nonphotochemical quenching, leading to an increased photosynthetic performance of its leaves. The age of the leaf affects the induction kinetics of nonphotochemical quenching. These observations were confirmed using model selection procedures. We further investigated how different leaves on a rosette acclimate to high light and show that younger leaves are less prone to photoinhibition than older leaves. Our results stress that both plant and leaf age should be taken into consideration during the quantification of photosynthetic and photoprotective traits to produce repeatable and reliable results.
Asunto(s)
Arabidopsis/fisiología , Luz , Fotosíntesis/fisiología , Hojas de la Planta/fisiología , Aclimatación , Arabidopsis/crecimiento & desarrollo , Clorofila , Metabolismo Energético , Modelos Biológicos , Factores de TiempoRESUMEN
As high-resolution mass spectrometry (HRMS) becomes increasingly available, the need of software tools capable of handling more complex data is surging. The complexity of the HRMS data stems partly from the presence of isotopes that give rise to more peaks to interpret compared to lower resolution instruments. However, a new generation of fine isotope calculators is on the rise. They calculate the smallest possible sets of isotopologues. However, none of these calculators lets the user specify the joint probability of the revealed envelope in advance. Instead, the user must provide a lower limit on the probability of isotopologues of interest, that is, provide minimal peak height. The choice of such threshold is far from obvious. In particular, it is impossible to a priori balance the trade-off between the algorithm speed and the portion of the revealed theoretical spectrum. We show that this leads to considerable inefficiencies. Here, we present IsoSpec: an algorithm for fast computation of isotopologues of chemical substances that can alternate between joint probability and peak height threshold. We prove that IsoSpec is optimal in terms of time complexity. Its implementation is freely available under a 2-clause BSD license, with bindings for C++, C, R, and Python.