Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 59
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Nat Commun ; 15(1): 3956, 2024 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-38730277

RESUMEN

Immunopeptidomics is crucial for immunotherapy and vaccine development. Because the generation of immunopeptides from their parent proteins does not adhere to clear-cut rules, rather than being able to use known digestion patterns, every possible protein subsequence within human leukocyte antigen (HLA) class-specific length restrictions needs to be considered during sequence database searching. This leads to an inflation of the search space and results in lower spectrum annotation rates. Peptide-spectrum match (PSM) rescoring is a powerful enhancement of standard searching that boosts the spectrum annotation performance. We analyze 302,105 unique synthesized non-tryptic peptides from the ProteomeTools project on a timsTOF-Pro to generate a ground-truth dataset containing 93,227 MS/MS spectra of 74,847 unique peptides, that is used to fine-tune the deep learning-based fragment ion intensity prediction model Prosit. We demonstrate up to 3-fold improvement in the identification of immunopeptides, as well as increased detection of immunopeptides from low input samples.


Asunto(s)
Aprendizaje Profundo , Péptidos , Espectrometría de Masas en Tándem , Humanos , Péptidos/química , Péptidos/inmunología , Espectrometría de Masas en Tándem/métodos , Bases de Datos de Proteínas , Proteómica/métodos , Antígenos HLA/inmunología , Antígenos HLA/genética , Programas Informáticos , Iones
2.
3.
Cell ; 187(7): 1801-1818.e20, 2024 Mar 28.
Artículo en Inglés | MEDLINE | ID: mdl-38471500

RESUMEN

The repertoire of modifications to bile acids and related steroidal lipids by host and microbial metabolism remains incompletely characterized. To address this knowledge gap, we created a reusable resource of tandem mass spectrometry (MS/MS) spectra by filtering 1.2 billion publicly available MS/MS spectra for bile-acid-selective ion patterns. Thousands of modifications are distributed throughout animal and human bodies as well as microbial cultures. We employed this MS/MS library to identify polyamine bile amidates, prevalent in carnivores. They are present in humans, and their levels alter with a diet change from a Mediterranean to a typical American diet. This work highlights the existence of many more bile acid modifications than previously recognized and the value of leveraging public large-scale untargeted metabolomics data to discover metabolites. The availability of a modification-centric bile acid MS/MS library will inform future studies investigating bile acid roles in health and disease.


Asunto(s)
Ácidos y Sales Biliares , Microbioma Gastrointestinal , Metabolómica , Espectrometría de Masas en Tándem , Animales , Humanos , Ácidos y Sales Biliares/química , Metabolómica/métodos , Poliaminas , Espectrometría de Masas en Tándem/métodos , Bases de Datos de Compuestos Químicos
4.
bioRxiv ; 2024 Feb 08.
Artículo en Inglés | MEDLINE | ID: mdl-38370723

RESUMEN

Although untargeted mass spectrometry-based metabolomics is crucial for understanding life's molecular underpinnings, its effectiveness is hampered by low annotation rates of the generated tandem mass spectra. To address this issue, we introduce a novel data-driven approach, Biotransformation-based Annotation Method (BAM), that leverages molecular structural similarities inherent in biochemical reactions. BAM operates by applying biotransformation rules to known 'anchor' molecules, which exhibit high spectral similarity to unknown spectra, thereby hypothesizing and ranking potential structures for the corresponding 'suspect' molecule. BAM's effectiveness is demonstrated by its success in annotating suspect spectra in a global molecular network comprising hundreds of millions of spectra. BAM was able to assign correct molecular structures to 24.2 % of examined anchor-suspect cases, thereby demonstrating remarkable advancement in metabolite annotation.

5.
Proteomics ; 24(8): e2300336, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38009585

RESUMEN

Immunopeptidomics is a key technology in the discovery of targets for immunotherapy and vaccine development. However, identifying immunopeptides remains challenging due to their non-tryptic nature, which results in distinct spectral characteristics. Moreover, the absence of strict digestion rules leads to extensive search spaces, further amplified by the incorporation of somatic mutations, pathogen genomes, unannotated open reading frames, and post-translational modifications. This inflation in search space leads to an increase in random high-scoring matches, resulting in fewer identifications at a given false discovery rate. Peptide-spectrum match rescoring has emerged as a machine learning-based solution to address challenges in mass spectrometry-based immunopeptidomics data analysis. It involves post-processing unfiltered spectrum annotations to better distinguish between correct and incorrect peptide-spectrum matches. Recently, features based on predicted peptidoform properties, including fragment ion intensities, retention time, and collisional cross section, have been used to improve the accuracy and sensitivity of immunopeptide identification. In this review, we describe the diverse bioinformatics pipelines that are currently available for peptide-spectrum match rescoring and discuss how they can be used for the analysis of immunopeptidomics data. Finally, we provide insights into current and future machine learning solutions to boost immunopeptide identification.


Asunto(s)
Péptidos , Proteómica , Proteómica/métodos , Péptidos/química , Espectrometría de Masas/métodos , Aprendizaje Automático , Procesamiento Proteico-Postraduccional
6.
J Chem Inf Model ; 64(7): 2515-2527, 2024 Apr 08.
Artículo en Inglés | MEDLINE | ID: mdl-37870574

RESUMEN

In the field of drug discovery, there is a substantial challenge in seeking out chemical structures that possess desirable pharmacological, toxicological, and pharmacokinetic properties. Complications arise when drugs interfere with the functioning of cardiac ion channels, leading to serious cardiovascular consequences. The discontinuation and removal of numerous approved drugs from the market or at late development stages in the pipeline due to such inhibitory effects further highlight the urgency of addressing this issue. Consequently, the early prediction of potential blockers targeting cardiac ion channels during the drug discovery process is of paramount importance. This study introduces a deep learning framework that computationally determines the cardiotoxicity associated with the voltage-gated potassium channel (hERG), the voltage-gated calcium channel (Cav1.2), and the voltage-gated sodium channel (Nav1.5) for drug candidates. The predictive capabilities of three feature representations─molecular fingerprints, descriptors, and graph-based numerical representations─are rigorously benchmarked. Additionally, a novel training and evaluation data set framework is presented, enabling predictive model training of drug off-target cardiotoxicity using a comprehensive and large curated data set covering these three cardiac ion channels. To facilitate these predictions, a robust and comprehensive small molecule cardiotoxicity prediction tool named CToxPred has been developed. It is made available as open source under the permissive MIT license at https://github.com/issararab/CToxPred.


Asunto(s)
Cardiotoxicidad , Canales de Potasio Éter-A-Go-Go , Humanos , Benchmarking , Canales Iónicos , Descubrimiento de Drogas , Bloqueadores de los Canales de Potasio/farmacología , Bloqueadores de los Canales de Potasio/química
7.
Nat Commun ; 14(1): 8488, 2023 Dec 20.
Artículo en Inglés | MEDLINE | ID: mdl-38123557

RESUMEN

Despite the increasing availability of tandem mass spectrometry (MS/MS) community spectral libraries for untargeted metabolomics over the past decade, the majority of acquired MS/MS spectra remain uninterpreted. To further aid in interpreting unannotated spectra, we created a nearest neighbor suspect spectral library, consisting of 87,916 annotated MS/MS spectra derived from hundreds of millions of MS/MS spectra originating from published untargeted metabolomics experiments. Entries in this library, or "suspects," were derived from unannotated spectra that could be linked in a molecular network to an annotated spectrum. Annotations were propagated to unknowns based on structural relationships to reference molecules using MS/MS-based spectrum alignment. We demonstrate the broad relevance of the nearest neighbor suspect spectral library through representative examples of propagation-based annotation of acylcarnitines, bacterial and plant natural products, and drug metabolism. Our results also highlight how the library can help to better understand an Alzheimer's brain phenotype. The nearest neighbor suspect spectral library is openly available for download or for data analysis through the GNPS platform to help investigators hypothesize candidate structures for unknown MS/MS spectra in untargeted metabolomics data.


Asunto(s)
Acceso a la Información , Espectrometría de Masas en Tándem , Espectrometría de Masas en Tándem/métodos , Metabolómica/métodos , Biblioteca de Genes , Análisis por Conglomerados
9.
Bioinformatics ; 39(7)2023 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-37369033

RESUMEN

MOTIVATION: Driven by technological advances, the throughput and cost of mass spectrometry (MS) proteomics experiments have improved by orders of magnitude in recent decades. Spectral library searching is a common approach to annotating experimental mass spectra by matching them against large libraries of reference spectra corresponding to known peptides. An important disadvantage, however, is that only peptides included in the spectral library can be found, whereas novel peptides, such as those with unexpected post-translational modifications (PTMs), will remain unknown. Open modification searching (OMS) is an increasingly popular approach to annotate modified peptides based on partial matches against their unmodified counterparts. Unfortunately, this leads to very large search spaces and excessive runtimes, which is especially problematic considering the continuously increasing sizes of MS proteomics datasets. RESULTS: We propose an OMS algorithm, called HOMS-TC, that fully exploits parallelism in the entire pipeline of spectral library searching. We designed a new highly parallel encoding method based on the principle of hyperdimensional computing to encode mass spectral data to hypervectors while minimizing information loss. This process can be easily parallelized since each dimension is calculated independently. HOMS-TC processes two stages of existing cascade search in parallel and selects the most similar spectra while considering PTMs. We accelerate HOMS-TC on NVIDIA's tensor core units, which is emerging and readily available in the recent graphics processing unit (GPU). Our evaluation shows that HOMS-TC is 31× faster on average than alternative search engines and provides comparable accuracy to competing search tools. AVAILABILITY AND IMPLEMENTATION: HOMS-TC is freely available under the Apache 2.0 license as an open-source software project at https://github.com/tycheyoung/homs-tc.


Asunto(s)
Programas Informáticos , Espectrometría de Masas en Tándem , Espectrometría de Masas en Tándem/métodos , Bases de Datos de Proteínas , Péptidos/química , Motor de Búsqueda , Algoritmos , Biblioteca de Péptidos
10.
J Proteome Res ; 22(6): 1639-1648, 2023 06 02.
Artículo en Inglés | MEDLINE | ID: mdl-37166120

RESUMEN

As current shotgun proteomics experiments can produce gigabytes of mass spectrometry data per hour, processing these massive data volumes has become progressively more challenging. Spectral clustering is an effective approach to speed up downstream data processing by merging highly similar spectra to minimize data redundancy. However, because state-of-the-art spectral clustering tools fail to achieve optimal runtimes, this simply moves the processing bottleneck. In this work, we present a fast spectral clustering tool, HyperSpec, based on hyperdimensional computing (HDC). HDC shows promising clustering capability while only requiring lightweight binary operations with high parallelism that can be optimized using low-level hardware architectures, making it possible to run HyperSpec on graphics processing units to achieve extremely efficient spectral clustering performance. Additionally, HyperSpec includes optimized data preprocessing modules to reduce the spectrum preprocessing time, which is a critical bottleneck during spectral clustering. Based on experiments using various mass spectrometry data sets, HyperSpec produces results with comparable clustering quality as state-of-the-art spectral clustering tools while achieving speedups by orders of magnitude, shortening the clustering runtime of over 21 million spectra from 4 h to only 24 min.


Asunto(s)
Algoritmos , Péptidos , Péptidos/análisis , Espectrometría de Masas/métodos , Proteómica/métodos , Análisis por Conglomerados
11.
J Proteome Res ; 22(2): 287-301, 2023 02 03.
Artículo en Inglés | MEDLINE | ID: mdl-36626722

RESUMEN

The Human Proteome Organization (HUPO) Proteomics Standards Initiative (PSI) has been successfully developing guidelines, data formats, and controlled vocabularies (CVs) for the proteomics community and other fields supported by mass spectrometry since its inception 20 years ago. Here we describe the general operation of the PSI, including its leadership, working groups, yearly workshops, and the document process by which proposals are thoroughly and publicly reviewed in order to be ratified as PSI standards. We briefly describe the current state of the many existing PSI standards, some of which remain the same as when originally developed, some of which have undergone subsequent revisions, and some of which have become obsolete. Then the set of proposals currently being developed are described, with an open call to the community for participation in the forging of the next generation of standards. Finally, we describe some synergies and collaborations with other organizations and look to the future in how the PSI will continue to promote the open sharing of data and thus accelerate the progress of the field of proteomics.


Asunto(s)
Proteoma , Proteómica , Humanos , Estándares de Referencia , Vocabulario Controlado , Espectrometría de Masas , Bases de Datos de Proteínas
12.
J Proteome Res ; 22(2): 625-631, 2023 02 03.
Artículo en Inglés | MEDLINE | ID: mdl-36688502

RESUMEN

spectrum_utils is a Python package for mass spectrometry data processing and visualization. Since its introduction, spectrum_utils has grown into a fundamental software solution that powers various applications in proteomics and metabolomics, ranging from spectrum preprocessing prior to spectrum identification and machine learning applications to spectrum plotting from online data repositories and assisting data analysis tasks for dozens of other projects. Here, we present updates to spectrum_utils, which include new functionality to integrate mass spectrometry community data standards, enhanced mass spectral data processing, and unified mass spectral data visualization in Python. spectrum_utils is freely available as open source at https://github.com/bittremieux/spectrum_utils.


Asunto(s)
Proteómica , Programas Informáticos , Espectrometría de Masas , Proteómica/métodos , Metabolómica , Aprendizaje Automático
13.
J Proteome Res ; 22(2): 585-593, 2023 02 03.
Artículo en Inglés | MEDLINE | ID: mdl-36688569

RESUMEN

A key analysis task in mass spectrometry proteomics is matching the acquired tandem mass spectra to their originating peptides by sequence database searching or spectral library searching. Machine learning is an increasingly popular postprocessing approach to maximize the number of confident spectrum identifications that can be obtained at a given false discovery rate threshold. Here, we have integrated semisupervised machine learning in the ANN-SoLo tool, an efficient spectral library search engine that is optimized for open modification searching to identify peptides with any type of post-translational modification. We show that machine learning rescoring boosts the number of spectra that can be identified for both standard searching and open searching, and we provide insights into relevant spectrum characteristics harnessed by the machine learning model. The semisupervised machine learning functionality has now been fully integrated into ANN-SoLo, which is available as open source under the permissive Apache 2.0 license on GitHub at https://github.com/bittremieux/ANN-SoLo.


Asunto(s)
Péptidos , Programas Informáticos , Bases de Datos de Proteínas , Péptidos/análisis , Espectrometría de Masas en Tándem/métodos , Aprendizaje Automático , Algoritmos , Biblioteca de Péptidos
14.
Nat Microbiol ; 7(12): 2128-2150, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-36443458

RESUMEN

Despite advances in sequencing, lack of standardization makes comparisons across studies challenging and hampers insights into the structure and function of microbial communities across multiple habitats on a planetary scale. Here we present a multi-omics analysis of a diverse set of 880 microbial community samples collected for the Earth Microbiome Project. We include amplicon (16S, 18S, ITS) and shotgun metagenomic sequence data, and untargeted metabolomics data (liquid chromatography-tandem mass spectrometry and gas chromatography mass spectrometry). We used standardized protocols and analytical methods to characterize microbial communities, focusing on relationships and co-occurrences of microbially related metabolites and microbial taxa across environments, thus allowing us to explore diversity at extraordinary scale. In addition to a reference database for metagenomic and metabolomic data, we provide a framework for incorporating additional studies, enabling the expansion of existing knowledge in the form of an evolving community resource. We demonstrate the utility of this database by testing the hypothesis that every microbe and metabolite is everywhere but the environment selects. Our results show that metabolite diversity exhibits turnover and nestedness related to both microbial communities and the environment, whereas the relative abundances of microbially related metabolites vary and co-occur with specific microbial consortia in a habitat-specific manner. We additionally show the power of certain chemistry, in particular terpenoids, in distinguishing Earth's environments (for example, terrestrial plant surfaces and soils, freshwater and marine animal stool), as well as that of certain microbes including Conexibacter woesei (terrestrial soils), Haloquadratum walsbyi (marine deposits) and Pantoea dispersa (terrestrial plant detritus). This Resource provides insight into the taxa and metabolites within microbial communities from diverse habitats across Earth, informing both microbial and chemical ecology, and provides a foundation and methods for multi-omics microbiome studies of hosts and the environment.


Asunto(s)
Microbiota , Animales , Microbiota/genética , Metagenoma , Metagenómica , Planeta Tierra , Suelo
15.
Metabolomics ; 18(12): 94, 2022 11 19.
Artículo en Inglés | MEDLINE | ID: mdl-36409434

RESUMEN

BACKGROUND: Spectral library searching is currently the most common approach for compound annotation in untargeted metabolomics. Spectral libraries applicable to liquid chromatography mass spectrometry have grown in size over the past decade to include hundreds of thousands to millions of mass spectra and tens of thousands of compounds, forming an essential knowledge base for the interpretation of metabolomics experiments. AIM OF REVIEW: We describe existing spectral library resources, highlight different strategies for compiling spectral libraries, and discuss quality considerations that should be taken into account when interpreting spectral library searching results. Finally, we describe how spectral libraries are empowering the next generation of machine learning tools in computational metabolomics, and discuss several opportunities for using increasingly accessible large spectral libraries. KEY SCIENTIFIC CONCEPTS OF REVIEW: This review focuses on the current state of spectral libraries for untargeted LC-MS/MS based metabolomics. We show how the number of entries in publicly accessible spectral libraries has increased more than 60-fold in the past eight years to aid molecular interpretation and we discuss how the role of spectral libraries in untargeted metabolomics will evolve in the near future.


Asunto(s)
Metabolómica , Espectrometría de Masas en Tándem , Metabolómica/métodos , Cromatografía Liquida/métodos , Espectrometría de Masas en Tándem/métodos
16.
Mol Cell Proteomics ; 21(12): 100425, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-36241021

RESUMEN

The outbreak of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causative agent of the coronavirus 2019 disease, has led to an ongoing global pandemic since 2019. Mass spectrometry can be used to understand the molecular mechanisms of viral infection by SARS-CoV-2, for example, by determining virus-host protein-protein interactions through which SARS-CoV-2 hijacks its human hosts during infection, and to study the role of post-translational modifications. We have reanalyzed public affinity purification-mass spectrometry data using open modification searching to investigate the presence of post-translational modifications in the context of the SARS-CoV-2 virus-host protein-protein interaction network. Based on an over twofold increase in identified spectra, our detected protein interactions show a high overlap with independent mass spectrometry-based SARS-CoV-2 studies and virus-host interactions for alternative viruses, as well as previously unknown protein interactions. In addition, we identified several novel modification sites on SARS-CoV-2 proteins that we investigated in relation to their interactions with host proteins. A detailed analysis of relevant modifications, including phosphorylation, ubiquitination, and S-nitrosylation, provides important hypotheses about the functional role of these modifications during viral infection by SARS-CoV-2.


Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , Interacciones Microbiota-Huesped , Procesamiento Proteico-Postraduccional , Mapas de Interacción de Proteínas
17.
PLoS Pathog ; 18(9): e1010848, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-36149920

RESUMEN

Aneuploidy causes system-wide disruptions in the stochiometric balances of transcripts, proteins, and metabolites, often resulting in detrimental effects for the organism. The protozoan parasite Leishmania has an unusually high tolerance for aneuploidy, but the molecular and functional consequences for the pathogen remain poorly understood. Here, we addressed this question in vitro and present the first integrated analysis of the genome, transcriptome, proteome, and metabolome of highly aneuploid Leishmania donovani strains. Our analyses unambiguously establish that aneuploidy in Leishmania proportionally impacts the average transcript- and protein abundance levels of affected chromosomes, ultimately correlating with the degree of metabolic differences between closely related aneuploid strains. This proportionality was present in both proliferative and non-proliferative in vitro promastigotes. However, as in other Eukaryotes, we observed attenuation of dosage effects for protein complex subunits and in addition, non-cytoplasmic proteins. Differentially expressed transcripts and proteins between aneuploid Leishmania strains also originated from non-aneuploid chromosomes. At protein level, these were enriched for proteins involved in protein metabolism, such as chaperones and chaperonins, peptidases, and heat-shock proteins. In conclusion, our results further support the view that aneuploidy in Leishmania can be adaptive. Additionally, we believe that the high karyotype diversity in vitro and absence of classical transcriptional regulation make Leishmania an attractive model to study processes of protein homeostasis in the context of aneuploidy and beyond.


Asunto(s)
Leishmania donovani , Proteoma , Aneuploidia , Proteínas de Choque Térmico/genética , Humanos , Cariotipo , Leishmania donovani/genética , Péptido Hidrolasas/genética , Proteoma/genética
18.
J Am Soc Mass Spectrom ; 33(9): 1733-1744, 2022 Sep 07.
Artículo en Inglés | MEDLINE | ID: mdl-35960544

RESUMEN

Spectrum alignment of tandem mass spectrometry (MS/MS) data using the modified cosine similarity and subsequent visualization as molecular networks have been demonstrated to be a useful strategy to discover analogs of molecules from untargeted MS/MS-based metabolomics experiments. Recently, a neutral loss matching approach has been introduced as an alternative to MS/MS-based molecular networking with an implied performance advantage in finding analogs that cannot be discovered using existing MS/MS spectrum alignment strategies. To comprehensively evaluate the scoring properties of neutral loss matching, the cosine similarity, and the modified cosine similarity, similarity measures of 955 228 peptide MS/MS spectrum pairs and 10 million small molecule MS/MS spectrum pairs were compared. This comparative analysis revealed that the modified cosine similarity outperformed neutral loss matching and the cosine similarity in all cases. The data further indicated that the performance of MS/MS spectrum alignment depends on the location and type of the modification, as well as the chemical compound class of fragmented molecules.


Asunto(s)
Metabolómica , Espectrometría de Masas en Tándem , Metabolómica/métodos , Péptidos , Espectrometría de Masas en Tándem/métodos
19.
Nat Commun ; 13(1): 4619, 2022 08 08.
Artículo en Inglés | MEDLINE | ID: mdl-35941113

RESUMEN

The identity and biological activity of most metabolites still remain unknown. A bottleneck in the exploration of metabolite structures and pharmaceutical activities is the compound purification needed for bioactivity assignments and downstream structure elucidation. To enable bioactivity-focused compound identification from complex mixtures, we develop a scalable native metabolomics approach that integrates non-targeted liquid chromatography tandem mass spectrometry and detection of protein binding via native mass spectrometry. A native metabolomics screen for protease inhibitors from an environmental cyanobacteria community reveals 30 chymotrypsin-binding cyclodepsipeptides. Guided by the native metabolomics results, we select and purify five of these compounds for full structure elucidation via tandem mass spectrometry, chemical derivatization, and nuclear magnetic resonance spectroscopy as well as evaluation of their biological activities. These results identify rivulariapeptolides as a family of serine protease inhibitors with nanomolar potency, highlighting native metabolomics as a promising approach for drug discovery, chemical ecology, and chemical biology studies.


Asunto(s)
Metabolómica , Inhibidores de Proteasas , Cromatografía Liquida/métodos , Espectroscopía de Resonancia Magnética/métodos , Metabolómica/métodos , Inhibidores de Proteasas/farmacología , Espectrometría de Masas en Tándem/métodos
20.
Nat Biotechnol ; 40(12): 1774-1779, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-35798960

RESUMEN

Human untargeted metabolomics studies annotate only ~10% of molecular features. We introduce reference-data-driven analysis to match metabolomics tandem mass spectrometry (MS/MS) data against metadata-annotated source data as a pseudo-MS/MS reference library. Applying this approach to food source data, we show that it increases MS/MS spectral usage 5.1-fold over conventional structural MS/MS library matches and allows empirical assessment of dietary patterns from untargeted data.


Asunto(s)
Metadatos , Espectrometría de Masas en Tándem , Humanos , Metabolómica/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...