Pesquisa | Portal Regional da BVS

1.

AlphaPept: a modern and open framework for MS-based proteomics.

Strauss, Maximilian T; Bludau, Isabell; Zeng, Wen-Feng; Voytik, Eugenia; Ammar, Constantin; Schessner, Julia P; Ilango, Rajesh; Gill, Michelle; Meier, Florian; Willems, Sander; Mann, Matthias.

Nat Commun ; 15(1): 2168, 2024 Mar 09.

Artigo em Inglês | MEDLINE | ID: mdl-38461149

RESUMO

In common with other omics technologies, mass spectrometry (MS)-based proteomics produces ever-increasing amounts of raw data, making efficient analysis a principal challenge. A plethora of different computational tools can process the MS data to derive peptide and protein identification and quantification. However, during the last years there has been dramatic progress in computer science, including collaboration tools that have transformed research and industry. To leverage these advances, we develop AlphaPept, a Python-based open-source framework for efficient processing of large high-resolution MS data sets. Numba for just-in-time compilation on CPU and GPU achieves hundred-fold speed improvements. AlphaPept uses the Python scientific stack of highly optimized packages, reducing the code base to domain-specific tasks while accessing the latest advances. We provide an easy on-ramp for community contributions through the concept of literate programming, implemented in Jupyter Notebooks. Large datasets can rapidly be processed as shown by the analysis of hundreds of proteomes in minutes per file, many-fold faster than acquisition. AlphaPept can be used to build automated processing pipelines with web-serving functionality and compatibility with downstream analysis tools. It provides easy access via one-click installation, a modular Python library for advanced users, and via an open GitHub repository for developers.

Assuntos

Proteômica , Software , Proteômica/métodos , Espectrometria de Massas/métodos , Proteoma

2.

Full Mass Range ΦSDM Orbitrap Mass Spectrometry for DIA Proteome Analysis.

Steigerwald, Sophia; Sinha, Ankit; Fort, Kyle L; Zeng, Wen-Feng; Niu, Lili; Wichmann, Christoph; Kreutzmann, Arne; Mourad, Daniel; Aizikov, Konstantin; Grinfeld, Dmitry; Makarov, Alexander; Mann, Matthias; Meier, Florian.

Mol Cell Proteomics ; 23(2): 100713, 2024 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-38184013

RESUMO

Optimizing data-independent acquisition methods for proteomics applications often requires balancing spectral resolution and acquisition speed. Here, we describe a real-time full mass range implementation of the phase-constrained spectrum deconvolution method (ΦSDM) for Orbitrap mass spectrometry that increases mass resolving power without increasing scan time. Comparing its performance to the standard enhanced Fourier transformation signal processing revealed that the increased resolving power of ΦSDM is beneficial in areas of high peptide density and comes with a greater ability to resolve low-abundance signals. In a standard 2 h analysis of a 200 ng HeLa digest, this resulted in an increase of 16% in the number of quantified peptides. As the acquisition speed becomes even more important when using fast chromatographic gradients, we further applied ΦSDM methods to a range of shorter gradient lengths (21, 12, and 5 min). While ΦSDM improved identification rates and spectral quality in all tested gradients, it proved particularly advantageous for the 5 min gradient. Here, the number of identified protein groups and peptides increased by >15% in comparison to enhanced Fourier transformation processing. In conclusion, ΦSDM is an alternative signal processing algorithm for processing Orbitrap data that can improve spectral quality and benefit quantitative accuracy in typical proteomics experiments, especially when using short gradients.

Assuntos

Proteoma , Espectrometria de Massas em Tandem , Humanos , Proteoma/metabolismo , Espectrometria de Massas em Tandem/métodos , Peptídeos/análise , Células HeLa , Proteômica/métodos

3.

IMBAS-MS Discovers Organ-Specific HLA Peptide Patterns in Plasma.

Wahle, Maria; Thielert, Marvin; Zwiebel, Maximilian; Skowronek, Patricia; Zeng, Wen-Feng; Mann, Matthias.

Mol Cell Proteomics ; 23(1): 100689, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-38043703

RESUMO

Distinction of non-self from self is the major task of the immune system. Immunopeptidomics studies the peptide repertoire presented by the human leukocyte antigen (HLA) protein, usually on tissues. However, HLA peptides are also bound to plasma soluble HLA (sHLA), but little is known about their origin and potential for biomarker discovery in this readily available biofluid. Currently, immunopeptidomics is hampered by complex workflows and limited sensitivity, typically requiring several mL of plasma. Here, we take advantage of recent improvements in the throughput and sensitivity of mass spectrometry (MS)-based proteomics to develop a highly sensitive, automated, and economical workflow for HLA peptide analysis, termed Immunopeptidomics by Biotinylated Antibodies and Streptavidin (IMBAS). IMBAS-MS quantifies more than 5000 HLA class I peptides from only 200 µl of plasma, in just 30 min. Our technology revealed that the plasma immunopeptidome of healthy donors is remarkably stable throughout the year and strongly correlated between individuals with overlapping HLA types. Immunopeptides originating from diverse tissues, including the brain, are proportionately represented. We conclude that sHLAs are a promising avenue for immunology and potentially for precision oncology.

Assuntos

Neoplasias , Humanos , Estreptavidina , Medicina de Precisão , Antígenos de Histocompatibilidade Classe I/metabolismo , Antígenos HLA , Antígenos de Histocompatibilidade Classe II , Peptídeos/metabolismo , Espectrometria de Massas , Anticorpos

4.

Robust dimethyl-based multiplex-DIA doubles single-cell proteome depth via a reference channel.

Thielert, Marvin; Itang, Ericka Cm; Ammar, Constantin; Rosenberger, Florian A; Bludau, Isabell; Schweizer, Lisa; Nordmann, Thierry M; Skowronek, Patricia; Wahle, Maria; Zeng, Wen-Feng; Zhou, Xie-Xuan; Brunner, Andreas-David; Richter, Sabrina; Levesque, Mitchell P; Theis, Fabian J; Steger, Martin; Mann, Matthias.

Mol Syst Biol ; 19(9): e11503, 2023 09 12.

Artigo em Inglês | MEDLINE | ID: mdl-37602975

RESUMO

Single-cell proteomics aims to characterize biological function and heterogeneity at the level of proteins in an unbiased manner. It is currently limited in proteomic depth, throughput, and robustness, which we address here by a streamlined multiplexed workflow using data-independent acquisition (mDIA). We demonstrate automated and complete dimethyl labeling of bulk or single-cell samples, without losing proteomic depth. Lys-N digestion enables five-plex quantification at MS1 and MS2 level. Because the multiplexed channels are quantitatively isolated from each other, mDIA accommodates a reference channel that does not interfere with the target channels. Our algorithm RefQuant takes advantage of this and confidently quantifies twice as many proteins per single cell compared to our previous work (Brunner et al, PMID 35226415), while our workflow currently allows routine analysis of 80 single cells per day. Finally, we combined mDIA with spatial proteomics to increase the throughput of Deep Visual Proteomics seven-fold for microdissection and four-fold for MS analysis. Applying this to primary cutaneous melanoma, we discovered proteomic signatures of cells within distinct tumor microenvironments, showcasing its potential for precision oncology.

Assuntos

Melanoma , Neoplasias Cutâneas , Humanos , Proteoma , Proteômica , Medicina de Precisão , Microambiente Tumoral

5.

Quantitative multiorgan proteomics of fatal COVID-19 uncovers tissue-specific effects beyond inflammation.

Schweizer, Lisa; Schaller, Tina; Zwiebel, Maximilian; Karayel, Özge; Müller-Reif, Johannes Bruno; Zeng, Wen-Feng; Dintner, Sebastian; Nordmann, Thierry M; Hirschbühl, Klaus; Märkl, Bruno; Claus, Rainer; Mann, Matthias.

EMBO Mol Med ; 15(9): e17459, 2023 09 11.

Artigo em Inglês | MEDLINE | ID: mdl-37519267

RESUMO

SARS-CoV-2 may directly and indirectly damage lung tissue and other host organs, but there are few system-wide, untargeted studies of these effects on the human body. Here, we developed a parallelized mass spectrometry (MS) proteomics workflow enabling the rapid, quantitative analysis of hundreds of virus-infected FFPE tissues. The first layer of response to SARS-CoV-2 in all tissues was dominated by circulating inflammatory molecules. Beyond systemic inflammation, we differentiated between systemic and true tissue-specific effects to reflect distinct COVID-19-associated damage patterns. Proteomic changes in the lungs resembled those of diffuse alveolar damage (DAD) in non-COVID-19 patients. Extensive organ-specific changes were also evident in the kidneys, liver, and lymphatic and vascular systems. Secondary inflammatory effects in the brain were related to rearrangements in neurotransmitter receptors and myelin degradation. These MS-proteomics-derived results contribute substantially to our understanding of COVID-19 pathomechanisms and suggest strategies for organ-specific therapeutic interventions.

Assuntos

COVID-19 , Humanos , SARS-CoV-2 , Proteômica , Inflamação , Pulmão

6.

pGlycoQuant with a deep residual network for quantitative glycoproteomics at intact glycopeptide level.

Kong, Siyuan; Gong, Pengyun; Zeng, Wen-Feng; Jiang, Biyun; Hou, Xinhang; Zhang, Yang; Zhao, Huanhuan; Liu, Mingqi; Yan, Guoquan; Zhou, Xinwen; Qiao, Xihua; Wu, Mengxi; Yang, Pengyuan; Liu, Chao; Cao, Weiqian.

Nat Commun ; 13(1): 7539, 2022 12 07.

Artigo em Inglês | MEDLINE | ID: mdl-36477196

RESUMO

Large-scale intact glycopeptide identification has been advanced by software tools. However, tools for quantitative analysis remain lagging behind, which hinders exploring the differential site-specific glycosylation. Here, we report pGlycoQuant, a generic tool for both primary and tandem mass spectrometry-based intact glycopeptide quantitation. pGlycoQuant advances in glycopeptide matching through applying a deep learning model that reduces missing values by 19-89% compared with Byologic, MSFragger-Glyco, Skyline, and Proteome Discoverer, as well as a Match In Run algorithm for more glycopeptide coverage, greatly expanding the quantitative function of several widely used search engines, including pGlyco 2.0, pGlyco3, Byonic and MSFragger-Glyco. Further application of pGlycoQuant to the N-glycoproteomic study in three different metastatic HCC cell lines quantifies 6435 intact N-glycopeptides and, together with in vitro molecular biology experiments, illustrates site 979-core fucosylation of L1CAM as a potential regulator of HCC metastasis. We expected further applications of the freely available pGlycoQuant in glycoproteomic studies.

Assuntos

Carcinoma Hepatocelular , Neoplasias Hepáticas , Humanos , Biologia Molecular

7.

AlphaPeptDeep: a modular deep learning framework to predict peptide properties for proteomics.

Zeng, Wen-Feng; Zhou, Xie-Xuan; Willems, Sander; Ammar, Constantin; Wahle, Maria; Bludau, Isabell; Voytik, Eugenia; Strauss, Maximillian T; Mann, Matthias.

Nat Commun ; 13(1): 7238, 2022 11 24.

Artigo em Inglês | MEDLINE | ID: mdl-36433986

RESUMO

Machine learning and in particular deep learning (DL) are increasingly important in mass spectrometry (MS)-based proteomics. Recent DL models can predict the retention time, ion mobility and fragment intensities of a peptide just from the amino acid sequence with good accuracy. However, DL is a very rapidly developing field with new neural network architectures frequently appearing, which are challenging to incorporate for proteomics researchers. Here we introduce AlphaPeptDeep, a modular Python framework built on the PyTorch DL library that learns and predicts the properties of peptides ( https://github.com/MannLabs/alphapeptdeep ). It features a model shop that enables non-specialists to create models in just a few lines of code. AlphaPeptDeep represents post-translational modifications in a generic manner, even if only the chemical composition is known. Extensive use of transfer learning obviates the need for large data sets to refine models for particular experimental conditions. The AlphaPeptDeep models for predicting retention time, collisional cross sections and fragment intensities are at least on par with existing tools. Additional sequence-based properties can also be predicted by AlphaPeptDeep, as demonstrated with a HLA peptide prediction model to improve HLA peptide identification for data-independent acquisition ( https://github.com/MannLabs/PeptDeep-HLA ).

Assuntos

Aprendizado Profundo , Proteômica , Proteômica/métodos , Peptídeos/química , Sequência de Aminoácidos , Redes Neurais de Computação

8.

Accurate Proteoform Identification and Quantitation Using pTop 2.0.

Sun, Rui-Xiang; Wang, Rui-Min; Luo, Lan; Liu, Chao; Chi, Hao; Zeng, Wen-Feng; He, Si-Min.

Methods Mol Biol ; 2500: 105-129, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35657590

RESUMO

The remarkable advancement of top-down proteomics in the past decade is driven by the technological development in separation, mass spectrometry (MS) instrumentation, novel fragmentation, and bioinformatics. However, the accurate identification and quantification of proteoforms, all clearly-defined molecular forms of protein products from a single gene, remain a challenging computational task. This is in part due to the complicated mass spectra from intact proteoforms when compared to those from the digested peptides. Herein, pTop 2.0 is developed to fill in the gap between the large-scale complex top-down MS data and the shortage of high-accuracy bioinformatic tools. Compared with pTop 1.0, the first version, pTop 2.0 concentrates mainly on the identification of the proteoforms with unexpected modifications or a terminal truncation. The quantitation based on isotopic labeling is also a new function, which can be carried out by the convenient and user-friendly "one-key operation," integrated together with the qualitative identifications. The accuracy and running speed of pTop 2.0 is significantly improved on the test data sets. This chapter will introduce the main features, step-by-step running operations, and algorithmic developments of pTop 2.0 in order to push the identification and quantitation of intact proteoforms to a higher-accuracy level in top-down proteomics.

Assuntos

Proteoma , Proteômica , Espectrometria de Massas , Proteoma/metabolismo , Proteômica/métodos

9.

The structural context of posttranslational modifications at a proteome-wide scale.

Bludau, Isabell; Willems, Sander; Zeng, Wen-Feng; Strauss, Maximilian T; Hansen, Fynn M; Tanzer, Maria C; Karayel, Ozge; Schulman, Brenda A; Mann, Matthias.

PLoS Biol ; 20(5): e3001636, 2022 05.

Artigo em Inglês | MEDLINE | ID: mdl-35576205

RESUMO

The recent revolution in computational protein structure prediction provides folding models for entire proteomes, which can now be integrated with large-scale experimental data. Mass spectrometry (MS)-based proteomics has identified and quantified tens of thousands of posttranslational modifications (PTMs), most of them of uncertain functional relevance. In this study, we determine the structural context of these PTMs and investigate how this information can be leveraged to pinpoint potential regulatory sites. Our analysis uncovers global patterns of PTM occurrence across folded and intrinsically disordered regions. We found that this information can help to distinguish regulatory PTMs from those marking improperly folded proteins. Interestingly, the human proteome contains thousands of proteins that have large folded domains linked by short, disordered regions that are strongly enriched in regulatory phosphosites. These include well-known kinase activation loops that induce protein conformational changes upon phosphorylation. This regulatory mechanism appears to be widespread in kinases but also occurs in other protein families such as solute carriers. It is not limited to phosphorylation but includes ubiquitination and acetylation sites as well. Furthermore, we performed three-dimensional proximity analysis, which revealed examples of spatial coregulation of different PTM types and potential PTM crosstalk. To enable the community to build upon these first analyses, we provide tools for 3D visualization of proteomics data and PTMs as well as python libraries for data accession and processing.

Assuntos

Processamento de Proteína Pós-Traducional , Proteoma , Humanos , Espectrometria de Massas/métodos , Fosforilação , Proteômica/métodos

10.

Inhibition of Connexin 36 attenuates HMGB1-mediated depressive-like behaviors induced by chronic unpredictable mild stress.

Jiang, Qian; Li, Chao-Ran; Zeng, Wen-Feng; Xu, Hui-Jing; Li, Jia-Mei; Zhang, Ting; Deng, Guang-Hui; Wang, Yun-Xia.

Brain Behav ; 12(2): e2470, 2022 02.

Artigo em Inglês | MEDLINE | ID: mdl-35089644

RESUMO

BACKGROUND: High mobility group box 1 (HMGB1) released by neurons and microglia was demonstrated to be an important mediator in depressive-like behaviors induced by chronic unpredictable mild stress (CUMS), which could lead to the imbalance of two different metabolic approaches in kynurenine pathway (KP), thus enhancing glutamate transmission and exacerbating depressive-like behaviors. Evidence showed that HMGB1 signaling might be regulated by Connexin (Cx) 36 in inflammatory diseases of central nervous system (CNS). Our study aimed to further explore the role of Cx36 in depressive-like behaviors and its relationship with HMGB1. METHODS: After 4-week chronic stress, behavioral tests were conducted to evaluate depressive-like behaviors, including sucrose preference test (SPT), tail suspension test (TST), forced swimming test (FST), and open field test (OFT). Western blot analysis and immunofluorescence staining were used to observe the expression and location of Cx36. Enzyme-linked immunosorbent assay (ELISA) was adopted to detect the concentrations of inflammatory cytokines. And the excitability and inward currents of hippocampal neurons were recorded by whole-cell patch clamping. RESULTS: The expression of Cx36 was significantly increased in hippocampal neurons of mice exposed to CUMS, while treatment with glycyrrhizinic acid (GZA) or quinine could both down-regulate Cx36 and alleviate depressive-like behaviors. The proinflammatory cytokines like HMGB1, tumor necrosis factor alpha (TNF-α), and interleukin-1ß (IL-1ß) were all elevated by CUMS, and application of GZA and quinine could decrease them. In addition, the enhanced excitability and inward currents of hippocampal neurons induced by lipopolysaccharide (LPS) could be reduced by either GZA or quinine. CONCLUSIONS: Inhibition of Cx36 in hippocampal neurons might attenuates HMGB1-mediated depressive-like behaviors induced by CUMS through down-regulation of the proinflammatory cytokines and reduction of the excitability and intracellular ion overload.

Assuntos

Proteína HMGB1 , Animais , Antidepressivos/farmacologia , Comportamento Animal , Conexinas/metabolismo , Citocinas/metabolismo , Depressão/tratamento farmacológico , Depressão/metabolismo , Modelos Animais de Doenças , Hipocampo/metabolismo , Camundongos , Quinina/metabolismo , Estresse Psicológico/complicações , Estresse Psicológico/metabolismo , Proteína delta-2 de Junções Comunicantes

11.

Precise, fast and comprehensive analysis of intact glycopeptides and modified glycans with pGlyco3.

Zeng, Wen-Feng; Cao, Wei-Qian; Liu, Ming-Qi; He, Si-Min; Yang, Peng-Yuan.

Nat Methods ; 18(12): 1515-1523, 2021 12.

Artigo em Inglês | MEDLINE | ID: mdl-34824474

RESUMO

Great advances have been made in mass spectrometric data interpretation for intact glycopeptide analysis. However, accurate identification of intact glycopeptides and modified saccharide units at the site-specific level and with fast speed remains challenging. Here, we present a glycan-first glycopeptide search engine, pGlyco3, to comprehensively analyze intact N- and O-glycopeptides, including glycopeptides with modified saccharide units. A glycan ion-indexing algorithm developed for glycan-first search makes pGlyco3 5-40 times faster than other glycoproteomic search engines without decreasing accuracy or sensitivity. By combining electron-based dissociation spectra, pGlyco3 integrates a dynamic programming-based algorithm termed pGlycoSite for site-specific glycan localization. Our evaluation shows that the site-specific glycan localization probabilities estimated by pGlycoSite are suitable to localize site-specific glycans. With pGlyco3, we confidently identified N-glycopeptides and O-mannose glycopeptides that were extensively modified by ammonia adducts in yeast samples. The freely available pGlyco3 is an accurate and flexible tool that can be used to identify glycopeptides and modified saccharide units.

Assuntos

Biologia Computacional/métodos , Glicopeptídeos/química , Proteoma , Proteômica/métodos , Algoritmos , Animais , Vaga-Lumes , Glicosilação , Células HEK293 , Humanos , Manose/química , Polissacarídeos/química , Probabilidade , Reprodutibilidade dos Testes , Saccharomyces cerevisiae , Schizosaccharomyces , Software

12.

Deep-Learning-Derived Evaluation Metrics Enable Effective Benchmarking of Computational Tools for Phosphopeptide Identification.

Jiang, Wen; Wen, Bo; Li, Kai; Zeng, Wen-Feng; da Veiga Leprevost, Felipe; Moon, Jamie; Petyuk, Vladislav A; Edwards, Nathan J; Liu, Tao; Nesvizhskii, Alexey I; Zhang, Bing.

Mol Cell Proteomics ; 20: 100171, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34737085

RESUMO

Tandem mass spectrometry (MS/MS)-based phosphoproteomics is a powerful technology for global phosphorylation analysis. However, applying four computational pipelines to a typical mass spectrometry (MS)-based phosphoproteomic dataset from a human cancer study, we observed a large discrepancy among the reported phosphopeptide identification and phosphosite localization results, underscoring a critical need for benchmarking. While efforts have been made to compare performance of computational pipelines using data from synthetic phosphopeptides, evaluations involving real application data have been largely limited to comparing the numbers of phosphopeptide identifications due to the lack of appropriate evaluation metrics. We investigated three deep-learning-derived features as potential evaluation metrics: phosphosite probability, Delta RT, and spectral similarity. Predicted phosphosite probability is computed by MusiteDeep, which provides high accuracy as previously reported; Delta RT is defined as the absolute retention time (RT) difference between RTs observed and predicted by AutoRT; and spectral similarity is defined as the Pearson's correlation coefficient between spectra observed and predicted by pDeep2. Using a synthetic peptide dataset, we found that both Delta RT and spectral similarity provided excellent discrimination between correct and incorrect peptide-spectrum matches (PSMs) both when incorrect PSMs involved wrong peptide sequences and even when incorrect PSMs were caused by only incorrect phosphosite localization. Based on these results, we used all the three deep-learning-derived features as evaluation metrics to compare different computational pipelines on diverse set of phosphoproteomic datasets and showed their utility in benchmarking performance of the pipelines. The benchmark metrics demonstrated in this study will enable users to select computational pipelines and parameters for routine analysis of phosphoproteomics data and will offer guidance for developers to improve computational methods.

Assuntos

Aprendizado Profundo , Fosfopeptídeos/análise , Animais , Benchmarking , Linhagem Celular , Humanos , Camundongos , Fosforilação , Proteômica/métodos

13.

Artificial intelligence for proteomics and biomarker discovery.

Mann, Matthias; Kumar, Chanchal; Zeng, Wen-Feng; Strauss, Maximilian T.

Cell Syst ; 12(8): 759-770, 2021 08 18.

Artigo em Inglês | MEDLINE | ID: mdl-34411543

RESUMO

There is an avalanche of biomedical data generation and a parallel expansion in computational capabilities to analyze and make sense of these data. Starting with genome sequencing and widely employed deep sequencing technologies, these trends have now taken hold in all omics disciplines and increasingly call for multi-omics integration as well as data interpretation by artificial intelligence technologies. Here, we focus on mass spectrometry (MS)-based proteomics and describe how machine learning and, in particular, deep learning now predicts experimental peptide measurements from amino acid sequences alone. This will dramatically improve the quality and reliability of analytical workflows because experimental results should agree with predictions in a multi-dimensional data landscape. Machine learning has also become central to biomarker discovery from proteomics data, which now starts to outperform existing best-in-class assays. Finally, we discuss model transparency and explainability and data privacy that are required to deploy MS-based biomarkers in clinical settings.

Assuntos

Inteligência Artificial , Proteômica , Biomarcadores/análise , Espectrometria de Massas/métodos , Proteômica/métodos , Reprodutibilidade dos Testes

14.

pDeepXL: MS/MS Spectrum Prediction for Cross-Linked Peptide Pairs by Deep Learning.

Chen, Zhen-Lin; Mao, Peng-Zhi; Zeng, Wen-Feng; Chi, Hao; He, Si-Min.

J Proteome Res ; 20(5): 2570-2582, 2021 05 07.

Artigo em Inglês | MEDLINE | ID: mdl-33821641

RESUMO

In cross-linking mass spectrometry, the identification of cross-linked peptide pairs heavily relies on the ability of a database search engine to measure the similarities between experimental and theoretical MS/MS spectra. However, the lack of accurate ion intensities in theoretical spectra impairs the performance of search engines, in particular, on proteome scales. Here we introduce pDeepXL, a deep neural network to predict MS/MS spectra of cross-linked peptide pairs. To train pDeepXL, we used the transfer-learning technique because it facilitated the training with limited benchmark data of cross-linked peptide pairs. Test results on more than ten data sets showed that pDeepXL accurately predicted the spectra of both noncleavable DSS/BS3/Leiker cross-linked peptide pairs (>80% of predicted spectra have Pearson's r values higher than 0.9) and cleavable DSSO/DSBU cross-linked peptide pairs (>75% of predicted spectra have Pearson's r values higher than 0.9). pDeepXL also achieved the accurate prediction on unseen data sets using an online fine-tuning technique. Lastly, integrating pDeepXL into a database search engine increased the number of identified cross-link spectra by 18% on average.

Assuntos

Aprendizado Profundo , Espectrometria de Massas em Tandem , Algoritmos , Redes Neurais de Computação , Peptídeos , Proteoma

15.

pDeep3: Toward More Accurate Spectrum Prediction with Fast Few-Shot Learning.

Tarn, Ching; Zeng, Wen-Feng.

Anal Chem ; 93(14): 5815-5822, 2021 04 13.

Artigo em Inglês | MEDLINE | ID: mdl-33797898

RESUMO

Spectrum prediction using deep learning has attracted a lot of attention in recent years. Although existing deep learning methods have dramatically increased the prediction accuracy, there is still considerable space for improvement, which is presently limited by the difference of fragmentation types or instrument settings. In this work, we use the few-shot learning method to fit the data online to make up for the shortcoming. The method is evaluated using ten data sets, where the instruments includes Velos, QE, Lumos, and Sciex, with collision energies being differently set. Experimental results show that few-shot learning can achieve higher prediction accuracy with almost negligible computing resources. For example, on the data set from a untrained instrument Sciex-6600, within about 10 s, the prediction accuracy is increased from 69.7% to 86.4%; on the CID (collision-induced dissociation) data set, the prediction accuracy of the model trained by HCD (higher energy collision dissociation) spectra is increased from 48.0% to 83.9%. It is also shown that, the method is not critical to data quality and is sufficiently efficient to fill the accuracy gap. The source code of pDeep3 is available at http://pfind.ict.ac.cn/software/pdeep3.

16.

N6-methyladenosine Regulator-Mediated Immune Genes Identify Breast Cancer Immune Subtypes and Predict Immunotherapy Efficacy.

Zhang, Meng-Meng; Lin, Yi-Lin; Zeng, Wen-Feng; Li, Yang; Yang, Yang; Liu, Miao; Ye, Ying-Jiang; Jiang, Ke-Wei; Wang, Shu; Wang, Shan.

Front Genet ; 12: 790888, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34976022

RESUMO

Breast cancer (BRCA) is a heterogeneous malignancy closely related to the tumor microenvironment (TME) cell infiltration. N6-methyladenosine (m6A) modification of mRNA plays a crucial regulator in regulating the immune microenvironment of BRCA. Immunotherapy represents a paradigm shift in BRCA treatment; however, lack of an appropriate approach for treatment evaluation is a significant issue in this field. In this study, we attempted to establish a prognostic signature of BRCA based on m6A-related immune genes and to investigate the potential association between prognosis and immunotherapy. We comprehensively evaluated the m6A modification patterns of BRCA tissues and non-tumor tissues from The Cancer Genome Atlas and the modification patterns with TME cell-infiltrating characteristics. Overall, 1,977 TME-related genes were identified in the literature. Based on LASSO and Cox regression analyses, the m6A-related immune score (m6A-IS) was established to characterize the TME of BRCA and predict prognosis and efficacy associated with immunotherapy. We developed an m6A-IS to effectively predict immune infiltration and the prognosis of patients with BRCA. The prognostic score model represented robust predictive performance in both the training and validation cohorts. The low-m6A-IS group was characterized by enhanced antigen presentation and improved immune checkpoint expression, further indicating sensitivity to immunotherapy. Compared with the patients in the high-score group, the overall survival rate after treatment in the low-score group was significantly higher in the testing and validation cohorts. We constructed an m6A-IS system to examine the ability of the m6A signature to predict the infiltration of immune cells of the TME in BRCA, and the m6A-IS system acted as an independent prognostic biomarker that predicts the response of patients with BRCA in immunotherapy.

17.

Deep Learning in Proteomics.

Wen, Bo; Zeng, Wen-Feng; Liao, Yuxing; Shi, Zhiao; Savage, Sara R; Jiang, Wen; Zhang, Bing.

Proteomics ; 20(21-22): e1900335, 2020 11.

Artigo em Inglês | MEDLINE | ID: mdl-32939979

RESUMO

Proteomics, the study of all the proteins in biological systems, is becoming a data-rich science. Protein sequences and structures are comprehensively catalogued in online databases. With recent advancements in tandem mass spectrometry (MS) technology, protein expression and post-translational modifications (PTMs) can be studied in a variety of biological systems at the global scale. Sophisticated computational algorithms are needed to translate the vast amount of data into novel biological insights. Deep learning automatically extracts data representations at high levels of abstraction from data, and it thrives in data-rich scientific research domains. Here, a comprehensive overview of deep learning applications in proteomics, including retention time prediction, MS/MS spectrum prediction, de novo peptide sequencing, PTM prediction, major histocompatibility complex-peptide binding prediction, and protein structure prediction, is provided. Limitations and the future directions of deep learning in proteomics are also discussed. This review will provide readers an overview of deep learning and how it can be used to analyze proteomics data.

Assuntos

Aprendizado Profundo , Proteômica , Algoritmos , Processamento de Proteína Pós-Traducional , Espectrometria de Massas em Tandem

18.

A Deep Learning-Based Tumor Classifier Directly Using MS Raw Data.

Dong, Hao; Liu, Yi; Zeng, Wen-Feng; Shu, Kunxian; Zhu, Yunping; Chang, Cheng.

Proteomics ; 20(21-22): e1900344, 2020 11.

Artigo em Inglês | MEDLINE | ID: mdl-32643271

RESUMO

Since the launch of Chinese Human Proteome Project (CNHPP) and Clinical Proteomic Tumor Analysis Consortium (CPTAC), large-scale mass spectrometry (MS) based proteomic profiling of different kinds of human tumor samples have provided huge amount of valuable data for both basic and clinical researchers. Accurate prediction for tumor and non-tumor samples, as well as the tumor types has become a key step for biological and medical research, such as biomarker discovery, diagnosis, and monitoring of diseases. The traditional MS-based classification strategy mainly depends on the identification and quantification results of MS data, which has some inherent limitations, such as the low identification rate of MS data. Here, a deep learning-based tumor classifier directly using MS raw data is proposed, which is independent of the identification and quantification results of MS data. The potential precursors with intensities and retention times from MS data as input is first detected and extracted. Then, a deep learning-based classifier is trained, which can accurately distinguish between the tumor and non-tumor samples. Finally, it is demonstrated the deep learning-based classifier has a good performance compared with other machine learning methods and may help researchers find the potential biomarkers which are likely to be missed by the traditional strategy.

Assuntos

Aprendizado Profundo , Neoplasias , Proteômica , Humanos , Espectrometria de Massas , Proteoma

19.

[Progress of minimally invasive treatment about fragility fractures of pelvis].

Zeng, Wen-Feng; Li, Yi-Nan; Wang, Ce.

Zhongguo Gu Shang ; 32(9): 872-875, 2019 Sep 25.

Artigo em Chinês | MEDLINE | ID: mdl-31615189

RESUMO

With the serious aging of the population, the incidence of fragility fractures of the pelvis(FFPs) has gradually increased, which has become a public problem affecting the living quality of the elderly. When a surgical treatment is chosen, the procedure should be as minimal invasive as possible and avoid all surgical complications. In recent years, different techniques for percutaneous or less invasive fixation of the posterior pelvic ring have been developed. Their advantages and limitations are presented: sacroplasty, iliosacral screw osteosynthesis, cement augmentation, transiliac internal fixation, transsacral osteosynthesis, lumbopelvic fixation. The purpose of this paper is to review the classification and minimally invasive treatment of FFP.

Assuntos

Fraturas Ósseas , Ossos Pélvicos , Idoso , Parafusos Ósseos , Fixação Interna de Fraturas , Fraturas Ósseas/cirurgia , Humanos , Pelve

20.

pNovo 3: precise de novo peptide sequencing using a learning-to-rank framework.

Yang, Hao; Chi, Hao; Zeng, Wen-Feng; Zhou, Wen-Jing; He, Si-Min.

Bioinformatics ; 35(14): i183-i190, 2019 07 15.

Artigo em Inglês | MEDLINE | ID: mdl-31510687

RESUMO

MOTIVATION: De novo peptide sequencing based on tandem mass spectrometry data is the key technology of shotgun proteomics for identifying peptides without any database and assembling unknown proteins. However, owing to the low ion coverage in tandem mass spectra, the order of certain consecutive amino acids cannot be determined if all of their supporting fragment ions are missing, which results in the low precision of de novo sequencing. RESULTS: In order to solve this problem, we developed pNovo 3, which used a learning-to-rank framework to distinguish similar peptide candidates for each spectrum. Three metrics for measuring the similarity between each experimental spectrum and its corresponding theoretical spectrum were used as important features, in which the theoretical spectra can be precisely predicted by the pDeep algorithm using deep learning. On seven benchmark datasets from six diverse species, pNovo 3 recalled 29-102% more correct spectra, and the precision was 11-89% higher than three other state-of-the-art de novo sequencing algorithms. Furthermore, compared with the newly developed DeepNovo, which also used the deep learning approach, pNovo 3 still identified 21-50% more spectra on the nine datasets used in the study of DeepNovo. In summary, the deep learning and learning-to-rank techniques implemented in pNovo 3 significantly improve the precision of de novo sequencing, and such machine learning framework is worth extending to other related research fields to distinguish the similar sequences. AVAILABILITY AND IMPLEMENTATION: pNovo 3 can be freely downloaded from http://pfind.ict.ac.cn/software/pNovo/index.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Peptídeos , Proteômica , Análise de Sequência de Proteína , Algoritmos , Software , Espectrometria de Massas em Tandem

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA