Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
Immunity ; 53(5): 1108-1122.e5, 2020 11 17.
Artículo en Inglés | MEDLINE | ID: mdl-33128875

RESUMEN

The coronavirus disease 2019 (COVID-19) pandemic is a global public health crisis. However, little is known about the pathogenesis and biomarkers of COVID-19. Here, we profiled host responses to COVID-19 by performing plasma proteomics of a cohort of COVID-19 patients, including non-survivors and survivors recovered from mild or severe symptoms, and uncovered numerous COVID-19-associated alterations of plasma proteins. We developed a machine-learning-based pipeline to identify 11 proteins as biomarkers and a set of biomarker combinations, which were validated by an independent cohort and accurately distinguished and predicted COVID-19 outcomes. Some of the biomarkers were further validated by enzyme-linked immunosorbent assay (ELISA) using a larger cohort. These markedly altered proteins, including the biomarkers, mediate pathophysiological pathways, such as immune or inflammatory responses, platelet degranulation and coagulation, and metabolism, that likely contribute to the pathogenesis. Our findings provide valuable knowledge about COVID-19 biomarkers and shed light on the pathogenesis and potential therapeutic targets of COVID-19.


Asunto(s)
Infecciones por Coronavirus/sangre , Infecciones por Coronavirus/patología , Plasma/metabolismo , Neumonía Viral/sangre , Neumonía Viral/patología , Adulto , Anciano , Anciano de 80 o más Años , Betacoronavirus , Biomarcadores/sangre , Proteínas Sanguíneas/metabolismo , COVID-19 , Infecciones por Coronavirus/clasificación , Infecciones por Coronavirus/metabolismo , Femenino , Humanos , Aprendizaje Automático , Masculino , Persona de Mediana Edad , Pandemias/clasificación , Neumonía Viral/clasificación , Neumonía Viral/metabolismo , Proteómica , Reproducibilidad de los Resultados , SARS-CoV-2
2.
Nucleic Acids Res ; 51(W1): W243-W250, 2023 07 05.
Artículo en Inglés | MEDLINE | ID: mdl-37158278

RESUMEN

Protein phosphorylation, catalyzed by protein kinases (PKs), is one of the most important post-translational modifications (PTMs), and involved in regulating almost all of biological processes. Here, we report an updated server, Group-based Prediction System (GPS) 6.0, for prediction of PK-specific phosphorylation sites (p-sites) in eukaryotes. First, we pre-trained a general model using penalized logistic regression (PLR), deep neural network (DNN), and Light Gradient Boosting Machine (LightGMB) on 490 762 non-redundant p-sites in 71 407 proteins. Then, transfer learning was conducted to obtain 577 PK-specific predictors at the group, family and single PK levels, using a well-curated data set of 30 043 known site-specific kinase-substrate relations in 7041 proteins. Together with the evolutionary information, GPS 6.0 could hierarchically predict PK-specific p-sites for 44046 PKs in 185 species. Besides the basic statistics, we also offered the knowledge from 22 public resources to annotate the prediction results, including the experimental evidence, physical interactions, sequence logos, and p-sites in sequences and 3D structures. The GPS 6.0 server is freely available at https://gps.biocuckoo.cn. We believe that GPS 6.0 could be a highly useful service for further analysis of phosphorylation.


Asunto(s)
Biología Computacional , Proteínas , Programas Informáticos , Fosforilación , Proteínas Quinasas/química , Proteínas Quinasas/metabolismo , Procesamiento Proteico-Postraduccional , Proteínas/química , Proteínas/metabolismo , Biología Computacional/instrumentación , Biología Computacional/métodos , Internet
3.
Brief Bioinform ; 23(2)2022 03 10.
Artículo en Inglés | MEDLINE | ID: mdl-35037020

RESUMEN

As an important post-translational modification, lysine ubiquitination participates in numerous biological processes and is involved in human diseases, whereas the site specificity of ubiquitination is mainly decided by ubiquitin-protein ligases (E3s). Although numerous ubiquitination predictors have been developed, computational prediction of E3-specific ubiquitination sites is still a great challenge. Here, we carefully reviewed the existing tools for the prediction of general ubiquitination sites. Also, we developed a tool named GPS-Uber for the prediction of general and E3-specific ubiquitination sites. From the literature, we manually collected 1311 experimentally identified site-specific E3-substrate relations, which were classified into different clusters based on corresponding E3s at different levels. To predict general ubiquitination sites, we integrated 10 types of sequence and structure features, as well as three types of algorithms including penalized logistic regression, deep neural network and convolutional neural network. Compared with other existing tools, the general model in GPS-Uber exhibited a highly competitive accuracy, with an area under curve values of 0.7649. Then, transfer learning was adopted for each E3 cluster to construct E3-specific models, and in total 112 individual E3-specific predictors were implemented. Using GPS-Uber, we conducted a systematic prediction of human cancer-associated ubiquitination events, which could be helpful for further experimental consideration. GPS-Uber will be regularly updated, and its online service is free for academic research at http://gpsuber.biocuckoo.cn/.


Asunto(s)
Lisina , Ubiquitina-Proteína Ligasas , Algoritmos , Humanos , Lisina/metabolismo , Procesamiento Proteico-Postraduccional , Ubiquitina-Proteína Ligasas/química , Ubiquitina-Proteína Ligasas/genética , Ubiquitina-Proteína Ligasas/metabolismo , Ubiquitinación
4.
Nucleic Acids Res ; 50(D1): D451-D459, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34581824

RESUMEN

Here, we reported the compendium of protein lysine modifications (CPLM 4.0, http://cplm.biocuckoo.cn/), a data resource for various post-translational modifications (PTMs) specifically occurred at the side-chain amino group of lysine residues in proteins. From the literature and public databases, we collected 450 378 protein lysine modification (PLM) events, and combined them with the existing data of our previously developed protein lysine modification database (PLMD 3.0). In total, CPLM 4.0 contained 592 606 experimentally identified modification events on 463 156 unique lysine residues of 105 673 proteins for up to 29 types of PLMs across 219 species. Furthermore, we carefully annotated the data using the knowledge from 102 additional resources that covered 13 aspects, including variation and mutation, disease-associated information, protein-protein interaction, protein functional annotation, DNA & RNA element, protein structure, chemical-target relation, mRNA expression, protein expression/proteomics, subcellular localization, biological pathway annotation, functional domain annotation, and physicochemical property. Compared to PLMD 3.0 and other existing resources, CPLM 4.0 achieved a >2-fold increase in collection of PLM events, with a data volume of ∼45GB. We anticipate that CPLM 4.0 can serve as a more useful database for further study of PLMs.


Asunto(s)
Bases de Datos de Proteínas , Lisina/metabolismo , Procesamiento Proteico-Postraduccional , Proteínas/metabolismo , Programas Informáticos , Acetilación , Animales , Bacterias/genética , Bacterias/metabolismo , Biotinilación , Humanos , Hidroxilación , Internet , Lisina/química , Metilación , Modelos Moleculares , Anotación de Secuencia Molecular , Mutación , Fosforilación , Plantas/genética , Plantas/metabolismo , Unión Proteica , Conformación Proteica , Mapeo de Interacción de Proteínas , Proteínas/química , Proteínas/genética , ARN Mensajero/genética , ARN Mensajero/metabolismo , Ubiquitinación
5.
Nucleic Acids Res ; 50(W1): W405-W411, 2022 07 05.
Artículo en Inglés | MEDLINE | ID: mdl-35670661

RESUMEN

Recent high-throughput omics techniques have produced a large amount of biological data. Visualization of big omics data is essential to answer a wide range of biological problems. As a concise but comprehensive strategy, a heatmap can analyze and visualize high-dimensional and heterogeneous biomolecular expression data in an attractive artwork. In 2014, we developed a stand-alone software package, Heat map Illustrator (HemI 1.0), which implemented three clustering methods and seven distance metrics for heatmap illustration. Here, we significantly improved 1.0 and released the online service of HemI 2.0, in which 7 clustering methods and 22 types of distance metrics were implemented. In HemI 2.0, the clustering results and publication-quality heatmaps can be exported directly. For an in-depth analysis of the data, we further added an option of enrichment analysis for 12 model organisms, with 15 types of functional annotations. The enrichment results can be visualized in five idioms, including bubble chart, bar graph, coxcomb chart, pie chart and word cloud. We anticipate that HemI 2.0 can be a helpful web server for visualization of biomolecular expression data, as well as the additional enrichment analysis. HemI 2.0 is freely available for all users at: https://hemi.biocuckoo.org/.


Asunto(s)
Análisis por Conglomerados , Análisis de Datos , Visualización de Datos , Internet , Programas Informáticos , Macrodatos , Animales , Modelos Animales , Perfilación de la Expresión Génica/métodos
6.
Brief Bioinform ; 22(2): 1836-1847, 2021 03 22.
Artículo en Inglés | MEDLINE | ID: mdl-32248222

RESUMEN

As an important reversible lipid modification, S-palmitoylation mainly occurs at specific cysteine residues in proteins, participates in regulating various biological processes and is associated with human diseases. Besides experimental assays, computational prediction of S-palmitoylation sites can efficiently generate helpful candidates for further experimental consideration. Here, we reviewed the current progress in the development of S-palmitoylation site predictors, as well as training data sets, informative features and algorithms used in these tools. Then, we compiled a benchmark data set containing 3098 known S-palmitoylation sites identified from small- or large-scale experiments, and developed a new method named data quality discrimination (DQD) to distinguish data quality weights (DQWs) between the two types of the sites. Besides DQD and our previous methods, we encoded sequence similarity values into images, constructed a deep learning framework of convolutional neural networks (CNNs) and developed a novel algorithm of graphic presentation system (GPS) 6.0. We further integrated nine additional types of sequence-based and structural features, implemented parallel CNNs (pCNNs) and designed a new predictor called GPS-Palm. Compared with other existing tools, GPS-Palm showed a >31.3% improvement of the area under the curve (AUC) value (0.855 versus 0.651) for general prediction of S-palmitoylation sites. We also produced two species-specific predictors, with corresponding AUC values of 0.900 and 0.897 for predicting human- and mouse-specific sites, respectively. GPS-Palm is free for academic research at http://gpspalm.biocuckoo.cn/.


Asunto(s)
Gráficos por Computador , Aprendizaje Profundo , Lipoilación , Proteínas/química , Algoritmos , Animales , Biología Computacional/métodos , Humanos , Ratones , Programas Informáticos
7.
Nucleic Acids Res ; 48(D1): D288-D295, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31691822

RESUMEN

Here, we presented an integrative database named DrLLPS (http://llps.biocuckoo.cn/) for proteins involved in liquid-liquid phase separation (LLPS), which is a ubiquitous and crucial mechanism for spatiotemporal organization of various biochemical reactions, by creating membraneless organelles (MLOs) in eukaryotic cells. From the literature, we manually collected 150 scaffold proteins that are drivers of LLPS, 987 regulators that contribute in modulating LLPS, and 8148 potential client proteins that might be dispensable for the formation of MLOs, which were then categorized into 40 biomolecular condensates. We searched potential orthologs of these known proteins, and in total DrLLPS contained 437 887 known and potential LLPS-associated proteins in 164 eukaryotes. Furthermore, we carefully annotated LLPS-associated proteins in eight model organisms, by using the knowledge integrated from 110 widely used resources that covered 16 aspects, including protein disordered regions, domain annotations, post-translational modifications (PTMs), genetic variations, cancer mutations, molecular interactions, disease-associated information, drug-target relations, physicochemical property, protein functional annotations, protein expressions/proteomics, protein 3D structures, subcellular localizations, mRNA expressions, DNA & RNA elements, and DNA methylations. We anticipate DrLLPS can serve as a helpful resource for further analysis of LLPS.


Asunto(s)
Bases de Datos Factuales , Eucariontes , Proteínas/química , Proteínas/metabolismo , Genoma , Proteínas Intrínsecamente Desordenadas/química , Proteínas Intrínsecamente Desordenadas/metabolismo , Orgánulos , Procesamiento Proteico-Postraduccional , Interfaz Usuario-Computador
8.
Nucleic Acids Res ; 47(D1): D344-D350, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30380109

RESUMEN

Here, we described the updated database iEKPD 2.0 (http://iekpd.biocuckoo.org) for eukaryotic protein kinases (PKs), protein phosphatases (PPs) and proteins containing phosphoprotein-binding domains (PPBDs), which are key molecules responsible for phosphorylation-dependent signalling networks and participate in the regulation of almost all biological processes and pathways. In total, iEKPD 2.0 contained 197 348 phosphorylation regulators, including 109 912 PKs, 23 294 PPs and 68 748 PPBD-containing proteins in 164 eukaryotic species. In particular, we provided rich annotations for the regulators of eight model organisms, especially humans, by compiling and integrating the knowledge from 100 widely used public databases that cover 13 aspects, including cancer mutations, genetic variations, disease-associated information, mRNA expression, DNA & RNA elements, DNA methylation, molecular interactions, drug-target relations, protein 3D structures, post-translational modifications, protein expressions/proteomics, subcellular localizations and protein functional annotations. Compared with our previously developed EKPD 1.0 (∼0.5 GB), iEKPD 2.0 contains ∼99.8 GB of data with an ∼200-fold increase in data volume. We anticipate that iEKPD 2.0 represents a more useful resource for further study of phosphorylation regulators.


Asunto(s)
Bases de Datos de Proteínas , Eucariontes/genética , Anotación de Secuencia Molecular , Fosfoproteínas Fosfatasas/genética , Proteínas Quinasas/genética , Animales , Recolección de Datos , Humanos , Fosfoproteínas/metabolismo , Fosforilación , Dominios Proteicos/genética , Procesamiento Proteico-Postraduccional , Interfaz Usuario-Computador
9.
Comput Struct Biotechnol J ; 23: 2507-2515, 2024 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38974887

RESUMEN

The incidence of early-onset colorectal cancer (EOCRC) has increased significantly worldwide. Uncovering biomarkers that are unique to EOCRC is of great importance to facilitate the prevention and detection of this growing cancer subtype. Although efforts have been made in the data curation about CRC, there is no integrated platform that gives access to data specifically related to young CRC patients. Here, we constructed a user-friendly open integrated resource called CRCDB (URL: http://crcdb-hust.com) which contains multi-omics data of 785 EOCRC, 4898 late-onset CRCs (LOCRC), and 1110 normal control samples from tissue, whole blood, platelets, and serum exosomes. CRCDB manages the differential analysis, survival analysis, co-expression analysis, and immune cell infiltration comparison analysis results in different CRC groups. Meta-analysis results were also provided for users for further data interpretation. Using the resource in CRCDB, we identified that genes associated with the metabolic process were less expressed in EOCRC patients, while up regulated genes most associated with the mitosis process might play an important role in the molecular pathogenesis of LOCRC. Survival-related genes were most enriched in oxidoreduction pathways in EOCRC while in immune-related pathways in LOCRC. With all the data gathered and processed, we anticipate that CRCDB could be a practical data mining platform to help explore potential applications of omics data and develop effective prevention and therapeutic strategies for the specific group of CRC patients.

10.
Heliyon ; 10(10): e31380, 2024 May 30.
Artículo en Inglés | MEDLINE | ID: mdl-38803927

RESUMEN

Objective: Our aim was to develop and validate a nomogram for predicting the in-hospital 14-day (14 d) and 28-day (28 d) survival rates of patients with coronavirus disease 2019 (COVID-19). Methods: Clinical data of patients with COVID-19 admitted to the Renmin Hospital of Wuhan University from December 2022 to February 2023 and the north campus of Shanghai Ninth People's Hospital from April 2022 to June 2022 were collected. A total of 408 patients from Renmin Hospital of Wuhan University were selected as the training cohort, and 151 patients from Shanghai Ninth People's Hospital were selected as the verification cohort. Independent variables were screened using Cox regression analysis, and a nomogram was constructed using R software. The prediction accuracy of the nomogram was evaluated using the receiver operating characteristic (ROC) curve, C-index, and calibration curve. Decision curve analysis was used to evaluate the clinical application value of the model. The nomogram was externally validated using a validation cohort. Result: In total, 559 patients with severe/critical COVID-19 were included in this study, of whom 179 (32.02 %) died. Multivariate Cox regression analysis showed that age >80 years [hazard ratio (HR) = 1.539, 95 % confidence interval (CI): 1.027-2.306, P = 0.037], history of diabetes (HR = 1.741, 95 % CI: 1.253-2.420, P = 0.001), high APACHE II score (HR = 1.083, 95 % CI: 1.042-1.126, P < 0.001), sepsis (HR = 2.387, 95 % CI: 1.707-3.338, P < 0.001), high neutrophil-to-lymphocyte ratio (NLR) (HR = 1.010, 95 % CI: 1.003-1.017, P = 0.007), and high D-dimer level (HR = 1.005, 95 % CI: 1.001-1.009, P = 0.028) were independent risk factors for 14 d and 28 d survival rates, whereas COVID-19 vaccination (HR = 0.625, 95 % CI: 0.440-0.886, P = 0.008) was a protective factor affecting prognosis. ROC curve analysis showed that the area under the curve (AUC) of the 14 d and 28 d hospital survival rates in the training cohort was 0.765 (95 % CI: 0.641-0.923) and 0.814 (95 % CI: 0.702-0.938), respectively, and the AUC of the 14 d and 28 d hospital survival rates in the verification cohort was 0.898 (95 % CI: 0.765-0.962) and 0.875 (95 % CI: 0.741-0.945), respectively. The calibration curves of 14 d and 28 d hospital survival showed that the predicted probability of the model agreed well with the actual probability. Decision curve analysis (DCA) showed that the nomogram has high clinical application value. Conclusion: In-hospital survival rates of patients with COVID-19 were predicted using a nomogram, which will help clinicians in make appropriate clinical decisions.

11.
Front Immunol ; 14: 1326018, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38143770

RESUMEN

Background: Ovarian cancer (OC) is a highly heterogeneous and malignant gynecological cancer, thereby leading to poor clinical outcomes. The study aims to identify and characterize clinically relevant subtypes in OC and develop a diagnostic model that can precisely stratify OC patients, providing more diagnostic clues for OC patients to access focused therapeutic and preventative strategies. Methods: Gene expression datasets of OC were retrieved from TCGA and GEO databases. To evaluate immune cell infiltration, the ESTIMATE algorithm was applied. A univariate Cox analysis and the two-sided log-rank test were used to screen OC risk factors. We adopted the ConsensusClusterPlus algorithm to determine OC subtypes. Enrichment analysis based on KEGG and GO was performed to determine enriched pathways of signature genes for each subtype. The machine learning algorithm, support vector machine (SVM) was used to select the feature gene and develop a diagnostic model. A ROC curve was depicted to evaluate the model performance. Results: A total of 1,273 survival-related genes (SRGs) were firstly determined and used to clarify OC samples into different subtypes based on their different molecular pattern. SRGs were successfully stratified in OC patients into three robust subtypes, designated S-I (Immunoreactive and DNA Damage repair), S-II (Mixed), and S-III (Proliferative and Invasive). S-I had more favorable OS and DFS, whereas S-III had the worst prognosis and was enriched with OC patients at advanced stages. Meanwhile, comprehensive functional analysis highlighted differences in biological pathways: genes associated with immune function and DNA damage repair including CXCL9, CXCL10, CXCL11, APEX, APEX2, and RBX1 were enriched in S-I; S-II combined multiple gene signatures including genes associated with metabolism and transcription; and the gene signature of S-III was extensively involved in pathways reflecting malignancies, including many core kinases and transcription factors involved in cancer such as CDK6, ERBB2, JAK1, DAPK1, FOXO1, and RXRA. The SVM model showed superior diagnostic performance with AUC values of 0.922 and 0.901, respectively. Furthermore, a new dataset of the independent cohort could be automatically analyzed by this innovative pipeline and yield similar results. Conclusion: This study exploited an innovative approach to construct previously unexplored robust subtypes significantly related to different clinical and molecular features for OC and a diagnostic model using SVM to aid in clinical diagnosis and treatment. This investigation also illustrated the importance of targeting innate immune suppression together with DNA damage in OC, offering novel insights for further experimental exploration and clinical trial.


Asunto(s)
Genes cdc , Neoplasias Ováricas , Humanos , Femenino , Pronóstico , Neoplasias Ováricas/diagnóstico , Neoplasias Ováricas/genética , Algoritmos
12.
Nat Commun ; 14(1): 2813, 2023 05 17.
Artículo en Inglés | MEDLINE | ID: mdl-37198164

RESUMEN

Proteostasis is fundamental for maintaining organismal health. However, the mechanisms underlying its dynamic regulation and how its disruptions lead to diseases are largely unclear. Here, we conduct in-depth propionylomic profiling in Drosophila, and develop a small-sample learning framework to prioritize the propionylation at lysine 17 of H2B (H2BK17pr) to be functionally important. Mutating H2BK17 which eliminates propionylation leads to elevated total protein level in vivo. Further analyses reveal that H2BK17pr modulates the expression of 14.7-16.3% of genes in the proteostasis network, and determines global protein level by regulating the expression of genes involved in the ubiquitin-proteasome system. In addition, H2BK17pr exhibits daily oscillation, mediating the influences of feeding/fasting cycles to drive rhythmic expression of proteasomal genes. Our study not only reveals a role of lysine propionylation in regulating proteostasis, but also implements a generally applicable method which can be extended to other issues with little prior knowledge.


Asunto(s)
Lisina , Proteostasis , Animales , Lisina/metabolismo , Ubiquitina/metabolismo , Drosophila/metabolismo , Complejo de la Endopetidasa Proteasomal/metabolismo
13.
ACS Chem Biol ; 17(1): 252-262, 2022 01 21.
Artículo en Inglés | MEDLINE | ID: mdl-34989232

RESUMEN

Although thermal proteome profiling (TPP) acts as a popular modification-free approach for drug target deconvolution, some key problems are still limiting screening sensitivity. In the prevailing TPP workflow, only the soluble fractions are analyzed after thermal treatment, while the precipitate fractions that also contain abundant information of drug-induced stability shifts are discarded; the sigmoid melting curve fitting strategy used for data processing suffers from discriminations for a part of human proteome with multiple transitions. In this study, a precipitate-supported TPP (PSTPP) assay was presented for unbiased and comprehensive analysis of protein-drug interactions at the proteome level. In PSTPP, only these temperatures where significant precipitation is observed were applied to induce protein denaturation and the complementary information contained in both supernatant fractions and precipitate fractions was used to improve the screening specificity and sensitivity. In addition, a novel image recognition algorithm based on deep learning was developed to recognize the target proteins, which circumvented the problems that exist in the sigmoid curve fitting strategy. PSTPP assay was validated by identifying the known targets of methotrexate, raltitrexed, and SNS-032 with good performance. Using a promiscuous kinase inhibitor, staurosporine, we delineated 99 kinase targets with a specificity up to 83% in K562 cell lysates, which represented a significant improvement over the existing thermal shift methods. Furthermore, the PSTPP strategy was successfully applied to analyze the binding targets of rapamycin, identifying the well-known targets, FKBP1A, as well as revealing a few other potential targets.


Asunto(s)
Precipitación Química , Aprendizaje Profundo , Sistemas de Liberación de Medicamentos , Proteínas/efectos de los fármacos , Proteoma , Proteómica/métodos , Algoritmos , Calor , Humanos , Células K562
14.
Diagnostics (Basel) ; 12(10)2022 Oct 21.
Artículo en Inglés | MEDLINE | ID: mdl-36292251

RESUMEN

Objective: A nomograph model of mortality risk for patients with coronavirus disease 2019 (COVID-19) was established and validated. Methods: We collected the clinical medical records of patients with severe/critical COVID-19 admitted to the eastern campus of Renmin Hospital of Wuhan University from January 2020 to May 2020 and to the north campus of Shanghai Ninth People's Hospital, Shanghai JiaoTong University School of Medicine, from April 2022 to June 2022. We assigned 254 patients to the former group, which served as the training set, and 113 patients were assigned to the latter group, which served as the validation set. The least absolute shrinkage and selection operator (LASSO) and multivariable logistic regression were used to select the variables and build the mortality risk prediction model. Results: The nomogram model was constructed with four risk factors for patient mortality following severe/critical COVID-19 (≥3 basic diseases, APACHE II score, urea nitrogen (Urea), and lactic acid (Lac)) and two protective factors (percentage of lymphocyte (L%) and neutrophil-to-platelets ratio (NPR)). The area under the curve (AUC) of the training set was 0.880 (95% confidence interval (95%CI), 0.837~0.923) and the AUC of the validation set was 0.814 (95%CI, 0.705~0.923). The decision curve analysis (DCA) showed that the nomogram model had high clinical value. Conclusion: The nomogram model for predicting the death risk of patients with severe/critical COVID-19 showed good prediction performance, and may be helpful in making appropriate clinical decisions for high-risk patients.

15.
Comput Struct Biotechnol J ; 19: 4497-4509, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34471495

RESUMEN

As a novel lactate-derived post-translational modification (PTM), lysine lactylation (Kla) is involved in diverse biological processes, and participates in human tumorigenesis. Identification of Kla substrates with their exact sites is crucial for revealing the molecular mechanisms of lactylation. In contrast with labor-intensive and time-consuming experimental approaches, computational prediction of Kla could provide convenience and increased speed, but is still lacking. In this work, although current identified Kla sites are limited, we constructed the first Kla benchmark dataset and developed a few-shot learning-based architecture approach to leverage the power of small datasets and reduce the impact of imbalance and overfitting. A maximum 11.7% (0.745 versus 0.667) increase of area under the curve (AUC) value was achieved in contrast to conventional machine learning methods. We conducted a comprehensive survey of the performance by combining 8 sequence-based features and 3 structure-based features and tailored a multi-feature hybrid system for synergistic combination. This system achieved >16.2% improvement of the AUC value (0.889 versus 0.765) compared with single feature-based models for the prediction of Kla sites in silico. Taken few-shot learning and hybrid system together, we present our newly designed predictor named FSL-Kla, which is not only a cutting-edge tool for Kla site profile but also could generate candidates for further experimental approaches. The webserver of FSL-Kla is freely accessible for academic research at http://kla.zbiolab.cn/.

16.
Nat Commun ; 12(1): 3258, 2021 05 31.
Artículo en Inglés | MEDLINE | ID: mdl-34059679

RESUMEN

Autophagy can selectively target protein aggregates, pathogens, and dysfunctional organelles for the lysosomal degradation. Aberrant regulation of autophagy promotes tumorigenesis, while it is far less clear whether and how tumor-specific alterations result in autophagic aberrance. To form a link between aberrant autophagy selectivity and human cancer, we establish a computational pipeline and prioritize 222 potential LIR (LC3-interacting region) motif-associated mutations (LAMs) in 148 proteins. We validate LAMs in multiple proteins including ATG4B, STBD1, EHMT2 and BRAF that impair their interactions with LC3 and autophagy activities. Using a combination of transcriptomic, metabolomic and additional experimental assays, we show that STBD1, a poorly-characterized protein, inhibits tumor growth via modulating glycogen autophagy, while a patient-derived W203C mutation on LIR abolishes its cancer inhibitory function. This work suggests that altered autophagy selectivity is a frequently-used mechanism by cancer cells to survive during various stresses, and provides a framework to discover additional autophagy-related pathways that influence carcinogenesis.


Asunto(s)
Carcinogénesis/genética , Macroautofagia/genética , Proteínas de la Membrana/genética , Modelos Genéticos , Proteínas Musculares/genética , Neoplasias/genética , Algoritmos , Animales , Carcinogénesis/patología , Línea Celular Tumoral , Simulación por Computador , Análisis Mutacional de ADN , Conjuntos de Datos como Asunto , Técnicas de Silenciamiento del Gen , Glucógeno/metabolismo , Humanos , Estimación de Kaplan-Meier , Proteínas de la Membrana/metabolismo , Ratones , Proteínas Asociadas a Microtúbulos/metabolismo , Proteínas Musculares/metabolismo , Mutación , Neoplasias/mortalidad , Neoplasias/patología , Vía de Pentosa Fosfato/genética , Dominios y Motivos de Interacción de Proteínas/genética , Proteoma/genética , RNA-Seq , Análisis de Matrices Tisulares , Efecto Warburg en Oncología , Ensayos Antitumor por Modelo de Xenoinjerto
17.
Theranostics ; 11(16): 8008-8026, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34335977

RESUMEN

Rationale: Children usually develop less severe symptoms responding to Coronavirus Disease 2019 (COVID-19) than adults. However, little is known about the molecular alterations and pathogenesis of COVID-19 in children. Methods: We conducted plasma proteomic and metabolomic profilings of the blood samples of a cohort containing 18 COVID-19-children with mild symptoms and 12 healthy children, which were enrolled from hospital admissions and outpatients, respectively. Statistical analyses were performed to identify molecules specifically altered in COVID-19-children. We also developed a machine learning-based pipeline named inference of biomolecular combinations with minimal bias (iBM) to prioritize proteins and metabolites strongly altered in COVID-19-children, and experimentally validated the predictions. Results: By comparing to the multi-omic data in adults, we identified 44 proteins and 249 metabolites differentially altered in COVID-19-children against healthy children or COVID-19-adults. Further analyses demonstrated that both deteriorative immune response/inflammation processes and protective antioxidant or anti-inflammatory processes were markedly induced in COVID-19-children. Using iBM, we prioritized two combinations that contained 5 proteins and 5 metabolites, respectively, each exhibiting a total area under curve (AUC) value of 100% to accurately distinguish COVID-19-children from healthy children or COVID-19-adults. Further experiments validated that all the 5 proteins were up-regulated upon coronavirus infection. Interestingly, we found that the prioritized metabolites inhibited the expression of pro-inflammatory factors, and two of them, methylmalonic acid (MMA) and mannitol, also suppressed coronaviral replication, implying a protective role of these metabolites in COVID-19-children. Conclusion: The finding of a strong antagonism of deteriorative and protective effects provided new insights on the mechanism and pathogenesis of COVID-19 in children that mostly underwent mild symptoms. The identified metabolites strongly altered in COVID-19-children could serve as potential therapeutic agents of COVID-19.


Asunto(s)
COVID-19/sangre , COVID-19/virología , Adulto , COVID-19/epidemiología , COVID-19/inmunología , Niño , Preescolar , China/epidemiología , Femenino , Hospitalización , Humanos , Masculino , Metabolómica/métodos , Persona de Mediana Edad , Proteómica/métodos , SARS-CoV-2/aislamiento & purificación
18.
Genomics Proteomics Bioinformatics ; 18(2): 194-207, 2020 04.
Artículo en Inglés | MEDLINE | ID: mdl-32861878

RESUMEN

As an important protein acylation modification, lysine succinylation (Ksucc) is involved in diverse biological processes, and participates in human tumorigenesis. Here, we collected 26,243 non-redundant known Ksucc sites from 13 species as the benchmark data set, combined 10 types of informative features, and implemented a hybrid-learning architecture by integrating deep-learning and conventional machine-learning algorithms into a single framework. We constructed a new tool named HybridSucc, which achieved area under curve (AUC) values of 0.885 and 0.952 for general and human-specific prediction of Ksucc sites, respectively. In comparison, the accuracy of HybridSucc was 17.84%-50.62% better than that of other existing tools. Using HybridSucc, we conducted a proteome-wide prediction and prioritized 370 cancer mutations that change Ksucc states of 218 important proteins, including PKM2, SHMT2, and IDH2. We not only developed a high-profile tool for predicting Ksucc sites, but also generated useful candidates for further experimental consideration. The online service of HybridSucc can be freely accessed for academic research at http://hybridsucc.biocuckoo.org/.


Asunto(s)
Algoritmos , Aprendizaje Automático , Proteínas/metabolismo , Ácido Succínico/metabolismo , Acilación , Secuencia de Aminoácidos , Área Bajo la Curva , Humanos , Lisina/metabolismo , Neoplasias/metabolismo , Proteoma/metabolismo , Curva ROC , Especificidad de la Especie
19.
Cells ; 9(5)2020 05 20.
Artículo en Inglés | MEDLINE | ID: mdl-32443803

RESUMEN

Protein phosphorylation is essential for regulating cellular activities by modifying substrates at specific residues, which frequently interact with proteins containing phosphoprotein-binding domains (PPBDs) to propagate the phosphorylation signaling into downstream pathways. Although massive phosphorylation sites (p-sites) have been reported, most of their interacting PPBDs are unknown. Here, we collected 4458 known PPBD-specific binding p-sites (PBSs), considerably improved our previously developed group-based prediction system (GPS) algorithm, and implemented a deep learning plus transfer learning strategy for model training. Then, we developed a new online service named GPS-PBS, which can hierarchically predict PBSs of 122 single PPBD clusters belonging to two groups and 16 families. By comparison, GPS-PBS achieved a highly competitive accuracy against other existing tools. Using GPS-PBS, we predicted 371,018 mammalian p-sites that potentially interact with at least one PPBD, and revealed that various PPBD-containing proteins (PPCPs) and protein kinases (PKs) can simultaneously regulate the same p-sites to orchestrate important pathways, such as the PI3K-Akt signaling pathway. Taken together, we anticipate GPS-PBS can be a great help for further dissecting phosphorylation signaling networks.


Asunto(s)
Algoritmos , Aprendizaje Profundo , Fosfoproteínas/química , Fosfoproteínas/metabolismo , Animales , Sitios de Unión , Bases de Datos de Proteínas , Humanos , Fosforilación , Unión Proteica , Dominios Proteicos , Proteoma/metabolismo , Transducción de Señal , Estadística como Asunto
20.
Nat Biomed Eng ; 4(12): 1197-1207, 2020 12.
Artículo en Inglés | MEDLINE | ID: mdl-33208927

RESUMEN

Data from patients with coronavirus disease 2019 (COVID-19) are essential for guiding clinical decision making, for furthering the understanding of this viral disease, and for diagnostic modelling. Here, we describe an open resource containing data from 1,521 patients with pneumonia (including COVID-19 pneumonia) consisting of chest computed tomography (CT) images, 130 clinical features (from a range of biochemical and cellular analyses of blood and urine samples) and laboratory-confirmed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) clinical status. We show the utility of the database for prediction of COVID-19 morbidity and mortality outcomes using a deep learning algorithm trained with data from 1,170 patients and 19,685 manually labelled CT slices. In an independent validation cohort of 351 patients, the algorithm discriminated between negative, mild and severe cases with areas under the receiver operating characteristic curve of 0.944, 0.860 and 0.884, respectively. The open database may have further uses in the diagnosis and management of patients with COVID-19.


Asunto(s)
COVID-19/patología , COVID-19/virología , Neumonía Viral/patología , Neumonía Viral/virología , Algoritmos , Aprendizaje Profundo , Femenino , Humanos , Masculino , Pandemias , Curva ROC , SARS-CoV-2/patogenicidad , Tomografía Computarizada por Rayos X/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA