Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Comput Struct Biotechnol J ; 23: 2507-2515, 2024 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38974887

RESUMO

The incidence of early-onset colorectal cancer (EOCRC) has increased significantly worldwide. Uncovering biomarkers that are unique to EOCRC is of great importance to facilitate the prevention and detection of this growing cancer subtype. Although efforts have been made in the data curation about CRC, there is no integrated platform that gives access to data specifically related to young CRC patients. Here, we constructed a user-friendly open integrated resource called CRCDB (URL: http://crcdb-hust.com) which contains multi-omics data of 785 EOCRC, 4898 late-onset CRCs (LOCRC), and 1110 normal control samples from tissue, whole blood, platelets, and serum exosomes. CRCDB manages the differential analysis, survival analysis, co-expression analysis, and immune cell infiltration comparison analysis results in different CRC groups. Meta-analysis results were also provided for users for further data interpretation. Using the resource in CRCDB, we identified that genes associated with the metabolic process were less expressed in EOCRC patients, while up regulated genes most associated with the mitosis process might play an important role in the molecular pathogenesis of LOCRC. Survival-related genes were most enriched in oxidoreduction pathways in EOCRC while in immune-related pathways in LOCRC. With all the data gathered and processed, we anticipate that CRCDB could be a practical data mining platform to help explore potential applications of omics data and develop effective prevention and therapeutic strategies for the specific group of CRC patients.

2.
Front Immunol ; 14: 1326018, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38143770

RESUMO

Background: Ovarian cancer (OC) is a highly heterogeneous and malignant gynecological cancer, thereby leading to poor clinical outcomes. The study aims to identify and characterize clinically relevant subtypes in OC and develop a diagnostic model that can precisely stratify OC patients, providing more diagnostic clues for OC patients to access focused therapeutic and preventative strategies. Methods: Gene expression datasets of OC were retrieved from TCGA and GEO databases. To evaluate immune cell infiltration, the ESTIMATE algorithm was applied. A univariate Cox analysis and the two-sided log-rank test were used to screen OC risk factors. We adopted the ConsensusClusterPlus algorithm to determine OC subtypes. Enrichment analysis based on KEGG and GO was performed to determine enriched pathways of signature genes for each subtype. The machine learning algorithm, support vector machine (SVM) was used to select the feature gene and develop a diagnostic model. A ROC curve was depicted to evaluate the model performance. Results: A total of 1,273 survival-related genes (SRGs) were firstly determined and used to clarify OC samples into different subtypes based on their different molecular pattern. SRGs were successfully stratified in OC patients into three robust subtypes, designated S-I (Immunoreactive and DNA Damage repair), S-II (Mixed), and S-III (Proliferative and Invasive). S-I had more favorable OS and DFS, whereas S-III had the worst prognosis and was enriched with OC patients at advanced stages. Meanwhile, comprehensive functional analysis highlighted differences in biological pathways: genes associated with immune function and DNA damage repair including CXCL9, CXCL10, CXCL11, APEX, APEX2, and RBX1 were enriched in S-I; S-II combined multiple gene signatures including genes associated with metabolism and transcription; and the gene signature of S-III was extensively involved in pathways reflecting malignancies, including many core kinases and transcription factors involved in cancer such as CDK6, ERBB2, JAK1, DAPK1, FOXO1, and RXRA. The SVM model showed superior diagnostic performance with AUC values of 0.922 and 0.901, respectively. Furthermore, a new dataset of the independent cohort could be automatically analyzed by this innovative pipeline and yield similar results. Conclusion: This study exploited an innovative approach to construct previously unexplored robust subtypes significantly related to different clinical and molecular features for OC and a diagnostic model using SVM to aid in clinical diagnosis and treatment. This investigation also illustrated the importance of targeting innate immune suppression together with DNA damage in OC, offering novel insights for further experimental exploration and clinical trial.


Assuntos
Genes cdc , Neoplasias Ovarianas , Humanos , Feminino , Prognóstico , Neoplasias Ovarianas/diagnóstico , Neoplasias Ovarianas/genética , Algoritmos
3.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35037020

RESUMO

As an important post-translational modification, lysine ubiquitination participates in numerous biological processes and is involved in human diseases, whereas the site specificity of ubiquitination is mainly decided by ubiquitin-protein ligases (E3s). Although numerous ubiquitination predictors have been developed, computational prediction of E3-specific ubiquitination sites is still a great challenge. Here, we carefully reviewed the existing tools for the prediction of general ubiquitination sites. Also, we developed a tool named GPS-Uber for the prediction of general and E3-specific ubiquitination sites. From the literature, we manually collected 1311 experimentally identified site-specific E3-substrate relations, which were classified into different clusters based on corresponding E3s at different levels. To predict general ubiquitination sites, we integrated 10 types of sequence and structure features, as well as three types of algorithms including penalized logistic regression, deep neural network and convolutional neural network. Compared with other existing tools, the general model in GPS-Uber exhibited a highly competitive accuracy, with an area under curve values of 0.7649. Then, transfer learning was adopted for each E3 cluster to construct E3-specific models, and in total 112 individual E3-specific predictors were implemented. Using GPS-Uber, we conducted a systematic prediction of human cancer-associated ubiquitination events, which could be helpful for further experimental consideration. GPS-Uber will be regularly updated, and its online service is free for academic research at http://gpsuber.biocuckoo.cn/.


Assuntos
Lisina , Ubiquitina-Proteína Ligases , Algoritmos , Humanos , Lisina/metabolismo , Processamento de Proteína Pós-Traducional , Ubiquitina-Proteína Ligases/química , Ubiquitina-Proteína Ligases/genética , Ubiquitina-Proteína Ligases/metabolismo , Ubiquitinação
4.
Comput Struct Biotechnol J ; 19: 4497-4509, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34471495

RESUMO

As a novel lactate-derived post-translational modification (PTM), lysine lactylation (Kla) is involved in diverse biological processes, and participates in human tumorigenesis. Identification of Kla substrates with their exact sites is crucial for revealing the molecular mechanisms of lactylation. In contrast with labor-intensive and time-consuming experimental approaches, computational prediction of Kla could provide convenience and increased speed, but is still lacking. In this work, although current identified Kla sites are limited, we constructed the first Kla benchmark dataset and developed a few-shot learning-based architecture approach to leverage the power of small datasets and reduce the impact of imbalance and overfitting. A maximum 11.7% (0.745 versus 0.667) increase of area under the curve (AUC) value was achieved in contrast to conventional machine learning methods. We conducted a comprehensive survey of the performance by combining 8 sequence-based features and 3 structure-based features and tailored a multi-feature hybrid system for synergistic combination. This system achieved >16.2% improvement of the AUC value (0.889 versus 0.765) compared with single feature-based models for the prediction of Kla sites in silico. Taken few-shot learning and hybrid system together, we present our newly designed predictor named FSL-Kla, which is not only a cutting-edge tool for Kla site profile but also could generate candidates for further experimental approaches. The webserver of FSL-Kla is freely accessible for academic research at http://kla.zbiolab.cn/.

5.
Nat Commun ; 12(1): 3258, 2021 05 31.
Artigo em Inglês | MEDLINE | ID: mdl-34059679

RESUMO

Autophagy can selectively target protein aggregates, pathogens, and dysfunctional organelles for the lysosomal degradation. Aberrant regulation of autophagy promotes tumorigenesis, while it is far less clear whether and how tumor-specific alterations result in autophagic aberrance. To form a link between aberrant autophagy selectivity and human cancer, we establish a computational pipeline and prioritize 222 potential LIR (LC3-interacting region) motif-associated mutations (LAMs) in 148 proteins. We validate LAMs in multiple proteins including ATG4B, STBD1, EHMT2 and BRAF that impair their interactions with LC3 and autophagy activities. Using a combination of transcriptomic, metabolomic and additional experimental assays, we show that STBD1, a poorly-characterized protein, inhibits tumor growth via modulating glycogen autophagy, while a patient-derived W203C mutation on LIR abolishes its cancer inhibitory function. This work suggests that altered autophagy selectivity is a frequently-used mechanism by cancer cells to survive during various stresses, and provides a framework to discover additional autophagy-related pathways that influence carcinogenesis.


Assuntos
Carcinogênese/genética , Macroautofagia/genética , Proteínas de Membrana/genética , Modelos Genéticos , Proteínas Musculares/genética , Neoplasias/genética , Algoritmos , Animais , Carcinogênese/patologia , Linhagem Celular Tumoral , Simulação por Computador , Análise Mutacional de DNA , Conjuntos de Dados como Assunto , Técnicas de Silenciamento de Genes , Glicogênio/metabolismo , Humanos , Estimativa de Kaplan-Meier , Proteínas de Membrana/metabolismo , Camundongos , Proteínas Associadas aos Microtúbulos/metabolismo , Proteínas Musculares/metabolismo , Mutação , Neoplasias/mortalidade , Neoplasias/patologia , Via de Pentose Fosfato/genética , Domínios e Motivos de Interação entre Proteínas/genética , Proteoma/genética , RNA-Seq , Análise Serial de Tecidos , Efeito Warburg em Oncologia , Ensaios Antitumorais Modelo de Xenoenxerto
6.
Brief Bioinform ; 22(2): 1836-1847, 2021 03 22.
Artigo em Inglês | MEDLINE | ID: mdl-32248222

RESUMO

As an important reversible lipid modification, S-palmitoylation mainly occurs at specific cysteine residues in proteins, participates in regulating various biological processes and is associated with human diseases. Besides experimental assays, computational prediction of S-palmitoylation sites can efficiently generate helpful candidates for further experimental consideration. Here, we reviewed the current progress in the development of S-palmitoylation site predictors, as well as training data sets, informative features and algorithms used in these tools. Then, we compiled a benchmark data set containing 3098 known S-palmitoylation sites identified from small- or large-scale experiments, and developed a new method named data quality discrimination (DQD) to distinguish data quality weights (DQWs) between the two types of the sites. Besides DQD and our previous methods, we encoded sequence similarity values into images, constructed a deep learning framework of convolutional neural networks (CNNs) and developed a novel algorithm of graphic presentation system (GPS) 6.0. We further integrated nine additional types of sequence-based and structural features, implemented parallel CNNs (pCNNs) and designed a new predictor called GPS-Palm. Compared with other existing tools, GPS-Palm showed a >31.3% improvement of the area under the curve (AUC) value (0.855 versus 0.651) for general prediction of S-palmitoylation sites. We also produced two species-specific predictors, with corresponding AUC values of 0.900 and 0.897 for predicting human- and mouse-specific sites, respectively. GPS-Palm is free for academic research at http://gpspalm.biocuckoo.cn/.


Assuntos
Gráficos por Computador , Aprendizado Profundo , Lipoilação , Proteínas/química , Algoritmos , Animais , Biologia Computacional/métodos , Humanos , Camundongos , Software
7.
Genomics Proteomics Bioinformatics ; 18(2): 194-207, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-32861878

RESUMO

As an important protein acylation modification, lysine succinylation (Ksucc) is involved in diverse biological processes, and participates in human tumorigenesis. Here, we collected 26,243 non-redundant known Ksucc sites from 13 species as the benchmark data set, combined 10 types of informative features, and implemented a hybrid-learning architecture by integrating deep-learning and conventional machine-learning algorithms into a single framework. We constructed a new tool named HybridSucc, which achieved area under curve (AUC) values of 0.885 and 0.952 for general and human-specific prediction of Ksucc sites, respectively. In comparison, the accuracy of HybridSucc was 17.84%-50.62% better than that of other existing tools. Using HybridSucc, we conducted a proteome-wide prediction and prioritized 370 cancer mutations that change Ksucc states of 218 important proteins, including PKM2, SHMT2, and IDH2. We not only developed a high-profile tool for predicting Ksucc sites, but also generated useful candidates for further experimental consideration. The online service of HybridSucc can be freely accessed for academic research at http://hybridsucc.biocuckoo.org/.


Assuntos
Algoritmos , Aprendizado de Máquina , Proteínas/metabolismo , Ácido Succínico/metabolismo , Acilação , Sequência de Aminoácidos , Área Sob a Curva , Humanos , Lisina/metabolismo , Neoplasias/metabolismo , Proteoma/metabolismo , Curva ROC , Especificidade da Espécie
8.
Nucleic Acids Res ; 48(D1): D288-D295, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31691822

RESUMO

Here, we presented an integrative database named DrLLPS (http://llps.biocuckoo.cn/) for proteins involved in liquid-liquid phase separation (LLPS), which is a ubiquitous and crucial mechanism for spatiotemporal organization of various biochemical reactions, by creating membraneless organelles (MLOs) in eukaryotic cells. From the literature, we manually collected 150 scaffold proteins that are drivers of LLPS, 987 regulators that contribute in modulating LLPS, and 8148 potential client proteins that might be dispensable for the formation of MLOs, which were then categorized into 40 biomolecular condensates. We searched potential orthologs of these known proteins, and in total DrLLPS contained 437 887 known and potential LLPS-associated proteins in 164 eukaryotes. Furthermore, we carefully annotated LLPS-associated proteins in eight model organisms, by using the knowledge integrated from 110 widely used resources that covered 16 aspects, including protein disordered regions, domain annotations, post-translational modifications (PTMs), genetic variations, cancer mutations, molecular interactions, disease-associated information, drug-target relations, physicochemical property, protein functional annotations, protein expressions/proteomics, protein 3D structures, subcellular localizations, mRNA expressions, DNA & RNA elements, and DNA methylations. We anticipate DrLLPS can serve as a helpful resource for further analysis of LLPS.


Assuntos
Bases de Dados Factuais , Eucariotos , Proteínas/química , Proteínas/metabolismo , Genoma , Proteínas Intrinsicamente Desordenadas/química , Proteínas Intrinsicamente Desordenadas/metabolismo , Organelas , Processamento de Proteína Pós-Traducional , Interface Usuário-Computador
9.
Nucleic Acids Res ; 47(D1): D344-D350, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30380109

RESUMO

Here, we described the updated database iEKPD 2.0 (http://iekpd.biocuckoo.org) for eukaryotic protein kinases (PKs), protein phosphatases (PPs) and proteins containing phosphoprotein-binding domains (PPBDs), which are key molecules responsible for phosphorylation-dependent signalling networks and participate in the regulation of almost all biological processes and pathways. In total, iEKPD 2.0 contained 197 348 phosphorylation regulators, including 109 912 PKs, 23 294 PPs and 68 748 PPBD-containing proteins in 164 eukaryotic species. In particular, we provided rich annotations for the regulators of eight model organisms, especially humans, by compiling and integrating the knowledge from 100 widely used public databases that cover 13 aspects, including cancer mutations, genetic variations, disease-associated information, mRNA expression, DNA & RNA elements, DNA methylation, molecular interactions, drug-target relations, protein 3D structures, post-translational modifications, protein expressions/proteomics, subcellular localizations and protein functional annotations. Compared with our previously developed EKPD 1.0 (∼0.5 GB), iEKPD 2.0 contains ∼99.8 GB of data with an ∼200-fold increase in data volume. We anticipate that iEKPD 2.0 represents a more useful resource for further study of phosphorylation regulators.


Assuntos
Bases de Dados de Proteínas , Eucariotos/genética , Anotação de Sequência Molecular , Fosfoproteínas Fosfatases/genética , Proteínas Quinases/genética , Animais , Coleta de Dados , Humanos , Fosfoproteínas/metabolismo , Fosforilação , Domínios Proteicos/genética , Processamento de Proteína Pós-Traducional , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA