Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Int J Biol Markers ; 38(3-4): 243-252, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37846061

RESUMEN

BACKGROUND: Upstream stimulatory factors (USFs) are members of the basic helix-loop-helix leucine zipper transcription factor family, including USF1, USF2, and USF3. The first two members have been well studied compared to the third member, USF3, which has received scarce attention in cancer research to date. Despite a recently reported association of its alteration with thyroid carcinoma, its expression has not been previously analyzed. METHODS: We comprehensively analyzed differential levels of USFs expression, genomic alteration, DNA methylation, and their prognostic value across different cancer types and the possible correlation with tumor-infiltrating immune cells and drug response by using different bioinformatics tools. RESULTS: Our findings established that USFs play an important role in cancers related to the urinary system and justify the necessity for further investigation. We implemented and offer a useful ShinyApp to facilitate researchers' efforts to inquire about any other gene of interest and to perform the analysis of drug response in a user-friendly fashion at http://zzdlab.com:3838/Drugdiscovery/.


Asunto(s)
Proteínas de Unión al ADN , Neoplasias , Humanos , Factores Estimuladores hacia 5'/genética , Factores Estimuladores hacia 5'/metabolismo , Proteínas de Unión al ADN/metabolismo , Neoplasias/genética
2.
Front Oncol ; 13: 1146463, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37007080

RESUMEN

Background: Cytokines are involved in many inflammatory diseases and thus play an important role in tumor immune regulation. In recent years, researchers have found that breast cancer is not only related to genetic and environmental factors, but also to the chronic inflammation and immunity. However, the correlation between serum cytokines and blood tests indicators remain unclear. Methods: A total of 84 serum samples and clinicopathological data of breast cancer patients from Tianjin Cancer Institute & Hospital, Tianjin Medical University, Tianjin, P. R. China were collected. The expression levels of the 12 cytokines were detected by immunofluorescence method. Blood tests results were obtained from medical records. By stepwise Cox regression analysis, a cytokine-related gene signature was generated. Univariate and multivariate Cox regression were used to analyze the influence on the prognosis of patients. A nomogram was constructed to illustrate the cytokine-related riskscore predicting 5-year OS, which was further evaluated and validated by C-index and ROC curve. The correlation between the expression of cytokines in serum and other blood indicators was studied by using Spearman's test. Results: The riskscore was calculated as IL-4×0.99069 + TNF-α×0.03683. Patients were divided into high and low risk groups according to the median riskscore, with the high-risk group has a shorter survival time by log-rank test (training set, P=0.017; validation set, P=0.013). Combined with the clinical characteristics, the riskscore was found to be an independent factor for predicting the OS of breast cancer patients in both training cohort (HR=1.2, P<0.01) and validation cohort (HR=1.6, P=0.023). The 5-year C-index and AUC of the nomogram were 0.78 and 0.68, respectively. IL-4 was further found to be negatively correlated with ALB. Conclusion: In summary, we have developed a nomogram based on two cytokines including IL-4 and TNF-α to predict OS of breast cancer and investigated their correlation with blood test indicators.

3.
4.
Front Genet ; 13: 772090, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35281837

RESUMEN

Objective: To identify CT imaging biomarkers based on radiomic features for predicting brain metastases (BM) in patients with ALK-rearranged non-small cell lung cancer (NSCLC). Methods: NSCLC patients with pathologically confirmed ALK rearrangement from January 2014 to December 2020 in our hospital were enrolled retrospectively in this study. Finally, 77 patients were included according to the inclusion and exclusion criteria. Patients were divided into two groups: BM+ were those patients who were diagnosed with BM at baseline examination (n = 16) or within 1 year's follow-up (n = 14), and BM- were those without BM followed up for at least 1 year (n = 47). Radiomic features were extracted from the pretreatment thoracic CT images. Sequential univariate logistic regression, LASSO regression, and backward stepwise logistic regression were used to select radiomic features and develop a BM-predicting model. Results: Five robust radiomic features were found to be independent predictors of BM. AUC for radiomics model was 0.828 (95% CI: 0.736-0.921), and when combined with clinical features, the AUC was increased (p = 0.017) to 0.909 (95% CI: 0.845-0.972). The individualized BM-predicting model incorporated with clinical features was visualized by the nomogram. Conclusion: Radiomic features extracted from pretreatment thoracic CT images have the potential to predict BM within 1 year after detection of the primary tumor in patients with ALK-rearranged NSCLC. The radiomics model incorporated with clinical features shows improved risk stratification for such patients.

5.
Technol Cancer Res Treat ; 20: 15330338211064434, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34931914

RESUMEN

Objective: This study aimed to investigate the tolerance and pharmacokinetic characteristics of recombinant human endostatin (rh-endostatin) administered as single-dose or multiple-dose infusions in patients with advanced solid tumors. Methods: This phase I trial was designed as a single-center, single-arm, nonrandomized, open-label, dose-escalation study. The trial consisted of 2 parts: a single-dose part and a multiple-dose part, each with 3 dose comparison groups. Rh-endostatin was administered as an intravenous injection only once at a dose of 5 mg/m2, 7.5 mg/m2, or 10 mg/m2 in the single-dose part and as a daily intravenous injection for 14 days at the same doses in the multiple-dose part. The serum pharmacokinetics, toxicity and immunogenicity of rh-endostatin were evaluated. Results: Dose-limiting toxicity (DLT) was not observed in any group. A few patients developed cardiotoxicity, such as QT prolongation or narrow arrhythmia. Other adverse events were slight coagulation abnormalities and haematological abnormalities. For rh-endostatin doses of 5 mg/m2, 7.5 mg/m2, and 10 mg/m2, the mean Cmax values in the single-dose part were 344 ± 38.7 ng/mL, 524 ± 157 ng/mL, and 800 ± 201 ng/mL, respectively, and the average AUC0-t values were 3290 ± 3790 ng•h/mL, 4940 ± 4380 ng•h/mL, and 5050 ± 3980 ng•h/mL, respectively. The Cmax ss values of the 3 doses in the multiple-dose part were 575 ± 270 ng/mL, 531 ± 106 ng/mL, and 864 ± 166 ng/mL, respectively, and the AUC0-τ values were 3610 ± 1040 ng•h/mL, 3290 ± 1090 ng•h/mL, and 5180 ± 1210 ng•h/mL, respectively. The Cmax of a single-dose regimen showed linear kinetic characteristics. The patients in the single-dose group were negative for serum antibodies against rh-endostatin, while one patient in the multiple-dose group was positive. Conclusions: Rh-endostatin as a daily intravenous injection for 14 days in patients with advanced solid tumors is safe and well tolerated, without DLT, at doses of 5 mg/m2, 7.5 mg/m2, and 10 mg/m2. Serum antibodies against rh-endostatin were very low after multiple infusions. For phase II trials, the recommended rh-endostatin dose is 10 mg/m2 as a daily intravenous injection for 14 days.


Asunto(s)
Antineoplásicos/administración & dosificación , Antineoplásicos/farmacocinética , Endostatinas/administración & dosificación , Endostatinas/farmacocinética , Neoplasias/tratamiento farmacológico , Proteínas Recombinantes , Adulto , Anciano , Inhibidores de la Angiogénesis/administración & dosificación , Inhibidores de la Angiogénesis/efectos adversos , Inhibidores de la Angiogénesis/farmacocinética , Antineoplásicos/efectos adversos , Manejo de la Enfermedad , Relación Dosis-Respuesta a Droga , Monitoreo de Drogas , Endostatinas/efectos adversos , Femenino , Humanos , Masculino , Persona de Mediana Edad , Neoplasias/diagnóstico , Neoplasias/mortalidad , Pronóstico , Resultado del Tratamiento
6.
Brief Bioinform ; 22(6)2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-34002774

RESUMEN

Lysine crotonylation (Kcr) is a newly discovered type of protein post-translational modification and has been reported to be involved in various pathophysiological processes. High-resolution mass spectrometry is the primary approach for identification of Kcr sites. However, experimental approaches for identifying Kcr sites are often time-consuming and expensive when compared with computational approaches. To date, several predictors for Kcr site prediction have been developed, most of which are capable of predicting crotonylation sites on either histones alone or mixed histone and nonhistone proteins together. These methods exhibit high diversity in their algorithms, encoding schemes, feature selection techniques and performance assessment strategies. However, none of them were designed for predicting Kcr sites on nonhistone proteins. Therefore, it is desirable to develop an effective predictor for identifying Kcr sites from the large amount of nonhistone sequence data. For this purpose, we first provide a comprehensive review on six methods for predicting crotonylation sites. Second, we develop a novel deep learning-based computational framework termed as CNNrgb for Kcr site prediction on nonhistone proteins by integrating different types of features. We benchmark its performance against multiple commonly used machine learning classifiers (including random forest, logitboost, naïve Bayes and logistic regression) by performing both 10-fold cross-validation and independent test. The results show that the proposed CNNrgb framework achieves the best performance with high computational efficiency on large datasets. Moreover, to facilitate users' efforts to investigate Kcr sites on human nonhistone proteins, we implement an online server called nhKcr and compare it with other existing tools to illustrate the utility and robustness of our method. The nhKcr web server and all the datasets utilized in this study are freely accessible at http://nhKcr.erc.monash.edu/.


Asunto(s)
Bases de Datos de Proteínas , Aprendizaje Profundo , Histonas , Procesamiento Proteico-Postraduccional , Análisis de Secuencia de Proteína , Programas Informáticos , Biología Computacional , Histonas/genética , Histonas/metabolismo , Humanos
7.
Nucleic Acids Res ; 49(10): e60, 2021 06 04.
Artículo en Inglés | MEDLINE | ID: mdl-33660783

RESUMEN

Sequence-based analysis and prediction are fundamental bioinformatic tasks that facilitate understanding of the sequence(-structure)-function paradigm for DNAs, RNAs and proteins. Rapid accumulation of sequences requires equally pervasive development of new predictive models, which depends on the availability of effective tools that support these efforts. We introduce iLearnPlus, the first machine-learning platform with graphical- and web-based interfaces for the construction of machine-learning pipelines for analysis and predictions using nucleic acid and protein sequences. iLearnPlus provides a comprehensive set of algorithms and automates sequence-based feature extraction and analysis, construction and deployment of models, assessment of predictive performance, statistical analysis, and data visualization; all without programming. iLearnPlus includes a wide range of feature sets which encode information from the input sequences and over twenty machine-learning algorithms that cover several deep-learning approaches, outnumbering the current solutions by a wide margin. Our solution caters to experienced bioinformaticians, given the broad range of options, and biologists with no programming background, given the point-and-click interface and easy-to-follow design process. We showcase iLearnPlus with two case studies concerning prediction of long noncoding RNAs (lncRNAs) from RNA transcripts and prediction of crotonylation sites in protein chains. iLearnPlus is an open-source platform available at https://github.com/Superzchen/iLearnPlus/ with the webserver at http://ilearnplus.erc.monash.edu/.


Asunto(s)
Biología Computacional/métodos , Aprendizaje Automático , Análisis de Secuencia/métodos , Programas Informáticos , Secuencia de Aminoácidos , Animales , Secuencia de Bases , Humanos
8.
Plant Mol Biol ; 105(6): 601-610, 2021 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-33527202

RESUMEN

KEY MESSAGE: We developed two CNNs for predicting ubiquitination sites in Arabidopsis thaliana, demonstrated their competitive performance, analyzed amino acid physicochemical properties and the CNN structures, and predicted ubiquitination sites in Arabidopsis. As an important posttranslational protein modification, ubiquitination plays critical roles in plant physiology, including plant growth and development, biotic and abiotic stress, metabolism, and so on. A lot of ubiquitination site prediction models have been developed for human, mouse and yeast. However, there are few models to predict ubiquitination sites for the plant Arabidopsis thaliana. Based on this context, we proposed two convolutional neural network (CNN) based models for predicting ubiquitination sites in A. thaliana. The two models reach AUC (area under the ROC curve) values of 0.924 and 0.913 respectively in five-fold cross-validation, and 0.921 and 0.914 respectively in independent test, which outperform other models and demonstrate the competitive edge of them. We in-depth analyze the amino acid physicochemical properties in the neighboring sequence regions of the ubiquitination sites, and study the influence of the CNN structure to the prediction performance. Potential ubiquitination sites in the global Arbidopsis proteome are predicted using the two CNN models. To facilitate the community, the source code, training and test dataset, predicted ubiquitination sites in the Arbidopsis proteome are available at GitHub ( http://github.com/nongdaxiaofeng/CNNAthUbi ) for interest users.


Asunto(s)
Arabidopsis/metabolismo , Biología Computacional/métodos , Redes Neurales de la Computación , Ubiquitinación , Aminoácidos/metabolismo , Animales , Humanos , Ratones , Procesamiento Proteico-Postraduccional , Proteoma/metabolismo , Programas Informáticos , Levaduras
9.
Mol Ther ; 29(4): 1541-1556, 2021 04 07.
Artículo en Inglés | MEDLINE | ID: mdl-33412308

RESUMEN

HER2 breast cancer (BC) remains a significant problem in patients with locally advanced or metastatic BC. We investigated the relationship between T helper 1 (Th1) immune response and the proteasomal degradation pathway (PDP), in HER2-sensitive and -resistant cells. HER2 overexpression is partially maintained because E3 ubiquitin ligase Cullin5 (CUL5), which degrades HER2, is frequently mutated or underexpressed, while the client-protective co-chaperones cell division cycle 37 (Cdc37) and heat shock protein 90 (Hsp90) are increased translating to diminished survival. The Th1 cytokine interferon (IFN)-γ caused increased CUL5 expression and marked dissociation of both Cdc37 and Hsp90 from HER2, causing significant surface loss of HER2, diminished growth, and induction of tumor senescence. In HER2-resistant mammary carcinoma, either IFN-γ or Th1-polarizing anti-HER2 vaccination, when administered with anti-HER2 antibodies, demonstrated increased intratumor CUL5 expression, decreased surface HER2, and tumor senescence with significant therapeutic activity. IFN-γ synergized with multiple HER2-targeted agents to decrease surface HER2 expression, resulting in decreased tumor growth. These data suggest a novel function of IFN-γ that regulates HER2 through the PDP pathway and provides an opportunity to impact HER2 responses through anti-tumor immunity.


Asunto(s)
Neoplasias de la Mama/tratamiento farmacológico , Proteínas Cullin/genética , Interferón gamma/genética , Receptor ErbB-2/inmunología , Neoplasias de la Mama/genética , Neoplasias de la Mama/inmunología , Neoplasias de la Mama/patología , Proteínas de Ciclo Celular/genética , Línea Celular Tumoral , Senescencia Celular/genética , Senescencia Celular/inmunología , Chaperoninas/genética , Proteínas Cullin/inmunología , Citocinas/genética , Femenino , Regulación Neoplásica de la Expresión Génica/genética , Regulación Neoplásica de la Expresión Génica/inmunología , Humanos , Interferón gamma/inmunología , Proteolisis , Receptor ErbB-2/antagonistas & inhibidores , Receptor ErbB-2/genética , Células TH1/efectos de los fármacos , Células TH1/metabolismo , Vacunación
10.
Technol Cancer Res Treat ; 18: 1533033819892260, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31808361

RESUMEN

BACKGROUND: Breast cancer is one of the most common malignant tumor type in women worldwide. BARD1 could impact function of BRCA1 as its interaction partner. In the current study, we aimed to investigate the prognostic role of BARD1 expression as well as its alterations in breast cancer using different online tools. METHODS: We performed a bioinformatics analysis for BARD1 in patients with breast cancer using several online databases, including Oncomine, bc-GenExMiner, PrognoScan, Search Tool for the Retrieval of Interacting Genes, Cytoscape, and cBioPortal. RESULTS: We found that BARD1 was highly expressed in basal-like, HER2-E, and luminal B compared with normal-like subtype. Forest plot showed that BARD1 overexpression was correlated with worse distant metastasis-free survival (hazard ratio: 2.72, 95% confidence interval: 1.02-2.21; P = .0448), disease-specific survival (hazard ratio: 2.65, 95% confidence interval: 1.37-5.12; P = .0037), and disease-free survival (hazard ratio: 1.98, 95% confidence interval: 1.22-3.24; P = .0062) but positively correlated with overall survival (hazard ratio: 0.66, 95% confidence interval: 0.50-0.85; P = .0017). Multivariate analysis indicated that BARD1 expression was significantly associated with distant metastasis-free survival (hazard ratio: 4.60, 95% confidence interval: 1.22-17.28; P = .0239) whereas marginally significant for disease-free survival (hazard ratio: 1.00, 95% confidence interval: 1.00-1.01, P = .0630) and disease-specific survival (hazard ratio: 1.96, 95% confidence interval: 0.97-3.96; P = .0602). Meanwhile, alterations in BARD1 interaction network were associated with worse overall survival instead of BARD1 alteration alone. CONCLUSIONS: Bioinformatics analysis revealed that BARD1 may be a predictive biomarker for prognosis of breast cancer. However, future research is required to validate our findings.


Asunto(s)
Neoplasias de la Mama/genética , Biología Computacional , Regulación Neoplásica de la Expresión Génica , Variación Genética , Proteínas Supresoras de Tumor/genética , Ubiquitina-Proteína Ligasas/genética , Biomarcadores de Tumor , Neoplasias de la Mama/metabolismo , Neoplasias de la Mama/mortalidad , Neoplasias de la Mama/patología , Biología Computacional/métodos , Susceptibilidad a Enfermedades , Femenino , Frecuencia de los Genes , Ontología de Genes , Humanos , Pronóstico , Análisis de Supervivencia , Proteínas Supresoras de Tumor/metabolismo , Ubiquitina-Proteína Ligasas/metabolismo
11.
Endocr Relat Cancer ; 25(6): 595-605, 2018 06.
Artículo en Inglés | MEDLINE | ID: mdl-29599124

RESUMEN

ER-negative breast cancer includes most aggressive subtypes of breast cancer such as triple negative (TN) breast cancer. Excluded from hormonal and targeted therapies effectively used for other subtypes of breast cancer, standard chemotherapy is one of the primary treatment options for these patients. However, as ER- patients have shown highly heterogeneous responses to different chemotherapies, it has been difficult to select most beneficial chemotherapy treatments for them. In this study, we have simultaneously developed single drug biomarker models for four standard chemotherapy agents: paclitaxel (T), 5-fluorouracil (F), doxorubicin (A) and cyclophosphamide (C) to predict responses and survival of ER- breast cancer patients treated with combination chemotherapies. We then flexibly combined these individual drug biomarkers for predicting patient outcomes of two independent cohorts of ER- breast cancer patients who were treated with different drug combinations of neoadjuvant chemotherapy. These individual and combined drug biomarker models significantly predicted chemotherapy response for 197 ER- patients in the Hatzis cohort (AUC = 0.637, P = 0.002) and 69 ER- patients in the Hess cohort (AUC = 0.635, P = 0.056). The prediction was also significant for the TN subgroup of both cohorts (AUC = 0.60, 0.72, P = 0.043, 0.009). In survival analysis, our predicted responder patients showed significantly improved survival with a >17 months longer median PFS than the predicted non-responder patients for both ER- and TN subgroups (log-rank test P-value = 0.018 and 0.044). This flexible prediction capability based on single drug biomarkers may allow us to even select new drug combinations most beneficial to individual patients with ER- breast cancer.


Asunto(s)
Antineoplásicos/uso terapéutico , Biomarcadores de Tumor/genética , Neoplasias de la Mama/tratamiento farmacológico , Modelos Biológicos , Receptores de Estrógenos , Adulto , Anciano , Anciano de 80 o más Años , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapéutico , Neoplasias de la Mama/genética , Línea Celular Tumoral , Ciclofosfamida/uso terapéutico , Doxorrubicina/uso terapéutico , Femenino , Fluorouracilo/uso terapéutico , Regulación Neoplásica de la Expresión Génica , Humanos , Persona de Mediana Edad , Terapia Neoadyuvante , Paclitaxel/uso terapéutico , Adulto Joven
12.
PLoS One ; 7(6): e39195, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22720073

RESUMEN

Sumoylation is one of the most essential mechanisms of reversible protein post-translational modifications and is a crucial biochemical process in the regulation of a variety of important biological functions. Sumoylation is also closely involved in various human diseases. The accurate computational identification of sumoylation sites in protein sequences aids in experimental design and mechanistic research in cellular biology. In this study, we introduced amino acid hydrophobicity as a parameter into a traditional binary encoding scheme and developed a novel sumoylation site prediction tool termed SUMOhydro. With the assistance of a support vector machine, the proposed method was trained and tested using a stringent non-redundant sumoylation dataset. In a leave-one-out cross-validation, the proposed method yielded an excellent performance with a correlation coefficient, specificity, sensitivity and accuracy equal to 0.690, 98.6%, 71.1% and 97.5%, respectively. In addition, SUMOhydro has been benchmarked against previously described predictors based on an independent dataset, thereby suggesting that the introduction of hydrophobicity as an additional parameter could assist in the prediction of sumoylation sites. Currently, SUMOhydro is freely accessible at http://protein.cau.edu.cn/others/SUMOhydro/.


Asunto(s)
Proteínas Modificadoras Pequeñas Relacionadas con Ubiquitina/metabolismo , Secuencia de Aminoácidos , Sitios de Unión , Datos de Secuencia Molecular , Curva ROC , Sumoilación , Máquina de Vectores de Soporte
13.
PLoS One ; 6(7): e22930, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-21829559

RESUMEN

As one of the most important reversible protein post-translation modifications, ubiquitination has been reported to be involved in lots of biological processes and closely implicated with various diseases. To fully decipher the molecular mechanisms of ubiquitination-related biological processes, an initial but crucial step is the recognition of ubiquitylated substrates and the corresponding ubiquitination sites. Here, a new bioinformatics tool named CKSAAP_UbSite was developed to predict ubiquitination sites from protein sequences. With the assistance of Support Vector Machine (SVM), the highlight of CKSAAP_UbSite is to employ the composition of k-spaced amino acid pairs surrounding a query site (i.e. any lysine in a query sequence) as input. When trained and tested in the dataset of yeast ubiquitination sites (Radivojac et al, Proteins, 2010, 78: 365-380), a 100-fold cross-validation on a 1∶1 ratio of positive and negative samples revealed that the accuracy and MCC of CKSAAP_UbSite reached 73.40% and 0.4694, respectively. The proposed CKSAAP_UbSite has also been intensively benchmarked to exhibit better performance than some existing predictors, suggesting that it can be served as a useful tool to the community. Currently, CKSAAP_UbSite is freely accessible at http://protein.cau.edu.cn/cksaap_ubsite/. Moreover, we also found that the sequence patterns around ubiquitination sites are not conserved across different species. To ensure a reasonable prediction performance, the application of the current CKSAAP_UbSite should be limited to the proteome of yeast.


Asunto(s)
Aminoácidos/metabolismo , Biología Computacional , Posición Específica de Matrices de Puntuación , Proteínas/química , Análisis de Secuencia de Proteína/métodos , Ubiquitinación , Humanos , Procesamiento Proteico-Postraduccional , Máquina de Vectores de Soporte
14.
BMC Bioinformatics ; 9: 101, 2008 Feb 18.
Artículo en Inglés | MEDLINE | ID: mdl-18282281

RESUMEN

BACKGROUND: As one of the most common protein post-translational modifications, glycosylation is involved in a variety of important biological processes. Computational identification of glycosylation sites in protein sequences becomes increasingly important in the post-genomic era. A new encoding scheme was employed to improve the prediction of mucin-type O-glycosylation sites in mammalian proteins. RESULTS: A new protein bioinformatics tool, CKSAAP_OGlySite, was developed to predict mucin-type O-glycosylation serine/threonine (S/T) sites in mammalian proteins. Using the composition of k-spaced amino acid pairs (CKSAAP) based encoding scheme, the proposed method was trained and tested in a new and stringent O-glycosylation dataset with the assistance of Support Vector Machine (SVM). When the ratio of O-glycosylation to non-glycosylation sites in training datasets was set as 1:1, 10-fold cross-validation tests showed that the proposed method yielded a high accuracy of 83.1% and 81.4% in predicting O-glycosylated S and T sites, respectively. Based on the same datasets, CKSAAP_OGlySite resulted in a higher accuracy than the conventional binary encoding based method (about +5.0%). When trained and tested in 1:5 datasets, the CKSAAP encoding showed a more significant improvement than the binary encoding. We also merged the training datasets of S and T sites and integrated the prediction of S and T sites into one single predictor (i.e. S+T predictor). Either in 1:1 or 1:5 datasets, the performance of this S+T predictor was always slightly better than those predictors where S and T sites were independently predicted, suggesting that the molecular recognition of O-glycosylated S/T sites seems to be similar and the increase of the S+T predictor's accuracy may be a result of expanded training datasets. Moreover, CKSAAP_OGlySite was also shown to have better performance when benchmarked against two existing predictors. CONCLUSION: Because of CKSAAP encoding's ability of reflecting characteristics of the sequences surrounding mucin-type O-glycosylation sites, CKSAAP_ OGlySite has been proved more powerful than the conventional binary encoding based method. This suggests that it can be used as a competitive mucin-type O-glycosylation site predictor to the biological community. CKSAAP_OGlySite is now available at http://bioinformatics.cau.edu.cn/zzd_lab/CKSAAP_OGlySite/.


Asunto(s)
Glicosilación , Mucinas/química , Mapeo de Interacción de Proteínas/métodos , Proteínas/química , Análisis de Secuencia de Proteína/métodos , Secuencias de Aminoácidos , Secuencia de Aminoácidos , Animales , Inteligencia Artificial , Sitios de Unión , Humanos , Mamíferos , Datos de Secuencia Molecular , Reconocimiento de Normas Patrones Automatizadas , Unión Proteica
15.
Protein Eng Des Sel ; 21(5): 295-302, 2008 May.
Artículo en Inglés | MEDLINE | ID: mdl-18287176

RESUMEN

The protein databases contain a huge number of function unknown proteins, including many proteins with newly determined 3D structures resulted from the Structural Genomics Projects. To accelerate experiment-based assignment of function, de novo prediction of protein functional sites, like active sites in enzymes, becomes increasingly important. Here, we attempted to improve the prediction of catalytic residues in enzyme structures by seeking and refining different encodings (i.e. residue properties) as well as employing new machine learning algorithms. In particular, considering that catalytic residues can often reveal specific network centrality when representing enzyme structure as a residue contact network, the corresponding measurement (i.e. closeness centrality) was used as one of the most important encodings in our new predictor. Meanwhile, a genetic algorithm integrated neural network (GANN) was also employed. Thanks to the above strategies, our GANN predictor demonstrated a high accuracy of 91.2% in the prediction of catalytic residues based on balanced datasets (i.e. the 1:1 ratio of catalytic to non-catalytic residues). When the GANN method was optimally applied to real enzyme structures, 73.9% of the tested structures had the active site correctly located. Compared with two existing methods, the proposed GANN method also demonstrated a better performance.


Asunto(s)
Enzimas/química , Algoritmos , Sitios de Unión , Catálisis , Biología Computacional/métodos , Bases de Datos de Proteínas , Humanos , Enlace de Hidrógeno , Modelos Genéticos , Modelos Moleculares , Modelos Estadísticos , Conformación Molecular , Redes Neurales de la Computación , Conformación Proteica , Análisis de Secuencia de Proteína
16.
Protein Eng Des Sel ; 20(8): 405-12, 2007 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-17652129

RESUMEN

With the advance of modern molecular biology it has become increasingly clear that few cellular processes are unaffected by protein phosphorylation. Therefore, computational identification of phosphorylation sites is very helpful to accelerate the functional understanding of huge available protein sequences obtained from genomic and proteomic studies. Using a genetic algorithm integrated neural network (GANN), a new bioinformatics method named GANNPhos has been developed to predict phosphorylation sites in proteins. Aided by a genetic algorithm to optimize the weight values within the network, GANNPhos has demonstrated a high accuracy of 81.1, 76.7 and 73.3% in predicting phosphorylated S, T and Y sites, respectively. When benchmarked against Back-Propagation neural network and Support Vector Machine algorithms, GANNPhos gives better performance, suggesting the GANN program can be used for other prediction tasks in the field of protein bioinformatics.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Redes Neurales de la Computación , Secuencia de Aminoácidos , Benchmarking , Sitios de Unión , Bases de Datos de Proteínas , Datos de Secuencia Molecular , Fosforilación , Valor Predictivo de las Pruebas , Unión Proteica , Proteínas/química
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...