Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Más filtros

Banco de datos
Tipo del documento
Asunto de la revista
País de afiliación
Intervalo de año de publicación
1.
Bioinformatics ; 38(2): 410-418, 2022 01 03.
Artículo en Inglés | MEDLINE | ID: mdl-34586380

RESUMEN

MOTIVATION: Survival analysis using gene expression profiles plays a crucial role in the interpretation of clinical research and assessment of disease therapy programs. Several prediction models have been developed to explore the relationship between patients' covariates and survival. However, the high-dimensional genomic features limit the prediction performance of the survival model. Thus, an accurate and reliable prediction model is necessary for survival analysis using high-dimensional genomic data. RESULTS: In this study, we proposed an improved survival prediction model based on XGBoost framework called XGBLC, which used Lasso-Cox to enhance the ability to analyze high-dimensional genomic data. The novel first- and second-order gradient statistics of Lasso-Cox were defined to construct the loss function of XGBLC. We extensively tested our XGBLC algorithm on both simulated and real-world datasets, and estimated the performance of models with 5-fold cross-validation. Based on 20 cancer datasets from The Cancer Genome Atlas (TCGA), XGBLC outperforms five state-of-the-art survival methods in terms of C-index, Brier score and AUC. The results show that XGBLC still keeps good accuracy and robustness by comparing the performance on the simulated datasets with different scales. The developed prediction model would be beneficial for physicians to understand the effects of patient's genomic characteristics on survival and make personalized treatment decisions. AVAILABILITY AND IMPLEMENTATION: The implementation of XGBLC algorithm based on R language is available at: https://github.com/lab319/XGBLC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Neoplasias , Humanos , Genómica , Neoplasias/genética , Genoma , Análisis de Supervivencia
2.
Math Biosci Eng ; 20(7): 11676-11687, 2023 05 08.
Artículo en Inglés | MEDLINE | ID: mdl-37501415

RESUMEN

Most kidney cancers are kidney renal clear cell carcinoma (KIRC) that is a main cause of cancer-related deaths. Polygenic risk score (PRS) is a weighted linear combination of phenotypic related alleles on the genome that can be used to assess KIRC risk. However, standalone SNP data as input to the PRS model may not provide satisfactory result. Therefore, Transcriptional risk scores (TRS) based on multi-omics data and machine learning models were proposed to assess the risk of KIRC. First, we collected four types of multi-omics data (DNA methylation, miRNA, mRNA and lncRNA) of KIRC patients from the TCGA database. Subsequently, a novel TRS method utilizing multiple omics data and XGBoost model was developed. Finally, we performed prevalence analysis and prognosis prediction to evaluate the utility of the TRS generated by our method. Our TRS methods exhibited better predictive performance than the linear models and other machine learning models. Furthermore, the prediction accuracy of combined TRS model was higher than that of single-omics TRS model. The KM curves showed that TRS was a valid prognostic indicator for cancer staging. Our proposed method extended the current definition of TRS from standalone SNP data to multi-omics data and was superior to the linear models and other machine learning models, which may provide a useful implement for diagnostic and prognostic prediction of KIRC.


Asunto(s)
Carcinoma de Células Renales , Neoplasias Renales , MicroARNs , Humanos , Carcinoma de Células Renales/diagnóstico , Carcinoma de Células Renales/genética , Carcinoma de Células Renales/patología , Neoplasias Renales/diagnóstico , Neoplasias Renales/genética , Neoplasias Renales/patología , MicroARNs/genética , Factores de Riesgo , Riñón/patología
3.
Sci Rep ; 12(1): 10646, 2022 06 23.
Artículo en Inglés | MEDLINE | ID: mdl-35739223

RESUMEN

The potential role of DNA methylation from paracancerous tissues in cancer diagnosis has not been explored until now. In this study, we built classification models using well-known machine learning models based on DNA methylation profiles of paracancerous tissues. We evaluated our methods on nine cancer datasets collected from The Cancer Genome Atlas (TCGA) and utilized fivefold cross-validation to assess the performance of models. Additionally, we performed gene ontology (GO) enrichment analysis on the basis of the significant CpG sites selected by feature importance scores of XGBoost model, aiming to identify biological pathways involved in cancer progression. We also exploited the XGBoost algorithm to classify cancer types using DNA methylation profiles of paracancerous tissues in external validation datasets. Comparative experiments suggested that XGBoost achieved better predictive performance than the other four machine learning methods in predicting cancer stage. GO enrichment analysis revealed key pathways involved, highlighting the importance of paracancerous tissues in cancer progression. Furthermore, XGBoost model can accurately classify nine different cancers from TCGA, and the feature sets selected by XGBoost can also effectively predict seven cancer types on independent GEO datasets. This study provided new insights into cancer diagnosis from an epigenetic perspective and may facilitate the development of personalized diagnosis and treatment strategies.


Asunto(s)
Metilación de ADN , Neoplasias , Epigenómica , Humanos , Aprendizaje Automático , Estadificación de Neoplasias , Neoplasias/diagnóstico , Neoplasias/genética
4.
Comput Biol Med ; 121: 103761, 2020 06.
Artículo en Inglés | MEDLINE | ID: mdl-32339094

RESUMEN

Accurate diagnostic classification of cancers can greatly help physicians to choose surveillance and treatment strategies for patients. Following the explosive growth of huge amounts of biological data, the shift from traditional biostatistical methods to computer-aided means has made machine-learning methods as an integral part of today's cancer prognosis prediction. In this work, we proposed a classification model by leveraging the power of extreme gradient boosting (XGBoost) and using increasingly complex multi-omics data with the aim to separate early stage and late stage cancers. We applied XGBoost model to four kinds of cancer data downloaded from TCGA and compared its performance with other popular machine-learning methods. The experimental results showed that our method obtained statistically significantly better or comparable predictive performance. The results of this study also revealed that DNA methylation outperforms other molecular data (mRNA expression and miRNA expression) in terms of accuracy and stability for discriminating between early stage and late stage groups. Furthermore, integration of multi-omics data by autoencoder can enhance the classification accuracy of cancer stage. Finally, we conducted bioinformatics analyses to assess the medical utility of the significant genes ranked by their importance using XGBoost algorithm. Extensively comparative experiments demonstrated that the XGBoost method has a remarkable performance in predicting the stage of cancer patients with multi-omics data. Moreover, identification of novel candidate genes associated with cancer stages would contribute to further elucidate disease pathogenesis and develop novel therapeutics.


Asunto(s)
MicroARNs , Neoplasias , Algoritmos , Metilación de ADN , Humanos , Aprendizaje Automático , MicroARNs/genética , Neoplasias/diagnóstico , Neoplasias/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA