Your browser doesn't support javascript.
loading
Pathway importance by graph convolutional network and Shapley additive explanations in gene expression phenotype of diffuse large B-cell lymphoma.
Hayakawa, Jin; Seki, Tomohisa; Kawazoe, Yoshimasa; Ohe, Kazuhiko.
Afiliación
  • Hayakawa J; Department of Biomedical Informatics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.
  • Seki T; Department of Healthcare Information Management, The University of Tokyo Hospital, Tokyo, Japan.
  • Kawazoe Y; Artificial Intelligence in Healthcare, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.
  • Ohe K; Department of Biomedical Informatics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.
PLoS One ; 17(6): e0269570, 2022.
Article en En | MEDLINE | ID: mdl-35749395
ABSTRACT
Deep learning techniques have recently been applied to analyze associations between gene expression data and disease phenotypes. However, there are concerns regarding the black box

problem:

it is difficult to interpret why the prediction results are obtained using deep learning models from model parameters. New methods have been proposed for interpreting deep learning model predictions but have not been applied to genetics. In this study, we demonstrated that applying SHapley Additive exPlanations (SHAP) to a deep learning model using graph convolutions of genetic pathways can provide pathway-level feature importance for classification prediction of diffuse large B-cell lymphoma (DLBCL) gene expression subtypes. Using Kyoto Encyclopedia of Genes and Genomes pathways, a graph convolutional network (GCN) model was implemented to construct graphs with nodes and edges. DLBCL datasets, including microarray gene expression data and clinical information on subtypes (germinal center B-cell-like type and activated B-cell-like type), were retrieved from the Gene Expression Omnibus to evaluate the model. The GCN model showed an accuracy of 0.914, precision of 0.948, recall of 0.868, and F1 score of 0.906 in analysis of the classification performance for the test datasets. The pathways with high feature importance by SHAP included highly enriched pathways in the gene set enrichment analysis. Moreover, a logistic regression model with explanatory variables of genes in pathways with high feature importance showed good performance in predicting DLBCL subtypes. In conclusion, our GCN model for classifying DLBCL subtypes is useful for interpreting important regulatory pathways that contribute to the prediction.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Linfoma de Células B Grandes Difuso Tipo de estudio: Prognostic_studies Límite: Humans Idioma: En Revista: PLoS One Asunto de la revista: CIENCIA / MEDICINA Año: 2022 Tipo del documento: Article País de afiliación: Japón

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Linfoma de Células B Grandes Difuso Tipo de estudio: Prognostic_studies Límite: Humans Idioma: En Revista: PLoS One Asunto de la revista: CIENCIA / MEDICINA Año: 2022 Tipo del documento: Article País de afiliación: Japón