Using test-time augmentation to investigate explainable AI: inconsistencies between method, model and human intuition.

Hartog, Peter B R; Krüger, Fabian; Genheden, Samuel; Tetko, Igor V

Hartog, Peter B R; Krüger, Fabian; Genheden, Samuel; Tetko, Igor V.

Afiliação

Hartog PBR; Molecular AI, Discovery Sciences, R &D, AstraZeneca, 431 83, Mölndal, Sweden. peter.hartog@astrazeneca.com.
Krüger F; Institute of Structural Biology, Helmholtz Munich, Munich, 85764, Germany. peter.hartog@astrazeneca.com.
Genheden S; Institute of Structural Biology, Helmholtz Munich, Munich, 85764, Germany.
Tetko IV; Molecular AI, Discovery Sciences, R &D, AstraZeneca, 431 83, Mölndal, Sweden.

J Cheminform ; 16(1): 39, 2024 Apr 04.

Article em En | MEDLINE | ID: mdl-38576047

ABSTRACT

ABSTRACT

Stakeholders of machine learning models desire explainable artificial intelligence (XAI) to produce human-understandable and consistent interpretations. In computational toxicity, augmentation of text-based molecular representations has been used successfully for transfer learning on downstream tasks. Augmentations of molecular representations can also be used at inference to compare differences between multiple representations of the same ground-truth. In this study, we investigate the robustness of eight XAI methods using test-time augmentation for a molecular-representation model in the field of computational toxicity prediction. We report significant differences between explanations for different representations of the same ground-truth, and show that randomized models have similar variance. We hypothesize that text-based molecular representations in this and past research reflect tokenization more than learned parameters. Furthermore, we see a greater variance between in-domain predictions than out-of-domain predictions, indicating XAI measures something other than learned parameters. Finally, we investigate the relative importance given to expert-derived structural alerts and find similar importance given irregardless of applicability domain, randomization and varying training procedures. We therefore caution future research to validate their methods using a similar comparison to human intuition without further investigation. SCIENTIFIC CONTRIBUTION In this research we critically investigate XAI through test-time augmentation, contrasting previous assumptions about using expert validation and showing inconsistencies within models for identical representations. SMILES augmentation has been used to increase model accuracy, but was here adapted from the field of image test-time augmentation to be used as an independent indication of the consistency within SMILES-based molecular representation models.

Palavras-chave

Explainability; Interpretation; ML; Representation learning; Robustness; Test-time augmentation; XAI

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Bases de dados: MEDLINE Idioma: En Revista: J Cheminform Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Suécia