Assessment of bias in scoring of AI-based radiotherapy segmentation and planning studies using modified TRIPOD and PROBAST guidelines as an example.

Hurkmans, Coen; Bibault, Jean-Emmanuel; Clementel, Enrico; Dhont, Jennifer; van Elmpt, Wouter; Kantidakis, Georgios; Andratschke, Nicolaus

Hurkmans, Coen; Bibault, Jean-Emmanuel; Clementel, Enrico; Dhont, Jennifer; van Elmpt, Wouter; Kantidakis, Georgios; Andratschke, Nicolaus.

Afiliação

Hurkmans C; Dept. of Radiation Oncology, Catharina Hospital Eindhoven, the Netherlands; Dept. of Electrical Engineering, Technical University Eindhoven, the Netherlands. Electronic address: coen.hurkmans@cze.nl.
Bibault JE; Dept. of Radiation Oncology, Hôpital Européen Georges Pompidou, Université Paris Cité, Paris, France.
Clementel E; European Organisation for the Research and Treatment of Cancer (EORTC), Brussels, Belgium.
Dhont J; Université libre de Bruxelles (ULB), Hôpital Universitaire de Bruxelles (H.U.B), Institut Jules Bordet, Department of Medical Physics, Brussels, Belgium; Université Libre De Bruxelles (ULB), Radiophysics and MRI Physics Laboratory, Brussels, Belgium.
van Elmpt W; Department of Radiation Oncology (MAASTRO), GROW - School for Oncology and Reproduction, Maastricht University Medical Center+, Maastricht, the Netherlands.
Kantidakis G; European Organisation for the Research and Treatment of Cancer (EORTC), Brussels, Belgium.
Andratschke N; Dept. of Radiation Oncology, University Hospital of Zurich, The University of Zurich, Zurich, Switzerland.

Radiother Oncol ; 194: 110196, 2024 May.

Article em En | MEDLINE | ID: mdl-38432311

ABSTRACT

ABSTRACT

BACKGROUND AND

PURPOSE:

Studies investigating the application of Artificial Intelligence (AI) in the field of radiotherapy exhibit substantial variations in terms of quality. The goal of this study was to assess the amount of transparency and bias in scoring articles with a specific focus on AI based segmentation and treatment planning, using modified PROBAST and TRIPOD checklists, in order to provide recommendations for future guideline developers and reviewers. MATERIALS AND

METHODS:

The TRIPOD and PROBAST checklist items were discussed and modified using a Delphi process. After consensus was reached, 2 groups of 3 co-authors scored 2 articles to evaluate usability and further optimize the adapted checklists. Finally, 10 articles were scored by all co-authors. Fleiss' kappa was calculated to assess the reliability of agreement between observers.

RESULTS:

Three of the 37 TRIPOD items and 5 of the 32 PROBAST items were deemed irrelevant. General terminology in the items (e.g., multivariable prediction model, predictors) was modified to align with AI-specific terms. After the first scoring round, further improvements of the items were formulated, e.g., by preventing the use of sub-questions or subjective words and adding clarifications on how to score an item. Using the final consensus list to score the 10 articles, only 2 out of the 61 items resulted in a statistically significant kappa of 0.4 or more demonstrating substantial agreement. For 41 items no statistically significant kappa was obtained indicating that the level of agreement among multiple observers is due to chance alone.

CONCLUSION:

Our study showed low reliability scores with the adapted TRIPOD and PROBAST checklists. Although such checklists have shown great value during development and reporting, this raises concerns about the applicability of such checklists to objectively score scientific articles for AI applications. When developing or revising guidelines, it is essential to consider their applicability to score articles without introducing bias.

Assuntos

Inteligência Artificial; Lista de Checagem; Técnica Delphi; Planejamento da Radioterapia Assistida por Computador; Humanos; Planejamento da Radioterapia Assistida por Computador/métodos; Planejamento da Radioterapia Assistida por Computador/normas; Guias de Prática Clínica como Assunto; Viés; Reprodutibilidade dos Testes; Neoplasias/radioterapia

Palavras-chave

Artificial intelligence; Bias; Checklists; Distinctiveness; Guidelines; Inter-observer variation; Oncology; Radiation therapy; Transparency

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Planejamento da Radioterapia Assistida por Computador / Inteligência Artificial / Técnica Delphi / Lista de Checagem Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google