Your browser doesn't support javascript.
loading
Recommendations for the development and use of imaging test sets to investigate the test performance of artificial intelligence in health screening.
Chalkidou, Anastasia; Shokraneh, Farhad; Kijauskaite, Goda; Taylor-Phillips, Sian; Halligan, Steve; Wilkinson, Louise; Glocker, Ben; Garrett, Peter; Denniston, Alastair K; Mackie, Anne; Seedat, Farah.
Afiliação
  • Chalkidou A; King's Technology Evaluation Centre, King's College London, London, UK. Electronic address: anastasia.chalkidou@nice.org.uk.
  • Shokraneh F; King's Technology Evaluation Centre, King's College London, London, UK.
  • Kijauskaite G; UK National Screening Committee, Office for Health Improvement and Disparities, Department of Health and Social Care, London, UK.
  • Taylor-Phillips S; Warwick Medical School, University of Warwick, Coventry, UK.
  • Halligan S; Centre for Medical Imaging, Division of Medicine, University College London, London, UK.
  • Wilkinson L; Oxford Breast Imaging Centre, Oxford University, Oxford, UK.
  • Glocker B; Department of Computing, Imperial College London, London, UK.
  • Garrett P; Department of Chemical Engineering and Analytical Science, University of Manchester, Manchester, UK.
  • Denniston AK; Department of Ophthalmology, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK.
  • Mackie A; UK National Screening Committee, Office for Health Improvement and Disparities, Department of Health and Social Care, London, UK.
  • Seedat F; UK National Screening Committee, Office for Health Improvement and Disparities, Department of Health and Social Care, London, UK.
Lancet Digit Health ; 4(12): e899-e905, 2022 12.
Article em En | MEDLINE | ID: mdl-36427951
ABSTRACT
Rigorous evaluation of artificial intelligence (AI) systems for image classification is essential before deployment into health-care settings, such as screening programmes, so that adoption is effective and safe. A key step in the evaluation process is the external validation of diagnostic performance using a test set of images. We conducted a rapid literature review on methods to develop test sets, published from 2012 to 2020, in English. Using thematic analysis, we mapped themes and coded the principles using the Population, Intervention, and Comparator or Reference standard, Outcome, and Study design framework. A group of screening and AI experts assessed the evidence-based principles for completeness and provided further considerations. From the final 15 principles recommended here, five affect population, one intervention, two comparator, one reference standard, and one both reference standard and comparator. Finally, four are appliable to outcome and one to study design. Principles from the literature were useful to address biases from AI; however, they did not account for screening specific biases, which we now incorporate. The principles set out here should be used to support the development and use of test sets for studies that assess the accuracy of AI within screening programmes, to ensure they are fit for purpose and minimise bias.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Inteligência Artificial / Diagnóstico por Imagem Tipo de estudo: Diagnostic_studies / Screening_studies Idioma: En Revista: Lancet Digit Health Ano de publicação: 2022 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Inteligência Artificial / Diagnóstico por Imagem Tipo de estudo: Diagnostic_studies / Screening_studies Idioma: En Revista: Lancet Digit Health Ano de publicação: 2022 Tipo de documento: Article