Pesquisa | BVS Integralidade em Saúde

1.

Reproducibility of radiomics quality score: an intra- and inter-rater reliability study.

Akinci D'Antonoli, Tugba; Cavallo, Armando Ugo; Vernuccio, Federica; Stanzione, Arnaldo; Klontzas, Michail E; Cannella, Roberto; Ugga, Lorenzo; Baran, Agah; Fanni, Salvatore Claudio; Petrash, Ekaterina; Ambrosini, Ilaria; Cappellini, Luca Alessandro; van Ooijen, Peter; Kotter, Elmar; Pinto Dos Santos, Daniel; Cuocolo, Renato.

Eur Radiol ; 34(4): 2791-2804, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-37733025

RESUMO

OBJECTIVES: To investigate the intra- and inter-rater reliability of the total radiomics quality score (RQS) and the reproducibility of individual RQS items' score in a large multireader study. METHODS: Nine raters with different backgrounds were randomly assigned to three groups based on their proficiency with RQS utilization: Groups 1 and 2 represented the inter-rater reliability groups with or without prior training in RQS, respectively; group 3 represented the intra-rater reliability group. Thirty-three original research papers on radiomics were evaluated by raters of groups 1 and 2. Of the 33 papers, 17 were evaluated twice with an interval of 1 month by raters of group 3. Intraclass coefficient (ICC) for continuous variables, and Fleiss' and Cohen's kappa (k) statistics for categorical variables were used. RESULTS: The inter-rater reliability was poor to moderate for total RQS (ICC 0.30-055, p < 0.001) and very low to good for item's reproducibility (k - 0.12 to 0.75) within groups 1 and 2 for both inexperienced and experienced raters. The intra-rater reliability for total RQS was moderate for the less experienced rater (ICC 0.522, p = 0.009), whereas experienced raters showed excellent intra-rater reliability (ICC 0.91-0.99, p < 0.001) between the first and second read. Intra-rater reliability on RQS items' score reproducibility was higher and most of the items had moderate to good intra-rater reliability (k - 0.40 to 1). CONCLUSIONS: Reproducibility of the total RQS and the score of individual RQS items is low. There is a need for a robust and reproducible assessment method to assess the quality of radiomics research. CLINICAL RELEVANCE STATEMENT: There is a need for reproducible scoring systems to improve quality of radiomics research and consecutively close the translational gap between research and clinical implementation. KEY POINTS: â¢ Radiomics quality score has been widely used for the evaluation of radiomics studies. â¢ Although the intra-rater reliability was moderate to excellent, intra- and inter-rater reliability of total score and point-by-point scores were low with radiomics quality score. â¢ A robust, easy-to-use scoring system is needed for the evaluation of radiomics research.

Assuntos

Radiômica , Leitura , Humanos , Variações Dependentes do Observador , Reprodutibilidade dos Testes

2.

Ovarian imaging radiomics quality score assessment: an EuSoMII radiomics auditing group initiative.

Ponsiglione, Andrea; Stanzione, Arnaldo; Spadarella, Gaia; Baran, Agah; Cappellini, Luca Alessandro; Lipman, Kevin Groot; Van Ooijen, Peter; Cuocolo, Renato.

Eur Radiol ; 33(3): 2239-2247, 2023 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-36303093

RESUMO

OBJECTIVE: To evaluate the methodological rigor of radiomics-based studies using noninvasive imaging in ovarian setting. METHODS: Multiple medical literature archives (PubMed, Web of Science, and Scopus) were searched to retrieve original studies focused on computed tomography (CT), magnetic resonance imaging (MRI), ultrasound (US), or positron emission tomography (PET) radiomics for ovarian disorders' assessment. Two researchers in consensus evaluated each investigation using the radiomics quality score (RQS). Subgroup analyses were performed to assess whether the total RQS varied according to first author category, study aim and topic, imaging modality, and journal quartile. RESULTS: From a total of 531 items, 63 investigations were finally included in the analysis. The studies were greatly focused (94%) on the field of oncology, with CT representing the most used imaging technique (41%). Overall, the papers achieved a median total RQS 6 (IQR, -0.5 to 11), corresponding to a percentage of 16.7% of the maximum score (IQR, 0-30.6%). The scoring was low especially due to the lack of prospective design and formal validation of the results. At subgroup analysis, the 4 studies not focused on oncological topic showed significantly lower quality scores than the others. CONCLUSIONS: The overall methodological rigor of radiomics studies in the ovarian field is still not ideal, limiting the reproducibility of results and potential translation to clinical setting. More efforts towards a standardized methodology in the workflow are needed to allow radiomics to become a viable tool for clinical decision-making. KEY POINTS: â¢ The 63 included studies using noninvasive imaging for ovarian applications were mostly focused on oncologic topic (94%). â¢ The included investigations achieved a median total RQS 6 (IQR, -0.5 to 11), indicating poor methodological rigor. â¢ The RQS was low especially due to the lack of prospective design and formal validation of the results.

Assuntos

Imageamento por Ressonância Magnética , Tomografia Computadorizada por Raios X , Humanos , Reprodutibilidade dos Testes , Tomografia Computadorizada por Raios X/métodos , Imageamento por Ressonância Magnética/métodos , Tomografia por Emissão de Pósitrons , Ultrassonografia

3.

Quantification of cauda equina nerve root dispersion through radiomics features in weight-bearing MRI in normal subjects and spinal canal stenosis patients.

Levi, Riccardo; Battaglia, Massimiliano; Garoli, Federico; Cappellini, Luca Alessandro; De Robertis, Mario; Anselmi, Leonardo; Savini, Giovanni; Riva, Marco; Fornari, Maurizio; Grimaldi, Marco; Politi, Letterio S.

Eur Radiol ; 2023 Dec 07.

Artigo em Inglês | MEDLINE | ID: mdl-38057593

RESUMO

OBJECTIVE: To quantify the distribution of cauda equina nerve roots in supine and upright positions using manual measurements and radiomics features both in normal subjects and in lumbar spinal canal stenosis (LSCS) patients. METHODS: We retrospectively recruited patients who underwent weight-bearing MRI in supine and upright positions for back pain. 3D T2-weighted isotropic acquisition (3D-HYCE) sequences were used to develop a 3D convolutional neural network for identification and segmentation of lumbar vertebrae. Para-axial reformatted images perpendicular to the spinal canal and parallel to each vertebral endplate were automatically extracted. From each level, we computed the maximum antero-posterior (AP) and latero-lateral (LL) dispersion of nerve roots; further, radiomics features were extracted to quantify standardized metrics of nerve root distribution. RESULTS: We included 16 patients with LSCS and 20 normal subjects. In normal subjects, nerve root AP dispersion significantly increased from supine to upright position (p < 0.001, L2-L5 levels), and radiomics features showed an increase in non-uniformity. In LSCS subjects, in the upright position AP dispersion of nerve roots and entropy-related features increased caudally to the stenosis level (p < 0.001) and decreased cranially (p < 0.001). Moreover, entropy-related radiomics features negatively correlated with pre-operative Pain Numerical Rating Scale. Comparison between normal subjects and LSCS patients showed a difference in AP dispersion and increase of variance cranially to the stenosis level (p < 0.001) in the upright position. CONCLUSIONS: Nerve root distribution inside the dural sac changed between supine and upright positions, and radiomics features were able to quantify the differences between normal and LSCS subjects. CLINICAL RELEVANCE STATEMENT: The distribution of cauda equina nerve roots and the redundant nerve root sign significantly varies between supine and upright positions in normal subjects and spinal canal stenosis patients, respectively. Radiomics features quantify nerve root dispersion and correlates with pain severity. KEY POINTS: â¢ Weight-bearing MRI depicts spatial distribution of the cauda equina in both supine and upright positions in normal subjects and spinal stenosis patients. â¢ Radiomics features can quantify the effects of spinal stenosis on the dispersion of the cauda equina in the dural sac. â¢ In the orthostatic position, dispersion of nerve roots is different in lumbar spinal stenosis patients compared to that in normal subjects; entropy-related features negatively correlated with pre-operative Pain Numerical Rating Scale.

4.

METhodological RadiomICs Score (METRICS): a quality scoring tool for radiomics research endorsed by EuSoMII.

Kocak, Burak; Akinci D'Antonoli, Tugba; Mercaldo, Nathaniel; Alberich-Bayarri, Angel; Baessler, Bettina; Ambrosini, Ilaria; Andreychenko, Anna E; Bakas, Spyridon; Beets-Tan, Regina G H; Bressem, Keno; Buvat, Irene; Cannella, Roberto; Cappellini, Luca Alessandro; Cavallo, Armando Ugo; Chepelev, Leonid L; Chu, Linda Chi Hang; Demircioglu, Aydin; deSouza, Nandita M; Dietzel, Matthias; Fanni, Salvatore Claudio; Fedorov, Andrey; Fournier, Laure S; Giannini, Valentina; Girometti, Rossano; Groot Lipman, Kevin B W; Kalarakis, Georgios; Kelly, Brendan S; Klontzas, Michail E; Koh, Dow-Mu; Kotter, Elmar; Lee, Ho Yun; Maas, Mario; Marti-Bonmati, Luis; Müller, Henning; Obuchowski, Nancy; Orlhac, Fanny; Papanikolaou, Nikolaos; Petrash, Ekaterina; Pfaehler, Elisabeth; Pinto Dos Santos, Daniel; Ponsiglione, Andrea; Sabater, Sebastià; Sardanelli, Francesco; Seeböck, Philipp; Sijtsema, Nanna M; Stanzione, Arnaldo; Traverso, Alberto; Ugga, Lorenzo; Vallières, Martin; van Dijk, Lisanne V.

Insights Imaging ; 15(1): 8, 2024 Jan 17.

Artigo em Inglês | MEDLINE | ID: mdl-38228979

RESUMO

PURPOSE: To propose a new quality scoring tool, METhodological RadiomICs Score (METRICS), to assess and improve research quality of radiomics studies. METHODS: We conducted an online modified Delphi study with a group of international experts. It was performed in three consecutive stages: Stage#1, item preparation; Stage#2, panel discussion among EuSoMII Auditing Group members to identify the items to be voted; and Stage#3, four rounds of the modified Delphi exercise by panelists to determine the items eligible for the METRICS and their weights. The consensus threshold was 75%. Based on the median ranks derived from expert panel opinion and their rank-sum based conversion to importance scores, the category and item weights were calculated. RESULT: In total, 59 panelists from 19 countries participated in selection and ranking of the items and categories. Final METRICS tool included 30 items within 9 categories. According to their weights, the categories were in descending order of importance: study design, imaging data, image processing and feature extraction, metrics and comparison, testing, feature processing, preparation for modeling, segmentation, and open science. A web application and a repository were developed to streamline the calculation of the METRICS score and to collect feedback from the radiomics community. CONCLUSION: In this work, we developed a scoring tool for assessing the methodological quality of the radiomics research, with a large international panel and a modified Delphi protocol. With its conditional format to cover methodological variations, it provides a well-constructed framework for the key methodological concepts to assess the quality of radiomic research papers. CRITICAL RELEVANCE STATEMENT: A quality assessment tool, METhodological RadiomICs Score (METRICS), is made available by a large group of international domain experts, with transparent methodology, aiming at evaluating and improving research quality in radiomics and machine learning. KEY POINTS: â¢ A methodological scoring tool, METRICS, was developed for assessing the quality of radiomics research, with a large international expert panel and a modified Delphi protocol. â¢ The proposed scoring tool presents expert opinion-based importance weights of categories and items with a transparent methodology for the first time. â¢ METRICS accounts for varying use cases, from handcrafted radiomics to entirely deep learning-based pipelines. â¢ A web application has been developed to help with the calculation of the METRICS score ( https://metricsscore.github.io/metrics/METRICS.html ) and a repository created to collect feedback from the radiomics community ( https://github.com/metricsscore/metrics ).

5.

Assessing Trustworthy AI in Times of COVID-19: Deep Learning for Predicting a Multiregional Score Conveying the Degree of Lung Compromise in COVID-19 Patients.

Allahabadi, Himanshi; Amann, Julia; Balot, Isabelle; Beretta, Andrea; Binkley, Charles; Bozenhard, Jonas; Bruneault, Frederick; Brusseau, James; Candemir, Sema; Cappellini, Luca Alessandro; Chakraborty, Subrata; Cherciu, Nicoleta; Cociancig, Christina; Coffee, Megan; Ek, Irene; Espinosa-Leal, Leonardo; Farina, Davide; Fieux-Castagnet, Genevieve; Frauenfelder, Thomas; Gallucci, Alessio; Giuliani, Guya; Golda, Adam; van Halem, Irmhild; Hildt, Elisabeth; Holm, Sune; Kararigas, Georgios; Krier, Sebastien A; Kuhne, Ulrich; Lizzi, Francesca; Madai, Vince I; Markus, Aniek F; Masis, Serg; Mathez, Emilie Wiinblad; Mureddu, Francesco; Neri, Emanuele; Osika, Walter; Ozols, Matiss; Panigutti, Cecilia; Parent, Brendan; Pratesi, Francesca; Moreno-Sanchez, Pedro A; Sartor, Giovanni; Savardi, Mattia; Signoroni, Alberto; Sormunen, Hanna-Maria; Spezzatti, Andy; Srivastava, Adarsh; Stephansen, Annette F; Theng, Lau Bee; Tithi, Jesmin Jahan.

IEEE Trans Technol Soc ; 3(4): 272-289, 2022 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-36573115

RESUMO

This article's main contributions are twofold: 1) to demonstrate how to apply the general European Union's High-Level Expert Group's (EU HLEG) guidelines for trustworthy AI in practice for the domain of healthcare and 2) to investigate the research question of what does "trustworthy AI" mean at the time of the COVID-19 pandemic. To this end, we present the results of a post-hoc self-assessment to evaluate the trustworthiness of an AI system for predicting a multiregional score conveying the degree of lung compromise in COVID-19 patients, developed and verified by an interdisciplinary team with members from academia, public hospitals, and industry in time of pandemic. The AI system aims to help radiologists to estimate and communicate the severity of damage in a patient's lung from Chest X-rays. It has been experimentally deployed in the radiology department of the ASST Spedali Civili clinic in Brescia, Italy, since December 2020 during pandemic time. The methodology we have applied for our post-hoc assessment, called Z-Inspection®, uses sociotechnical scenarios to identify ethical, technical, and domain-specific issues in the use of the AI system in the context of the pandemic.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa