Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 143
Filtrar
1.
Insights Imaging ; 15(1): 248, 2024 Oct 14.
Artigo em Inglês | MEDLINE | ID: mdl-39400639

RESUMO

Various healthcare domains have witnessed successful preliminary implementation of artificial intelligence (AI) solutions, including radiology, though limited generalizability hinders their widespread adoption. Currently, most research groups and industry have limited access to the data needed for external validation studies. The creation and accessibility of benchmark datasets to validate such solutions represents a critical step towards generalizability, for which an array of aspects ranging from preprocessing to regulatory issues and biostatistical principles come into play. In this article, the authors provide recommendations for the creation of benchmark datasets in radiology, explain current limitations in this realm, and explore potential new approaches. CLINICAL RELEVANCE STATEMENT: Benchmark datasets, facilitating validation of AI software performance can contribute to the adoption of AI in clinical practice. KEY POINTS: Benchmark datasets are essential for the validation of AI software performance. Factors like image quality and representativeness of cases should be considered. Benchmark datasets can help adoption by increasing the trustworthiness and robustness of AI.

2.
Artif Intell Med ; 156: 102952, 2024 10.
Artigo em Inglês | MEDLINE | ID: mdl-39180925

RESUMO

The advent of computer vision technology and increased usage of video cameras in clinical settings have facilitated advancements in movement disorder analysis. This review investigated these advancements in terms of providing practical, low-cost solutions for the diagnosis and analysis of movement disorders, such as Parkinson's disease, ataxia, dyskinesia, and Tourette syndrome. Traditional diagnostic methods for movement disorders are typically reliant on the subjective assessment of motor symptoms, which poses inherent challenges. Furthermore, early symptoms are often overlooked, and overlapping symptoms across diseases can complicate early diagnosis. Consequently, deep learning has been used for the objective video-based analysis of movement disorders. This study systematically reviewed the latest advancements in automatic two-dimensional & three-dimensional video analysis using deep learning for movement disorders. We comprehensively analyzed the literature published until September 2023 by searching the Web of Science, PubMed, Scopus, and Embase databases. We identified 68 relevant studies and extracted information on their objectives, datasets, modalities, and methodologies. The study aimed to identify, catalogue, and present the most significant advancements, offering a consolidated knowledge base on the role of video analysis and deep learning in movement disorder analysis. First, the objectives, including specific PD symptom quantification, ataxia assessment, cerebral palsy assessment, gait disorder analysis, tremor assessment, tic detection (in the context of Tourette syndrome), dystonia assessment, and abnormal movement recognition were discussed. Thereafter, the datasets used in the study were examined. Subsequently, video modalities and deep learning methodologies related to the topic were investigated. Finally, the challenges and opportunities in terms of datasets, interpretability, evaluation methods, and home/remote monitoring were discussed.


Assuntos
Aprendizado Profundo , Transtornos dos Movimentos , Gravação em Vídeo , Humanos , Imageamento Tridimensional/métodos , Transtornos dos Movimentos/diagnóstico , Transtornos dos Movimentos/fisiopatologia
3.
Artigo em Inglês | MEDLINE | ID: mdl-39147208

RESUMO

PURPOSE: Conventional normal tissue complication probability (NTCP) models for patients with head and neck cancer are typically based on single-value variables, which, for radiation-induced xerostomia, are baseline xerostomia and mean salivary gland doses. This study aimed to improve the prediction of late xerostomia by using 3-dimensional information from radiation dose distributions, computed tomography imaging, organ-at-risk segmentations, and clinical variables with deep learning (DL). METHODS AND MATERIALS: An international cohort of 1208 patients with head and neck cancer from 2 institutes was used to train and twice validate DL models (deep convolutional neural network, EfficientNet-v2, and ResNet) with 3-dimensional dose distribution, computed tomography scan, organ-at-risk segmentations, baseline xerostomia score, sex, and age as input. The NTCP endpoint was moderate-to-severe xerostomia 12 months postradiation therapy. The DL models' prediction performance was compared with a reference model: a recently published xerostomia NTCP model that used baseline xerostomia score and mean salivary gland doses as input. Attention maps were created to visualize the focus regions of the DL predictions. Transfer learning was conducted to improve the DL model performance on the external validation set. RESULTS: All DL-based NTCP models showed better performance (area under the receiver operating characteristic curve [AUC]test, 0.78-0.79) than the reference NTCP model (AUCtest, 0.74) in the independent test. Attention maps showed that the DL model focused on the major salivary glands, particularly the stem cell-rich region of the parotid glands. DL models obtained lower external validation performance (AUCexternal, 0.63) than the reference model (AUCexternal, 0.66). After transfer learning on a small external subset, the DL model (AUCtl, external, 0.66) performed better than the reference model (AUCtl, external, 0.64). CONCLUSION: DL-based NTCP models performed better than the reference model when validated in data from the same institute. Improved performance in the external data set was achieved with transfer learning, demonstrating the need for multicenter training data to realize generalizable DL-based NTCP models.

7.
Radiother Oncol ; 197: 110368, 2024 08.
Artigo em Inglês | MEDLINE | ID: mdl-38834153

RESUMO

BACKGROUND AND PURPOSE: To optimize our previously proposed TransRP, a model integrating CNN (convolutional neural network) and ViT (Vision Transformer) designed for recurrence-free survival prediction in oropharyngeal cancer and to extend its application to the prediction of multiple clinical outcomes, including locoregional control (LRC), Distant metastasis-free survival (DMFS) and overall survival (OS). MATERIALS AND METHODS: Data was collected from 400 patients (300 for training and 100 for testing) diagnosed with oropharyngeal squamous cell carcinoma (OPSCC) who underwent (chemo)radiotherapy at University Medical Center Groningen. Each patient's data comprised pre-treatment PET/CT scans, clinical parameters, and clinical outcome endpoints, namely LRC, DMFS and OS. The prediction performance of TransRP was compared with CNNs when inputting image data only. Additionally, three distinct methods (m1-3) of incorporating clinical predictors into TransRP training and one method (m4) that uses TransRP prediction as one parameter in a clinical Cox model were compared. RESULTS: TransRP achieved higher test C-index values of 0.61, 0.84 and 0.70 than CNNs for LRC, DMFS and OS, respectively. Furthermore, when incorporating TransRP's prediction into a clinical Cox model (m4), a higher C-index of 0.77 for OS was obtained. Compared with a clinical routine risk stratification model of OS, our model, using clinical variables, radiomics and TransRP prediction as predictors, achieved larger separations of survival curves between low, intermediate and high risk groups. CONCLUSION: TransRP outperformed CNN models for all endpoints. Combining clinical data and TransRP prediction in a Cox model achieved better OS prediction.


Assuntos
Neoplasias Orofaríngeas , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada , Humanos , Neoplasias Orofaríngeas/mortalidade , Neoplasias Orofaríngeas/diagnóstico por imagem , Neoplasias Orofaríngeas/patologia , Neoplasias Orofaríngeas/radioterapia , Neoplasias Orofaríngeas/terapia , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada/métodos , Masculino , Feminino , Pessoa de Meia-Idade , Idoso , Redes Neurais de Computação , Adulto
8.
Eur Radiol Exp ; 8(1): 63, 2024 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-38764066

RESUMO

BACKGROUND: Emphysema influences the appearance of lung tissue in computed tomography (CT). We evaluated whether this affects lung nodule detection by artificial intelligence (AI) and human readers (HR). METHODS: Individuals were selected from the "Lifelines" cohort who had undergone low-dose chest CT. Nodules in individuals without emphysema were matched to similar-sized nodules in individuals with at least moderate emphysema. AI results for nodular findings of 30-100 mm3 and 101-300 mm3 were compared to those of HR; two expert radiologists blindly reviewed discrepancies. Sensitivity and false positives (FPs)/scan were compared for emphysema and non-emphysema groups. RESULTS: Thirty-nine participants with and 82 without emphysema were included (n = 121, aged 61 ± 8 years (mean ± standard deviation), 58/121 males (47.9%)). AI and HR detected 196 and 206 nodular findings, respectively, yielding 109 concordant nodules and 184 discrepancies, including 118 true nodules. For AI, sensitivity was 0.68 (95% confidence interval 0.57-0.77) in emphysema versus 0.71 (0.62-0.78) in non-emphysema, with FPs/scan 0.51 and 0.22, respectively (p = 0.028). For HR, sensitivity was 0.76 (0.65-0.84) and 0.80 (0.72-0.86), with FPs/scan of 0.15 and 0.27 (p = 0.230). Overall sensitivity was slightly higher for HR than for AI, but this difference disappeared after the exclusion of benign lymph nodes. FPs/scan were higher for AI in emphysema than in non-emphysema (p = 0.028), while FPs/scan for HR were higher than AI for 30-100 mm3 nodules in non-emphysema (p = 0.009). CONCLUSIONS: AI resulted in more FPs/scan in emphysema compared to non-emphysema, a difference not observed for HR. RELEVANCE STATEMENT: In the creation of a benchmark dataset to validate AI software for lung nodule detection, the inclusion of emphysema cases is important due to the additional number of FPs. KEY POINTS: • The sensitivity of nodule detection by AI was similar in emphysema and non-emphysema. • AI had more FPs/scan in emphysema compared to non-emphysema. • Sensitivity and FPs/scan by the human reader were comparable for emphysema and non-emphysema. • Emphysema and non-emphysema representation in benchmark dataset is important for validating AI.


Assuntos
Inteligência Artificial , Enfisema Pulmonar , Tomografia Computadorizada por Raios X , Humanos , Masculino , Pessoa de Meia-Idade , Feminino , Tomografia Computadorizada por Raios X/métodos , Enfisema Pulmonar/diagnóstico por imagem , Software , Sensibilidade e Especificidade , Neoplasias Pulmonares/diagnóstico por imagem , Idoso , Doses de Radiação , Nódulo Pulmonar Solitário/diagnóstico por imagem , Interpretação de Imagem Radiográfica Assistida por Computador/métodos
9.
Comput Biol Med ; 177: 108675, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38820779

RESUMO

BACKGROUND: The different tumor appearance of head and neck cancer across imaging modalities, scanners, and acquisition parameters accounts for the highly subjective nature of the manual tumor segmentation task. The variability of the manual contours is one of the causes of the lack of generalizability and the suboptimal performance of deep learning (DL) based tumor auto-segmentation models. Therefore, a DL-based method was developed that outputs predicted tumor probabilities for each PET-CT voxel in the form of a probability map instead of one fixed contour. The aim of this study was to show that DL-generated probability maps for tumor segmentation are clinically relevant, intuitive, and a more suitable solution to assist radiation oncologists in gross tumor volume segmentation on PET-CT images of head and neck cancer patients. METHOD: A graphical user interface (GUI) was designed, and a prototype was developed to allow the user to interact with tumor probability maps. Furthermore, a user study was conducted where nine experts in tumor delineation interacted with the interface prototype and its functionality. The participants' experience was assessed qualitatively and quantitatively. RESULTS: The interviews with radiation oncologists revealed their preference for using a rainbow colormap to visualize tumor probability maps during contouring, which they found intuitive. They also appreciated the slider feature, which facilitated interaction by allowing the selection of threshold values to create single contours for editing and use as a starting point. Feedback on the prototype highlighted its excellent usability and positive integration into clinical workflows. CONCLUSIONS: This study shows that DL-generated tumor probability maps are explainable, transparent, intuitive and a better alternative to the single output of tumor segmentation models.


Assuntos
Aprendizado Profundo , Neoplasias de Cabeça e Pescoço , Humanos , Neoplasias de Cabeça e Pescoço/diagnóstico por imagem , Interface Usuário-Computador , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada/métodos
10.
Insights Imaging ; 15(1): 54, 2024 Feb 27.
Artigo em Inglês | MEDLINE | ID: mdl-38411750

RESUMO

OBJECTIVE: To systematically review radiomic feature reproducibility and model validation strategies in recent studies dealing with CT and MRI radiomics of bone and soft-tissue sarcomas, thus updating a previous version of this review which included studies published up to 2020. METHODS: A literature search was conducted on EMBASE and PubMed databases for papers published between January 2021 and March 2023. Data regarding radiomic feature reproducibility and model validation strategies were extracted and analyzed. RESULTS: Out of 201 identified papers, 55 were included. They dealt with radiomics of bone (n = 23) or soft-tissue (n = 32) tumors. Thirty-two (out of 54 employing manual or semiautomatic segmentation, 59%) studies included a feature reproducibility analysis. Reproducibility was assessed based on intra/interobserver segmentation variability in 30 (55%) and geometrical transformations of the region of interest in 2 (4%) studies. At least one machine learning validation technique was used for model development in 34 (62%) papers, and K-fold cross-validation was employed most frequently. A clinical validation of the model was reported in 38 (69%) papers. It was performed using a separate dataset from the primary institution (internal test) in 22 (40%), an independent dataset from another institution (external test) in 14 (25%) and both in 2 (4%) studies. CONCLUSIONS: Compared to papers published up to 2020, a clear improvement was noted with almost double publications reporting methodological aspects related to reproducibility and validation. Larger multicenter investigations including external clinical validation and the publication of databases in open-access repositories could further improve methodology and bring radiomics from a research area to the clinical stage. CRITICAL RELEVANCE STATEMENT: An improvement in feature reproducibility and model validation strategies has been shown in this updated systematic review on radiomics of bone and soft-tissue sarcomas, highlighting efforts to enhance methodology and bring radiomics from a research area to the clinical stage. KEY POINTS: • 2021-2023 radiomic studies on CT and MRI of musculoskeletal sarcomas were reviewed. • Feature reproducibility was assessed in more than half (59%) of the studies. • Model clinical validation was performed in 69% of the studies. • Internal (44%) and/or external (29%) test datasets were employed for clinical validation.

11.
Insights Imaging ; 15(1): 15, 2024 Jan 17.
Artigo em Inglês | MEDLINE | ID: mdl-38228800

RESUMO

OBJECTIVES: To present a framework to develop and implement a fast-track artificial intelligence (AI) curriculum into an existing radiology residency program, with the potential to prepare a new generation of AI conscious radiologists. METHODS: The AI-curriculum framework comprises five sequential steps: (1) forming a team of AI experts, (2) assessing the residents' knowledge level and needs, (3) defining learning objectives, (4) matching these objectives with effective teaching strategies, and finally (5) implementing and evaluating the pilot. Following these steps, a multidisciplinary team of AI engineers, radiologists, and radiology residents designed a 3-day program, including didactic lectures, hands-on laboratory sessions, and group discussions with experts to enhance AI understanding. Pre- and post-curriculum surveys were conducted to assess participants' expectations and progress and were analyzed using a Wilcoxon rank-sum test. RESULTS: There was 100% response rate to the pre- and post-curriculum survey (17 and 12 respondents, respectively). Participants' confidence in their knowledge and understanding of AI in radiology significantly increased after completing the program (pre-curriculum means 3.25 ± 1.48 (SD), post-curriculum means 6.5 ± 0.90 (SD), p-value = 0.002). A total of 75% confirmed that the course addressed topics that were applicable to their work in radiology. Lectures on the fundamentals of AI and group discussions with experts were deemed most useful. CONCLUSION: Designing an AI curriculum for radiology residents and implementing it into a radiology residency program is feasible using the framework presented. The 3-day AI curriculum effectively increased participants' perception of knowledge and skills about AI in radiology and can serve as a starting point for further customization. CRITICAL RELEVANCE STATEMENT: The framework provides guidance for developing and implementing an AI curriculum in radiology residency programs, educating residents on the application of AI in radiology and ultimately contributing to future high-quality, safe, and effective patient care. KEY POINTS: • AI education is necessary to prepare a new generation of AI-conscious radiologists. • The AI curriculum increased participants' perception of AI knowledge and skills in radiology. • This five-step framework can assist integrating AI education into radiology residency programs.

12.
IEEE Trans Med Imaging ; 43(1): 216-228, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37428657

RESUMO

Karyotyping is of importance for detecting chromosomal aberrations in human disease. However, chromosomes easily appear curved in microscopic images, which prevents cytogeneticists from analyzing chromosome types. To address this issue, we propose a framework for chromosome straightening, which comprises a preliminary processing algorithm and a generative model called masked conditional variational autoencoders (MC-VAE). The processing method utilizes patch rearrangement to address the difficulty in erasing low degrees of curvature, providing reasonable preliminary results for the MC-VAE. The MC-VAE further straightens the results by leveraging chromosome patches conditioned on their curvatures to learn the mapping between banding patterns and conditions. During model training, we apply a masking strategy with a high masking ratio to train the MC-VAE with eliminated redundancy. This yields a non-trivial reconstruction task, allowing the model to effectively preserve chromosome banding patterns and structure details in the reconstructed results. Extensive experiments on three public datasets with two stain styles show that our framework surpasses the performance of state-of-the-art methods in retaining banding patterns and structure details. Compared to using real-world bent chromosomes, the use of high-quality straightened chromosomes generated by our proposed method can improve the performance of various deep learning models for chromosome classification by a large margin. Such a straightening approach has the potential to be combined with other karyotyping systems to assist cytogeneticists in chromosome analysis.


Assuntos
Algoritmos , Cromossomos , Humanos , Cariotipagem , Bandeamento Cromossômico
13.
Eur Radiol ; 34(3): 2084-2092, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37658141

RESUMO

OBJECTIVES: To develop a deep learning-based method for contrast-enhanced breast lesion detection in ultrafast screening MRI. MATERIALS AND METHODS: A total of 837 breast MRI exams of 488 consecutive patients were included. Lesion's location was independently annotated in the maximum intensity projection (MIP) image of the last time-resolved angiography with stochastic trajectories (TWIST) sequence for each individual breast, resulting in 265 lesions (190 benign, 75 malignant) in 163 breasts (133 women). YOLOv5 models were fine-tuned using training sets containing the same number of MIP images with and without lesions. A long short-term memory (LSTM) network was employed to help reduce false positive predictions. The integrated system was then evaluated on test sets containing enriched uninvolved breasts during cross-validation to mimic the performance in a screening scenario. RESULTS: In five-fold cross-validation, the YOLOv5x model showed a sensitivity of 0.95, 0.97, 0.98, and 0.99, with 0.125, 0.25, 0.5, and 1 false positive per breast, respectively. The LSTM network reduced 15.5% of the false positive prediction from the YOLO model, and the positive predictive value was increased from 0.22 to 0.25. CONCLUSIONS: A fine-tuned YOLOv5x model can detect breast lesions on ultrafast MRI with high sensitivity in a screening population, and the output of the model could be further refined by an LSTM network to reduce the amount of false positive predictions. CLINICAL RELEVANCE STATEMENT: The proposed integrated system would make the ultrafast MRI screening process more effective by assisting radiologists in prioritizing suspicious examinations and supporting the diagnostic workup. KEY POINTS: • Deep convolutional neural networks could be utilized to automatically pinpoint breast lesions in screening MRI with high sensitivity. • False positive predictions significantly increased when the detection models were tested on highly unbalanced test sets with more normal scans. • Dynamic enhancement patterns of breast lesions during contrast inflow learned by the long short-term memory networks helped to reduce false positive predictions.


Assuntos
Neoplasias da Mama , Meios de Contraste , Feminino , Humanos , Meios de Contraste/farmacologia , Mama/patologia , Imageamento por Ressonância Magnética/métodos , Redes Neurais de Computação , Tempo , Neoplasias da Mama/diagnóstico por imagem , Neoplasias da Mama/patologia
14.
Eur Radiol ; 34(4): 2791-2804, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-37733025

RESUMO

OBJECTIVES: To investigate the intra- and inter-rater reliability of the total radiomics quality score (RQS) and the reproducibility of individual RQS items' score in a large multireader study. METHODS: Nine raters with different backgrounds were randomly assigned to three groups based on their proficiency with RQS utilization: Groups 1 and 2 represented the inter-rater reliability groups with or without prior training in RQS, respectively; group 3 represented the intra-rater reliability group. Thirty-three original research papers on radiomics were evaluated by raters of groups 1 and 2. Of the 33 papers, 17 were evaluated twice with an interval of 1 month by raters of group 3. Intraclass coefficient (ICC) for continuous variables, and Fleiss' and Cohen's kappa (k) statistics for categorical variables were used. RESULTS: The inter-rater reliability was poor to moderate for total RQS (ICC 0.30-055, p < 0.001) and very low to good for item's reproducibility (k - 0.12 to 0.75) within groups 1 and 2 for both inexperienced and experienced raters. The intra-rater reliability for total RQS was moderate for the less experienced rater (ICC 0.522, p = 0.009), whereas experienced raters showed excellent intra-rater reliability (ICC 0.91-0.99, p < 0.001) between the first and second read. Intra-rater reliability on RQS items' score reproducibility was higher and most of the items had moderate to good intra-rater reliability (k - 0.40 to 1). CONCLUSIONS: Reproducibility of the total RQS and the score of individual RQS items is low. There is a need for a robust and reproducible assessment method to assess the quality of radiomics research. CLINICAL RELEVANCE STATEMENT: There is a need for reproducible scoring systems to improve quality of radiomics research and consecutively close the translational gap between research and clinical implementation. KEY POINTS: • Radiomics quality score has been widely used for the evaluation of radiomics studies. • Although the intra-rater reliability was moderate to excellent, intra- and inter-rater reliability of total score and point-by-point scores were low with radiomics quality score. • A robust, easy-to-use scoring system is needed for the evaluation of radiomics research.


Assuntos
Radiômica , Leitura , Humanos , Variações Dependentes do Observador , Reprodutibilidade dos Testes
15.
Comput Methods Programs Biomed ; 244: 107939, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38008678

RESUMO

BACKGROUND AND OBJECTIVE: Recently, deep learning (DL) algorithms showed to be promising in predicting outcomes such as distant metastasis-free survival (DMFS) and overall survival (OS) using pre-treatment imaging in head and neck cancer. Gross Tumor Volume of the primary tumor (GTVp) segmentation is used as an additional channel in the input to DL algorithms to improve model performance. However, the binary segmentation mask of the GTVp directs the focus of the network to the defined tumor region only and uniformly. DL models trained for tumor segmentation have also been used to generate predicted tumor probability maps (TPM) where each pixel value corresponds to the degree of certainty of that pixel to be classified as tumor. The aim of this study was to explore the effect of using TPM as an extra input channel of CT- and PET-based DL prediction models for oropharyngeal cancer (OPC) patients in terms of local control (LC), regional control (RC), DMFS and OS. METHODS: We included 399 OPC patients from our institute that were treated with definitive (chemo)radiation. For each patient, CT and PET scans and GTVp contours, used for radiotherapy treatment planning, were collected. We first trained a previously developed 2.5D DL framework for tumor probability prediction by 5-fold cross validation using 131 patients. Then, a 3D ResNet18 was trained for outcome prediction using the 3D TPM as one of the possible inputs. The endpoints were LC, RC, DMFS, and OS. We performed 3-fold cross validation on 168 patients for each endpoint using different combinations of image modalities as input. The final prediction in the test set (100) was obtained by averaging the predictions of the 3-fold models. The C-index was used to evaluate the discriminative performance of the models. RESULTS: The models trained replacing the GTVp contours with the TPM achieved the highest C-indexes for LC (0.74) and RC (0.60) prediction. For OS, using the TPM or the GTVp as additional image modality resulted in comparable C-indexes (0.72 and 0.74). CONCLUSIONS: Adding predicted TPMs instead of GTVp contours as an additional input channel for DL-based outcome prediction models improved model performance for LC and RC.


Assuntos
Aprendizado Profundo , Neoplasias de Cabeça e Pescoço , Neoplasias Orofaríngeas , Humanos , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada/métodos , Neoplasias Orofaríngeas/diagnóstico por imagem , Prognóstico
16.
Comput Biol Med ; 169: 107871, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38154157

RESUMO

BACKGROUND: During lung cancer screening, indeterminate pulmonary nodules (IPNs) are a frequent finding. We aim to predict whether IPNs are resolving or non-resolving to reduce follow-up examinations, using machine learning (ML) models. We incorporated dedicated techniques to enhance prediction explainability. METHODS: In total, 724 IPNs (size 50-500 mm3, 575 participants) from the Dutch-Belgian Randomized Lung Cancer Screening Trial were used. We implemented six ML models and 14 factors to predict nodule disappearance. Random search was applied to determine the optimal hyperparameters on the training set (579 nodules). ML models were trained using 5-fold cross-validation and tested on the test set (145 nodules). Model predictions were evaluated by utilizing the recall, precision, F1 score, and the area under the receiver operating characteristic curve (AUC). The best-performing model was used for three feature importance techniques: mean decrease in impurity (MDI), permutation feature importance (PFI), and SHAPley Additive exPlanations (SHAP). RESULTS: The random forest model outperformed the other ML models with an AUC of 0.865. This model achieved a recall of 0.646, a precision of 0.816, and an F1 score of 0.721. The evaluation of feature importance achieved consistent ranking across all three methods for the most crucial factors. The MDI, PFI, and SHAP methods highlighted volume, maximum diameter, and minimum diameter as the top three factors. However, the remaining factors revealed discrepant ranking across methods. CONCLUSION: ML models effectively predict IPN disappearance using participant demographics and nodule characteristics. Explainable techniques can assist clinicians in developing understandable preliminary assessments.


Assuntos
Neoplasias Pulmonares , Humanos , Detecção Precoce de Câncer , Aprendizado de Máquina , Curva ROC , Ensaios Clínicos Controlados Aleatórios como Assunto
17.
Phys Imaging Radiat Oncol ; 28: 100502, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-38026084

RESUMO

Background and purpose: To compare the prediction performance of image features of computed tomography (CT) images extracted by radiomics, self-supervised learning and end-to-end deep learning for local control (LC), regional control (RC), locoregional control (LRC), distant metastasis-free survival (DMFS), tumor-specific survival (TSS), overall survival (OS) and disease-free survival (DFS) of oropharyngeal squamous cell carcinoma (OPSCC) patients after (chemo)radiotherapy. Methods and materials: The OPC-Radiomics dataset was used for model development and independent internal testing and the UMCG-OPC set for external testing. Image features were extracted from the Gross Tumor Volume contours of the primary tumor (GTVt) regions in CT scans when using radiomics or a self-supervised learning-based method (autoencoder). Clinical and combined (radiomics, autoencoder or end-to-end) models were built using multivariable Cox proportional-hazard analysis with clinical features only and both clinical and image features for LC, RC, LRC, DMFS, TSS, OS and DFS prediction, respectively. Results: In the internal test set, combined autoencoder models performed better than clinical models and combined radiomics models for LC, RC, LRC, DMFS, TSS and DFS prediction (largest improvements in C-index: 0.91 vs. 0.76 in RC and 0.74 vs. 0.60 in DMFS). In the external test set, combined radiomics models performed better than clinical and combined autoencoder models for all endpoints (largest improvements in LC, 0.82 vs. 0.71). Furthermore, combined models performed better in risk stratification than clinical models and showed good calibration for most endpoints. Conclusions: Image features extracted using self-supervised learning showed best internal prediction performance while radiomics features have better external generalizability.

18.
BJR Open ; 5(1): 20230033, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37953871

RESUMO

Artificial intelligence (AI) has transitioned from the lab to the bedside, and it is increasingly being used in healthcare. Radiology and Radiography are on the frontline of AI implementation, because of the use of big data for medical imaging and diagnosis for different patient groups. Safe and effective AI implementation requires that responsible and ethical practices are upheld by all key stakeholders, that there is harmonious collaboration between different professional groups, and customised educational provisions for all involved. This paper outlines key principles of ethical and responsible AI, highlights recent educational initiatives for clinical practitioners and discusses the synergies between all medical imaging professionals as they prepare for the digital future in Europe. Responsible and ethical AI is vital to enhance a culture of safety and trust for healthcare professionals and patients alike. Educational and training provisions for medical imaging professionals on AI is central to the understanding of basic AI principles and applications and there are many offerings currently in Europe. Education can facilitate the transparency of AI tools, but more formalised, university-led training is needed to ensure the academic scrutiny, appropriate pedagogy, multidisciplinarity and customisation to the learners' unique needs are being adhered to. As radiographers and radiologists work together and with other professionals to understand and harness the benefits of AI in medical imaging, it becomes clear that they are faced with the same challenges and that they have the same needs. The digital future belongs to multidisciplinary teams that work seamlessly together, learn together, manage risk collectively and collaborate for the benefit of the patients they serve.

19.
J Magn Reson Imaging ; 2023 Oct 17.
Artigo em Inglês | MEDLINE | ID: mdl-37846440

RESUMO

BACKGROUND: Accurate breast density evaluation allows for more precise risk estimation but suffers from high inter-observer variability. PURPOSE: To evaluate the feasibility of reducing inter-observer variability of breast density assessment through artificial intelligence (AI) assisted interpretation. STUDY TYPE: Retrospective. POPULATION: Six hundred and twenty-one patients without breast prosthesis or reconstructions were randomly divided into training (N = 377), validation (N = 98), and independent test (N = 146) datasets. FIELD STRENGTH/SEQUENCE: 1.5 T and 3.0 T; T1-weighted spectral attenuated inversion recovery. ASSESSMENT: Five radiologists independently assessed each scan in the independent test set to establish the inter-observer variability baseline and to reach a reference standard. Deep learning and three radiomics models were developed for three classification tasks: (i) four Breast Imaging-Reporting and Data System (BI-RADS) breast composition categories (A-D), (ii) dense (categories C, D) vs. non-dense (categories A, B), and (iii) extremely dense (category D) vs. moderately dense (categories A-C). The models were tested against the reference standard on the independent test set. AI-assisted interpretation was performed by majority voting between the models and each radiologist's assessment. STATISTICAL TESTS: Inter-observer variability was assessed using linear-weighted kappa (κ) statistics. Kappa statistics, accuracy, and area under the receiver operating characteristic curve (AUC) were used to assess models against reference standard. RESULTS: In the independent test set, five readers showed an overall substantial agreement on tasks (i) and (ii), but moderate agreement for task (iii). The best-performing model showed substantial agreement with reference standard for tasks (i) and (ii), but moderate agreement for task (iii). With the assistance of the AI models, almost perfect inter-observer variability was obtained for tasks (i) (mean κ = 0.86), (ii) (mean κ = 0.94), and (iii) (mean κ = 0.94). DATA CONCLUSION: Deep learning and radiomics models have the potential to help reduce inter-observer variability of breast density assessment. LEVEL OF EVIDENCE: 3 TECHNICAL EFFICACY: Stage 1.

20.
Heliyon ; 9(6): e17104, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-37484314

RESUMO

BACKGROUND: Deep learning is an important means to realize the automatic detection, segmentation, and classification of pulmonary nodules in computed tomography (CT) images. An entire CT scan cannot directly be used by deep learning models due to image size, image format, image dimensionality, and other factors. Between the acquisition of the CT scan and feeding the data into the deep learning model, there are several steps including data use permission, data access and download, data annotation, and data preprocessing. This paper aims to recommend a complete and detailed guide for researchers who want to engage in interdisciplinary lung nodule research of CT images and Artificial Intelligence (AI) engineering. METHODS: The data preparation pipeline used the following four popular large-scale datasets: LIDC-IDRI (Lung Image Database Consortium image collection), LUNA16 (Lung Nodule Analysis 2016), NLST (National Lung Screening Trial) and NELSON (The Dutch-Belgian Randomized Lung Cancer Screening Trial). The dataset preparation is presented in chronological order. FINDINGS: The different data preparation steps before deep learning were identified. These include both more generic steps and steps dedicated to lung nodule research. For each of these steps, the required process, necessity, and example code or tools for actual implementation are provided. DISCUSSION AND CONCLUSION: Depending on the specific research question, researchers should be aware of the various preparation steps required and carefully select datasets, data annotation methods, and image preprocessing methods. Moreover, it is vital to acknowledge that each auxiliary tool or code has its specific scope of use and limitations. This paper proposes a standardized data preparation process while clearly demonstrating the principles and sequence of different steps. A data preparation pipeline can be quickly realized by following these proposed steps and implementing the suggested example codes and tools.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA