RESUMO
PURPOSE: To harmonize the use of color for MR relaxometry maps and therefore recommend the use of specific color-maps for representing T 1 $$ {\mathrm{T}}_1 $$ , T 2 $$ {\mathrm{T}}_2 $$ , and T 2 * $$ {\mathrm{T}}_2^{\ast } $$ maps and their inverses. METHODS: Perceptually linearized color-maps were chosen to have similar color settings as those proposed by Griswold et al. in 2018. A Delphi process, polling the opinion of a panel of 81 experts, was used to generate consensus on the suitability of these maps. RESULTS: Consensus was reached on the suitability of the logarithm-processed Lipari color-map for T 1 $$ {\mathrm{T}}_1 $$ and the logarithm-processed Navia color-map for T 2 $$ {\mathrm{T}}_2 $$ and T 2 * $$ {\mathrm{T}}_2^{\ast } $$ . There was consensus on color bars being mandatory and on the use of a specific value indicating "invalidity." There was no consensus on whether the ranges should be fixed per anatomy. CONCLUSION: The authors recommend the use of the logarithm-processed Lipari color-map for displaying quantitative T 1 $$ {\mathrm{T}}_1 $$ maps and R 1 $$ {\mathrm{R}}_1 $$ maps; likewise, the authors recommend the logarithm-processed Navia color-map for displaying T 2 $$ {\mathrm{T}}_2 $$ , T 2 * $$ {\mathrm{T}}_2^{\ast } $$ , R 2 $$ {\mathrm{R}}_2 $$ , and R 2 * $$ {\mathrm{R}}_2^{\ast } $$ maps. This work originated with the Quantitative MR Study Group of the International Society of Magnetic Resonance in Medicine (ISMRM); it has the approval of the Publication Committee and of the Board of the ISMRM.
RESUMO
Presenting quantitative data using non-standardized color maps potentially results in unrecognized misinterpretation of data. Clinically meaningful color maps should intuitively and inclusively represent data without misleading interpretation. Uniformity of the color gradient for color maps is critically important. Maximal color and lightness contrast, readability for color vision-impaired individuals, and recognizability of the color scheme are highly desirable features. This article describes the use of color maps in five key quantitative MRI techniques: relaxometry, diffusion-weighted imaging (DWI), dynamic contrast-enhanced (DCE)-MRI, MR elastography (MRE), and water-fat MRI. Current display practice of color maps is reviewed and shortcomings against desirable features are highlighted. EVIDENCE LEVEL: 5 TECHNICAL EFFICACY: Stage 2.
RESUMO
OBJECTIVES: To investigate the model-, code-, and data-sharing practices in the current radiomics research landscape and to introduce a radiomics research database. METHODS: A total of 1254 articles published between January 1, 2021, and December 31, 2022, in leading radiology journals (European Radiology, European Journal of Radiology, Radiology, Radiology: Artificial Intelligence, Radiology: Cardiothoracic Imaging, Radiology: Imaging Cancer) were retrospectively screened, and 257 original research articles were included in this study. The categorical variables were compared using Fisher's exact tests or chi-square test and numerical variables using Student's t test with relation to the year of publication. RESULTS: Half of the articles (128 of 257) shared the model by either including the final model formula or reporting the coefficients of selected radiomics features. A total of 73 (28%) models were validated on an external independent dataset. Only 16 (6%) articles shared the data or used publicly available open datasets. Similarly, only 20 (7%) of the articles shared the code. A total of 7 (3%) articles both shared code and data. All collected data in this study is presented in a radiomics research database (RadBase) and could be accessed at https://github.com/EuSoMII/RadBase . CONCLUSION: According to the results of this study, the majority of published radiomics models were not technically reproducible since they shared neither model nor code and data. There is still room for improvement in carrying out reproducible and open research in the field of radiomics. CLINICAL RELEVANCE STATEMENT: To date, the reproducibility of radiomics research and open science practices within the radiomics research community are still very low. Ensuring reproducible radiomics research with model-, code-, and data-sharing practices will facilitate faster clinical translation. KEY POINTS: ⢠There is a discrepancy between the number of published radiomics papers and the clinical implementation of these published radiomics models. ⢠The main obstacle to clinical implementation is the lack of model-, code-, and data-sharing practices. ⢠In order to translate radiomics research into clinical practice, the radiomics research community should adopt open science practices.
Assuntos
Inteligência Artificial , Radiômica , Humanos , Reprodutibilidade dos Testes , Estudos Retrospectivos , RadiografiaRESUMO
OBJECTIVES: Structured reporting enhances comparability, readability, and content detail. Large language models (LLMs) could convert free text into structured data without disrupting radiologists' reporting workflow. This study evaluated an on-premise, privacy-preserving LLM for automatically structuring free-text radiology reports. MATERIALS AND METHODS: We developed an approach to controlling the LLM output, ensuring the validity and completeness of structured reports produced by a locally hosted Llama-2-70B-chat model. A dataset with de-identified narrative chest radiograph (CXR) reports was compiled retrospectively. It included 202 English reports from a publicly available MIMIC-CXR dataset and 197 German reports from our university hospital. Senior radiologist prepared a detailed, fully structured reporting template with 48 question-answer pairs. All reports were independently structured by the LLM and two human readers. Bayesian inference (Markov chain Monte Carlo sampling) was used to estimate the distributions of Matthews correlation coefficient (MCC), with [-0.05, 0.05] as the region of practical equivalence (ROPE). RESULTS: The LLM generated valid structured reports in all cases, achieving an average MCC of 0.75 (94% HDI: 0.70-0.80) and F1 score of 0.70 (0.70-0.80) for English, and 0.66 (0.62-0.70) and 0.68 (0.64-0.72) for German reports, respectively. The MCC differences between LLM and humans were within ROPE for both languages: 0.01 (-0.05 to 0.07), 0.01 (-0.05 to 0.07) for English, and -0.01 (-0.07 to 0.05), 0.00 (-0.06 to 0.06) for German, indicating approximately comparable performance. CONCLUSION: Locally hosted, open-source LLMs can automatically structure free-text radiology reports with approximately human accuracy. However, the understanding of semantics varied across languages and imaging findings. KEY POINTS: Question Why has structured reporting not been widely adopted in radiology despite clear benefits and how can we improve this? Findings A locally hosted large language model successfully structured narrative reports, showing variation between languages and findings. Critical relevance Structured reporting provides many benefits, but its integration into the clinical routine is limited. Automating the extraction of structured information from radiology reports enables the capture of structured data while allowing the radiologist to maintain their reporting workflow.
RESUMO
Background Automation bias (the propensity for humans to favor suggestions from automated decision-making systems) is a known source of error in human-machine interactions, but its implications regarding artificial intelligence (AI)-aided mammography reading are unknown. Purpose To determine how automation bias can affect inexperienced, moderately experienced, and very experienced radiologists when reading mammograms with the aid of an artificial intelligence (AI) system. Materials and Methods In this prospective experiment, 27 radiologists read 50 mammograms and provided their Breast Imaging Reporting and Data System (BI-RADS) assessment assisted by a purported AI system. Mammograms were obtained between January 2017 and December 2019 and were presented in two randomized sets. The first was a training set of 10 mammograms, with the correct BI-RADS category suggested by the AI system. The second was a set of 40 mammograms in which an incorrect BI-RADS category was suggested for 12 mammograms. Reader performance, degree of bias in BI-RADS scoring, perceived accuracy of the AI system, and reader confidence in their own BI-RADS ratings were assessed using analysis of variance (ANOVA) and repeated-measures ANOVA followed by post hoc tests and Kruskal-Wallis tests followed by the Dunn post hoc test. Results The percentage of correctly rated mammograms by inexperienced (mean, 79.7% ± 11.7 [SD] vs 19.8% ± 14.0; P < .001; r = 0.93), moderately experienced (mean, 81.3% ± 10.1 vs 24.8% ± 11.6; P < .001; r = 0.96), and very experienced (mean, 82.3% ± 4.2 vs 45.5% ± 9.1; P = .003; r = 0.97) radiologists was significantly impacted by the correctness of the AI prediction of BI-RADS category. Inexperienced radiologists were significantly more likely to follow the suggestions of the purported AI when it incorrectly suggested a higher BI-RADS category than the actual ground truth compared with both moderately (mean degree of bias, 4.0 ± 1.8 vs 2.4 ± 1.5; P = .044; r = 0.46) and very (mean degree of bias, 4.0 ± 1.8 vs 1.2 ± 0.8; P = .009; r = 0.65) experienced readers. Conclusion The results show that inexperienced, moderately experienced, and very experienced radiologists reading mammograms are prone to automation bias when being supported by an AI-based system. This and other effects of human and machine interaction must be considered to ensure safe deployment and accurate diagnostic performance when combining human readers and AI. © RSNA, 2023 Supplemental material is available for this article. See also the editorial by Baltzer in this issue.
Assuntos
Inteligência Artificial , Neoplasias da Mama , Humanos , Feminino , Estudos Prospectivos , Mamografia , Automação , Neoplasias da Mama/diagnóstico por imagem , Estudos RetrospectivosRESUMO
BACKGROUND: For time-consuming diffusion-weighted imaging (DWI) of the breast, deep learning-based imaging acceleration appears particularly promising. PURPOSE: To investigate a combined k-space-to-image reconstruction approach for scan time reduction and improved spatial resolution in breast DWI. STUDY TYPE: Retrospective. POPULATION: 133 women (age 49.7 ± 12.1 years) underwent multiparametric breast MRI. FIELD STRENGTH/SEQUENCE: 3.0T/T2 turbo spin echo, T1 3D gradient echo, DWI (800 and 1600 sec/mm2 ). ASSESSMENT: DWI data were retrospectively processed using deep learning-based k-space-to-image reconstruction (DL-DWI) and an additional super-resolution algorithm (SRDL-DWI). In addition to signal-to-noise ratio and apparent diffusion coefficient (ADC) comparisons among standard, DL- and SRDL-DWI, a range of quantitative similarity (e.g., structural similarity index [SSIM]) and error metrics (e.g., normalized root mean square error [NRMSE], symmetric mean absolute percent error [SMAPE], log accuracy error [LOGAC]) was calculated to analyze structural variations. Subjective image evaluation was performed independently by three radiologists on a seven-point rating scale. STATISTICAL TESTS: Friedman's rank-based analysis of variance with Bonferroni-corrected pairwise post-hoc tests. P < 0.05 was considered significant. RESULTS: Both DL- and SRDL-DWI allowed for a 39% reduction in simulated scan time over standard DWI (5 vs. 3 minutes). The highest image quality ratings were assigned to SRDL-DWI with good interreader agreement (ICC 0.834; 95% confidence interval 0.818-0.848). Irrespective of b-value, both standard and DL-DWI produced superior SNR compared to SRDL-DWI. ADC values were slightly higher in SRDL-DWI (+0.5%) and DL-DWI (+3.4%) than in standard DWI. Structural similarity was excellent between DL-/SRDL-DWI and standard DWI for either b value (SSIM ≥ 0.86). Calculation of error metrics (NRMSE ≤ 0.05, SMAPE ≤ 0.02, and LOGAC ≤ 0.04) supported the assumption of low voxel-wise error. DATA CONCLUSION: Deep learning-based k-space-to-image reconstruction reduces simulated scan time of breast DWI by 39% without influencing structural similarity. Additionally, super-resolution interpolation allows for substantial improvement of subjective image quality. EVIDENCE LEVEL: 4 TECHNICAL EFFICACY: Stage 1.
RESUMO
OBJECTIVE: To conduct a comprehensive bibliometric analysis of artificial intelligence (AI) and its subfields as well as radiomics in Radiology, Nuclear Medicine, and Medical Imaging (RNMMI). METHODS: Web of Science was queried for relevant publications in RNMMI and medicine along with their associated data from 2000 to 2021. Bibliometric techniques utilised were co-occurrence, co-authorship, citation burst, and thematic evolution analyses. Growth rate and doubling time were also estimated using log-linear regression analyses. RESULTS: According to the number of publications, RNMMI (11,209; 19.8%) was the most prominent category in medicine (56,734). USA (44.6%) and China (23.1%) were the two most productive and collaborative countries. USA and Germany experienced the strongest citation bursts. Thematic evolution has recently exhibited a significant shift toward deep learning. In all analyses, the annual number of publications and citations demonstrated exponential growth, with deep learning-based publications exhibiting the most prominent growth pattern. Estimated continuous growth rate, annual growth rate, and doubling time of the AI and machine learning publications in RNMMI were 26.1% (95% confidence interval [CI], 12.0-40.2%), 29.8% (95% CI, 12.7-49.5%), and 2.7 years (95% CI, 1.7-5.8), respectively. In the sensitivity analysis using data from the last 5 and 10 years, these estimates ranged from 47.6 to 51.1%, 61.0 to 66.7%, and 1.4 to 1.5 years. CONCLUSION: This study provides an overview of AI and radiomics research conducted mainly in RNMMI. These results may assist researchers, practitioners, policymakers, and organisations in gaining a better understanding of both the evolution of these fields and the importance of supporting (e.g., financial) these research activities. KEY POINTS: ⢠In terms of the number of publications on AI and ML, Radiology, Nuclear Medicine, and Medical Imaging was the most prominent category compared to the other categories related to medicine (e.g., Health Policy & Services, Surgery). ⢠All evaluated analyses (i.e., AI, its subfields, and radiomics), based on the annual number of publications and citations, demonstrated exponential growth, with decreasing doubling time, which indicates increasing interest from researchers, journals, and, in turn, the medical imaging community. ⢠The most prominent growth pattern was observed in deep learning-based publications. However, the further thematic analysis demonstrated that deep learning has been underdeveloped but highly relevant to the medical imaging community.
Assuntos
Medicina Nuclear , Humanos , Inteligência Artificial , Radiografia , Cintilografia , BibliometriaRESUMO
On behalf of the International Society for Magnetic Resonance in Medicine (ISMRM) Quantitative MR Study Group, this article provides an overview of considerations for the development, validation, qualification, and dissemination of quantitative MR (qMR) methods. This process is framed in terms of two central technical performance properties, i.e., bias and precision. Although qMR is confounded by undesired effects, methods with low bias and high precision can be iteratively developed and validated. For illustration, two distinct qMR methods are discussed throughout the manuscript: quantification of liver proton-density fat fraction, and cardiac T1 . These examples demonstrate the expansion of qMR methods from research centers toward widespread clinical dissemination. The overall goal of this article is to provide trainees, researchers, and clinicians with essential guidelines for the development and validation of qMR methods, as well as an understanding of necessary steps and potential pitfalls for the dissemination of quantitative MR in research and in the clinic.
Assuntos
Imageamento por Ressonância Magnética , Terapia com Prótons , Viés , Espectroscopia de Ressonância Magnética , Prótons , Reprodutibilidade dos TestesRESUMO
OBJECTIVES: To investigate, in patients with metastatic prostate cancer, whether radiomics of computed tomography (CT) image data enables the differentiation of bone metastases not visible on CT from unaffected bone using 68 Ga-PSMA PET imaging as reference standard. METHODS: In this IRB-approved retrospective study, 67 patients (mean age 71 ± 7 years; range: 55-84 years) showing a total of 205 68 Ga-PSMA-positive prostate cancer bone metastases in the thoraco-lumbar spine and pelvic bone being invisible in CT were included. Metastases and 86 68 Ga-PSMA-negative bone volumes in the same body region were segmented and further post-processed. Intra- and inter-reader reproducibility was assessed, with ICCs < 0.90 being considered non-reproducible. To account for imbalances in the dataset, data augmentation was performed to achieve improved class balance and to avoid model overfitting. The dataset was split into training, test, and validation set. After a multi-step dimension reduction process and feature selection process, the 11 most important and independent features were selected for statistical analyses. RESULTS: A gradient-boosted tree was trained on the selected 11 radiomic features in order to classify patients' bones into bone metastasis and normal bone using the training dataset. This trained model achieved a classification accuracy of 0.85 (95% confidence interval [CI]: 0.76-0.92, p < .001) with 78% sensitivity and 93% specificity. The tuned model was applied on the original, non-augmented dataset resulting in a classification accuracy of 0.90 (95% CI: 0.82-0.98) with 91% sensitivity and 88% specificity. CONCLUSION: Our proof-of-concept study indicates that radiomics may accurately differentiate unaffected bone from metastatic bone, being invisible by the human eye on CT. KEY POINTS: ⢠This proof-of-concept study showed that radiomics applied on CT images may accurately differentiate between bone metastases and metastatic-free bone in patients with prostate cancer. ⢠Future promising applications include automatic bone segmentation, followed by a radiomics classifier, allowing for a screening-like approach in the detection of bone metastases.
Assuntos
Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada , Neoplasias da Próstata , Idoso , Radioisótopos de Gálio , Humanos , Masculino , Pessoa de Meia-Idade , Neoplasias da Próstata/diagnóstico por imagem , Reprodutibilidade dos Testes , Estudos Retrospectivos , Tomografia Computadorizada por Raios XRESUMO
OBJECTIVES: To investigate the robustness of radiomic features between three dual-energy CT (DECT) systems. METHODS: An anthropomorphic body phantom was scanned on three different DECT scanners, a dual-source (dsDECT), a rapid kV-switching (rsDECT), and a dual-layer detector DECT (dlDECT). Twenty-four patients who underwent abdominal DECT examinations on each of the scanner types during clinical follow-up were retrospectively included (n = 72 examinations). Radiomic features were extracted after standardized image processing, following ROI placement in phantom tissues and healthy appearing hepatic, splenic and muscular tissue of patients using virtual monoenergetic images at 65 keV (VMI65keV) and virtual unenhanced images (VUE). In total, 774 radiomic features were extracted including 86 original features and 8 wavelet transformations hereof. Concordance correlation coefficients (CCC) and analysis of variances (ANOVA) were calculated to determine inter-scanner robustness of radiomic features with a CCC of ≥ 0.9 deeming a feature robust. RESULTS: None of the phantom-derived features attained the threshold for high feature robustness for any inter-scanner comparison. The proportion of robust features obtained from patients scanned on all three scanners was low both in VMI65keV (dsDECT vs. rsDECT:16.1% (125/774), dlDECT vs. rsDECT:2.5% (19/774), dsDECT vs. dlDECT:2.6% (20/774)) and VUE (dsDECT vs. rsDECT:11.1% (86/774), dlDECT vs. rsDECT:2.8% (22/774), dsDECT vs. dlDECT:2.7% (21/774)). The proportion of features without significant differences as per ANOVA was higher both in patients (51.4-71.1%) and in the phantom (60.6-73.4%). CONCLUSIONS: The robustness of radiomic features across different DECT scanners in patients was low and the few robust patient-derived features were not reflected in the phantom experiment. Future efforts should aim to improve the cross-platform generalizability of DECT-derived radiomics. KEY POINTS: ⢠Inter-scanner robustness of dual-energy CT-derived radiomic features was on a low level in patients who underwent clinical examinations on three DECT platforms. ⢠The few robust patient-derived features were not confirmed in our phantom experiment. ⢠Limited inter-scanner robustness of dual-energy CT derived radiomic features may impact the generalizability of models built with features from one particular dual-energy CT scanner type.
Assuntos
Imagem Radiográfica a Partir de Emissão de Duplo Fóton , Humanos , Processamento de Imagem Assistida por Computador , Imagens de Fantasmas , Estudos Retrospectivos , Tomografia Computadorizada por Raios XRESUMO
OBJECTIVES: To compare the accuracy of lesion detection of trauma-related injuries using combined "all-in-one" fused (AIO) and conventionally reconstructed images (CR) in acute trauma CT. METHODS: In this retrospective study, trauma CT of 66 patients (median age 47 years, range 18-96 years; 20 female (30.3%)) were read using AIO and CR. Images were independently reviewed by 4 blinded radiologists (two residents and two consultants) for trauma-related injuries in 22 regions. Sub-analyses were performed to analyze the influence of experience (residents vs. consultants) and body region (chest, abdomen, skeletal structures) on lesion detection. Paired t-test was used to compare the accuracy of lesion detection. The effect size was calculated (Cohen's d). Linear mixed-effects model with patients as the fixed effect and random forest models were used to investigate the effect of experience, reconstruction/image processing, and body region on lesion detection. RESULTS: Reading time of residents was significantly faster using AIO (AIO: 266 ± 72 s, CR: 318 ± 113 s; p < 0.001; d = 0.46) while no significant difference was observed in the accuracy of lesion detection (AIO: 93.5 ± 6.0%, CR: 94.6 ± 6.0% p = 0.092; d = - 0.21). Reading time of consultants showed no significant difference (AIO: 283 ± 82 s, CR: 274 ± 95 s; p = 0.067; d = 0.16). Accuracy was significantly higher using CR; however, the difference and effect size were very small (AIO 95.1 ± 4.9%, CR: 97.3 ± 3.7%, p = 0.002; d = - 0.39). The linear mixed-effects model showed only minor effect of image processing/reconstruction for lesion detection. CONCLUSIONS: Residents at the emergency department might benefit from faster reading time without sacrificing lesion detection rate using AIO for trauma CT. KEY POINTS: ⢠Image fusion techniques decrease the reading time of acute trauma CT without sacrificing diagnostic accuracy.
Assuntos
Processamento de Imagem Assistida por Computador , Tomografia Computadorizada por Raios X , Abdome , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Feminino , Humanos , Processamento de Imagem Assistida por Computador/métodos , Pessoa de Meia-Idade , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Estudos Retrospectivos , Tórax , Tomografia Computadorizada por Raios X/métodos , Adulto JovemRESUMO
INTRODUCTION: The aim of this study was to assess the value of computed tomography (CT)-based radiomics of perinephric fat (PNF) for prediction of surgical complexity. METHODS: Fifty-six patients who underwent renal tumor surgery were included. Radiomic features were extracted from contrast-enhanced CT. Machine learning models using radiomic features, the Mayo Adhesive Probability (MAP) score, and/or clinical variables (age, sex, and body mass index) were compared for the prediction of adherent PNF (APF), the occurrence of postoperative complications (Clavien-Dindo Classification ≥2), and surgery duration. Discrimination performance was assessed by the area under the receiver operating characteristic curve (AUC). In addition, the root mean square error (RMSE) and R2 (fraction of explained variance) were used as additional evaluation metrics. RESULTS: A single feature logit model containing "Wavelet-LHH-transformed GLCM Correlation" achieved the best discrimination (AUC 0.90, 95% confidence interval [CI]: 0.75-1.00) and lowest error (RMSE 0.32, 95% CI: 0.20-0.42) at prediction of APF. This model was superior to all other models containing all radiomic features, clinical variables, and/or the MAP score. The performance of uninformative benchmark models for prediction of postoperative complications and surgery duration were not improved by machine learning models. CONCLUSION: Radiomic features derived from PNF may provide valuable information for preoperative risk stratification of patients undergoing renal tumor surgery.
Assuntos
Neoplasias Renais , Humanos , Rim/diagnóstico por imagem , Rim/patologia , Rim/cirurgia , Neoplasias Renais/diagnóstico por imagem , Neoplasias Renais/patologia , Neoplasias Renais/cirurgia , Aprendizado de Máquina , Complicações Pós-Operatórias/diagnóstico por imagem , Complicações Pós-Operatórias/etiologia , Tomografia Computadorizada por Raios X/métodosRESUMO
KEY POINTS: ⢠Although radiomics is potentially a promising approach to analyze medical image data, many pitfalls need to be considered to avoid a reproducibility crisis.⢠There is a translation gap in radiomics research, with many studies being published but so far little to no translation into clinical practice.⢠Going forward, more studies with higher levels of evidence are needed, ideally also focusing on prospective studies with relevant clinical impact.
Assuntos
Reprodutibilidade dos Testes , Humanos , Estudos ProspectivosRESUMO
OBJECTIVES: The goal of the present study was to classify the most common types of plain radiographs using a neural network and to validate the network's performance on internal and external data. Such a network could help improve various radiological workflows. METHODS: All radiographs from the year 2017 (n = 71,274) acquired at our institution were retrieved from the PACS. The 30 largest categories (n = 58,219, 81.7% of all radiographs performed in 2017) were used to develop and validate a neural network (MobileNet v1.0) using transfer learning. Image categories were extracted from DICOM metadata (study and image description) and mapped to the WHO manual of diagnostic imaging. As an independent, external validation set, we used images from other institutions that had been stored in our PACS (n = 5324). RESULTS: In the internal validation, the overall accuracy of the model was 90.3% (95%CI: 89.2-91.3%), whereas, for the external validation set, the overall accuracy was 94.0% (95%CI: 93.3-94.6%). CONCLUSIONS: Using data from one single institution, we were able to classify the most common categories of radiographs with a neural network. The network showed good generalizability on the external validation set and could be used to automatically organize a PACS, preselect radiographs so that they can be routed to more specialized networks for abnormality detection or help with other parts of the radiological workflow (e.g., automated hanging protocols; check if ordered image and performed image are the same). The final AI algorithm is publicly available for evaluation and extension. KEY POINTS: ⢠Data from one single institution can be used to train a neural network for the correct detection of the 30 most common categories of plain radiographs. ⢠The trained model achieved a high accuracy for the majority of categories and showed good generalizability to images from other institutions. ⢠The neural network is made publicly available and can be used to automatically organize a PACS or to preselect radiographs so that they can be routed to more specialized neural networks for abnormality detection.
Assuntos
Aprendizado Profundo , Algoritmos , Humanos , Redes Neurais de Computação , Radiografia , Fluxo de TrabalhoRESUMO
Machine learning offers great opportunities to streamline and improve clinical care from the perspective of cardiac imagers, patients, and the industry and is a very active scientific research field. In light of these advances, the European Society of Cardiovascular Radiology (ESCR), a non-profit medical society dedicated to advancing cardiovascular radiology, has assembled a position statement regarding the use of machine learning (ML) in cardiovascular imaging. The purpose of this statement is to provide guidance on requirements for successful development and implementation of ML applications in cardiovascular imaging. In particular, recommendations on how to adequately design ML studies and how to report and interpret their results are provided. Finally, we identify opportunities and challenges ahead. While the focus of this position statement is ML development in cardiovascular imaging, most considerations are relevant to ML in radiology in general. KEY POINTS: ⢠Development and clinical implementation of machine learning in cardiovascular imaging is a multidisciplinary pursuit. ⢠Based on existing study quality standard frameworks such as SPIRIT and STARD, we propose a list of quality criteria for ML studies in radiology. ⢠The cardiovascular imaging research community should strive for the compilation of multicenter datasets for the development, evaluation, and benchmarking of ML algorithms.
Assuntos
Aprendizado de Máquina , Radiologia , Algoritmos , Humanos , Radiografia , Sociedades MédicasRESUMO
BACKGROUND: In oncology, the correct determination of nodal metastatic disease is essential for patient management, as patient treatment and prognosis are closely linked to the stage of the disease. The aim of the study was to develop a tool for automatic 3D detection and segmentation of lymph nodes (LNs) in computed tomography (CT) scans of the thorax using a fully convolutional neural network based on 3D foveal patches. METHODS: The training dataset was collected from the Computed Tomography Lymph Nodes Collection of the Cancer Imaging Archive, containing 89 contrast-enhanced CT scans of the thorax. A total number of 4275 LNs was segmented semi-automatically by a radiologist, assessing the entire 3D volume of the LNs. Using this data, a fully convolutional neuronal network based on 3D foveal patches was trained with fourfold cross-validation. Testing was performed on an unseen dataset containing 15 contrast-enhanced CT scans of patients who were referred upon suspicion or for staging of bronchial carcinoma. RESULTS: The algorithm achieved a good overall performance with a total detection rate of 76.9% for enlarged LNs during fourfold cross-validation in the training dataset with 10.3 false-positives per volume and of 69.9% in the unseen testing dataset. In the training dataset a better detection rate was observed for enlarged LNs compared to smaller LNs, the detection rate for LNs with a short-axis diameter (SAD) ≥ 20 mm and SAD 5-10 mm being 91.6% and 62.2% (p < 0.001), respectively. Best detection rates were obtained for LNs located in Level 4R (83.6%) and Level 7 (80.4%). CONCLUSIONS: The proposed 3D deep learning approach achieves an overall good performance in the automatic detection and segmentation of thoracic LNs and shows reasonable generalizability, yielding the potential to facilitate detection during routine clinical work and to enable radiomics research without observer-bias.
Assuntos
Carcinoma Broncogênico/diagnóstico por imagem , Aprendizado Profundo , Neoplasias Pulmonares/diagnóstico por imagem , Linfonodos/diagnóstico por imagem , Redes Neurais de Computação , Tomografia Computadorizada por Raios X/métodos , Adulto , Idoso , Axila , Meios de Contraste/administração & dosagem , Conjuntos de Dados como Assunto , Feminino , Humanos , Metástase Linfática/diagnóstico por imagem , Masculino , Mediastino , Pessoa de Meia-Idade , TóraxRESUMO
Current research, especially in oncology, increasingly focuses on the integration of quantitative, multiparametric and functional imaging data. In this fast-growing field of research, radiomics may allow for a more sophisticated analysis of imaging data, far beyond the qualitative evaluation of visible tissue changes. Through use of quantitative imaging data, more tailored and tumour-specific diagnostic work-up and individualized treatment concepts may be applied for oncologic patients in the future. This is of special importance in cross-sectional disciplines such as radiology and radiation oncology, with already high and still further increasing use of imaging data in daily clinical practice. Liver targets are generally treated with stereotactic body radiotherapy (SBRT), allowing for local dose escalation while preserving surrounding normal tissue. With the introduction of online target surveillance with implanted markers, 3D-ultrasound on conventional linacs and hybrid magnetic resonance imaging (MRI)-linear accelerators, individualized adaptive radiotherapy is heading towards realization. The use of big data such as radiomics and the integration of artificial intelligence techniques have the potential to further improve image-based treatment planning and structured follow-up, with outcome/toxicity prediction and immediate detection of (oligo)progression. The scope of current research in this innovative field is to identify and critically discuss possible application forms of radiomics, which is why this review tries to summarize current knowledge about interdisciplinary integration of radiomics in oncologic patients, with a focus on investigations of radiotherapy in patients with liver cancer or oligometastases including multiparametric, quantitative data into (radio)-oncologic workflow from disease diagnosis, treatment planning, delivery and patient follow-up.
Assuntos
Biologia Computacional , Processamento de Imagem Assistida por Computador/métodos , Neoplasias Hepáticas/diagnóstico por imagem , Radioterapia (Especialidade)/métodos , Assistência ao Convalescente , Quimioembolização Terapêutica , Terapia Combinada , Aprendizado Profundo , Humanos , Neoplasias Hepáticas/secundário , Neoplasias Hepáticas/terapia , Órgãos em Risco , Prognóstico , Radiocirurgia , Dosagem Radioterapêutica , Planejamento da Radioterapia Assistida por Computador/métodos , Radioterapia Guiada por Imagem , Cirurgia Assistida por ComputadorRESUMO
OBJECTIVES: To evaluate whether a computed tomography (CT) radiomics-based machine learning classifier can predict histopathology of lymph nodes (LNs) after post-chemotherapy LN dissection (pcRPLND) in patients with metastatic non-seminomatous testicular germ cell tumors (NSTGCTs). METHODS: Eighty patients with retroperitoneal LN metastases and contrast-enhanced CT were included into this retrospective study. Resected LNs were histopathologically classified into "benign" (necrosis/fibrosis) or "malignant" (viable tumor/teratoma). On CT imaging, 204 corresponding LNs were segmented and 97 radiomic features per LN were extracted after standardized image processing. The dataset was split into training, test, and validation sets. After stepwise feature reduction based on reproducibility, variable importance, and correlation analyses, a gradient-boosted tree was trained and tuned on the selected most important features using the training and test datasets. Model validation was performed on the independent validation dataset. RESULTS: The trained machine learning classifier achieved a classification accuracy of 0.81 in the validation dataset with a misclassification of 8 of 36 benign LNs as malignant and 4 of 25 malignant LNs as benign (sensitivity 88%, specificity 72%, negative predictive value 88%). In contrast, a model containing only the LN volume resulted in a classification accuracy of 0.68 with 64% sensitivity and 68% specificity. CONCLUSIONS: CT radiomics represents an exciting new tool for improved prediction of the presence of malignant histopathology in retroperitoneal LN metastases from NSTGCTs, aiming at reducing overtreatment in this group of young patients. Thus, the presented approach should be combined with established clinical biomarkers and further validated in larger, prospective clinical trials. KEY POINTS: ⢠Patients with metastatic non-seminomatous testicular germ cell tumors undergoing post-chemotherapy retroperitoneal lymph node dissection of residual lesions show overtreatment in up to 50%. ⢠We assessed whether a CT radiomics-based machine learning classifier can predict histopathology of lymph nodes after post-chemotherapy lymph node dissection. ⢠The trained machine learning classifier achieved a classification accuracy of 0.81 in the validation dataset with a sensitivity of 88% and a specificity of 78%, thus allowing for prediction of the presence of viable tumor or teratoma in retroperitoneal lymph node metastases.
Assuntos
Biologia Computacional , Linfonodos/diagnóstico por imagem , Aprendizado de Máquina , Neoplasias Embrionárias de Células Germinativas/diagnóstico por imagem , Neoplasias Testiculares/diagnóstico por imagem , Adulto , Humanos , Excisão de Linfonodo , Linfonodos/patologia , Metástase Linfática , Masculino , Pessoa de Meia-Idade , Estadiamento de Neoplasias , Neoplasias Embrionárias de Células Germinativas/patologia , Neoplasias Embrionárias de Células Germinativas/terapia , Orquiectomia , Reprodutibilidade dos Testes , Espaço Retroperitoneal , Estudos Retrospectivos , Neoplasias Testiculares/patologia , Neoplasias Testiculares/terapia , Tomografia Computadorizada por Raios X/métodos , Adulto JovemRESUMO
OBJECTIVES: Interventional radiology (IR) is a growing field but is underrepresented in most medical school curricula. We tested whether endovascular simulator training improves medical students' attitudes towards IR. MATERIALS AND METHODS: We conducted this prospective study at two university medical centers; overall, 305 fourth-year medical students completed a 90-min IR course. The class consisted of theoretical and practical parts involving endovascular simulators. Students completed questionnaires before the course, after the theoretical and after the practical part. On a 7-point Likert scale, they rated their interest in IR, knowledge of IR, attractiveness of IR, and the likelihood to choose IR as subspecialty. We used a crossover design to prevent position-effect bias. RESULTS: The seminar/simulator parts led to the improvement for all items compared with baseline: interest in IR (pre-course 5.2 vs. post-seminar/post-simulator 5.5/5.7), knowledge of IR (pre-course 2.7 vs. post-seminar/post-simulator 5.1/5.4), attractiveness of IR (pre-course 4.6 vs. post-seminar/post-simulator 4.8/5.0), and the likelihood of choosing IR as a subspecialty (pre-course 3.3 vs. post-seminar/post-simulator 3.8/4.1). Effect was significantly stronger for simulator training compared with that for seminar for all items (p < 0.05). For simulator training, subgroup analysis of students with pre-existing positive attitude showed considerable improvement regarding "interest in IR" (× 1.4), "knowledge of IR" (× 23), "attractiveness of IR" (× 2), and "likelihood to choose IR" (× 3.2) compared with pretest. CONCLUSION: Endovascular simulator training significantly improves students' attitude towards IR regarding all items. Implementing such courses at a very early stage in the curriculum should be the first step to expose medical students to IR and push for IR. KEY POINTS: ⢠Dedicated IR-courses have a significant positive effect on students' attitudes towards IR. ⢠Simulator training is superior to a theoretical seminar in positively influencing students' attitudes towards IR. ⢠Implementing dedicated IR courses in medical school might ease recruitment problems in the field.
Assuntos
Competência Clínica , Currículo , Educação de Graduação em Medicina/métodos , Radiologia Intervencionista/educação , Treinamento por Simulação/métodos , Estudantes de Medicina , Centros Médicos Acadêmicos , Adulto , Feminino , Humanos , Masculino , Estudos Prospectivos , Inquéritos e QuestionáriosRESUMO
PURPOSE OF REVIEW: The aim of this structured review is to summarize the current research applications and opportunities arising from artificial intelligence (AI) and texture analysis with regard to cardiac imaging. RECENT FINDINGS: Current research findings suggest tremendous potential for AI in cardiac imaging, especially with regard to objective image analyses, overcoming the limitations of an observer-dependent subjective image interpretation. Researchers have used this technique across multiple imaging modalities, for instance to detect myocardial scars in cardiac MR imaging, to predict contrast enhancement in non-contrast studies, and to improve image acquisition and reconstruction. AI in medical imaging has the potential to provide novel, much-needed applications for improving patient care pertaining to the cardiovascular system. While several shortcomings are still present in the current methodology, AI may serve as a resourceful assistant to radiologists and clinicians alike.