Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 26
Filter
1.
AJR Am J Roentgenol ; 2024 Apr 10.
Article in English | MEDLINE | ID: mdl-38598354

ABSTRACT

Large language models (LLMs) hold immense potential to revolutionize radiology. However, their integration into practice requires careful consideration. Artificial intelligence (AI) chatbots and general-purpose LLMs have potential pitfalls related to privacy, transparency, and accuracy, limiting their current clinical readiness. Thus, LLM-based tools must be optimized for radiology practice to overcome these limitations. While research and validation for radiology applications remain in their infancy, commercial products incorporating LLMs are becoming available alongside promises of transforming practice. To help radiologists navigate this landscape, this AJR Expert Panel Narrative Review provides a multidimensional perspective on LLMs, encompassing considerations from bench (development and optimization) to bedside (use in practice). At present, LLMs are not autonomous entities that can replace expert decision-making, and radiologists remain responsible for the content of their reports. Patient-facing tools, particularly medical AI chatbots, require additional guardrails to ensure safety and prevent misuse. Still, if responsibly implemented, LLMs are well-positioned to transform efficiency and quality in radiology. Radiologists must be well-informed and proactively involved in guiding the implementation of LLMs in practice to mitigate risks and maximize benefits to patient care.

2.
Radiol Artif Intell ; 6(3): e230227, 2024 May.
Article in English | MEDLINE | ID: mdl-38477659

ABSTRACT

The Radiological Society of North America (RSNA) has held artificial intelligence competitions to tackle real-world medical imaging problems at least annually since 2017. This article examines the challenges and processes involved in organizing these competitions, with a specific emphasis on the creation and curation of high-quality datasets. The collection of diverse and representative medical imaging data involves dealing with issues of patient privacy and data security. Furthermore, ensuring quality and consistency in data, which includes expert labeling and accounting for various patient and imaging characteristics, necessitates substantial planning and resources. Overcoming these obstacles requires meticulous project management and adherence to strict timelines. The article also highlights the potential of crowdsourced annotation to progress medical imaging research. Through the RSNA competitions, an effective global engagement has been realized, resulting in innovative solutions to complex medical imaging problems, thus potentially transforming health care by enhancing diagnostic accuracy and patient outcomes. Keywords: Use of AI in Education, Artificial Intelligence © RSNA, 2024.


Subject(s)
Artificial Intelligence , Radiology , Humans , Diagnostic Imaging/methods , Societies, Medical , North America
3.
Radiol Artif Intell ; 6(1): e230103, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38294325

ABSTRACT

This prospective exploratory study conducted from January 2023 through May 2023 evaluated the ability of ChatGPT to answer questions from Brazilian radiology board examinations, exploring how different prompt strategies can influence performance using GPT-3.5 and GPT-4. Three multiple-choice board examinations that did not include image-based questions were evaluated: (a) radiology and diagnostic imaging, (b) mammography, and (c) neuroradiology. Five different styles of zero-shot prompting were tested: (a) raw question, (b) brief instruction, (c) long instruction, (d) chain-of-thought, and (e) question-specific automatic prompt generation (QAPG). The QAPG and brief instruction prompt strategies performed best for all examinations (P < .05), obtaining passing scores (≥60%) on the radiology and diagnostic imaging examination when testing both versions of ChatGPT. The QAPG style achieved a score of 60% for the mammography examination using GPT-3.5 and 76% using GPT-4. GPT-4 achieved a score up to 65% in the neuroradiology examination. The long instruction style consistently underperformed, implying that excessive detail might harm performance. GPT-4's scores were less sensitive to prompt style changes. The QAPG prompt style showed a high volume of the "A" option but no statistical difference, suggesting bias was found. GPT-4 passed all three radiology board examinations, and GPT-3.5 passed two of three examinations when using an optimal prompt style. Keywords: ChatGPT, Artificial Intelligence, Board Examinations, Radiology and Diagnostic Imaging, Mammography, Neuroradiology © RSNA, 2023 See also the commentary by Trivedi and Gichoya in this issue.


Subject(s)
Artificial Intelligence , Radiology , Brazil , Prospective Studies , Radiography , Mammography
4.
Eur Radiol ; 34(3): 2024-2035, 2024 Mar.
Article in English | MEDLINE | ID: mdl-37650967

ABSTRACT

OBJECTIVES: Evaluate the performance of a deep learning (DL)-based model for multiple sclerosis (MS) lesion segmentation and compare it to other DL and non-DL algorithms. METHODS: This ambispective, multicenter study assessed the performance of a DL-based model for MS lesion segmentation and compared it to alternative DL- and non-DL-based methods. Models were tested on internal (n = 20) and external (n = 18) datasets from Latin America, and on an external dataset from Europe (n = 49). We also examined robustness by rescanning six patients (n = 6) from our MS clinical cohort. Moreover, we studied inter-human annotator agreement and discussed our findings in light of these results. Performance and robustness were assessed using intraclass correlation coefficient (ICC), Dice coefficient (DC), and coefficient of variation (CV). RESULTS: Inter-human ICC ranged from 0.89 to 0.95, while spatial agreement among annotators showed a median DC of 0.63. Using expert manual segmentations as ground truth, our DL model achieved a median DC of 0.73 on the internal, 0.66 on the external, and 0.70 on the challenge datasets. The performance of our DL model exceeded that of the alternative algorithms on all datasets. In the robustness experiment, our DL model also achieved higher DC (ranging from 0.82 to 0.90) and lower CV (ranging from 0.7 to 7.9%) when compared to the alternative methods. CONCLUSION: Our DL-based model outperformed alternative methods for brain MS lesion segmentation. The model also proved to generalize well on unseen data and has a robust performance and low processing times both on real-world and challenge-based data. CLINICAL RELEVANCE STATEMENT: Our DL-based model demonstrated superior performance in accurately segmenting brain MS lesions compared to alternative methods, indicating its potential for clinical application with improved accuracy, robustness, and efficiency. KEY POINTS: • Automated lesion load quantification in MS patients is valuable; however, more accurate methods are still necessary. • A novel deep learning model outperformed alternative MS lesion segmentation methods on multisite datasets. • Deep learning models are particularly suitable for MS lesion segmentation in clinical scenarios.


Subject(s)
Magnetic Resonance Imaging , Multiple Sclerosis , Humans , Magnetic Resonance Imaging/methods , Multiple Sclerosis/diagnostic imaging , Multiple Sclerosis/pathology , Neural Networks, Computer , Algorithms , Brain/diagnostic imaging , Brain/pathology
5.
Radiology ; 309(1): e232372, 2023 10.
Article in English | MEDLINE | ID: mdl-37787677

Subject(s)
Radiology , Humans , Radiography
6.
Radiol Artif Intell ; 5(5): e230034, 2023 Sep.
Article in English | MEDLINE | ID: mdl-37795143

ABSTRACT

This dataset is composed of cervical spine CT images with annotations related to fractures; it is available at https://www.kaggle.com/competitions/rsna-2022-cervical-spine-fracture-detection/.

7.
J Digit Imaging ; 36(5): 2306-2312, 2023 10.
Article in English | MEDLINE | ID: mdl-37407841

ABSTRACT

Since 2000, there have been more than 8000 publications on radiology artificial intelligence (AI). AI breakthroughs allow complex tasks to be automated and even performed beyond human capabilities. However, the lack of details on the methods and algorithm code undercuts its scientific value. Many science subfields have recently faced a reproducibility crisis, eroding trust in processes and results, and influencing the rise in retractions of scientific papers. For the same reasons, conducting research in deep learning (DL) also requires reproducibility. Although several valuable manuscript checklists for AI in medical imaging exist, they are not focused specifically on reproducibility. In this study, we conducted a systematic review of recently published papers in the field of DL to evaluate if the description of their methodology could allow the reproducibility of their findings. We focused on the Journal of Digital Imaging (JDI), a specialized journal that publishes papers on AI and medical imaging. We used the keyword "Deep Learning" and collected the articles published between January 2020 and January 2022. We screened all the articles and included the ones which reported the development of a DL tool in medical imaging. We extracted the reported details about the dataset, data handling steps, data splitting, model details, and performance metrics of each included article. We found 148 articles. Eighty were included after screening for articles that reported developing a DL model for medical image analysis. Five studies have made their code publicly available, and 35 studies have utilized publicly available datasets. We provided figures to show the ratio and absolute count of reported items from included studies. According to our cross-sectional study, in JDI publications on DL in medical imaging, authors infrequently report the key elements of their study to make it reproducible.


Subject(s)
Artificial Intelligence , Diagnostic Imaging , Humans , Cross-Sectional Studies , Reproducibility of Results , Algorithms
9.
Semin Roentgenol ; 58(2): 203-207, 2023 Apr.
Article in English | MEDLINE | ID: mdl-37087141

ABSTRACT

In the field of radiology, the use of artificial intelligence (AI) is increasing. Even though healthcare facilities are interested in using this technology, having success with an AI project can be challenging. There is a myriad of AI solutions today, and comparing them can be challenging. Moreover, the implementation process involves alignment with many different areas. In our institution, we have been testing, developing, deploying, and monitoring AI solutions for the last four years. This article intends to share our experience and highlight the most important points to ensure a successful project based on our experience in the setting of a large private practice in Latin America.


Subject(s)
Artificial Intelligence , Radiology , Humans , Latin America , Radiography
10.
11.
Sci Rep ; 13(1): 1383, 2023 01 25.
Article in English | MEDLINE | ID: mdl-36697450

ABSTRACT

Artificial intelligence (AI)-generated clinical advice is becoming more prevalent in healthcare. However, the impact of AI-generated advice on physicians' decision-making is underexplored. In this study, physicians received X-rays with correct diagnostic advice and were asked to make a diagnosis, rate the advice's quality, and judge their own confidence. We manipulated whether the advice came with or without a visual annotation on the X-rays, and whether it was labeled as coming from an AI or a human radiologist. Overall, receiving annotated advice from an AI resulted in the highest diagnostic accuracy. Physicians rated the quality of AI advice higher than human advice. We did not find a strong effect of either manipulation on participants' confidence. The magnitude of the effects varied between task experts and non-task experts, with the latter benefiting considerably from correct explainable AI advice. These findings raise important considerations for the deployment of diagnostic advice in healthcare.


Subject(s)
Artificial Intelligence , Physicians , Humans , X-Rays , Radiography , Radiologists
13.
Medicine (Baltimore) ; 101(29): e29587, 2022 Jul 22.
Article in English | MEDLINE | ID: mdl-35866818

ABSTRACT

To tune and test the generalizability of a deep learning-based model for assessment of COVID-19 lung disease severity on chest radiographs (CXRs) from different patient populations. A published convolutional Siamese neural network-based model previously trained on hospitalized patients with COVID-19 was tuned using 250 outpatient CXRs. This model produces a quantitative measure of COVID-19 lung disease severity (pulmonary x-ray severity (PXS) score). The model was evaluated on CXRs from 4 test sets, including 3 from the United States (patients hospitalized at an academic medical center (N = 154), patients hospitalized at a community hospital (N = 113), and outpatients (N = 108)) and 1 from Brazil (patients at an academic medical center emergency department (N = 303)). Radiologists from both countries independently assigned reference standard CXR severity scores, which were correlated with the PXS scores as a measure of model performance (Pearson R). The Uniform Manifold Approximation and Projection (UMAP) technique was used to visualize the neural network results. Tuning the deep learning model with outpatient data showed high model performance in 2 United States hospitalized patient datasets (R = 0.88 and R = 0.90, compared to baseline R = 0.86). Model performance was similar, though slightly lower, when tested on the United States outpatient and Brazil emergency department datasets (R = 0.86 and R = 0.85, respectively). UMAP showed that the model learned disease severity information that generalized across test sets. A deep learning model that extracts a COVID-19 severity score on CXRs showed generalizable performance across multiple populations from 2 continents, including outpatients and hospitalized patients.


Subject(s)
COVID-19 , Deep Learning , COVID-19/diagnostic imaging , Humans , Lung , Radiography, Thoracic/methods , Radiologists
14.
Radiology ; 303(1): 52-53, 2022 04.
Article in English | MEDLINE | ID: mdl-35014902

ABSTRACT

Online supplemental material is available for this article.


Subject(s)
Artificial Intelligence , Humans
15.
Radiol Artif Intell ; 3(4): e200184, 2021 Jul.
Article in English | MEDLINE | ID: mdl-34350408

ABSTRACT

PURPOSE: To develop a deep learning model for detecting brain abnormalities on MR images. MATERIALS AND METHODS: In this retrospective study, a deep learning approach using T2-weighted fluid-attenuated inversion recovery images was developed to classify brain MRI findings as "likely normal" or "likely abnormal." A convolutional neural network model was trained on a large, heterogeneous dataset collected from two different continents and covering a broad panel of pathologic conditions, including neoplasms, hemorrhages, infarcts, and others. Three datasets were used. Dataset A consisted of 2839 patients, dataset B consisted of 6442 patients, and dataset C consisted of 1489 patients and was only used for testing. Datasets A and B were split into training, validation, and test sets. A total of three models were trained: model A (using only dataset A), model B (using only dataset B), and model A + B (using training datasets from A and B). All three models were tested on subsets from dataset A, dataset B, and dataset C separately. The evaluation was performed by using annotations based on the images, as well as labels based on the radiology reports. RESULTS: Model A trained on dataset A from one institution and tested on dataset C from another institution reached an F1 score of 0.72 (95% CI: 0.70, 0.74) and an area under the receiver operating characteristic curve of 0.78 (95% CI: 0.75, 0.80) when compared with findings from the radiology reports. CONCLUSION: The model shows relatively good performance for differentiating between likely normal and likely abnormal brain examination findings by using data from different institutions.Keywords: MR-Imaging, Head/Neck, Computer Applications-General (Informatics), Convolutional Neural Network (CNN), Deep Learning Algorithms, Machine Learning Algorithms© RSNA, 2021Supplemental material is available for this article.

18.
Radiology ; 299(1): E204-E213, 2021 04.
Article in English | MEDLINE | ID: mdl-33399506

ABSTRACT

The coronavirus disease 2019 (COVID-19) pandemic is a global health care emergency. Although reverse-transcription polymerase chain reaction testing is the reference standard method to identify patients with COVID-19 infection, chest radiography and CT play a vital role in the detection and management of these patients. Prediction models for COVID-19 imaging are rapidly being developed to support medical decision making. However, inadequate availability of a diverse annotated data set has limited the performance and generalizability of existing models. To address this unmet need, the RSNA and Society of Thoracic Radiology collaborated to develop the RSNA International COVID-19 Open Radiology Database (RICORD). This database is the first multi-institutional, multinational, expert-annotated COVID-19 imaging data set. It is made freely available to the machine learning community as a research and educational resource for COVID-19 chest imaging. Pixel-level volumetric segmentation with clinical annotations was performed by thoracic radiology subspecialists for all COVID-19-positive thoracic CT scans. The labeling schema was coordinated with other international consensus panels and COVID-19 data annotation efforts, the European Society of Medical Imaging Informatics, the American College of Radiology, and the American Association of Physicists in Medicine. Study-level COVID-19 classification labels for chest radiographs were annotated by three radiologists, with majority vote adjudication by board-certified radiologists. RICORD consists of 240 thoracic CT scans and 1000 chest radiographs contributed from four international sites. It is anticipated that RICORD will ideally lead to prediction models that can demonstrate sustained performance across populations and health care systems.


Subject(s)
COVID-19/diagnostic imaging , Databases, Factual/statistics & numerical data , Global Health/statistics & numerical data , Lung/diagnostic imaging , Tomography, X-Ray Computed/methods , Humans , Internationality , Radiography, Thoracic , Radiology , SARS-CoV-2 , Societies, Medical , Tomography, X-Ray Computed/statistics & numerical data
19.
medRxiv ; 2020 Sep 18.
Article in English | MEDLINE | ID: mdl-32995811

ABSTRACT

PURPOSE: To improve and test the generalizability of a deep learning-based model for assessment of COVID-19 lung disease severity on chest radiographs (CXRs) from different patient populations. MATERIALS AND METHODS: A published convolutional Siamese neural network-based model previously trained on hospitalized patients with COVID-19 was tuned using 250 outpatient CXRs. This model produces a quantitative measure of COVID-19 lung disease severity (pulmonary x-ray severity (PXS) score). The model was evaluated on CXRs from four test sets, including 3 from the United States (patients hospitalized at an academic medical center (N=154), patients hospitalized at a community hospital (N=113), and outpatients (N=108)) and 1 from Brazil (patients at an academic medical center emergency department (N=303)). Radiologists from both countries independently assigned reference standard CXR severity scores, which were correlated with the PXS scores as a measure of model performance (Pearson r). The Uniform Manifold Approximation and Projection (UMAP) technique was used to visualize the neural network results. RESULTS: Tuning the deep learning model with outpatient data improved model performance in two United States hospitalized patient datasets (r=0.88 and r=0.90, compared to baseline r=0.86). Model performance was similar, though slightly lower, when tested on the United States outpatient and Brazil emergency department datasets (r=0.86 and r=0.85, respectively). UMAP showed that the model learned disease severity information that generalized across test sets. CONCLUSIONS: Performance of a deep learning-based model that extracts a COVID-19 severity score on CXRs improved using training data from a different patient cohort (outpatient versus hospitalized) and generalized across multiple populations.

SELECTION OF CITATIONS
SEARCH DETAIL
...