Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 6.221
Filter
Add more filters

Publication year range
1.
Radiology ; 311(3): e232653, 2024 06.
Article in English | MEDLINE | ID: mdl-38888474

ABSTRACT

The deployment of artificial intelligence (AI) solutions in radiology practice creates new demands on existing imaging workflow. Accommodating custom integrations creates a substantial operational and maintenance burden. These custom integrations also increase the likelihood of unanticipated problems. Standards-based interoperability facilitates AI integration with systems from different vendors into a single environment by enabling seamless exchange between information systems in the radiology workflow. Integrating the Healthcare Enterprise (IHE) is an initiative to improve how computer systems share information across health care domains, including radiology. IHE integrates existing standards-such as Digital Imaging and Communications in Medicine, Health Level Seven, and health care lexicons and ontologies (ie, LOINC, RadLex, SNOMED Clinical Terms)-by mapping data elements from one standard to another. IHE Radiology manages profiles (standards-based implementation guides) for departmental workflow and information sharing across care sites, including profiles for scaling AI processing traffic and integrating AI results. This review focuses on the need for standards-based interoperability to scale AI integration in radiology, including a brief review of recent IHE profiles that provide a framework for AI integration. This review also discusses challenges and additional considerations for AI integration, including technical, clinical, and policy perspectives.


Subject(s)
Artificial Intelligence , Radiology Information Systems , Systems Integration , Workflow , Radiology/standards , Radiology Information Systems/standards
2.
Radiology ; 311(1): e232133, 2024 04.
Article in English | MEDLINE | ID: mdl-38687216

ABSTRACT

Background The performance of publicly available large language models (LLMs) remains unclear for complex clinical tasks. Purpose To evaluate the agreement between human readers and LLMs for Breast Imaging Reporting and Data System (BI-RADS) categories assigned based on breast imaging reports written in three languages and to assess the impact of discordant category assignments on clinical management. Materials and Methods This retrospective study included reports for women who underwent MRI, mammography, and/or US for breast cancer screening or diagnostic purposes at three referral centers. Reports with findings categorized as BI-RADS 1-5 and written in Italian, English, or Dutch were collected between January 2000 and October 2023. Board-certified breast radiologists and the LLMs GPT-3.5 and GPT-4 (OpenAI) and Bard, now called Gemini (Google), assigned BI-RADS categories using only the findings described by the original radiologists. Agreement between human readers and LLMs for BI-RADS categories was assessed using the Gwet agreement coefficient (AC1 value). Frequencies were calculated for changes in BI-RADS category assignments that would affect clinical management (ie, BI-RADS 0 vs BI-RADS 1 or 2 vs BI-RADS 3 vs BI-RADS 4 or 5) and compared using the McNemar test. Results Across 2400 reports, agreement between the original and reviewing radiologists was almost perfect (AC1 = 0.91), while agreement between the original radiologists and GPT-4, GPT-3.5, and Bard was moderate (AC1 = 0.52, 0.48, and 0.42, respectively). Across human readers and LLMs, differences were observed in the frequency of BI-RADS category upgrades or downgrades that would result in changed clinical management (118 of 2400 [4.9%] for human readers, 611 of 2400 [25.5%] for Bard, 573 of 2400 [23.9%] for GPT-3.5, and 435 of 2400 [18.1%] for GPT-4; P < .001) and that would negatively impact clinical management (37 of 2400 [1.5%] for human readers, 435 of 2400 [18.1%] for Bard, 344 of 2400 [14.3%] for GPT-3.5, and 255 of 2400 [10.6%] for GPT-4; P < .001). Conclusion LLMs achieved moderate agreement with human reader-assigned BI-RADS categories across reports written in three languages but also yielded a high percentage of discordant BI-RADS categories that would negatively impact clinical management. © RSNA, 2024 Supplemental material is available for this article.


Subject(s)
Breast Neoplasms , Adult , Aged , Female , Humans , Middle Aged , Breast/diagnostic imaging , Breast Neoplasms/diagnostic imaging , Language , Magnetic Resonance Imaging/methods , Mammography/methods , Radiology Information Systems/statistics & numerical data , Retrospective Studies , Ultrasonography, Mammary/methods
3.
Radiology ; 311(2): e232369, 2024 05.
Article in English | MEDLINE | ID: mdl-38805727

ABSTRACT

The American College of Radiology Liver Imaging Reporting and Data System (LI-RADS) standardizes the imaging technique, reporting lexicon, disease categorization, and management for patients with or at risk for hepatocellular carcinoma (HCC). LI-RADS encompasses HCC surveillance with US; HCC diagnosis with CT, MRI, or contrast-enhanced US (CEUS); and treatment response assessment (TRA) with CT or MRI. LI-RADS was recently expanded to include CEUS TRA after nonradiation locoregional therapy or surgical resection. This report provides an overview of LI-RADS CEUS Nonradiation TRA v2024, including a lexicon of imaging findings, techniques, and imaging criteria for posttreatment tumor viability assessment. LI-RADS CEUS Nonradiation TRA v2024 takes into consideration differences in the CEUS appearance of viable tumor and posttreatment changes within and in close proximity to a treated lesion. Due to the high sensitivity of CEUS to vascular flow, posttreatment reactive changes commonly manifest as areas of abnormal perilesional enhancement without washout, especially in the first 3 months after treatment. To improve the accuracy of CEUS for nonradiation TRA, different diagnostic criteria are used to evaluate tumor viability within and outside of the treated lesion margin. Broader criteria for intralesional enhancement increase sensitivity for tumor viability detection. Stricter criteria for perilesional enhancement limit miscategorization of posttreatment reactive changes as viable tumor. Finally, the TRA algorithm reconciles intralesional and perilesional tumor viability assessment and assigns a single LI-RADS treatment response (LR-TR) category: LR-TR nonviable, LR-TR equivocal, or LR-TR viable.


Subject(s)
Carcinoma, Hepatocellular , Contrast Media , Liver Neoplasms , Ultrasonography , Humans , Liver Neoplasms/diagnostic imaging , Liver Neoplasms/radiotherapy , Carcinoma, Hepatocellular/diagnostic imaging , Carcinoma, Hepatocellular/radiotherapy , Ultrasonography/methods , Radiology Information Systems , Liver/diagnostic imaging , Treatment Outcome
4.
Radiology ; 312(2): e233332, 2024 08.
Article in English | MEDLINE | ID: mdl-39162630

ABSTRACT

The Ovarian-Adnexal Reporting and Data System (O-RADS) is an evidence-based clinical support system for ovarian and adnexal lesion assessment in women of average risk. The system has both US and MRI components with separate but complementary lexicons and assessment categories to assign the risk of malignancy. US is an appropriate initial imaging modality, and O-RADS US can accurately help to characterize most adnexal lesions. MRI is a valuable adjunct imaging tool to US, and O-RADS MRI can help to both confirm a benign diagnosis and accurately stratify lesions that are at risk for malignancy. This article will review the O-RADS US and MRI systems, highlight their similarities and differences, and provide an overview of the interplay between the systems. When used together, the O-RADS US and MRI systems can help to accurately diagnose benign lesions, assess the risk of malignancy in lesions suspicious for malignancy, and triage patients for optimal management.


Subject(s)
Adnexal Diseases , Magnetic Resonance Imaging , Ovarian Neoplasms , Radiology Information Systems , Ultrasonography , Humans , Female , Magnetic Resonance Imaging/methods , Adnexal Diseases/diagnostic imaging , Ovarian Neoplasms/diagnostic imaging , Ultrasonography/methods
5.
Radiology ; 312(2): e232914, 2024 08.
Article in English | MEDLINE | ID: mdl-39189902

ABSTRACT

Background Current terms used to describe the MRI findings for musculoskeletal infections are nonspecific and inconsistent. Purpose To develop and validate an MRI-based musculoskeletal infection classification and scoring system. Materials and Methods In this retrospective cross-sectional internal validation study, a Musculoskeletal Infection Reporting and Data System (MSKI-RADS) was designed. Adult patients with radiographs and MRI scans of suspected extremity infections with a known reference standard obtained between June 2015 and May 2019 were included. The scoring categories were as follows: 0, incomplete imaging; I, negative for infection; II, superficial soft-tissue infection; III, deeper soft-tissue infection; IV, possible osteomyelitis (OM); V, highly suggestive of OM and/or septic arthritis; VI, known OM; and NOS (not otherwise specified), nonspecific bone lesions. Interreader agreement for 20 radiologists from 13 institutions (intraclass correlation coefficient [ICC]) and true-positive rates of MSKI-RADS were calculated and the accuracy of final diagnoses rendered by the readers was compared using generalized estimating equations for clustered data. Results Among paired radiographs and MRI scans from 208 patients (133 male, 75 female; mean age, 55 years ± 13 [SD]), 20 were category I; 34, II; 35, III; 30, IV; 35, V; 18, VI; and 36, NOS. Moderate interreader agreement was observed among the 20 readers (ICC, 0.70; 95% CI: 0.66, 0.75). There was no evidence of correlation between reader experience and overall accuracy (P = .94). The highest true-positive rate was for MSKI-RADS I and NOS at 88.7% (95% CI: 84.6, 91.7). The true-positive rate was 73% (95% CI: 63, 80) for MSKI-RADS V. Overall reader accuracy using MSKI-RADS across all patients was 65% ± 5, higher than final reader diagnoses at 55% ± 7 (P < .001). Conclusion MSKI-RADS is a valid system for standardized terminology and recommended management of imaging findings of peripheral extremity infections across various musculoskeletal-fellowship-trained reader experience levels. © RSNA, 2024 Supplemental material is available for this article. See also the editorial by Schweitzer in this issue.


Subject(s)
Magnetic Resonance Imaging , Humans , Magnetic Resonance Imaging/methods , Male , Female , Middle Aged , Retrospective Studies , Cross-Sectional Studies , Radiology Information Systems , Extremities/diagnostic imaging , Adult , Musculoskeletal Diseases/diagnostic imaging , Aged , Reproducibility of Results
6.
Radiology ; 310(2): e231501, 2024 02.
Article in English | MEDLINE | ID: mdl-38376399

ABSTRACT

Background The independent contribution of each Liver Imaging Reporting and Data System (LI-RADS) CT or MRI ancillary feature (AF) has not been established. Purpose To evaluate the association of LI-RADS AFs with hepatocellular carcinoma (HCC) and malignancy while adjusting for LI-RADS major features through an individual participant data (IPD) meta-analysis. Materials and Methods Medline, Embase, Cochrane Central Register of Controlled Trials, and Scopus were searched from January 2014 to January 2022 for studies evaluating the diagnostic accuracy of CT and MRI for HCC using LI-RADS version 2014, 2017, or 2018. Using a one-step approach, IPD across studies were pooled. Adjusted odds ratios (ORs) and 95% CIs were derived from multivariable logistic regression models of each AF combined with major features except threshold growth (excluded because of infrequent reporting). Liver observation clustering was addressed at the study and participant levels through random intercepts. Risk of bias was assessed using a composite reference standard and Quality Assessment of Diagnostic Accuracy Studies 2. Results Twenty studies comprising 3091 observations (2456 adult participants; mean age, 59 years ± 11 [SD]; 1849 [75.3%] men) were included. In total, 89% (eight of nine) of AFs favoring malignancy were associated with malignancy and/or HCC, 80% (four of five) of AFs favoring HCC were associated with HCC, and 57% (four of seven) of AFs favoring benignity were negatively associated with HCC and/or malignancy. Nonenhancing capsule (OR = 3.50 [95% CI: 1.53, 8.01]) had the strongest association with HCC. Diffusion restriction (OR = 14.45 [95% CI: 9.82, 21.27]) and mild-moderate T2 hyperintensity (OR = 10.18 [95% CI: 7.17, 14.44]) had the strongest association with malignancy. The strongest negative associations with HCC were parallels blood pool enhancement (OR = 0.07 [95% CI: 0.01, 0.49]) and marked T2 hyperintensity (OR = 0.18 [95% CI: 0.07, 0.45]). Seventeen studies (85%) had a high risk of bias. Conclusion Most LI-RADS AFs were independently associated with HCC, malignancy, or benignity as intended when adjusting for major features. © RSNA, 2024 Supplemental material is available for this article. See also the editorial by Crivellaro in this issue.


Subject(s)
Carcinoma, Hepatocellular , Liver Neoplasms , Magnetic Resonance Imaging , Tomography, X-Ray Computed , Humans , Carcinoma, Hepatocellular/diagnostic imaging , Liver Neoplasms/diagnostic imaging , Magnetic Resonance Imaging/methods , Tomography, X-Ray Computed/methods , Liver/diagnostic imaging , Radiology Information Systems , Middle Aged , Male
7.
Liver Int ; 44(7): 1578-1587, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38651924

ABSTRACT

BACKGROUND AND AIMS: The Liver Imaging Reporting and Data System (LI-RADS) offers a standardized approach for imaging hepatocellular carcinoma. However, the diverse styles and structures of radiology reports complicate automatic data extraction. Large language models hold the potential for structured data extraction from free-text reports. Our objective was to evaluate the performance of Generative Pre-trained Transformer (GPT)-4 in extracting LI-RADS features and categories from free-text liver magnetic resonance imaging (MRI) reports. METHODS: Three radiologists generated 160 fictitious free-text liver MRI reports written in Korean and English, simulating real-world practice. Of these, 20 were used for prompt engineering, and 140 formed the internal test cohort. Seventy-two genuine reports, authored by 17 radiologists were collected and de-identified for the external test cohort. LI-RADS features were extracted using GPT-4, with a Python script calculating categories. Accuracies in each test cohort were compared. RESULTS: On the external test, the accuracy for the extraction of major LI-RADS features, which encompass size, nonrim arterial phase hyperenhancement, nonperipheral 'washout', enhancing 'capsule' and threshold growth, ranged from .92 to .99. For the rest of the LI-RADS features, the accuracy ranged from .86 to .97. For the LI-RADS category, the model showed an accuracy of .85 (95% CI: .76, .93). CONCLUSIONS: GPT-4 shows promise in extracting LI-RADS features, yet further refinement of its prompting strategy and advancements in its neural network architecture are crucial for reliable use in processing complex real-world MRI reports.


Subject(s)
Liver Neoplasms , Magnetic Resonance Imaging , Humans , Liver Neoplasms/diagnostic imaging , Carcinoma, Hepatocellular/diagnostic imaging , Natural Language Processing , Radiology Information Systems , Republic of Korea , Data Mining , Liver/diagnostic imaging
8.
BJU Int ; 134(4): 510-518, 2024 Oct.
Article in English | MEDLINE | ID: mdl-38923789

ABSTRACT

OBJECTIVES: To explore the topic of Prostate Imaging-Reporting and Data System (PI-RADS) interobserver variability, including a discussion of major sources, mitigation approaches, and future directions. METHODS: A narrative review of PI-RADS interobserver variability. RESULTS: PI-RADS was developed in 2012 to set technical standards for prostate magnetic resonance imaging (MRI), reduce interobserver variability at interpretation, and improve diagnostic accuracy in the MRI-directed diagnostic pathway for detection of clinically significant prostate cancer. While PI-RADS has been validated in selected research cohorts with prostate cancer imaging experts, subsequent prospective studies in routine clinical practice demonstrate wide variability in diagnostic performance. Radiologist and biopsy operator experience are the most important contributing drivers of high-quality care among multiple interrelated factors including variability in MRI hardware and technique, image quality, and population and patient-specific factors such as prostate cancer disease prevalence. Iterative improvements in PI-RADS have helped flatten the curve for novice readers and reduce variability. Innovations in image quality reporting, administrative and organisational workflows, and artificial intelligence hold promise in improving variability even further. CONCLUSION: Continued research into PI-RADS is needed to facilitate benchmark creation, reader certification, and independent accreditation, which are systems-level interventions needed to uphold and maintain high-quality prostate MRI across entire populations.


Subject(s)
Magnetic Resonance Imaging , Observer Variation , Prostatic Neoplasms , Male , Humans , Prostatic Neoplasms/diagnostic imaging , Prostate/pathology , Prostate/diagnostic imaging , Data Systems , Radiology Information Systems
9.
Eur Radiol ; 34(8): 5120-5130, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38206405

ABSTRACT

OBJECTIVES: To assess radiologists' current use of, and opinions on, structured reporting (SR) in oncologic imaging, and to provide recommendations for a structured report template. MATERIALS AND METHODS: An online survey with 28 questions was sent to European Society of Oncologic Imaging (ESOI) members. The questionnaire had four main parts: (1) participant information, e.g., country, workplace, experience, and current SR use; (2) SR design, e.g., numbers of sections and fields, and template use; (3) clinical impact of SR, e.g., on report quality and length, workload, and communication with clinicians; and (4) preferences for an oncology-focused structured CT report. Data analysis comprised descriptive statistics, chi-square tests, and Spearman correlation coefficients. RESULTS: A total of 200 radiologists from 51 countries completed the survey: 57.0% currently utilized SR (57%), with a lower proportion within than outside of Europe (51.0 vs. 72.7%; p = 0.006). Among SR users, the majority observed markedly increased report quality (62.3%) and easier comparison to previous exams (53.5%), a slightly lower error rate (50.9%), and fewer calls/emails by clinicians (78.9%) due to SR. The perceived impact of SR on communication with clinicians (i.e., frequency of calls/emails) differed with radiologists' experience (p < 0.001), and experience also showed low but significant correlations with communication with clinicians (r = - 0.27, p = 0.003), report quality (r = 0.19, p = 0.043), and error rate (r = - 0.22, p = 0.016). Template use also affected the perceived impact of SR on report quality (p = 0.036). CONCLUSION: Radiologists regard SR in oncologic imaging favorably, with perceived positive effects on report quality, error rate, comparison of serial exams, and communication with clinicians. CLINICAL RELEVANCE STATEMENT: Radiologists believe that structured reporting in oncologic imaging improves report quality, decreases the error rate, and enables better communication with clinicians. Implementation of structured reporting in Europe is currently below the international level and needs society endorsement. KEY POINTS: • The majority of oncologic imaging specialists (57% overall; 51% in Europe) use structured reporting in clinical practice. • The vast majority of oncologic imaging specialists use templates (92.1%), which are typically cancer-specific (76.2%). • Structured reporting is perceived to markedly improve report quality, communication with clinicians, and comparison to prior scans.


Subject(s)
Attitude of Health Personnel , Neoplasms , Radiologists , Societies, Medical , Humans , Europe , Surveys and Questionnaires , Neoplasms/diagnostic imaging , Radiologists/statistics & numerical data , Radiology Information Systems/statistics & numerical data
10.
AJR Am J Roentgenol ; 222(6): e2330343, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38534191

ABSTRACT

BACKGROUND. To implement provisions of the 21st Century Cures Act that address information blocking, federal regulations mandated that health systems provide patients with immediate access to elements of their electronic health information, including imaging results. OBJECTIVE. The purpose of this study was to compare patient access of radiology reports before and after implementation of the information-blocking provisions of the 21st Century Cures Act. METHODS. This retrospective study included patients who underwent outpatient imaging examinations from January 1, 2021, through December 31, 2022, at three campuses within a large health system. The system implemented policies to comply with the Cures Act information-blocking provisions on January 1, 2022. Imaging results were released in patient portals after a 36-hour embargo period before implementation versus being released immediately after report finalization after implementation. Data regarding patient report access in the portal and report acknowledgment by the ordering provider in the EMR were extracted and compared between periods. RESULTS. The study included reports for 1,188,692 examinations in 388,921 patients (mean age, 58.5 ± 16.6 [SD] years; 209,589 women, 179,290 men, eight nonbinary individuals, and 34 individuals for whom sex information was missing). A total of 77.5% of reports were accessed by the patient before implementation versus 80.4% after implementation. The median time from report finalization to report release in the patient portal was 36.0 hours before implementation versus 0.4 hours after implementation. The median time from report release to first patient access of the report in the portal was 8.7 hours before implementation versus 3.0 hours after implementation. The median time from report finalization to first patient access was 45.0 hours before implementation versus 5.5 hours after implementation. Before implementation, a total of 18.5% of reports were first accessed by the patient before being accessed by the ordering provider versus 44.0% after implementation. After implementation, the median time from report release to first patient access was 1.8 hours for patients with age younger than 60 years old versus 4.3 hours for patients 60 years old or older. CONCLUSION. After implementation of institutional policies to comply with 21st Century Cures Act information-blocking provisions, the length of time until patients accessed imaging results decreased, and the proportion of patients who accessed their reports before the ordering provider increased. CLINICAL IMPACT. Radiologists should consider mechanisms to ensure timely and appropriate communication of important findings to ordering providers.


Subject(s)
Patient Access to Records , Humans , Male , Female , Retrospective Studies , Middle Aged , Adult , Patient Access to Records/legislation & jurisprudence , Aged , United States , Electronic Health Records/legislation & jurisprudence , Adolescent , Patient Portals/legislation & jurisprudence , Child , Radiology Information Systems/legislation & jurisprudence , Young Adult , Aged, 80 and over , Child, Preschool
11.
J Biomed Inform ; 157: 104718, 2024 Sep.
Article in English | MEDLINE | ID: mdl-39209086

ABSTRACT

Radiology report generation automates diagnostic narrative synthesis from medical imaging data. Current report generation methods primarily employ knowledge graphs for image enhancement, neglecting the interpretability and guiding function of the knowledge graphs themselves. Additionally, few approaches leverage the stable modal alignment information from multimodal pre-trained models to facilitate the generation of radiology reports. We propose the Terms-Guided Radiology Report Generation (TGR), a simple and practical model for generating reports guided primarily by anatomical terms. Specifically, we utilize a dual-stream visual feature extraction module comprised of detail extraction module and a frozen multimodal pre-trained model to separately extract visual detail features and semantic features. Furthermore, a Visual Enhancement Module (VEM) is proposed to further enrich the visual features, thereby facilitating the generation of a list of anatomical terms. We integrate anatomical terms with image features and proceed to engage contrastive learning with frozen text embeddings, utilizing the stable feature space from these embeddings to boost modal alignment capabilities further. Our model incorporates the capability for manual input, enabling it to generate a list of organs for specifically focused abnormal areas or to produce more accurate single-sentence descriptions based on selected anatomical terms. Comprehensive experiments demonstrate the effectiveness of our method in report generation tasks, our TGR-S model reduces training parameters by 38.9% while performing comparably to current state-of-the-art models, and our TGR-B model exceeds the best baseline models across multiple metrics.


Subject(s)
Natural Language Processing , Humans , Radiology/education , Radiology/methods , Algorithms , Machine Learning , Semantics , Radiology Information Systems , Diagnostic Imaging/methods
12.
BMC Med Imaging ; 24(1): 254, 2024 Sep 27.
Article in English | MEDLINE | ID: mdl-39333958

ABSTRACT

BACKGROUND: The impression section integrates key findings of a radiology report but can be subjective and variable. We sought to fine-tune and evaluate an open-source Large Language Model (LLM) in automatically generating impressions from the remainder of a radiology report across different imaging modalities and hospitals. METHODS: In this institutional review board-approved retrospective study, we collated a dataset of CT, US, and MRI radiology reports from the University of California San Francisco Medical Center (UCSFMC) (n = 372,716) and the Zuckerberg San Francisco General (ZSFG) Hospital and Trauma Center (n = 60,049), both under a single institution. The Recall-Oriented Understudy for Gisting Evaluation (ROUGE) score, an automatic natural language evaluation metric that measures word overlap, was used for automatic natural language evaluation. A reader study with five cardiothoracic radiologists was performed to more strictly evaluate the model's performance on a specific modality (CT chest exams) with a radiologist subspecialist baseline. We stratified the results of the reader performance study based on the diagnosis category and the original impression length to gauge case complexity. RESULTS: The LLM achieved ROUGE-L scores of 46.51, 44.2, and 50.96 on UCSFMC and upon external validation, ROUGE-L scores of 40.74, 37.89, and 24.61 on ZSFG across the CT, US, and MRI modalities respectively, implying a substantial degree of overlap between the model-generated impressions and impressions written by the subspecialist attending radiologists, but with a degree of degradation upon external validation. In our reader study, the model-generated impressions achieved overall mean scores of 3.56/4, 3.92/4, 3.37/4, 18.29 s,12.32 words, and 84 while the original impression written by a subspecialist radiologist achieved overall mean scores of 3.75/4, 3.87/4, 3.54/4, 12.2 s, 5.74 words, and 89 for clinical accuracy, grammatical accuracy, stylistic quality, edit time, edit distance, and ROUGE-L score respectively. The LLM achieved the highest clinical accuracy ratings for acute/emergent findings and on shorter impressions. CONCLUSIONS: An open-source fine-tuned LLM can generate impressions to a satisfactory level of clinical accuracy, grammatical accuracy, and stylistic quality. Our reader performance study demonstrates the potential of large language models in drafting radiology report impressions that can aid in streamlining radiologists' workflows.


Subject(s)
Natural Language Processing , Humans , Retrospective Studies , Magnetic Resonance Imaging/methods , Tomography, X-Ray Computed/methods , Observer Variation , Radiology Information Systems
13.
Pediatr Radiol ; 54(10): 1566-1578, 2024 09.
Article in English | MEDLINE | ID: mdl-39085531

ABSTRACT

Over the last decades, magnetic resonance imaging (MRI) has emerged as a valuable adjunct to prenatal ultrasound for evaluating fetal malformations. Several radiological societies advocate for standardised and structured reporting practices to enhance the uniformity of imaging language. Compared to narrative formats, standardised and structured reports offer enhanced content quality, minimise reader variability, have the potential to save reporting time, and streamline the communication between specialists by employing a shared lexicon. Structured reporting holds promise for mitigating medico-legal liability, while also facilitating rigorous scientific data analyses and the development of standardised databases. While structured reporting templates for fetal MRI are already in use in some centres, specific recommendations and/or guidelines from international societies are scarce in the literature. The purpose of this paper is to propose a standardised and structured reporting template for fetal MRI to assist radiologists, particularly those with less experience, in delivering systematic reports. Additionally, the paper aims to offer an overview of the anatomical structures that necessitate reporting and the prevalent normative values for fetal biometrics found in current literature.


Subject(s)
Magnetic Resonance Imaging , Prenatal Diagnosis , Humans , Magnetic Resonance Imaging/methods , Magnetic Resonance Imaging/standards , Europe , Prenatal Diagnosis/methods , Prenatal Diagnosis/standards , Practice Guidelines as Topic , Radiology/standards , Pediatrics/standards , Documentation/standards , Societies, Medical , Radiology Information Systems/standards , Female , Pregnancy
14.
Pediatr Radiol ; 54(9): 1476-1485, 2024 08.
Article in English | MEDLINE | ID: mdl-38981907

ABSTRACT

BACKGROUND: Thyroid nodules are unusual in children, but when present, they carry a higher risk for malignancy, as compared to adults. Several guidelines have been created to address the risk stratification for malignancy of thyroid nodules in adults, but none has been completely validated in children. A few authors have proposed lowering the size threshold to the American College of Radiology Thyroid Imaging, Reporting and Data System (ACR TI-RADS™) management guidelines to decrease missed carcinomas at presentation in children; however, little information is known regarding their accuracy. OBJECTIVE: To assess the performance of proposed modifications of the ACR TI-RADS™ size criteria to guide management decisions in pediatric thyroid nodules and to assess the associated increase in number of fine needle aspiration (FNA) and follow-up exams. MATERIALS AND METHODS: This is a retrospective study of children under 18 years old who underwent ultrasound assessment of a thyroid nodule at a tertiary care pediatric institution between January 2006 and August 2021. The largest dimension, maximum ACR TI-RADS™ score, and final thyroid nodules' diagnoses were documented. The course of action based on the adult ACR TI-RADS™ and after modifying the size threshold for management recommendations was documented and compared. Statistics included descriptive analysis, weighted Kappa statistics, sensitivity, specificity, accuracy, and positive/negative predictive values of the ACR TI-RADS™ presented with 95% confidence intervals (CI) using either Clopper-Pearson or standard logit methods. RESULTS: Of 116 nodules, 18 (15.5%) were malignant. Most malignant nodules (94.4%, n = 17) were ACR TI-RADS™ 4 and ACR TI-RADS™ 5 categories. Based on the adult ACR TI-RADS™ criteria, 24 (24.5%) benign and 15 (83.3%) malignant nodules would have undergone FNA; 14 (14.3%) benign and 3 (16.7%) malignant nodules would have been followed up; and 60 (61.2%) benign and none of malignant nodules would have been dismissed. Three (16.7%) malignant nodules would not have been recommended FNA at presentation, delaying their diagnoses. By lowering the size-threshold criteria of the ACR TI-RADS™ guidelines, no malignancy would have been missed at presentation, but this also resulted in a higher number of FNA from 24 (24.5%) to 36 (36.7%) and follow-up ultrasound exams from 14 (14.3%) to 62 (63.3%). CONCLUSION: Applying potential modifications to the ACR TI-RADS™ guideline lowering the size threshold criteria of the thyroid nodule to guide management decisions for pediatric thyroid nodules can lead to early detection of malignant nodules in children, but at the cost of a significantly increased number of biopsies or ultrasound exams. Further tailoring of the guideline with larger multicentric studies is needed, before warranting its acceptance and general use in the pediatric population.


Subject(s)
Thyroid Nodule , Ultrasonography , Humans , Thyroid Nodule/diagnostic imaging , Thyroid Nodule/therapy , Thyroid Nodule/pathology , Child , Male , Adolescent , Female , Retrospective Studies , Ultrasonography/methods , Biopsy, Fine-Needle , United States , Societies, Medical , Radiology Information Systems , Practice Guidelines as Topic , Thyroid Gland/diagnostic imaging , Thyroid Gland/pathology , Child, Preschool
15.
Pediatr Radiol ; 54(7): 1128-1136, 2024 06.
Article in English | MEDLINE | ID: mdl-38771344

ABSTRACT

BACKGROUND: Identifying the associations between BRAFV600E mutation, the American College of Radiology Thyroid Imaging Reporting and Data System (TI-RADS) and clinicopathological characteristics could assist in making appropriate treatment strategies for pediatric patients with papillary thyroid carcinoma. OBJECTIVE: To retrospectively assess the associations between BRAFV600E mutation, TI-RADS, and clinicopathological characteristics in pediatric patients with papillary thyroid carcinoma. MATERIALS AND METHODS: Between May 2013 and May 2023, pediatric patients with papillary thyroid carcinoma who underwent thyroidectomy were retrospectively evaluated. Univariate and multivariate logistic regression analyses were performed to determine the associations between BRAFV600E mutation, TI-RADS, and clinicopathological characteristics. The diagnostic performance of TI-RADS to predict BRAFV600E mutation was assessed. RESULTS: The BRAFV600E mutation was found in 59.1% (39/66) of pediatric patients with papillary thyroid carcinoma. Multivariate analyses showed that hypoechoic/very hypoechoic [odds ratio (OR) = 8.48; 95% confidence interval (CI) = 1.48-48.74); P-value = 0.02] and punctate echogenic foci (OR = 24.3; 95% CI = 3.80-155.84; P-value = 0.001) were independent factors associated with BRAFV600E mutation. In addition, BRAFV600E mutation was significantly associated with TI-RADS 5 (OR = 12.61; 95% CI = 1.28-124.49; P-value = 0.03). There were no associations between BRAFV600E mutation and nodule size, composition, shape, margin, cervical lymph node metastasis, or Hashimoto's thyroiditis (P-value > 0.05). Combined with hypoechoic/very hypoechoic and punctate echogenic foci, the sensitivity, specificity, positive predictive value, negative predictive value, and accuracy were 89.7%, 85.2%, 89.7%, 85.2%, and 87.9%, respectively. CONCLUSIONS: Hypoechoic/very hypoechoic, punctate echogenic foci, and TI-RADS 5 are independently associated with BRAFV600E mutation in pediatric patients with papillary thyroid carcinoma.


Subject(s)
Mutation , Proto-Oncogene Proteins B-raf , Thyroid Cancer, Papillary , Thyroid Neoplasms , Humans , Male , Female , Proto-Oncogene Proteins B-raf/genetics , Thyroid Cancer, Papillary/genetics , Thyroid Cancer, Papillary/diagnostic imaging , Thyroid Cancer, Papillary/pathology , Child , Thyroid Neoplasms/genetics , Thyroid Neoplasms/diagnostic imaging , Thyroid Neoplasms/pathology , Retrospective Studies , Adolescent , United States , Radiology Information Systems , Thyroidectomy , Child, Preschool
16.
Skeletal Radiol ; 53(8): 1621-1624, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38270616

ABSTRACT

OBJECTIVE: To assess the feasibility of using large language models (LLMs), specifically ChatGPT-4, to generate concise and accurate layperson summaries of musculoskeletal radiology reports. METHODS: Sixty radiology reports, comprising 20 MR shoulder, 20 MR knee, and 20 MR lumbar spine reports, were obtained via PACS. The reports were deidentified and then submitted to ChatGPT-4, with the prompt "Produce an organized and concise layperson summary of the findings of the following radiology report. Target a reading level of 8-9th grade and word count <300 words." Three (two primary and one later added for validation) independent readers evaluated the summaries for completeness and accuracy compared to the original reports. Summaries were rated on a scale of 1 to 3: 1) summaries that were incorrect or incomplete, potentially providing harmful or confusing information; 2) summaries that were mostly correct and complete, unlikely to cause confusion or harm; and 3) summaries that were entirely correct and complete. RESULTS: All 60 responses met the criteria for word count and readability. Mean ratings for accuracy were 2.58 for reader 1, 2.71 for reader 2, and 2.77 for reader 3. Mean ratings for completeness were 2.87 for reader 1 and 2.73 for reader 2 and 2.87 for reader 3. For accuracy, reader 1 identified three summaries as a 1, reader 2 identified one, and reader 3 identified none. For the two primary readers, inter-reader agreement was low for accuracy (kappa 0.33) and completeness (kappa 0.29). There were no statistically significant changes in inter-reader agreement when the third reader's ratings were included in analysis. CONCLUSION: Overall ratings for accuracy and completeness of the AI-generated layperson report summaries were high with only a small minority likely to be confusing or inaccurate. This study illustrates the potential for leveraging generative AI, such as ChatGPT-4, to automate the production of patient-friendly summaries for musculoskeletal MR imaging.


Subject(s)
Radiology Information Systems , Humans , Musculoskeletal Diseases/diagnostic imaging , Feasibility Studies , Translating , Comprehension
17.
BMC Med Educ ; 24(1): 935, 2024 Aug 28.
Article in English | MEDLINE | ID: mdl-39198788

ABSTRACT

BACKGROUND: Traditional radiology education for medical students predominantly uses textbooks, PowerPoint files, and hard-copy radiographic images, which often lack student interaction. PACS (Picture Archiving and Communication System) is a crucial tool for radiologists in viewing and reporting images, but its use in medical student training remains limited. OBJECTIVE: This study investigates the effectiveness of using PACS (Picture Archiving and Communication System) for teaching radiology to undergraduate medical students compared to traditional methods. METHODS: Fifty-three medical students were divided into a control group (25 students) receiving traditional slide-based training and an intervention group (28 students) using PACS software to view complete patient images. Pre- and post-course tests and satisfaction surveys were conducted for both groups, along with self-evaluation by the intervention group. The validity and reliability of the assessment tools were confirmed through expert review and pilot testing. RESULTS: No significant difference was found between the control and intervention groups regarding, gender, age, and GPA. Final multiple-choice test scores were similar (intervention: 10.89 ± 2.9; control: 10.76 ± 3.5; p = 0.883). However, the intervention group demonstrated significantly higher improvement in the short answer test for image interpretation (intervention: 8.8 ± 2.28; control: 5.35 ± 2.39; p = 0.001). Satisfaction with the learning method did not significantly differ between groups (intervention: 36.54 ± 5.87; control: 39.44 ± 7.76; p = 0.129). The intervention group reported high familiarity with PACS capabilities (75%), CT principles (71.4%), interpretation (64.3%), appropriate window selection (75%), and anatomical relationships (85.7%). CONCLUSION: PACS-based training enhances medical students' diagnostic and analytical skills in radiology. Further research with larger sample sizes and robust assessment methods is recommended to confirm and expand upon theses results.


Subject(s)
Education, Medical, Undergraduate , Radiology Information Systems , Radiology , Students, Medical , Humans , Education, Medical, Undergraduate/methods , Male , Female , Radiology/education , Educational Measurement , Young Adult
18.
BMC Med Educ ; 24(1): 969, 2024 Sep 05.
Article in English | MEDLINE | ID: mdl-39237930

ABSTRACT

BACKGROUND: Diagnostic radiology residents in low- and middle-income countries (LMICs) may have to provide significant contributions to the clinical workload before the completion of their residency training. Because of time constraints inherent to the delivery of acute care, some of the most clinically impactful diagnostic radiology errors arise from the use of Computed Tomography (CT) in the management of acutely ill patients. As a result, it is paramount to ensure that radiology trainees reach adequate skill levels prior to assuming independent on-call responsibilities. We partnered with the radiology residency program at the Aga Khan University Hospital in Nairobi (Kenya) to evaluate a novel cloud-based testing method that provides an authentic radiology viewing and interpretation environment. It is based on Lifetrack, a unique Google Chrome-based Picture Archiving and Communication System, that enables a complete viewing environment for any scan, and provides a novel report generation tool based on Active Templates which are a patented structured reporting method. We applied it to evaluate the skills of AKUHN trainees on entire CT scans representing the spectrum of acute non-trauma abdominal pathology encountered in a typical on-call setting. We aimed to demonstrate the feasibility of remotely testing the authentic practice of radiology and to show that important observations can be made from such a Lifetrack-based testing approach regarding the radiology skills of an individual practitioner or of a cohort of trainees. METHODS: A total of 13 anonymized trainees with experience from 12 months to over 4 years took part in the study. Individually accessing the Lifetrack tool they were tested on 37 abdominal CT scans (including one normal scan) over six 2-hour sessions on consecutive days. All cases carried the same clinical history of acute abdominal pain. During each session the trainees accessed the corresponding Lifetrack test set using clinical workstations, reviewed the CT scans, and formulated an opinion for the acute diagnosis, any secondary pathology, and incidental findings on the scan. Their scan interpretations were composed using the Lifetrack report generation system based on active templates in which segments of text can be selected to assemble a detailed report. All reports generated by the trainees were scored on four different interpretive components: (a) acute diagnosis, (b) unrelated secondary diagnosis, (c) number of missed incidental findings, and (d) number of overcalls. A 3-score aggregate was defined from the first three interpretive elements. A cumulative score modified the 3-score aggregate for the negative effect of interpretive overcalls. RESULTS: A total of 436 scan interpretations and scores were available from 13 trainees tested on 37 cases. The acute diagnosis score ranged from 0 to 1 with a mean of 0.68 ± 0.36 and median of 0.78 (IQR: 0.5-1), and there were 436 scores. An unrelated secondary diagnosis was present in 11 cases, resulting in 130 secondary diagnosis scores. The unrelated secondary diagnosis score ranged from 0 to 1, with mean score of 0.48 ± 0.46 and median of 0.5 (IQR: 0-1). There were 32 cases with incidental findings, yielding 390 scores for incidental findings. The number of missed incidental findings ranged from 0 to 5 with a median at 1 (IQR: 1-2). The incidental findings score ranged from 0 to 1 with a mean of 0.4 ± 0.38 and median of 0.33 (IQR: 0- 0.66). The number of overcalls ranged from 0 to 3 with a median at 0 (IQR: 0-1) and a mean of 0.36 ± 0.63. The 3-score aggregate ranged from 0 to 100 with a mean of 65.5 ± 32.5 and median of 77.3 (IQR: 45.0, 92.5). The cumulative score ranged from - 30 to 100 with a mean of 61.9 ± 35.5 and median of 71.4 (IQR: 37.4, 92.0). The mean acute diagnosis scores and SD by training period were 0.62 ± 0.03, 0.80 ± 0.05, 0.71 ± 0.05, 0.58 ± 0.07, and 0.66 ± 0.05 for trainees with ≤ 12 months, 12-24 months, 24-36 months, 36-48 months and > 48 months respectively. The mean acute diagnosis score of 12-24 months training was the only statistically significant greater score when compared to ≤ 12 months by the ANOVA with Tukey testing (p = 0.0002). We found a similar trend with distribution of 3-score aggregates and cumulative scores. There were no significant associations when the training period was categorized as less than and more than 2 years. We looked at the distribution of the 3-score aggregate versus the number of overcalls by trainee, and we found that the 3-score aggregate was inversely related to the number of overcalls. Heatmaps and raincloud plots provided an illustrative means to visualize the relative performance of trainees across cases. CONCLUSION: We demonstrated the feasibility of remotely testing the authentic practice of radiology and showed that important observations can be made from our Lifetrack-based testing approach regarding radiology skills of an individual or a cohort. From observed weaknesses areas for targeted teaching can be implemented, and retesting could reveal their impact. This methodology can be customized to different LMIC environments and expanded to board certification examinations.


Subject(s)
Clinical Competence , Developing Countries , Internship and Residency , Radiology Information Systems , Radiology , Humans , Radiology/education , Kenya , Tomography, X-Ray Computed
19.
Sensors (Basel) ; 24(15)2024 Aug 01.
Article in English | MEDLINE | ID: mdl-39124032

ABSTRACT

This article presents an ingestion procedure towards an interoperable repository called ALPACS (Anonymized Local Picture Archiving and Communication System). ALPACS provides services to clinical and hospital users, who can access the repository data through an Artificial Intelligence (AI) application called PROXIMITY. This article shows the automated procedure for data ingestion from the medical imaging provider to the ALPACS repository. The data ingestion procedure was successfully applied by the data provider (Hospital Clínico de la Universidad de Chile, HCUCH) using a pseudo-anonymization algorithm at the source, thereby ensuring that the privacy of patients' sensitive data is respected. Data transfer was carried out using international communication standards for health systems, which allows for replication of the procedure by other institutions that provide medical images. OBJECTIVES: This article aims to create a repository of 33,000 medical CT images and 33,000 diagnostic reports with international standards (HL7 HAPI FHIR, DICOM, SNOMED). This goal requires devising a data ingestion procedure that can be replicated by other provider institutions, guaranteeing data privacy by implementing a pseudo-anonymization algorithm at the source, and generating labels from annotations via NLP. METHODOLOGY: Our approach involves hybrid on-premise/cloud deployment of PACS and FHIR services, including transfer services for anonymized data to populate the repository through a structured ingestion procedure. We used NLP over the diagnostic reports to generate annotations, which were then used to train ML algorithms for content-based similar exam recovery. OUTCOMES: We successfully implemented ALPACS and PROXIMITY 2.0, ingesting almost 19,000 thorax CT exams to date along with their corresponding reports.


Subject(s)
Algorithms , Radiology Information Systems , Humans , Artificial Intelligence , Tomography, X-Ray Computed/methods , Diagnostic Imaging , Databases, Factual
20.
J Med Syst ; 48(1): 66, 2024 Jul 08.
Article in English | MEDLINE | ID: mdl-38976137

ABSTRACT

Three-dimensional (3D) printing has gained popularity across various domains but remains less integrated into medical surgery due to its complexity. Existing literature primarily discusses specific applications, with limited detailed guidance on the entire process. The methodological details of converting Computed Tomography (CT) images into 3D models are often found in amateur 3D printing forums rather than scientific literature. To address this gap, we present a comprehensive methodology for converting CT images of bone fractures into 3D-printed models. This involves transferring files in Digital Imaging and Communications in Medicine (DICOM) format to stereolithography format, processing the 3D model, and preparing it for printing. Our methodology outlines step-by-step guidelines, time estimates, and software recommendations, prioritizing free open-source tools. We also share our practical experience and outcomes, including the successful creation of 72 models for surgical planning, patient education, and teaching. Although there are challenges associated with utilizing 3D printing in surgery, such as the requirement for specialized expertise and equipment, the advantages in surgical planning, patient education, and improved outcomes are evident. Further studies are warranted to refine and standardize these methodologies for broader adoption in medical practice.


Subject(s)
Fractures, Bone , Printing, Three-Dimensional , Tomography, X-Ray Computed , Humans , Fractures, Bone/diagnostic imaging , Fractures, Bone/surgery , Tomography, X-Ray Computed/methods , Imaging, Three-Dimensional/methods , Traumatology , Radiology Information Systems/organization & administration , Models, Anatomic
SELECTION OF CITATIONS
SEARCH DETAIL