Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 59
Filtrar
1.
J Am Coll Radiol ; 2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-38960083

RESUMO

PURPOSE: We compared the performance of generative AI (G-AI, ATARI) and natural language processing (NLP) tools for identifying laterality errors in radiology reports and images. METHODS: We used an NLP-based (mPower) tool to identify radiology reports flagged for laterality errors in its QA Dashboard. The NLP model detects and highlights laterality mismatches in radiology reports. From an initial pool of 1124 radiology reports flagged by the NLP for laterality errors, we selected and evaluated 898 reports that encompassed radiography, CT, MRI, and ultrasound modalities to ensure comprehensive coverage. A radiologist reviewed each radiology report to assess if the flagged laterality errors were present (reporting error - true positive) or absent (NLP error - false positive). Next, we applied ATARI to 237 radiology reports and images with consecutive NLP true positive (118 reports) and false positive (119 reports) laterality errors. We estimated accuracy of NLP and G-AI tools to identify overall and modality-wise laterality errors. RESULTS: Among the 898 NLP-flagged laterality errors, 64% (574/898) had NLP errors and 36% (324/898) were reporting errors. The text query ATARI feature correctly identified the absence of laterality mismatch (NLP false positives) with a 97.4% accuracy (115/118 reports; 95% CI = 96.5% - 98.3%). Combined Vision and text query resulted in 98.3% accuracy (116/118 reports/images; 95% CI = 97.6% - 99.0%) query alone had a 98.3% accuracy (116/118 images; 95% CI = 97.6% - 99.0%). CONCLUSION: The generative AI-empowered ATARI prototype outperformed the assessed NLP tool for determining true and false laterality errors in radiology reports while enabling an image-based laterality determination. Underlying errors in ATARI text query in complex radiology reports emphasize the need for further improvement in the technology.

2.
Clin Imaging ; 112: 110207, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38838448

RESUMO

PURPOSE: We created an infrastructure for no code machine learning (NML) platform for non-programming physicians to create NML model. We tested the platform by creating an NML model for classifying radiographs for the presence and absence of clavicle fractures. METHODS: Our IRB-approved retrospective study included 4135 clavicle radiographs from 2039 patients (mean age 52 ± 20 years, F:M 1022:1017) from 13 hospitals. Each patient had two-view clavicle radiographs with axial and anterior-posterior projections. The positive radiographs had either displaced or non-displaced clavicle fractures. We configured the NML platform to automatically retrieve the eligible exams using the series' unique identification from the hospital virtual network archive via web access to DICOM Objects. The platform trained a model until the validation loss plateaus. Once the testing was complete, the platform provided the receiver operating characteristics curve and confusion matrix for estimating sensitivity, specificity, and accuracy. RESULTS: The NML platform successfully retrieved 3917 radiographs (3917/4135, 94.7 %) and parsed them for creating a ML classifier with 2151 radiographs in the training, 100 radiographs for validation, and 1666 radiographs in testing datasets (772 radiographs with clavicle fracture, 894 without clavicle fracture). The network identified clavicle fracture with 90 % sensitivity, 87 % specificity, and 88 % accuracy with AUC of 0.95 (confidence interval 0.94-0.96). CONCLUSION: A NML platform can help physicians create and test machine learning models from multicenter imaging datasets such as the one in our study for classifying radiographs based on the presence of clavicle fracture.


Assuntos
Clavícula , Fraturas Ósseas , Aprendizado de Máquina , Humanos , Clavícula/lesões , Clavícula/diagnóstico por imagem , Fraturas Ósseas/diagnóstico por imagem , Fraturas Ósseas/classificação , Feminino , Pessoa de Meia-Idade , Masculino , Estudos Retrospectivos , Sensibilidade e Especificidade , Adulto , Radiografia/métodos
3.
Artigo em Inglês | MEDLINE | ID: mdl-38806239

RESUMO

BACKGROUND AND PURPOSE: Mass effect and vasogenic edema are critical findings on CT of the head. This study compared the accuracy of an artificial intelligence model (Annalise Enterprise CTB) to consensus neuroradiologist interpretations in detecting mass effect and vasogenic edema. MATERIALS AND METHODS: A retrospective standalone performance assessment was conducted on datasets of non-contrast CT head cases acquired between 2016 and 2022 for each finding. The cases were obtained from patients aged 18 years or older from five hospitals in the United States. The positive cases were selected consecutively based on the original clinical reports using natural language processing and manual confirmation. The negative cases were selected by taking the next negative case acquired from the same CT scanner after positive cases. Each case was interpreted independently by up to three neuroradiologists to establish consensus interpretations. Each case was then interpreted by the AI model for the presence of the relevant finding. The neuroradiologists were provided with the entire CT study. The AI model separately received thin (≤1.5mm) and/or thick (>1.5 and ≤5mm) axial series. RESULTS: The two cohorts included 818 cases for mass effect and 310 cases for vasogenic edema. The AI model identified mass effect with sensitivity 96.6% (95% CI, 94.9-98.2) and specificity 89.8% (95% CI, 84.7-94.2) for the thin series, and 95.3% (95% CI, 93.5-96.8) and 93.1% (95% CI, 89.1-96.6) for the thick series. It identified vasogenic edema with sensitivity 90.2% (95% CI, 82.0-96.7) and specificity 93.5% (95% CI, 88.9-97.2) for the thin series, and 90.0% (95% CI, 84.0-96.0) and 95.5% (95% CI, 92.5-98.0) for the thick series. The corresponding areas under the curve were at least 0.980. CONCLUSIONS: The assessed AI model accurately identified mass effect and vasogenic edema in this CT dataset. It could assist the clinical workflow by prioritizing interpretation of abnormal cases, which could benefit patients through earlier identification and subsequent treatment. ABBREVIATIONS: AI = artificial intelligence; AUC = area under the curve; CADt = computer assisted triage devices; FDA = Food and Drug Administration; NPV = negative predictive value; PPV = positive predictive value; SD = standard deviation.

4.
J Med Syst ; 48(1): 41, 2024 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-38632172

RESUMO

Polypharmacy remains an important challenge for patients with extensive medical complexity. Given the primary care shortage and the increasing aging population, effective polypharmacy management is crucial to manage the increasing burden of care. The capacity of large language model (LLM)-based artificial intelligence to aid in polypharmacy management has yet to be evaluated. Here, we evaluate ChatGPT's performance in polypharmacy management via its deprescribing decisions in standardized clinical vignettes. We inputted several clinical vignettes originally from a study of general practicioners' deprescribing decisions into ChatGPT 3.5, a publicly available LLM, and evaluated its capacity for yes/no binary deprescribing decisions as well as list-based prompts in which the model was prompted to choose which of several medications to deprescribe. We recorded ChatGPT responses to yes/no binary deprescribing prompts and the number and types of medications deprescribed. In yes/no binary deprescribing decisions, ChatGPT universally recommended deprescribing medications regardless of ADL status in patients with no overlying CVD history; in patients with CVD history, ChatGPT's answers varied by technical replicate. Total number of medications deprescribed ranged from 2.67 to 3.67 (out of 7) and did not vary with CVD status, but increased linearly with severity of ADL impairment. Among medication types, ChatGPT preferentially deprescribed pain medications. ChatGPT's deprescribing decisions vary along the axes of ADL status, CVD history, and medication type, indicating some concordance of internal logic between general practitioners and the model. These results indicate that specifically trained LLMs may provide useful clinical support in polypharmacy management for primary care physicians.


Assuntos
Doenças Cardiovasculares , Desprescrições , Clínicos Gerais , Humanos , Idoso , Polimedicação , Inteligência Artificial
6.
J Am Coll Radiol ; 21(2): 225-226, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37659452
7.
J Am Coll Radiol ; 20(10): 990-997, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37356806

RESUMO

OBJECTIVE: Despite rising popularity and performance, studies evaluating the use of large language models for clinical decision support are lacking. Here, we evaluate ChatGPT (Generative Pre-trained Transformer)-3.5 and GPT-4's (OpenAI, San Francisco, California) capacity for clinical decision support in radiology via the identification of appropriate imaging services for two important clinical presentations: breast cancer screening and breast pain. METHODS: We compared ChatGPT's responses to the ACR Appropriateness Criteria for breast pain and breast cancer screening. Our prompt formats included an open-ended (OE) and a select all that apply (SATA) format. Scoring criteria evaluated whether proposed imaging modalities were in accordance with ACR guidelines. Three replicate entries were conducted for each prompt, and the average of these was used to determine final scores. RESULTS: Both ChatGPT-3.5 and ChatGPT-4 achieved an average OE score of 1.830 (out of 2) for breast cancer screening prompts. ChatGPT-3.5 achieved a SATA average percentage correct of 88.9%, compared with ChatGPT-4's average percentage correct of 98.4% for breast cancer screening prompts. For breast pain, ChatGPT-3.5 achieved an average OE score of 1.125 (out of 2) and a SATA average percentage correct of 58.3%, as compared with an average OE score of 1.666 (out of 2) and a SATA average percentage correct of 77.7%. DISCUSSION: Our results demonstrate the eventual feasibility of using large language models like ChatGPT for radiologic decision making, with the potential to improve clinical workflow and responsible use of radiology services. More use cases and greater accuracy are necessary to evaluate and implement such tools.


Assuntos
Neoplasias da Mama , Mastodinia , Radiologia , Humanos , Feminino , Neoplasias da Mama/diagnóstico por imagem , Tomada de Decisões
8.
Radiology ; 307(5): e222044, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37219444

RESUMO

Radiologic tests often contain rich imaging data not relevant to the clinical indication. Opportunistic screening refers to the practice of systematically leveraging these incidental imaging findings. Although opportunistic screening can apply to imaging modalities such as conventional radiography, US, and MRI, most attention to date has focused on body CT by using artificial intelligence (AI)-assisted methods. Body CT represents an ideal high-volume modality whereby a quantitative assessment of tissue composition (eg, bone, muscle, fat, and vascular calcium) can provide valuable risk stratification and help detect unsuspected presymptomatic disease. The emergence of "explainable" AI algorithms that fully automate these measurements could eventually lead to their routine clinical use. Potential barriers to widespread implementation of opportunistic CT screening include the need for buy-in from radiologists, referring providers, and patients. Standardization of acquiring and reporting measures is needed, in addition to expanded normative data according to age, sex, and race and ethnicity. Regulatory and reimbursement hurdles are not insurmountable but pose substantial challenges to commercialization and clinical use. Through demonstration of improved population health outcomes and cost-effectiveness, these opportunistic CT-based measures should be attractive to both payers and health care systems as value-based reimbursement models mature. If highly successful, opportunistic screening could eventually justify a practice of standalone "intended" CT screening.


Assuntos
Inteligência Artificial , Radiologia , Humanos , Algoritmos , Radiologistas , Programas de Rastreamento/métodos , Radiologia/métodos
9.
Acad Radiol ; 30(12): 2921-2930, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37019698

RESUMO

RATIONALE AND OBJECTIVES: Suboptimal chest radiographs (CXR) can limit interpretation of critical findings. Radiologist-trained AI models were evaluated for differentiating suboptimal(sCXR) and optimal(oCXR) chest radiographs. MATERIALS AND METHODS: Our IRB-approved study included 3278 CXRs from adult patients (mean age 55 ± 20 years) identified from a retrospective search of CXR in radiology reports from 5 sites. A chest radiologist reviewed all CXRs for the cause of suboptimality. The de-identified CXRs were uploaded into an AI server application for training and testing 5 AI models. The training set consisted of 2202 CXRs (n = 807 oCXR; n = 1395 sCXR) while 1076 CXRs (n = 729 sCXR; n = 347 oCXR) were used for testing. Data were analyzed with the Area under the curve (AUC) for the model's ability to classify oCXR and sCXR correctly. RESULTS: For the two-class classification into sCXR or oCXR from all sites, for CXR with missing anatomy, AI had sensitivity, specificity, accuracy, and AUC of 78%, 95%, 91%, 0.87(95% CI 0.82-0.92), respectively. AI identified obscured thoracic anatomy with 91% sensitivity, 97% specificity, 95% accuracy, and 0.94 AUC (95% CI 0.90-0.97). Inadequate exposure with 90% sensitivity, 93% specificity, 92% accuracy, and AUC of 0.91 (95% CI 0.88-0.95). The presence of low lung volume was identified with 96% sensitivity, 92% specificity, 93% accuracy, and 0.94 AUC (95% CI 0.92-0.96). The sensitivity, specificity, accuracy, and AUC of AI in identifying patient rotation were 92%, 96%, 95%, and 0.94 (95% CI 0.91-0.98), respectively. CONCLUSION: The radiologist-trained AI models can accurately classify optimal and suboptimal CXRs. Such AI models at the front end of radiographic equipment can enable radiographers to repeat sCXRs when necessary.


Assuntos
Pulmão , Radiografia Torácica , Adulto , Humanos , Pessoa de Meia-Idade , Idoso , Pulmão/diagnóstico por imagem , Estudos Retrospectivos , Radiografia , Radiologistas
10.
medRxiv ; 2023 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-36865204

RESUMO

IMPORTANCE: Large language model (LLM) artificial intelligence (AI) chatbots direct the power of large training datasets towards successive, related tasks, as opposed to single-ask tasks, for which AI already achieves impressive performance. The capacity of LLMs to assist in the full scope of iterative clinical reasoning via successive prompting, in effect acting as virtual physicians, has not yet been evaluated. OBJECTIVE: To evaluate ChatGPT's capacity for ongoing clinical decision support via its performance on standardized clinical vignettes. DESIGN: We inputted all 36 published clinical vignettes from the Merck Sharpe & Dohme (MSD) Clinical Manual into ChatGPT and compared accuracy on differential diagnoses, diagnostic testing, final diagnosis, and management based on patient age, gender, and case acuity. SETTING: ChatGPT, a publicly available LLM. PARTICIPANTS: Clinical vignettes featured hypothetical patients with a variety of age and gender identities, and a range of Emergency Severity Indices (ESIs) based on initial clinical presentation. EXPOSURES: MSD Clinical Manual vignettes. MAIN OUTCOMES AND MEASURES: We measured the proportion of correct responses to the questions posed within the clinical vignettes tested. RESULTS: ChatGPT achieved 71.7% (95% CI, 69.3% to 74.1%) accuracy overall across all 36 clinical vignettes. The LLM demonstrated the highest performance in making a final diagnosis with an accuracy of 76.9% (95% CI, 67.8% to 86.1%), and the lowest performance in generating an initial differential diagnosis with an accuracy of 60.3% (95% CI, 54.2% to 66.6%). Compared to answering questions about general medical knowledge, ChatGPT demonstrated inferior performance on differential diagnosis (ß=-15.8%, p<0.001) and clinical management (ß=-7.4%, p=0.02) type questions. CONCLUSIONS AND RELEVANCE: ChatGPT achieves impressive accuracy in clinical decision making, with particular strengths emerging as it has more clinical information at its disposal.

11.
J Am Coll Radiol ; 20(3): 352-360, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36922109

RESUMO

The multitude of artificial intelligence (AI)-based solutions, vendors, and platforms poses a challenging proposition to an already complex clinical radiology practice. Apart from assessing and ensuring acceptable local performance and workflow fit to improve imaging services, AI tools require multiple stakeholders, including clinical, technical, and financial, who collaborate to move potential deployable applications to full clinical deployment in a structured and efficient manner. Postdeployment monitoring and surveillance of such tools require an infrastructure that ensures proper and safe use. Herein, the authors describe their experience and framework for implementing and supporting the use of AI applications in radiology workflow.


Assuntos
Inteligência Artificial , Radiologia , Radiologia/métodos , Diagnóstico por Imagem , Fluxo de Trabalho , Comércio
12.
Diagnostics (Basel) ; 13(4)2023 Feb 18.
Artigo em Inglês | MEDLINE | ID: mdl-36832266

RESUMO

Purpose: Motion-impaired CT images can result in limited or suboptimal diagnostic interpretation (with missed or miscalled lesions) and patient recall. We trained and tested an artificial intelligence (AI) model for identifying substantial motion artifacts on CT pulmonary angiography (CTPA) that have a negative impact on diagnostic interpretation. Methods: With IRB approval and HIPAA compliance, we queried our multicenter radiology report database (mPower, Nuance) for CTPA reports between July 2015 and March 2022 for the following terms: "motion artifacts", "respiratory motion", "technically inadequate", and "suboptimal" or "limited exam". All CTPA reports were from two quaternary (Site A, n = 335; B, n = 259) and a community (C, n = 199) healthcare sites. A thoracic radiologist reviewed CT images of all positive hits for motion artifacts (present or absent) and their severity (no diagnostic effect or major diagnostic impairment). Coronal multiplanar images from 793 CTPA exams were de-identified and exported offline into an AI model building prototype (Cognex Vision Pro, Cognex Corporation) to train an AI model to perform two-class classification ("motion" or "no motion") with data from the three sites (70% training dataset, n = 554; 30% validation dataset, n = 239). Separately, data from Site A and Site C were used for training and validating; testing was performed on the Site B CTPA exams. A five-fold repeated cross-validation was performed to evaluate the model performance with accuracy and receiver operating characteristics analysis (ROC). Results: Among the CTPA images from 793 patients (mean age 63 ± 17 years; 391 males, 402 females), 372 had no motion artifacts, and 421 had substantial motion artifacts. The statistics for the average performance of the AI model after five-fold repeated cross-validation for the two-class classification included 94% sensitivity, 91% specificity, 93% accuracy, and 0.93 area under the ROC curve (AUC: 95% CI 0.89-0.97). Conclusion: The AI model used in this study can successfully identify CTPA exams with diagnostic interpretation limiting motion artifacts in multicenter training and test datasets. Clinical relevance: The AI model used in the study can help alert technologists about the presence of substantial motion artifacts on CTPA, where a repeat image acquisition can help salvage diagnostic information.

13.
Diagnostics (Basel) ; 13(3)2023 Jan 23.
Artigo em Inglês | MEDLINE | ID: mdl-36766516

RESUMO

Chest radiographs (CXR) are the most performed imaging tests and rank high among the radiographic exams with suboptimal quality and high rejection rates. Suboptimal CXRs can cause delays in patient care and pitfalls in radiographic interpretation, given their ubiquitous use in the diagnosis and management of acute and chronic ailments. Suboptimal CXRs can also compound and lead to high inter-radiologist variations in CXR interpretation. While advances in radiography with transitions to computerized and digital radiography have reduced the prevalence of suboptimal exams, the problem persists. Advances in machine learning and artificial intelligence (AI), particularly in the radiographic acquisition, triage, and interpretation of CXRs, could offer a plausible solution for suboptimal CXRs. We review the literature on suboptimal CXRs and the potential use of AI to help reduce the prevalence of suboptimal CXRs.

14.
Clin Imaging ; 95: 47-51, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-36610270

RESUMO

PURPOSE: To assess feasibility of automated segmentation and measurement of tracheal collapsibility for detecting tracheomalacia on inspiratory and expiratory chest CT images. METHODS: Our study included 123 patients (age 67 ± 11 years; female: male 69:54) who underwent clinically indicated chest CT examinations in both inspiration and expiration phases. A thoracic radiologist measured anteroposterior length of trachea in inspiration and expiration phase image at the level of maximum collapsibility or aortic arch (in absence of luminal change). Separately, another investigator separately processed the inspiratory and expiratory DICOM CT images with Airway Segmentation component of a commercial COPD software (IntelliSpace Portal, Philips Healthcare). Upon segmentation, the software automatically estimated average lumen diameter (in mm) and lumen area (sq.mm) both along the entire length of trachea and at the level of aortic arch. Data were analyzed with independent t-tests and area under the receiver operating characteristic curve (AUC). RESULTS: Of the 123 patients, 48 patients had tracheomalacia and 75 patients did not. Ratios of inspiration to expiration phases average lumen area and lumen diameter from the length of trachea had the highest AUC of 0.93 (95% CI = 0.88-0.97) for differentiating presence and absence of tracheomalacia. A decrease of ≥25% in average lumen diameter had sensitivity of 82% and specificity of 87% for detecting tracheomalacia. A decrease of ≥40% in the average lumen area had sensitivity and specificity of 86% for detecting tracheomalacia. CONCLUSION: Automatic segmentation and measurement of tracheal dimension over the entire tracheal length is more accurate than a single-level measurement for detecting tracheomalacia.


Assuntos
Traqueomalácia , Humanos , Masculino , Feminino , Pessoa de Meia-Idade , Idoso , Traqueomalácia/diagnóstico por imagem , Traqueia/diagnóstico por imagem , Tomografia Computadorizada por Raios X/métodos , Sensibilidade e Especificidade , Curva ROC
15.
Sci Rep ; 13(1): 189, 2023 01 05.
Artigo em Inglês | MEDLINE | ID: mdl-36604467

RESUMO

Non-contrast head CT (NCCT) is extremely insensitive for early (< 3-6 h) acute infarct identification. We developed a deep learning model that detects and delineates suspected early acute infarcts on NCCT, using diffusion MRI as ground truth (3566 NCCT/MRI training patient pairs). The model substantially outperformed 3 expert neuroradiologists on a test set of 150 CT scans of patients who were potential candidates for thrombectomy (60 stroke-negative, 90 stroke-positive middle cerebral artery territory only infarcts), with sensitivity 96% (specificity 72%) for the model versus 61-66% (specificity 90-92%) for the experts; model infarct volume estimates also strongly correlated with those of diffusion MRI (r2 > 0.98). When this 150 CT test set was expanded to include a total of 364 CT scans with a more heterogeneous distribution of infarct locations (94 stroke-negative, 270 stroke-positive mixed territory infarcts), model sensitivity was 97%, specificity 99%, for detection of infarcts larger than the 70 mL volume threshold used for patient selection in several major randomized controlled trials of thrombectomy treatment.


Assuntos
Aprendizado Profundo , Acidente Vascular Cerebral , Humanos , Tomografia Computadorizada por Raios X , Acidente Vascular Cerebral/diagnóstico por imagem , Imageamento por Ressonância Magnética , Infarto da Artéria Cerebral Média
16.
Jpn J Radiol ; 41(2): 194-200, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-36331701

RESUMO

PURPOSE: Knowledge of kidney stone composition can help in patient management; urine composition analysis and dual-energy CT are frequently used to assess stone type. We assessed if threshold-based stone segmentation and radiomics can determine the composition of kidney stones from single-energy, non-contrast abdomen-pelvis CT. METHODS: With IRB approval, we identified 218 consecutive patients (mean age 64 ± 13 years; male:female 138:80) with the presence of kidney stones on non-contrast, abdomen-pelvis CT and surgical or biochemical proof of their stone composition. CT examinations were performed on one of the seven multidetector-row scanners from four vendors (GE, Philips, Siemens, Toshiba). Deidentified CT images were processed with a radiomics prototype (Frontier, Siemens Healthineers) to segment the entire kidney volumes with an AI-based organ segmentation tool. We applied a threshold of 130 HU to isolate stones in the segmented kidneys and to estimate radiomics over the segmented stone volume. A coinvestigator verified kidney stone segmentation and adjusted the volume of interest to include the entire stone volume when necessary. We applied multiple logistic regression tests with precision recall plots to obtain area under the curve (AUC) using a built-in R statistical program. RESULTS: The threshold-based stone segmentation successfully isolated kidney stones (uric acid: n = 102 patients, calcium oxalate/phosphate: n = 116 patients) in all patients. Radiomics differentiated between calcium and uric acid stones with an AUC of 0.78 (p < 0.01, 95% CI 0.73-0.83), 0.79 sensitivity, and 0.90 specificity regardless of CT vendors (GE CT: AUC = 0.82, p < 0.01, 95% CI 0.740-0896; Siemens CT: AUC = 0.77, 95% CI 0.700-0.846, p < 0.01). CONCLUSION: Automated threshold-based stone segmentation and radiomics can differentiate between calcium oxalate/phosphate and urate stones from non-contrast, single-energy abdomen CT.


Assuntos
Oxalato de Cálcio , Cálculos Renais , Humanos , Masculino , Feminino , Pessoa de Meia-Idade , Idoso , Oxalato de Cálcio/análise , Ácido Úrico/análise , Cálculos Renais/diagnóstico por imagem , Cálculos Renais/química , Tomografia Computadorizada por Raios X/métodos , Oxalatos , Fosfatos
17.
JAMA Netw Open ; 5(12): e2247172, 2022 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-36520432

RESUMO

Importance: Early detection of pneumothorax, most often via chest radiography, can help determine need for emergent clinical intervention. The ability to accurately detect and rapidly triage pneumothorax with an artificial intelligence (AI) model could assist with earlier identification and improve care. Objective: To compare the accuracy of an AI model vs consensus thoracic radiologist interpretations in detecting any pneumothorax (incorporating both nontension and tension pneumothorax) and tension pneumothorax. Design, Setting, and Participants: This diagnostic study was a retrospective standalone performance assessment using a data set of 1000 chest radiographs captured between June 1, 2015, and May 31, 2021. The radiographs were obtained from patients aged at least 18 years at 4 hospitals in the Mass General Brigham hospital network in the United States. Included radiographs were selected using 2 strategies from all chest radiography performed at the hospitals, including inpatient and outpatient. The first strategy identified consecutive radiographs with pneumothorax through a manual review of radiology reports, and the second strategy identified consecutive radiographs with tension pneumothorax using natural language processing. For both strategies, negative radiographs were selected by taking the next negative radiograph acquired from the same radiography machine as each positive radiograph. The final data set was an amalgamation of these processes. Each radiograph was interpreted independently by up to 3 radiologists to establish consensus ground-truth interpretations. Each radiograph was then interpreted by the AI model for the presence of pneumothorax and tension pneumothorax. This study was conducted between July and October 2021, with the primary analysis performed between October and November 2021. Main Outcomes and Measures: The primary end points were the areas under the receiver operating characteristic curves (AUCs) for the detection of pneumothorax and tension pneumothorax. The secondary end points were the sensitivities and specificities for the detection of pneumothorax and tension pneumothorax. Results: The final analysis included radiographs from 985 patients (mean [SD] age, 60.8 [19.0] years; 436 [44.3%] female patients), including 307 patients with nontension pneumothorax, 128 patients with tension pneumothorax, and 550 patients without pneumothorax. The AI model detected any pneumothorax with an AUC of 0.979 (95% CI, 0.970-0.987), sensitivity of 94.3% (95% CI, 92.0%-96.3%), and specificity of 92.0% (95% CI, 89.6%-94.2%) and tension pneumothorax with an AUC of 0.987 (95% CI, 0.980-0.992), sensitivity of 94.5% (95% CI, 90.6%-97.7%), and specificity of 95.3% (95% CI, 93.9%-96.6%). Conclusions and Relevance: These findings suggest that the assessed AI model accurately detected pneumothorax and tension pneumothorax in this chest radiograph data set. The model's use in the clinical workflow could lead to earlier identification and improved care for patients with pneumothorax.


Assuntos
Aprendizado Profundo , Pneumotórax , Humanos , Feminino , Adolescente , Adulto , Pessoa de Meia-Idade , Masculino , Pneumotórax/diagnóstico por imagem , Radiografia Torácica , Inteligência Artificial , Estudos Retrospectivos , Radiografia
18.
Diagnostics (Basel) ; 12(10)2022 Sep 30.
Artigo em Inglês | MEDLINE | ID: mdl-36292071

RESUMO

BACKGROUND: Missed findings in chest X-ray interpretation are common and can have serious consequences. METHODS: Our study included 2407 chest radiographs (CXRs) acquired at three Indian and five US sites. To identify CXRs reported as normal, we used a proprietary radiology report search engine based on natural language processing (mPower, Nuance). Two thoracic radiologists reviewed all CXRs and recorded the presence and clinical significance of abnormal findings on a 5-point scale (1-not important; 5-critical importance). All CXRs were processed with the AI model (Qure.ai) and outputs were recorded for the presence of findings. Data were analyzed to obtain area under the ROC curve (AUC). RESULTS: Of 410 CXRs (410/2407, 18.9%) with unreported/missed findings, 312 (312/410, 76.1%) findings were clinically important: pulmonary nodules (n = 157), consolidation (60), linear opacities (37), mediastinal widening (21), hilar enlargement (17), pleural effusions (11), rib fractures (6) and pneumothoraces (3). AI detected 69 missed findings (69/131, 53%) with an AUC of up to 0.935. The AI model was generalizable across different sites, geographic locations, patient genders and age groups. CONCLUSION: A substantial number of important CXR findings are missed; the AI model can help to identify and reduce the frequency of important missed findings in a generalizable manner.

20.
Diagnostics (Basel) ; 12(9)2022 Aug 28.
Artigo em Inglês | MEDLINE | ID: mdl-36140488

RESUMO

Purpose: We assessed whether a CXR AI algorithm was able to detect missed or mislabeled chest radiograph (CXR) findings in radiology reports. Methods: We queried a multi-institutional radiology reports search database of 13 million reports to identify all CXR reports with addendums from 1999-2021. Of the 3469 CXR reports with an addendum, a thoracic radiologist excluded reports where addenda were created for typographic errors, wrong report template, missing sections, or uninterpreted signoffs. The remaining reports contained addenda (279 patients) with errors related to side-discrepancies or missed findings such as pulmonary nodules, consolidation, pleural effusions, pneumothorax, and rib fractures. All CXRs were processed with an AI algorithm. Descriptive statistics were performed to determine the sensitivity, specificity, and accuracy of the AI in detecting missed or mislabeled findings. Results: The AI had high sensitivity (96%), specificity (100%), and accuracy (96%) for detecting all missed and mislabeled CXR findings. The corresponding finding-specific statistics for the AI were nodules (96%, 100%, 96%), pneumothorax (84%, 100%, 85%), pleural effusion (100%, 17%, 67%), consolidation (98%, 100%, 98%), and rib fractures (87%, 100%, 94%). Conclusions: The CXR AI could accurately detect mislabeled and missed findings. Clinical Relevance: The CXR AI can reduce the frequency of errors in detection and side-labeling of radiographic findings.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA