Búsqueda | Portal Regional de la BVS

Artificial intelligence to identify fractures on pediatric and young adult upper extremity radiographs.

Zech, John R; Jaramillo, Diego; Altosaar, Jaan; Popkin, Charles A; Wong, Tony T.

Pediatr Radiol ; 53(12): 2386-2397, 2023 11.

Artículo en Inglés | MEDLINE | ID: mdl-37740031

RESUMEN

BACKGROUND: Pediatric fractures are challenging to identify given the different response of the pediatric skeleton to injury compared to adults, and most artificial intelligence (AI) fracture detection work has focused on adults. OBJECTIVE: Develop and transparently share an AI model capable of detecting a range of pediatric upper extremity fractures. MATERIALS AND METHODS: In total, 58,846 upper extremity radiographs (finger/hand, wrist/forearm, elbow, humerus, shoulder/clavicle) from 14,873 pediatric and young adult patients were divided into train (n = 12,232 patients), tune (n = 1,307), internal test (n = 819), and external test (n = 515) splits. Fracture was determined by manual inspection of all test radiographs and the subset of train/tune radiographs whose reports were classified fracture-positive by a rule-based natural language processing (NLP) algorithm. We trained an object detection model (Faster Region-based Convolutional Neural Network [R-CNN]; "strongly-supervised") and an image classification model (EfficientNetV2-Small; "weakly-supervised") to detect fractures using train/tune data and evaluate on test data. AI fracture detection accuracy was compared with accuracy of on-call residents on cases they preliminarily interpreted overnight. RESULTS: A strongly-supervised fracture detection AI model achieved overall test area under the receiver operating characteristic curve (AUC) of 0.96 (95% CI 0.95-0.97), accuracy 89.7% (95% CI 88.0-91.3%), sensitivity 90.8% (95% CI 88.5-93.1%), and specificity 88.7% (95% CI 86.4-91.0%), and outperformed a weakly-supervised model (AUC 0.93, 95% CI 0.92-0.94, P < 0.0001). AI accuracy on cases preliminary interpreted overnight was higher than resident accuracy (AI 89.4% vs. 85.1%, 95% CI 87.3-91.5% vs. 82.7-87.5%, P = 0.01). CONCLUSION: An object detection AI model identified pediatric upper extremity fractures with high accuracy.

Asunto(s)

Inteligencia Artificial , Fracturas Óseas , Humanos , Niño , Adulto Joven , Fracturas Óseas/diagnóstico por imagen , Redes Neurales de la Computación , Radiografía , Codo , Estudios Retrospectivos

Auditing Learned Associations in Deep Learning Approaches to Extract Race and Ethnicity from Clinical Text.

Bear Don't Walk Iv, Oliver J; Pichon, Adrienne; Nieva, Harry Reyes; Sun, Tony; Altosaar, Jaan; Natarajan, Karthik; Perotte, Adler; Tarczy-Hornoch, Peter; Demner-Fushman, Dina; Elhadad, Noémie.

AMIA Annu Symp Proc ; 2023: 289-298, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-38222422

RESUMEN

Complete and accurate race and ethnicity (RE) patient information is important for many areas of biomedical informatics research, such as defining and characterizing cohorts, performing quality assessments, and identifying health inequities. Patient-level RE data is often inaccurate or missing in structured sources, but can be supplemented through clinical notes and natural language processing (NLP). While NLP has made many improvements in recent years with large language models, bias remains an often-unaddressed concern, with research showing that harmful and negative language is more often used for certain racial/ethnic groups than others. We present an approach to audit the learned associations of models trained to identify RE information in clinical text by measuring the concordance between model-derived salient features and manually identified RE-related spans of text. We show that while models perform well on the surface, there exist concerning learned associations and potential for future harms from RE-identification models if left unaddressed.

Asunto(s)

Aprendizaje Profundo , Etnicidad , Humanos , Lenguaje , Procesamiento de Lenguaje Natural

Assessing Phenotype Definitions for Algorithmic Fairness.

Sun, Tony Y; Bhave, Shreyas A; Altosaar, Jaan; Elhadad, Noémie.

AMIA Annu Symp Proc ; 2022: 1032-1041, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-37128361

RESUMEN

Phenotyping is a core, routine activity in observational health research. Cohorts impact downstream analyses, such as how a condition is characterized, how patient risk is defined, and what treatments are studied. It is thus critical to ensure that cohorts are representative of all patients, independently of their demographics or social determinants of health. In this paper, we propose a set of best practices to assess the fairness of phenotype definitions. We leverage established fairness metrics commonly used in predictive models and relate them to commonly used epidemiological metrics. We describe an empirical study for Crohn's disease and diabetes type 2, each with multiple phenotype definitions taken from the literature across gender and race. We show that the different phenotype definitions exhibit widely varying and disparate performance according to the different fairness metrics and subgroups. We hope that the proposed best practices can help in constructing fair and inclusive phenotype definitions.

Asunto(s)

Enfermedad de Crohn , Humanos , Fenotipo

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA