Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
Intervalo de ano de publicação
1.
J Surg Res ; 301: 504-511, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39042979

RESUMO

INTRODUCTION: Large language models like Chat Generative Pre-Trained Transformer (ChatGPT) are increasingly used in academic writing. Faculty may consider use of artificial intelligence (AI)-generated responses a form of cheating. We sought to determine whether general surgery residency faculty could detect AI versus human-written responses to a text prompt; hypothesizing that faculty would not be able to reliably differentiate AI versus human-written responses. METHODS: Ten essays were generated using a text prompt, "Tell us in 1-2 paragraphs why you are considering the University of Rochester for General Surgery residency" (Current trainees: n = 5, ChatGPT: n = 5). Ten blinded faculty reviewers rated essays (ten-point Likert scale) on the following criteria: desire to interview, relevance to the general surgery residency, overall impression, and AI- or human-generated; with scores and identification error rates compared between the groups. RESULTS: There were no differences between groups for %total points (ChatGPT 66.0 ± 13.5%, human 70.0 ± 23.0%, P = 0.508) or identification error rates (ChatGPT 40.0 ± 35.0%, human 20.0 ± 30.0%, P = 0.175). Except for one, all essays were identified incorrectly by at least two reviewers. Essays identified as human-generated received higher overall impression scores (area under the curve: 0.82 ± 0.04, P < 0.01). CONCLUSIONS: Whether use of AI tools for academic purposes should constitute academic dishonesty is controversial. We demonstrate that human and AI-generated essays are similar in quality, but there is bias against presumed AI-generated essays. Faculty are not able to reliably differentiate human from AI-generated essays, thus bias may be misdirected. AI-tools are becoming ubiquitous and their use is not easily detected. Faculty must expect these tools to play increasing roles in medical education.


Assuntos
Inteligência Artificial , Cirurgia Geral , Internato e Residência , Internato e Residência/métodos , Humanos , Cirurgia Geral/educação , Redação , Docentes de Medicina/psicologia
3.
J Pediatr Surg ; 59(1): 74-79, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37865573

RESUMO

BACKGROUND: The assignment of trauma team activation levels can be conceptualized as a classification task. Machine learning models can be used to optimize classification predictions. Our purpose was to demonstrate proof-of-concept for a machine learning tool for predicting trauma team activation levels in pediatric patients with traumatic injuries. METHODS: Following IRB approval, we retrospectively collected data from the institutional trauma registry and electronic medical record at our Pediatric Trauma Center for all patients (age <18 y) who triggered a trauma team activation (1/2014-12/2021), including: demographics, mechanisms of injury, comorbidities, pre-hospital interventions, numeric variables, and the six "Need for Trauma Intervention (NFTI)" criteria. Three machine learning models (Logistic Regression, Random Forest, Support Vector Machine) were tested 1000 times in separate trials using the union of the Cribari and NFTI metrics as ground-truth (Injury Severity Score >15 or positive for any of 6 NFTI criteria = full activation). Model performance was quantified and compared to emergency department (ED) staff. RESULTS: ED staff had 75% accuracy, an area under the curve (AUC) of 0.73 ± 0.04, and an F1 score of 0.49. The best performing of all machine learning models, the support vector machine, had 80% accuracy, AUC 0.81 ± 4.1e-5, F1 Score 0.80, with less variance compared to other models and ED staff. CONCLUSIONS: All machine learning models outperformed ED staff in all performance metrics. These results suggest that data-driven methods can optimize trauma team activations in the ED, with potential improvements in both patient safety and hospital resource utilization. TYPE OF STUDY: Economic/Decision Analysis or Modeling Studies. LEVEL OF EVIDENCE: II.


Assuntos
Serviço Hospitalar de Emergência , Triagem , Humanos , Criança , Estudos Retrospectivos , Triagem/métodos , Centros de Traumatologia , Aprendizado de Máquina
4.
J Am Coll Surg ; 239(2): 134-144, 2024 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-38357984

RESUMO

BACKGROUND: Assigning trauma team activation (TTA) levels for trauma patients is a classification task that machine learning models can help optimize. However, performance is dependent on the "ground-truth" labels used for training. Our purpose was to investigate 2 ground truths, the Cribari matrix and the Need for Trauma Intervention (NFTI), for labeling training data. STUDY DESIGN: Data were retrospectively collected from the institutional trauma registry and electronic medical record, including all pediatric patients (age <18 years) who triggered a TTA (January 2014 to December 2021). Three ground truths were used to label training data: (1) Cribari (Injury Severity Score >15 = full activation), (2) NFTI (positive for any of 6 criteria = full activation), and (3) the union of Cribari+NFTI (either positive = full activation). RESULTS: Of 1,366 patients triaged by trained staff, 143 (10.47%) were considered undertriaged using Cribari, 210 (15.37%) using NFTI, and 273 (19.99%) using Cribari+NFTI. NFTI and Cribari+NFTI were more sensitive to undertriage in patients with penetrating mechanisms of injury (p = 0.006), specifically stab wounds (p = 0.014), compared with Cribari, but Cribari indicated overtriage in more patients who required prehospital airway management (p < 0.001), CPR (p = 0.017), and who had mean lower Glasgow Coma Scale scores on presentation (p < 0.001). The mortality rate was higher in the Cribari overtriage group (7.14%, n = 9) compared with NFTI and Cribari+NFTI (0.00%, n = 0, p = 0.005). CONCLUSIONS: To prioritize patient safety, Cribari+NFTI appears best for training a machine learning algorithm to predict the TTA level.


Assuntos
Escala de Gravidade do Ferimento , Triagem , Ferimentos e Lesões , Humanos , Criança , Estudos Retrospectivos , Ferimentos e Lesões/terapia , Ferimentos e Lesões/diagnóstico , Ferimentos e Lesões/mortalidade , Feminino , Masculino , Pré-Escolar , Adolescente , Triagem/normas , Triagem/métodos , Aprendizado de Máquina , Centros de Traumatologia , Equipe de Assistência ao Paciente/organização & administração , Lactente , Sistema de Registros
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA