RESUMO
PURPOSE: Worldwide clinical knowledge is expanding rapidly, but physicians have sparse time to review scientific literature. Large language models (eg, Chat Generative Pretrained Transformer [ChatGPT]), might help summarize and prioritize research articles to review. However, large language models sometimes "hallucinate" incorrect information. METHODS: We evaluated ChatGPT's ability to summarize 140 peer-reviewed abstracts from 14 journals. Physicians rated the quality, accuracy, and bias of the ChatGPT summaries. We also compared human ratings of relevance to various areas of medicine to ChatGPT relevance ratings. RESULTS: ChatGPT produced summaries that were 70% shorter (mean abstract length of 2,438 characters decreased to 739 characters). Summaries were nevertheless rated as high quality (median score 90, interquartile range [IQR] 87.0-92.5; scale 0-100), high accuracy (median 92.5, IQR 89.0-95.0), and low bias (median 0, IQR 0-7.5). Serious inaccuracies and hallucinations were uncommon. Classification of the relevance of entire journals to various fields of medicine closely mirrored physician classifications (nonlinear standard error of the regression [SER] 8.6 on a scale of 0-100). However, relevance classification for individual articles was much more modest (SER 22.3). CONCLUSIONS: Summaries generated by ChatGPT were 70% shorter than mean abstract length and were characterized by high quality, high accuracy, and low bias. Conversely, ChatGPT had modest ability to classify the relevance of articles to medical specialties. We suggest that ChatGPT can help family physicians accelerate review of the scientific literature and have developed software (pyJournalWatch) to support this application. Life-critical medical decisions should remain based on full, critical, and thoughtful evaluation of the full text of research articles in context with clinical guidelines.
Assuntos
Medicina , Humanos , Médicos de FamíliaRESUMO
Objective: Our objective is to assess the accuracy of the COVID-19 vaccination status within the electronic health record (EHR) for a panel of patients in a primary care practice when manual queries of the state immunization databases are required to access outside immunization records. Materials and Methods: This study evaluated COVID-19 vaccination status of adult primary care patients within a university-based health system EHR by manually querying the Kansas and Missouri Immunization Information Systems. Results: A manual query of the local Immunization Information Systems for 4114 adult patients with "unknown" vaccination status showed 44% of the patients were previously vaccinated. Attempts to assess the comprehensiveness of the Immunization Information Systems were hampered by incomplete documentation in the chart and poor response to patient outreach. Conclusions: When the interface between the patient chart and the local Immunization Information System depends on a manual query for the transfer of data, the COVID-19 vaccination status for a panel of patients is often inaccurate.