Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
Intervalo de ano de publicação
1.
J Am Med Inform Assoc ; 31(10): 2284-2293, 2024 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-39271171

RESUMO

OBJECTIVES: The aim of this study was to investigate GPT-3.5 in generating and coding medical documents with International Classification of Diseases (ICD)-10 codes for data augmentation on low-resource labels. MATERIALS AND METHODS: Employing GPT-3.5 we generated and coded 9606 discharge summaries based on lists of ICD-10 code descriptions of patients with infrequent (or generation) codes within the MIMIC-IV dataset. Combined with the baseline training set, this formed an augmented training set. Neural coding models were trained on baseline and augmented data and evaluated on an MIMIC-IV test set. We report micro- and macro-F1 scores on the full codeset, generation codes, and their families. Weak Hierarchical Confusion Matrices determined within-family and outside-of-family coding errors in the latter codesets. The coding performance of GPT-3.5 was evaluated on prompt-guided self-generated data and real MIMIC-IV data. Clinicians evaluated the clinical acceptability of the generated documents. RESULTS: Data augmentation results in slightly lower overall model performance but improves performance for the generation candidate codes and their families, including 1 absent from the baseline training data. Augmented models display lower out-of-family error rates. GPT-3.5 identifies ICD-10 codes by their prompted descriptions but underperforms on real data. Evaluators highlight the correctness of generated concepts while suffering in variety, supporting information, and narrative. DISCUSSION AND CONCLUSION: While GPT-3.5 alone given our prompt setting is unsuitable for ICD-10 coding, it supports data augmentation for training neural models. Augmentation positively affects generation code families but mainly benefits codes with existing examples. Augmentation reduces out-of-family errors. Documents generated by GPT-3.5 state prompted concepts correctly but lack variety, and authenticity in narratives.


Assuntos
Codificação Clínica , Classificação Internacional de Doenças , Sumários de Alta do Paciente Hospitalar , Humanos , Registros Eletrônicos de Saúde , Alta do Paciente , Redes Neurais de Computação
2.
EClinicalMedicine ; 71: 102590, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38623399

RESUMO

Background: Long COVID is a debilitating multisystem condition. The objective of this study was to estimate the prevalence of long COVID in the adult population of Scotland, and to identify risk factors associated with its development. Methods: In this national, retrospective, observational cohort study, we analysed electronic health records (EHRs) for all adults (≥18 years) registered with a general medical practice and resident in Scotland between March 1, 2020, and October 26, 2022 (98-99% of the population). We linked data from primary care, secondary care, laboratory testing and prescribing. Four outcome measures were used to identify long COVID: clinical codes, free text in primary care records, free text on sick notes, and a novel operational definition. The operational definition was developed using Poisson regression to identify clinical encounters indicative of long COVID from a sample of negative and positive COVID-19 cases matched on time-varying propensity to test positive for SARS-CoV-2. Possible risk factors for long COVID were identified by stratifying descriptive statistics by long COVID status. Findings: Of 4,676,390 participants, 81,219 (1.7%) were identified as having long COVID. Clinical codes identified the fewest cases (n = 1,092, 0.02%), followed by free text (n = 8,368, 0.2%), sick notes (n = 14,469, 0.3%), and the operational definition (n = 64,193, 1.4%). There was limited overlap in cases identified by the measures; however, temporal trends and patient characteristics were consistent across measures. Compared with the general population, a higher proportion of people with long COVID were female (65.1% versus 50.4%), aged 38-67 (63.7% versus 48.9%), overweight or obese (45.7% versus 29.4%), had one or more comorbidities (52.7% versus 36.0%), were immunosuppressed (6.9% versus 3.2%), shielding (7.9% versus 3.4%), or hospitalised within 28 days of testing positive (8.8% versus 3.3%%), and had tested positive before Omicron became the dominant variant (44.9% versus 35.9%). The operational definition identified long COVID cases with combinations of clinical encounters (from four symptoms, six investigation types, and seven management strategies) recorded in EHRs within 4-26 weeks of a positive SARS-CoV-2 test. These combinations were significantly (p < 0.0001) more prevalent in positive COVID-19 patients than in matched negative controls. In a case-crossover analysis, 16.4% of those identified by the operational definition had similar healthcare patterns recorded before testing positive. Interpretation: The prevalence of long COVID presenting in general practice was estimated to be 0.02-1.7%, depending on the measure used. Due to challenges in diagnosing long COVID and inconsistent recording of information in EHRs, the true prevalence of long COVID is likely to be higher. The operational definition provided a novel approach but relied on a restricted set of symptoms and may misclassify individuals with pre-existing health conditions. Further research is needed to refine and validate this approach. Funding: Chief Scientist Office (Scotland), Medical Research Council, and BREATHE.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA