Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 62
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
J Biomed Inform ; 142: 104346, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37061012

RESUMO

Daily progress notes are a common note type in the electronic health record (EHR) where healthcare providers document the patient's daily progress and treatment plans. The EHR is designed to document all the care provided to patients, but it also enables note bloat with extraneous information that distracts from the diagnoses and treatment plans. Applications of natural language processing (NLP) in the EHR is a growing field with the majority of methods in information extraction. Few tasks use NLP methods for downstream diagnostic decision support. We introduced the 2022 National NLP Clinical Challenge (N2C2) Track 3: Progress Note Understanding - Assessment and Plan Reasoning as one step towards a new suite of tasks. The Assessment and Plan Reasoning task focuses on the most critical components of progress notes, Assessment and Plan subsections where health problems and diagnoses are contained. The goal of the task was to develop and evaluate NLP systems that automatically predict causal relations between the overall status of the patient contained in the Assessment section and its relation to each component of the Plan section which contains the diagnoses and treatment plans. The goal of the task was to identify and prioritize diagnoses as the first steps in diagnostic decision support to find the most relevant information in long documents like daily progress notes. We present the results of the 2022 N2C2 Track 3 and provide a description of the data, evaluation, participation and system performance.


Assuntos
Registros Eletrônicos de Saúde , Armazenamento e Recuperação da Informação , Humanos , Processamento de Linguagem Natural , Pessoal de Saúde
2.
J Biomed Inform ; 138: 104286, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36706848

RESUMO

The meaningful use of electronic health records (EHR) continues to progress in the digital era with clinical decision support systems augmented by artificial intelligence. A priority in improving provider experience is to overcome information overload and reduce the cognitive burden so fewer medical errors and cognitive biases are introduced during patient care. One major type of medical error is diagnostic error due to systematic or predictable errors in judgement that rely on heuristics. The potential for clinical natural language processing (cNLP) to model diagnostic reasoning in humans with forward reasoning from data to diagnosis and potentially reduce cognitive burden and medical error has not been investigated. Existing tasks to advance the science in cNLP have largely focused on information extraction and named entity recognition through classification tasks. We introduce a novel suite of tasks coined as Diagnostic Reasoning Benchmarks, Dr.Bench, as a new benchmark for developing and evaluating cNLP models with clinical diagnostic reasoning ability. The suite includes six tasks from ten publicly available datasets addressing clinical text understanding, medical knowledge reasoning, and diagnosis generation. DR.BENCH is the first clinical suite of tasks designed to be a natural language generation framework to evaluate pre-trained language models for diagnostic reasoning. The goal of DR. BENCH is to advance the science in cNLP to support downstream applications in computerized diagnostic decision support and improve the efficiency and accuracy of healthcare providers during patient care. We fine-tune and evaluate the state-of-the-art generative models on DR.BENCH. Experiments show that with domain adaptation pre-training on medical knowledge, the model demonstrated opportunities for improvement when evaluated in DR. BENCH. We share DR. BENCH as a publicly available GitLab repository with a systematic approach to load and evaluate models for the cNLP community. We also discuss the carbon footprint produced during the experiments and encourage future work on DR.BENCH to report the carbon footprint.


Assuntos
Inteligência Artificial , Processamento de Linguagem Natural , Humanos , Benchmarking , Resolução de Problemas , Armazenamento e Recuperação da Informação
3.
J Stroke Cerebrovasc Dis ; 31(3): 106268, 2022 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-34974241

RESUMO

OBJECTIVES: The pathogenesis of intracranial aneurysms is multifactorial and includes genetic, environmental, and anatomic influences. We aimed to identify image-based morphological parameters that were associated with middle cerebral artery (MCA) bifurcation aneurysms. MATERIALS AND METHODS: We evaluated three-dimensional morphological parameters obtained from CT angiography (CTA) or digital subtraction angiography (DSA) from 317 patients with unilateral MCA bifurcation aneurysms diagnosed at the Brigham and Women's Hospital and Massachusetts General Hospital between 1990 and 2016. We chose the contralateral unaffected MCA bifurcation as the control group, in order to control for genetic and environmental risk factors. Diameters and angles of surrounding parent and daughter vessels of 634 MCAs were examined. RESULTS: Univariable and multivariable statistical analyses were performed to determine statistical significance. Sensitivity analyses with smaller (≤ 3 mm) aneurysms only and with angles excluded, were also performed. In a multivariable conditional logistic regression model we showed that smaller diameter size ratio (OR 0.0004, 95% CI 0.0001-0.15), larger daughter-daughter angles (OR 1.08, 95% CI 1.06-1.11) and larger parent-daughter angle ratios (OR 4.24, 95% CI 1.77-10.16) were significantly associated with MCA aneurysm presence after correcting for other variables. In order to account for possible changes to the vasculature by the aneurysm, a subgroup analysis of small aneurysms (≤ 3 mm) was performed and showed that the results were similar. CONCLUSIONS: Easily measurable morphological parameters of the surrounding vasculature of the MCA may provide objective metrics to assess MCA aneurysm formation risk in high-risk patients.


Assuntos
Aneurisma Intracraniano , Artéria Cerebral Média , Estudos de Casos e Controles , Angiografia por Tomografia Computadorizada , Feminino , Humanos , Aneurisma Intracraniano/diagnóstico por imagem , Artéria Cerebral Média/diagnóstico por imagem
4.
J Biomed Inform ; 113: 103626, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33259943

RESUMO

Recent transformer-based pre-trained language models have become a de facto standard for many text classification tasks. Nevertheless, their utility in the clinical domain, where classification is often performed at encounter or patient level, is still uncertain due to the limitation on the maximum length of input. In this work, we introduce a self-supervised method for pre-training that relies on a masked token objective and is free from the limitation on the maximum input length. We compare the proposed method with supervised pre-training that uses billing codes as a source of supervision. We evaluate the proposed method on one publicly-available and three in-house datasets using the standard evaluation metrics such as the area under the ROC curve and F1 score. We find that, surprisingly, even though self-supervised pre-training performs slightly worse than supervised, it still preserves most of the gains from pre-training.


Assuntos
Idioma , Processamento de Linguagem Natural , Humanos , Curva ROC
5.
Crit Care Med ; 48(9): e791-e798, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32590389

RESUMO

OBJECTIVES: Acute respiratory distress syndrome is frequently under recognized and associated with increased mortality. Previously, we developed a model that used machine learning and natural language processing of text from radiology reports to identify acute respiratory distress syndrome. The model showed improved performance in diagnosing acute respiratory distress syndrome when compared to a rule-based method. In this study, our objective was to externally validate the natural language processing model in patients from an independent hospital setting. DESIGN: Secondary analysis of data across five prospective clinical studies. SETTING: An urban, tertiary care, academic hospital. PATIENTS: Adult patients admitted to the medical ICU and at-risk for acute respiratory distress syndrome. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: The natural language processing model was previously derived and internally validated in burn, trauma, and medical patients at Loyola University Medical Center. Two machine learning models were examined with the following text features from qualifying radiology reports: 1) word representations (n-grams) and 2) standardized clinical named entity mentions mapped from the National Library of Medicine Unified Medical Language System. The models were externally validated in a cohort of 235 patients at the University of Chicago Medicine, among which 110 (47%) were diagnosed with acute respiratory distress syndrome by expert annotation. During external validation, the n-gram model demonstrated good discrimination between acute respiratory distress syndrome and nonacute respiratory distress syndrome patients (C-statistic, 0.78; 95% CI, 0.72-0.84). The n-gram model had a higher discrimination for acute respiratory distress syndrome when compared with the standardized named entity model, although not statistically significant (C-statistic 0.78 vs 0.72; p = 0.09). The most important features in the model had good face validity for acute respiratory distress syndrome characteristics but differences in frequencies did occur between hospital settings. CONCLUSIONS: Our computable phenotype for acute respiratory distress syndrome had good discrimination in external validation and may be used by other health systems for case-identification. Discrepancies in feature representation are likely due to differences in characteristics of the patient cohorts.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Unidades de Terapia Intensiva , Radiografia Torácica/métodos , Síndrome do Desconforto Respiratório/diagnóstico por imagem , Síndrome do Desconforto Respiratório/mortalidade , Centros Médicos Acadêmicos , Adulto , Fatores Etários , Idoso , Feminino , Mortalidade Hospitalar , Hospitais Urbanos , Humanos , Aprendizado de Máquina , Masculino , Pessoa de Meia-Idade , Processamento de Linguagem Natural , Estudos Prospectivos , Reprodutibilidade dos Testes , Fatores Sexuais , Fatores Socioeconômicos
6.
BMC Med Inform Decis Mak ; 20(1): 79, 2020 04 29.
Artigo em Inglês | MEDLINE | ID: mdl-32349766

RESUMO

BACKGROUND: Automated de-identification methods for removing protected health information (PHI) from the source notes of the electronic health record (EHR) rely on building systems to recognize mentions of PHI in text, but they remain inadequate at ensuring perfect PHI removal. As an alternative to relying on de-identification systems, we propose the following solutions: (1) Mapping the corpus of documents to standardized medical vocabulary (concept unique identifier [CUI] codes mapped from the Unified Medical Language System) thus eliminating PHI as inputs to a machine learning model; and (2) training character-based machine learning models that obviate the need for a dictionary containing input words/n-grams. We aim to test the performance of models with and without PHI in a use-case for an opioid misuse classifier. METHODS: An observational cohort sampled from adult hospital inpatient encounters at a health system between 2007 and 2017. A case-control stratified sampling (n = 1000) was performed to build an annotated dataset for a reference standard of cases and non-cases of opioid misuse. Models for training and testing included CUI codes, character-based, and n-gram features. Models applied were machine learning with neural network and logistic regression as well as expert consensus with a rule-based model for opioid misuse. The area under the receiver operating characteristic curves (AUROC) were compared between models for discrimination. The Hosmer-Lemeshow test and visual plots measured model fit and calibration. RESULTS: Machine learning models with CUI codes performed similarly to n-gram models with PHI. The top performing models with AUROCs > 0.90 included CUI codes as inputs to a convolutional neural network, max pooling network, and logistic regression model. The top calibrated models with the best model fit were the CUI-based convolutional neural network and max pooling network. The top weighted CUI codes in logistic regression has the related terms 'Heroin' and 'Victim of abuse'. CONCLUSIONS: We demonstrate good test characteristics for an opioid misuse computable phenotype that is void of any PHI and performs similarly to models that use PHI. Herein we share a PHI-free, trained opioid misuse classifier for other researchers and health systems to use and benchmark to overcome privacy and security concerns.


Assuntos
Aprendizado de Máquina , Processamento de Linguagem Natural , Transtornos Relacionados ao Uso de Opioides/diagnóstico , Adulto , Registros Eletrônicos de Saúde , Humanos , Pacientes Internados , Prontuários Médicos , Unified Medical Language System
7.
Stroke ; 49(9): 2046-2052, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-30354989

RESUMO

Background and Purpose- The effects of anticoagulation therapy and elevated international normalized ratio (INR) values on the risk of aneurysmal subarachnoid hemorrhage are unknown. We aimed to investigate the association between anticoagulation therapy, elevated INR values, and rupture of intracranial aneurysms. Methods- We conducted a case-control study of 4696 patients with 6403 intracranial aneurysms, including 1198 prospective patients, diagnosed at the Massachusetts General Hospital and the Brigham and Women's Hospital between 1990 and 2016 who were on no anticoagulant therapy or on warfarin for anticoagulation. Patients were divided into ruptured and nonruptured groups. Univariable and multivariable logistic regression analyses were performed to evaluate the association of anticoagulation therapy, INR values, and presentation with a ruptured intracranial aneurysm, taking into account the interaction between anticoagulant use and INR. Inverse probability weighting using propensity scores was used to minimize differences in baseline demographics characteristics. The marginal effects of anticoagulant use on rupture risk stratified by INR values were calculated. Results- In unweighted and weighted multivariable analyses, elevated INR values were significantly associated with rupture status among patients who were not anticoagulated (unweighted odds ratio, 22.78; 95% CI, 10.85-47.81 and weighted odds ratio, 28.16; 95% CI, 12.44-63.77). In anticoagulated patients, warfarin use interacts significantly with INR when INR ≥1.2 by decreasing the effects of INR on rupture risk. Conclusions- INR elevation is associated with intracranial aneurysm rupture, but the effects may be moderated by warfarin. INR values should, therefore, be taken into consideration when counseling patients with intracranial aneurysms.


Assuntos
Aneurisma Roto/epidemiologia , Anticoagulantes/uso terapêutico , Coeficiente Internacional Normatizado , Aneurisma Intracraniano , Hemorragia Subaracnóidea/epidemiologia , Varfarina/uso terapêutico , Adulto , Idoso , Aneurisma Roto/sangue , Estudos de Casos e Controles , Feminino , Humanos , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Análise Multivariada , Razão de Chances , Pontuação de Propensão , Fatores de Risco , Ruptura Espontânea , Hemorragia Subaracnóidea/sangue
8.
Stroke ; 49(1): 34-39, 2018 01.
Artigo em Inglês | MEDLINE | ID: mdl-29203688

RESUMO

BACKGROUND AND PURPOSE: Previous studies have suggested a protective effect of diabetes mellitus on aneurysmal subarachnoid hemorrhage risk. However, reports are inconsistent, and objective measures of hyperglycemia in these studies are lacking. Our aim was to investigate the association between aneurysmal subarachnoid hemorrhage and antihyperglycemic agent use and glycated hemoglobin levels. METHODS: The medical records of 4701 patients with 6411 intracranial aneurysms, including 1201 prospective patients, diagnosed at the Massachusetts General Hospital and Brigham and Women's Hospital between 1990 and 2016 were reviewed and analyzed. Patients were separated into ruptured and nonruptured groups. Univariate and multivariate logistic regression analyses were performed to determine the association between aneurysmal subarachnoid hemorrhage and antihyperglycemic agents and glycated hemoglobin levels. Propensity score weighting was used to account for selection bias. RESULTS: In both unweighted and weighted multivariate analysis, antihyperglycemic agent use was inversely and significantly associated with ruptured aneurysms (unweighted odds ratio, 0.58; 95% confidence interval, 0.39-0.87; weighted odds ratio, 0.57; 95% confidence interval, 0.34-0.96). In contrast, glycated hemoglobin levels were not significantly associated with rupture status. CONCLUSIONS: Antihyperglycemic agent use rather than hyperglycemia is associated with decreased risk of aneurysmal subarachnoid hemorrhage, suggesting a possible protective effect of glucose-lowering agents in the pathogenesis of aneurysm rupture.


Assuntos
Aneurisma Roto , Hemoglobinas Glicadas/metabolismo , Hipoglicemiantes/administração & dosagem , Aneurisma Intracraniano , Hemorragia Subaracnóidea , Adulto , Idoso , Aneurisma Roto/sangue , Aneurisma Roto/epidemiologia , Aneurisma Roto/etiologia , Aneurisma Roto/fisiopatologia , Feminino , Humanos , Hipoglicemiantes/efeitos adversos , Aneurisma Intracraniano/sangue , Aneurisma Intracraniano/epidemiologia , Aneurisma Intracraniano/etiologia , Aneurisma Intracraniano/fisiopatologia , Masculino , Pessoa de Meia-Idade , Fatores de Risco , Hemorragia Subaracnóidea/sangue , Hemorragia Subaracnóidea/epidemiologia , Hemorragia Subaracnóidea/etiologia , Hemorragia Subaracnóidea/fisiopatologia
9.
Stroke ; 49(7): 1747-1750, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29844027

RESUMO

BACKGROUND AND PURPOSE: Both low serum calcium and magnesium levels have been associated with the extent of bleeding in patients with intracerebral hemorrhage, suggesting hypocalcemia- and hypomagnesemia-induced coagulopathy as a possible underlying mechanism. We hypothesized that serum albumin-corrected total calcium and magnesium levels are associated with ruptured intracranial aneurysms. METHODS: The medical records of 4701 patients, including 1201 prospective patients, diagnosed at the Brigham and Women's Hospital and Massachusetts General Hospital between 1990 and 2016 were reviewed and analyzed. One thousand two hundred seventy-five patients had available serum calcium, magnesium, and albumin values within 1 day of diagnosis. Individuals were divided into cases with ruptured aneurysms and controls with unruptured aneurysms. Univariable and multivariable logistic regression analyses were performed to determine the association between serum albumin-corrected total calcium and magnesium levels and ruptured aneurysms. RESULTS: In multivariable analysis, both albumin-corrected calcium (odds ratio, 0.33; 95% confidence interval, 0.27-0.40) and magnesium (odds ratio, 0.40; 95% confidence interval, 0.28-0.55) were significantly and inversely associated with ruptured intracranial aneurysms. CONCLUSIONS: In this large case-control study, hypocalcemia and hypomagnesemia at diagnosis were significantly associated with ruptured aneurysms. Impaired hemostasis caused by hypocalcemia and hypomagnesemia may explain this association.


Assuntos
Aneurisma Roto/sangue , Cálcio/sangue , Aneurisma Intracraniano/sangue , Magnésio/sangue , Estudos de Casos e Controles , Feminino , Humanos , Masculino , Estudos Prospectivos
10.
Stroke ; 49(5): 1148-1154, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29622625

RESUMO

BACKGROUND AND PURPOSE: Growing evidence from experimental animal models and clinical studies suggests the protective effect of statin use against rupture of intracranial aneurysms; however, results from large studies detailing the relationship between intracranial aneurysm rupture and total cholesterol, HDL (high-density lipoprotein), LDL (low-density lipoprotein), and lipid-lowering agent use are lacking. METHODS: The medical records of 4701 patients with 6411 intracranial aneurysms diagnosed at the Massachusetts General Hospital and the Brigham and Women's Hospital between 1990 and 2016 were reviewed and analyzed. Patients were separated into ruptured and nonruptured groups. Univariable and multivariable logistic regression analyses were performed to determine the effects of lipids (total cholesterol, LDL, and HDL) and lipid-lowering medications on intracranial aneurysm rupture risk. Propensity score weighting was used to account for differences in baseline characteristics of the cohorts. RESULTS: Lipid-lowering agent use was significantly inversely associated with rupture status (odds ratio, 0.58; 95% confidence interval, 0.47-0.71). In a subgroup analysis of complete cases that includes both lipid-lowering agent use and lipid values, higher HDL levels (odds ratio, 0.95; 95% confidence interval, 0.93-0.98) and lipid-lowering agent use (odds ratio, 0.41; 95% confidence interval, 0.23-0.73) were both significantly and inversely associated with rupture status, whereas total cholesterol and LDL levels were not significant. A monotonic exposure-response curve between HDL levels and risk of aneurysmal rupture was obtained. CONCLUSIONS: Higher HDL values and the use of lipid-lowering agents are significantly inversely associated with ruptured intracranial aneurysms.


Assuntos
Aneurisma Roto/epidemiologia , HDL-Colesterol/sangue , Hipolipemiantes/uso terapêutico , Aneurisma Intracraniano/epidemiologia , Adulto , Idoso , Aneurisma Roto/sangue , Benzimidazóis/uso terapêutico , LDL-Colesterol/sangue , Resina de Colestiramina/uso terapêutico , Colestipol/uso terapêutico , Ezetimiba/uso terapêutico , Feminino , Ácidos Fíbricos/uso terapêutico , Humanos , Inibidores de Hidroximetilglutaril-CoA Redutases/uso terapêutico , Aneurisma Intracraniano/sangue , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Análise Multivariada , Razão de Chances , Oligonucleotídeos/uso terapêutico , Inibidores de PCSK9 , Pontuação de Propensão , Fatores de Proteção
11.
J Biomed Inform ; 69: 251-258, 2017 05.
Artigo em Inglês | MEDLINE | ID: mdl-28438706

RESUMO

OBJECTIVE: This work investigates the problem of clinical coreference resolution in a model that explicitly tracks entities, and aims to measure the performance of that model in both traditional in-domain train/test splits and cross-domain experiments that measure the generalizability of learned models. METHODS: The two methods we compare are a baseline mention-pair coreference system that operates over pairs of mentions with best-first conflict resolution and a mention-synchronous system that incrementally builds coreference chains. We develop new features that incorporate distributional semantics, discourse features, and entity attributes. We use two new coreference datasets with similar annotation guidelines - the THYME colon cancer dataset and the DeepPhe breast cancer dataset. RESULTS: The mention-synchronous system performs similarly on in-domain data but performs much better on new data. Part of speech tag features prove superior in feature generalizability experiments over other word representations. Our methods show generalization improvement but there is still a performance gap when testing in new domains. DISCUSSION: Generalizability of clinical NLP systems is important and under-studied, so future work should attempt to perform cross-domain and cross-institution evaluations and explicitly develop features and training regimens that favor generalizability. A performance-optimized version of the mention-synchronous system will be included in the open source Apache cTAKES software.


Assuntos
APACHE , Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Semântica , Software , Humanos
12.
medRxiv ; 2024 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-38585973

RESUMO

OBJECTIVE: The application of Natural Language Processing (NLP) in the clinical domain is important due to the rich unstructured information in clinical documents, which often remains inaccessible in structured data. When applying NLP methods to a certain domain, the role of benchmark datasets is crucial as benchmark datasets not only guide the selection of best-performing models but also enable the assessment of the reliability of the generated outputs. Despite the recent availability of language models (LMs) capable of longer context, benchmark datasets targeting long clinical document classification tasks are absent. MATERIALS AND METHODS: To address this issue, we propose LCD benchmark, a benchmark for the task of predicting 30-day out-of-hospital mortality using discharge notes of MIMIC-IV and statewide death data. We evaluated this benchmark dataset using baseline models, from bag-of-words and CNN to instruction-tuned large language models. Additionally, we provide a comprehensive analysis of the model outputs, including manual review and visualization of model weights, to offer insights into their predictive capabilities and limitations. RESULTS AND DISCUSSION: Baseline models showed 28.9% for best-performing supervised models and 32.2% for GPT-4 in F1-metrics. Notes in our dataset have a median word count of 1687. Our analysis of the model outputs showed that our dataset is challenging for both models and human experts, but the models can find meaningful signals from the text. CONCLUSION: We expect our LCD benchmark to be a resource for the development of advanced supervised models, or prompting methods, tailored for clinical text. The benchmark dataset is available at https://github.com/Machine-Learning-for-Medical-Language/long-clinical-doc.

13.
medRxiv ; 2024 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-38562730

RESUMO

In the evolving landscape of clinical Natural Language Generation (NLG), assessing abstractive text quality remains challenging, as existing methods often overlook generative task complexities. This work aimed to examine the current state of automated evaluation metrics in NLG in healthcare. To have a robust and well-validated baseline with which to examine the alignment of these metrics, we created a comprehensive human evaluation framework. Employing ChatGPT-3.5-turbo generative output, we correlated human judgments with each metric. None of the metrics demonstrated high alignment; however, the SapBERT score-a Unified Medical Language System (UMLS)- showed the best results. This underscores the importance of incorporating domain-specific knowledge into evaluation efforts. Our work reveals the deficiency in quality evaluations for generated text and introduces our comprehensive human evaluation framework as a baseline. Future efforts should prioritize integrating medical knowledge databases to enhance the alignment of automated metrics, particularly focusing on refining the SapBERT score for improved assessments.

14.
J Clin Neurosci ; 126: 128-134, 2024 Jun 12.
Artigo em Inglês | MEDLINE | ID: mdl-38870642

RESUMO

OBJECTIVE: Intracranial aneurysms (IA) and aortic aneurysms (AA) are both abnormal dilations of arteries with familial predisposition and have been proposed to share co-prevalence and pathophysiology. Associations of IA and non-aortic peripheral aneurysms are less well-studied. The goal of the study was to understand the patterns of aortic and peripheral (extracranial) aneurysms in patients with IA, and risk factors associated with the development of these aneurysms. METHODS: 4701 patients were included in our retrospective analysis of all patients with intracranial aneurysms at our institution over the past 26 years. Patient demographics, comorbidities, and aneurysmal locations were analyzed. Univariate and multivariate analyses were performed to study associations with and without extracranial aneurysms. RESULTS: A total of 3.4% of patients (161 of 4701) with IA had at least one extracranial aneurysm. 2.8% had thoracic or abdominal aortic aneurysms. Age, male sex, hypertension, coronary artery disease, history of ischemic cerebral infarction, connective tissues disease, and family history of extracranial aneurysms in a 1st degree relative were associated with the presence of extracranial aneurysms and a higher number of extracranial aneurysms. In addition, family history of extracranial aneurysms in a second degree relative is associated with the presence of extracranial aneurysms and atrial fibrillation is associated with a higher number of extracranial aneurysms. CONCLUSION: Significant comorbidities are associated with extracranial aneurysms in patients with IA. Family history of extracranial aneurysms has the strongest association and suggests that IA patients with a family history of extracranial aneurysms may benefit from screening.

15.
J Am Med Inform Assoc ; 31(6): 1291-1302, 2024 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-38587875

RESUMO

OBJECTIVE: The timely stratification of trauma injury severity can enhance the quality of trauma care but it requires intense manual annotation from certified trauma coders. The objective of this study is to develop machine learning models for the stratification of trauma injury severity across various body regions using clinical text and structured electronic health records (EHRs) data. MATERIALS AND METHODS: Our study utilized clinical documents and structured EHR variables linked with the trauma registry data to create 2 machine learning models with different approaches to representing text. The first one fuses concept unique identifiers (CUIs) extracted from free text with structured EHR variables, while the second one integrates free text with structured EHR variables. Temporal validation was undertaken to ensure the models' temporal generalizability. Additionally, analyses to assess the variable importance were conducted. RESULTS: Both models demonstrated impressive performance in categorizing leg injuries, achieving high accuracy with macro-F1 scores of over 0.8. Additionally, they showed considerable accuracy, with macro-F1 scores exceeding or near 0.7, in assessing injuries in the areas of the chest and head. We showed in our variable importance analysis that the most important features in the model have strong face validity in determining clinically relevant trauma injuries. DISCUSSION: The CUI-based model achieves comparable performance, if not higher, compared to the free-text-based model, with reduced complexity. Furthermore, integrating structured EHR data improves performance, particularly when the text modalities are insufficiently indicative. CONCLUSIONS: Our multi-modal, multiclass models can provide accurate stratification of trauma injury severity and clinically relevant interpretations.


Assuntos
Registros Eletrônicos de Saúde , Aprendizado de Máquina , Ferimentos e Lesões , Humanos , Ferimentos e Lesões/classificação , Escala de Gravidade do Ferimento , Sistema de Registros , Índices de Gravidade do Trauma , Processamento de Linguagem Natural
16.
Proc Conf Assoc Comput Linguist Meet ; 2023: 313-319, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37780680

RESUMO

Understanding temporal relationships in text from electronic health records can be valuable for many important downstream clinical applications. Since Clinical TempEval 2017, there has been little work on end-to-end systems for temporal relation extraction, with most work focused on the setting where gold standard events and time expressions are given. In this work, we make use of a novel multi-headed attention mechanism on top of a pre-trained transformer encoder to allow the learning process to attend to multiple aspects of the contextualized embeddings. Our system achieves state of the art results on the THYME corpus by a wide margin, in both the in-domain and cross-domain settings.

17.
Proc Conf Assoc Comput Linguist Meet ; 2023: 125-130, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37786810

RESUMO

Text in electronic health records is organized into sections, and classifying those sections into section categories is useful for downstream tasks. In this work, we attempt to improve the transferability of section classification models by combining the dataset-specific knowledge in supervised learning models with the world knowledge inside large language models (LLMs). Surprisingly, we find that zero-shot LLMs out-perform supervised BERT-based models applied to out-of-domain data. We also find that their strengths are synergistic, so that a simple ensemble technique leads to additional performance gains.

18.
Proc Conf Assoc Comput Linguist Meet ; 2023: 461-467, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37583489

RESUMO

The BioNLP Workshop 2023 initiated the launch of a shared task on Problem List Summarization (ProbSum) in January 2023. The aim of this shared task is to attract future research efforts in building NLP models for real-world diagnostic decision support applications, where a system generating relevant and accurate diagnoses will augment the healthcare providers' decision-making process and improve the quality of care for patients. The goal for participants is to develop models that generated a list of diagnoses and problems using input from the daily care notes collected from the hospitalization of critically ill patients. Eight teams submitted their final systems to the shared task leaderboard. In this paper, we describe the tasks, datasets, evaluation metrics, and baseline systems. Additionally, the techniques and results of the evaluation of the different approaches tried by the participating teams are summarized.

19.
Proc Conf Assoc Comput Linguist Meet ; 2023(ClinicalNLP): 78-85, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37492270

RESUMO

Generative artificial intelligence (AI) is a promising direction for augmenting clinical diagnostic decision support and reducing diagnostic errors, a leading contributor to medical errors. To further the development of clinical AI systems, the Diagnostic Reasoning Benchmark (DR.BENCH) was introduced as a comprehensive generative AI framework, comprised of six tasks representing key components in clinical reasoning. We present a comparative analysis of in-domain versus out-of-domain language models as well as multi-task versus single task training with a focus on the problem summarization task in DR.BENCH (Gao et al., 2023). We demonstrate that a multi-task, clinically-trained language model outperforms its general domain counterpart by a large margin, establishing a new state-of-the-art performance, with a ROUGE-L score of 28.55. This research underscores the value of domain-specific training for optimizing clinical diagnostic reasoning tasks.

20.
JAMIA Open ; 6(4): ooad109, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38144168

RESUMO

Objectives: To develop and externally validate machine learning models using structured and unstructured electronic health record data to predict postoperative acute kidney injury (AKI) across inpatient settings. Materials and Methods: Data for adult postoperative admissions to the Loyola University Medical Center (2009-2017) were used for model development and admissions to the University of Wisconsin-Madison (2009-2020) were used for validation. Structured features included demographics, vital signs, laboratory results, and nurse-documented scores. Unstructured text from clinical notes were converted into concept unique identifiers (CUIs) using the clinical Text Analysis and Knowledge Extraction System. The primary outcome was the development of Kidney Disease Improvement Global Outcomes stage 2 AKI within 7 days after leaving the operating room. We derived unimodal extreme gradient boosting machines (XGBoost) and elastic net logistic regression (GLMNET) models using structured-only data and multimodal models combining structured data with CUI features. Model comparison was performed using the receiver operating characteristic curve (AUROC), with Delong's test for statistical differences. Results: The study cohort included 138 389 adult patient admissions (mean [SD] age 58 [16] years; 11 506 [8%] African-American; and 70 826 [51%] female) across the 2 sites. Of those, 2959 (2.1%) developed stage 2 AKI or higher. Across all data types, XGBoost outperformed GLMNET (mean AUROC 0.81 [95% confidence interval (CI), 0.80-0.82] vs 0.78 [95% CI, 0.77-0.79]). The multimodal XGBoost model incorporating CUIs parameterized as term frequency-inverse document frequency (TF-IDF) showed the highest discrimination performance (AUROC 0.82 [95% CI, 0.81-0.83]) over unimodal models (AUROC 0.79 [95% CI, 0.78-0.80]). Discussion: A multimodality approach with structured data and TF-IDF weighting of CUIs increased model performance over structured data-only models. Conclusion: These findings highlight the predictive power of CUIs when merged with structured data for clinical prediction models, which may improve the detection of postoperative AKI.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa