Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 68
Filtrar
1.
medRxiv ; 2024 May 22.
Artículo en Inglés | MEDLINE | ID: mdl-38826441

RESUMEN

The consistent and persuasive evidence illustrating the influence of social determinants on health has prompted a growing realization throughout the health care sector that enhancing health and health equity will likely depend, at least to some extent, on addressing detrimental social determinants. However, detailed social determinants of health (SDoH) information is often buried within clinical narrative text in electronic health records (EHRs), necessitating natural language processing (NLP) methods to automatically extract these details. Most current NLP efforts for SDoH extraction have been limited, investigating on limited types of SDoH elements, deriving data from a single institution, focusing on specific patient cohorts or note types, with reduced focus on generalizability. This study aims to address these issues by creating cross-institutional corpora spanning different note types and healthcare systems, and developing and evaluating the generalizability of classification models, including novel large language models (LLMs), for detecting SDoH factors from diverse types of notes from four institutions: Harris County Psychiatric Center, University of Texas Physician Practice, Beth Israel Deaconess Medical Center, and Mayo Clinic. Four corpora of deidentified clinical notes were annotated with 21 SDoH factors at two levels: level 1 with SDoH factor types only and level 2 with SDoH factors along with associated values. Three traditional classification algorithms (XGBoost, TextCNN, Sentence BERT) and an instruction tuned LLM-based approach (LLaMA) were developed to identify multiple SDoH factors. Substantial variation was noted in SDoH documentation practices and label distributions based on patient cohorts, note types, and hospitals. The LLM achieved top performance with micro-averaged F1 scores over 0.9 on level 1 annotated corpora and an F1 over 0.84 on level 2 annotated corpora. While models performed well when trained and tested on individual datasets, cross-dataset generalization highlighted remaining obstacles. To foster collaboration, access to partial annotated corpora and models trained by merging all annotated datasets will be made available on the PhysioNet repository.

2.
medRxiv ; 2024 Apr 25.
Artículo en Inglés | MEDLINE | ID: mdl-38712199

RESUMEN

Background: Postoperative ileus (POI) after colorectal surgery leads to increased morbidity, costs, and hospital stays. Identifying POI risk for early intervention is important for improving surgical outcomes especially given the increasing trend towards early discharge after surgery. While existing studies have assessed POI risk with regression models, the role of deep learning's remains unexplored. Methods: We assessed the performance and transferability (brutal force/instance/parameter transfer) of Gated Recurrent Unit with Decay (GRU-D), a longitudinal deep learning architecture, for real-time risk assessment of POI among 7,349 colorectal surgeries performed across three hospital sites operated by Mayo Clinic with two electronic health records (EHR) systems. The results were compared with atemporal models on a panel of benchmark metrics. Results: GRU-D exhibits robust transferability across different EHR systems and hospital sites, showing enhanced performance by integrating new measurements, even amid the extreme sparsity of real-world longitudinal data. On average, for labs, vitals, and assisted living status, 72.2%, 26.9%, and 49.3% respectively lack measurements within 24 hours after surgery. Over the follow-up period with 4-hour intervals, 98.7%, 84%, and 95.8% of data points are missing, respectively. A maximum of 5% decrease in AUROC was observed in brutal-force transfer between different EHR systems with non-overlapping surgery date frames. Multi-source instance transfer witnessed the best performance, with a maximum of 2.6% improvement in AUROC over local learning. The significant benefit, however, lies in the reduction of variance (a maximum of 86% decrease). The GRU-D model's performance mainly depends on the prediction task's difficulty, especially the case prevalence rate. Whereas the impact of training data and transfer strategy is less crucial, underscoring the challenge of effectively leveraging transfer learning for rare outcomes. While atemporal Logit models show notably superior performance at certain pre-surgical points, their performance fluctuate significantly and generally underperform GRU-D in post-surgical hours. Conclusion: GRU-D demonstrated robust transferability across EHR systems and hospital sites with highly sparse real-world EHR data. Further research on built-in explainability for meaningful intervention would be highly valuable for its integration into clinical practice.

3.
J Am Med Inform Assoc ; 31(7): 1493-1502, 2024 Jun 20.
Artículo en Inglés | MEDLINE | ID: mdl-38742455

RESUMEN

BACKGROUND: Error analysis plays a crucial role in clinical concept extraction, a fundamental subtask within clinical natural language processing (NLP). The process typically involves a manual review of error types, such as contextual and linguistic factors contributing to their occurrence, and the identification of underlying causes to refine the NLP model and improve its performance. Conducting error analysis can be complex, requiring a combination of NLP expertise and domain-specific knowledge. Due to the high heterogeneity of electronic health record (EHR) settings across different institutions, challenges may arise when attempting to standardize and reproduce the error analysis process. OBJECTIVES: This study aims to facilitate a collaborative effort to establish common definitions and taxonomies for capturing diverse error types, fostering community consensus on error analysis for clinical concept extraction tasks. MATERIALS AND METHODS: We iteratively developed and evaluated an error taxonomy based on existing literature, standards, real-world data, multisite case evaluations, and community feedback. The finalized taxonomy was released in both .dtd and .owl formats at the Open Health Natural Language Processing Consortium. The taxonomy is compatible with several different open-source annotation tools, including MAE, Brat, and MedTator. RESULTS: The resulting error taxonomy comprises 43 distinct error classes, organized into 6 error dimensions and 4 properties, including model type (symbolic and statistical machine learning), evaluation subject (model and human), evaluation level (patient, document, sentence, and concept), and annotation examples. Internal and external evaluations revealed strong variations in error types across methodological approaches, tasks, and EHR settings. Key points emerged from community feedback, including the need to enhancing clarity, generalizability, and usability of the taxonomy, along with dissemination strategies. CONCLUSION: The proposed taxonomy can facilitate the acceleration and standardization of the error analysis process in multi-site settings, thus improving the provenance, interpretability, and portability of NLP models. Future researchers could explore the potential direction of developing automated or semi-automated methods to assist in the classification and standardization of error analysis.


Asunto(s)
Registros Electrónicos de Salud , Procesamiento de Lenguaje Natural , Registros Electrónicos de Salud/clasificación , Humanos , Clasificación/métodos , Errores Médicos/clasificación
4.
J Phys Ther Educ ; 2024 Apr 19.
Artículo en Inglés | MEDLINE | ID: mdl-38640081

RESUMEN

INTRODUCTION: Letters of recommendation (LOR) are an integral component of physical therapy residency applications. Identifying the influence of applicant and writer gender in LOR will help identify whether potential implicit gender bias exists in physical therapy residency application processes. REVIEW OF LITERATURE: Several medical and surgical residency education programs have reported positive, neutral, or negative LOR female gender bias among applicants and writers. Little research exists on gender differences in LOR to physical therapy education programs or physical therapy residency programs. SUBJECTS: Seven hundred sixty-eight LOR were analyzed from 256 applications to 3 physical therapy residency programs (neurologic, orthopaedic, sports) at one institution from 2014 to 2020. METHODS: Thematic categories were developed to identify themes in a sample of LOR. Associations between writer and applicant gender were analyzed using summary statistics, word counts, thematic and psycholinguistic extraction, and rule-based and deep learning Natural Language Processing . RESULTS: No significant difference in LOR word counts were found based on writer or applicant gender. Increased word counts were seen in sports residency LOR compared with the orthopaedic residency. Thematic analysis showed LOR gender differences with male applicants receiving more positive generalized recommendations and female applicants receiving more comments regarding interpersonal relationship skills. No thematic or psycholinguistic gender differences were seen by LOR writer. Male applicants were 1.9 times more likely to select all male LOR writers, whereas female applicants were 2.1 times more likely to choose all female LOR writers. DISCUSSION AND CONCLUSION: Gender differences in LORs for physical therapy residencies were found using a comprehensive Natural Language Processing approach that identified both a positive recommendation male applicant gender bias and a positive interpersonal relationship skill female applicant gender bias. Applicants were not harmed nor helped by selecting LOR writers of the opposite gender. Admissions committees and LOR writers should be mindful of potential implicit gender biases in LOR submitted to physical therapy residency programs.

5.
J Biomed Inform ; 152: 104623, 2024 04.
Artículo en Inglés | MEDLINE | ID: mdl-38458578

RESUMEN

INTRODUCTION: Patients' functional status assesses their independence in performing activities of daily living, including basic ADLs (bADL), and more complex instrumental activities (iADL). Existing studies have discovered that patients' functional status is a strong predictor of health outcomes, particularly in older adults. Depite their usefulness, much of the functional status information is stored in electronic health records (EHRs) in either semi-structured or free text formats. This indicates the pressing need to leverage computational approaches such as natural language processing (NLP) to accelerate the curation of functional status information. In this study, we introduced FedFSA, a hybrid and federated NLP framework designed to extract functional status information from EHRs across multiple healthcare institutions. METHODS: FedFSA consists of four major components: 1) individual sites (clients) with their private local data, 2) a rule-based information extraction (IE) framework for ADL extraction, 3) a BERT model for functional status impairment classification, and 4) a concept normalizer. The framework was implemented using the OHNLP Backbone for rule-based IE and open-source Flower and PyTorch library for federated BERT components. For gold standard data generation, we carried out corpus annotation to identify functional status-related expressions based on ICF definitions. Four healthcare institutions were included in the study. To assess FedFSA, we evaluated the performance of category- and institution-specific ADL extraction across different experimental designs. RESULTS: ADL extraction performance ranges from an F1-score of 0.907 to 0.986 for bADL and 0.825 to 0.951 for iADL across the four healthcare sites. The performance for ADL extraction with impairment ranges from an F1-score of 0.722 to 0.954 for bADL and 0.674 to 0.813 for iADL across four healthcare sites. For category-specific ADL extraction, laundry and transferring yielded relatively high performance, while dressing, medication, bathing, and continence achieved moderate-high performance. Conversely, food preparation and toileting showed low performance. CONCLUSION: NLP performance varied across ADL categories and healthcare sites. Federated learning using a FedFSA framework performed higher than non-federated learning for impaired ADL extraction at all healthcare sites. Our study demonstrated the potential of the federated learning framework in functional status extraction and impairment classification in EHRs, exemplifying the importance of a large-scale, multi-institutional collaborative development effort.


Asunto(s)
Actividades Cotidianas , Estado Funcional , Humanos , Anciano , Aprendizaje , Almacenamiento y Recuperación de la Información , Procesamiento de Lenguaje Natural
6.
Aging Dis ; 2024 Feb 19.
Artículo en Inglés | MEDLINE | ID: mdl-38421836

RESUMEN

Covert cerebrovascular disease (CCD) is frequently reported on neuroimaging and associates with increased dementia and stroke risk. We aimed to determine how incidentally-discovered CCD during clinical neuroimaging in a large population associates with mortality. We screened CT and MRI reports of adults aged ≥50 in the Kaiser Permanente Southern California health system who underwent neuroimaging for a non-stroke clinical indication from 2009-2019. Natural language processing identified incidental covert brain infarcts (CBI) and/or white matter hyperintensities (WMH), grading WMH as mild/moderate/severe. Models adjusted for age, sex, ethnicity, multimorbidity, vascular risks, depression, exercise, and imaging modality. Of n=241,028, the mean age was 64.9 (SD=10.4); mean follow-up 4.46 years; 178,554 (74.1%) had CT; 62,474 (25.9%) had MRI; 11,328 (4.7%) had CBI; and 69,927 (29.0%) had WMH. The mortality rate per 1,000 person-years with CBI was 59.0 (95%CI 57.0-61.1); with WMH=46.5 (45.7-47.2); with neither=17.4 (17.1-17.7). In adjusted models, mortality risk associated with CBI was modified by age, e.g. HR 1.34 [1.21-1.48] at age 56.1 years vs HR 1.22 [1.17-1.28] at age 72 years. Mortality associated with WMH was modified by both age and imaging modality e.g., WMH on MRI at age 56.1 HR = 1.26 [1.18-1.35]; WMH on MRI at age 72 HR 1.15 [1.09-1.21]; WMH on CT at age 56.1 HR 1.41 [1.33-1.50]; WMH on CT at age 72 HR 1.28 [1.24-1.32], vs. patients without CBI or without WMH, respectively. Increasing WMH severity associated with higher mortality, e.g. mild WMH on MRI had adjusted HR=1.13 [1.06-1.20] while severe WMH on CT had HR=1.45 [1.33-1.59]. Incidentally-detected CBI and WMH on population-based clinical neuroimaging can predict higher mortality rates. We need treatments and healthcare planning for individuals with CCD.

7.
Stud Health Technol Inform ; 310: 850-854, 2024 Jan 25.
Artículo en Inglés | MEDLINE | ID: mdl-38269929

RESUMEN

With increasing number of people living with dementia, the problem of late diagnosis significantly impacts a person's quality of life while early signs of dementia may provide useful insights to facilitate better treatment plans. With time, this progressive neurodegenerative syndrome could progress from mild cognitive impairment to dementia. A pattern of health conditions can be characterized in unsupervised manner to help predict this progress. As a significant extension to our previous work with streaming clustering model, we consider additional information for predicting dementia onset. With empirical observations, we discover the importance of examining sex and age to predict dementia onset. To this end, we propose a sex-specific model with age-constraint for predicting dementia onset and validate the effectiveness of our models using data from Mayo Clinic Study of Aging (MCSA). The proposed sex-specific models for older adult populations (>=65 years of age) outperformed the previous models with F-score of 77% and 78% for male-specific and female-specific models, respectively. Our experiments of sex-specific temporal clustering of features in older adults demonstrate the potential of more personalized models for early alerts of dementia.


Asunto(s)
Disfunción Cognitiva , Demencia , Humanos , Femenino , Masculino , Anciano , Calidad de Vida , Envejecimiento , Análisis por Conglomerados , Disfunción Cognitiva/diagnóstico , Demencia/diagnóstico
8.
World Neurosurg ; 183: e243-e249, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38103686

RESUMEN

BACKGROUND: Many predictive models for estimating clinical outcomes after spine surgery have been reported in the literature. However, implementation of predictive scores in practice is limited by the time-intensive nature of manually abstracting relevant predictors. In this study, we designed natural language processing (NLP) algorithms to automate data abstraction for the thoracolumbar injury classification score (TLICS). METHODS: We retrieved the radiology reports of all Mayo Clinic patients with an International Classification of Diseases, 9th or 10th revision, code corresponding to a fracture of the thoracolumbar spine between January 2005 and October 2020. Annotated data were used to train an N-gram NLP model using machine learning methods, including random forest, stepwise linear discriminant analysis, k-nearest neighbors, and penalized logistic regression models. RESULTS: A total of 1085 spine radiology reports were included in our analysis. Our dataset included 483 compression, 401 burst, 103 translational/rotational, and 98 distraction fractures. A total of 103 reports had documented an injury of the posterior ligamentous complex. The overall accuracy of the random forest model for fracture morphology feature detection was 76.96% versus 65.90% in the stepwise linear discriminant analysis, 50.69% in the k-nearest neighbors, and 62.67% in the penalized logistic regression. The overall accuracy to detect posterior ligamentous complex integrity was highest in the random forest model at 83.41%. Our random forest model was implemented in the backend of a web application in which users can dictate reports and have TLICS features automatically extracted. CONCLUSIONS: We have developed a machine learning NLP model for extracting TLICS features from radiology reports, which we deployed in a web application that can be integrated into clinical practice.


Asunto(s)
Fracturas Óseas , Radiología , Humanos , Procesamiento de Lenguaje Natural , Reconocimiento de Voz , Vértebras Lumbares/diagnóstico por imagen , Vértebras Lumbares/lesiones , Vértebras Torácicas/diagnóstico por imagen , Vértebras Torácicas/lesiones
9.
Cerebrovasc Dis ; 2023 Nov 07.
Artículo en Inglés | MEDLINE | ID: mdl-37935160

RESUMEN

BACKGROUND: Covert cerebrovascular disease (CCD) includes white matter disease (WMD) and covert brain infarction (CBI). Incidentally-discovered CCD is associated with increased risk of subsequent symptomatic stroke. However, it is unknown whether the severity of WMD or the location of CBI predicts risk. OBJECTIVES: To examine the association of incidentally-discovered WMD severity and CBI location with risk of subsequent symptomatic stroke. METHOD: This retrospective cohort study includes patients 50 years old in the Kaiser Permanente Southern California health system who received neuroimaging for a non-stroke indication between 2009-2019. Incidental CBI and WMD were identified via natural language processing of the neuroimage report, and WMD severity was classified into grades. RESULTS: 261,960 patients received neuroimaging; 78,555 (30.0%) were identified to have incidental WMD, and 12,857 (4.9%) to have incidental CBI. Increasing WMD severity is associated with increased incidence rate of future stroke. However, the stroke incidence rate in CT-identified WMD is higher at each level of severity compared to rates in MRI-identified WMD. Patients with mild WMD via CT have a stroke incidence rate of 24.9 per 1,000 person-years, similar to that of patients with severe WMD via MRI. Among incidentally-discovered CBI patients with a determined CBI location, 97.9% are subcortical rather than cortical infarcts. CBI confers a similar risk of future stroke, whether cortical or subcortical, or whether MRI- or CT-detected. CONCLUSIONS: Increasing severity of incidental WMD is associated with an increased risk of future symptomatic stroke, dependent on the imaging modality. Subcortical and cortical CBI conferred similar risks.

10.
J Clin Transl Sci ; 7(1): e187, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37745932

RESUMEN

Introduction: We tested the ability of our natural language processing (NLP) algorithm to identify delirium episodes in a large-scale study using real-world clinical notes. Methods: We used the Rochester Epidemiology Project to identify persons ≥ 65 years who were hospitalized between 2011 and 2017. We identified all persons with an International Classification of Diseases code for delirium within ±14 days of a hospitalization. We independently applied our NLP algorithm to all clinical notes for this same population. We calculated rates using number of delirium episodes as the numerator and number of hospitalizations as the denominator. Rates were estimated overall, by demographic characteristics, and by year of episode, and differences were tested using Poisson regression. Results: In total, 14,255 persons had 37,554 hospitalizations between 2011 and 2017. The code-based delirium rate was 3.02 per 100 hospitalizations (95% CI: 2.85, 3.20). The NLP-based rate was 7.36 per 100 (95% CI: 7.09, 7.64). Rates increased with age (both p < 0.0001). Code-based rates were higher in men compared to women (p = 0.03), but NLP-based rates were similar by sex (p = 0.89). Code-based rates were similar by race and ethnicity, but NLP-based rates were higher in the White population compared to the Black and Asian populations (p = 0.001). Both types of rates increased significantly over time (both p values < 0.001). Conclusions: The NLP algorithm identified more delirium episodes compared to the ICD code method. However, NLP may still underestimate delirium cases because of limitations in real-world clinical notes, including incomplete documentation, practice changes over time, and missing clinical notes in some time periods.

11.
J Alzheimers Dis ; 95(3): 931-940, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37638438

RESUMEN

BACKGROUND: Multiple algorithms with variable performance have been developed to identify dementia using combinations of billing codes and medication data that are widely available from electronic health records (EHR). If the characteristics of misclassified patients are clearly identified, modifying existing algorithms to improve performance may be possible. OBJECTIVE: To examine the performance of a code-based algorithm to identify dementia cases in the population-based Mayo Clinic Study of Aging (MCSA) where dementia diagnosis (i.e., reference standard) is actively assessed through routine follow-up and describe the characteristics of persons incorrectly categorized. METHODS: There were 5,316 participants (age at baseline (mean (SD)): 73.3 (9.68) years; 50.7% male) without dementia at baseline and available EHR data. ICD-9/10 codes and prescription medications for dementia were extracted between baseline and one year after an MCSA dementia diagnosis or last follow-up. Fisher's exact or Kruskal-Wallis tests were used to compare characteristics between groups. RESULTS: Algorithm sensitivity and specificity were 0.70 (95% CI: 0.67, 0.74) and 0.95 (95% CI: 0.95, 0.96). False positives (i.e., participants falsely diagnosed with dementia by the algorithm) were older, with higher Charlson comorbidity index, more likely to have mild cognitive impairment (MCI), and longer follow-up (versus true negatives). False negatives (versus true positives) were older, more likely to have MCI, or have more functional limitations. CONCLUSIONS: We observed a moderate-high performance of the code-based diagnosis method against the population-based MCSA reference standard dementia diagnosis. Older participants and those with MCI at baseline were more likely to be misclassified.


Asunto(s)
Enfermedad de Alzheimer , Envejecimiento Cognitivo , Disfunción Cognitiva , Demencia , Humanos , Masculino , Femenino , Demencia/diagnóstico , Demencia/epidemiología , Enfermedad de Alzheimer/diagnóstico , Progresión de la Enfermedad , Disfunción Cognitiva/diagnóstico , Disfunción Cognitiva/epidemiología
12.
J Am Med Inform Assoc ; 30(12): 2036-2040, 2023 11 17.
Artículo en Inglés | MEDLINE | ID: mdl-37555837

RESUMEN

Despite recent methodology advancements in clinical natural language processing (NLP), the adoption of clinical NLP models within the translational research community remains hindered by process heterogeneity and human factor variations. Concurrently, these factors also dramatically increase the difficulty in developing NLP models in multi-site settings, which is necessary for algorithm robustness and generalizability. Here, we reported on our experience developing an NLP solution for Coronavirus Disease 2019 (COVID-19) signs and symptom extraction in an open NLP framework from a subset of sites participating in the National COVID Cohort (N3C). We then empirically highlight the benefits of multi-site data for both symbolic and statistical methods, as well as highlight the need for federated annotation and evaluation to resolve several pitfalls encountered in the course of these efforts.


Asunto(s)
COVID-19 , Procesamiento de Lenguaje Natural , Humanos , Registros Electrónicos de Salud , Algoritmos
13.
NPJ Digit Med ; 6(1): 132, 2023 Jul 21.
Artículo en Inglés | MEDLINE | ID: mdl-37479735

RESUMEN

Clinical phenotyping is often a foundational requirement for obtaining datasets necessary for the development of digital health applications. Traditionally done via manual abstraction, this task is often a bottleneck in development due to time and cost requirements, therefore raising significant interest in accomplishing this task via in-silico means. Nevertheless, current in-silico phenotyping development tends to be focused on a single phenotyping task resulting in a dearth of reusable tools supporting cross-task generalizable in-silico phenotyping. In addition, in-silico phenotyping remains largely inaccessible for a substantial portion of potentially interested users. Here, we highlight the barriers to the usage of in-silico phenotyping and potential solutions in the form of a framework of several desiderata as observed during our implementation of such tasks. In addition, we introduce an example implementation of said framework as a software application, with a focus on ease of adoption, cross-task reusability, and facilitating the clinical phenotyping algorithm development process.

14.
JMIR Med Inform ; 11: e48072, 2023 Jun 27.
Artículo en Inglés | MEDLINE | ID: mdl-37368483

RESUMEN

BACKGROUND: A patient's family history (FH) information significantly influences downstream clinical care. Despite this importance, there is no standardized method to capture FH information in electronic health records and a substantial portion of FH information is frequently embedded in clinical notes. This renders FH information difficult to use in downstream data analytics or clinical decision support applications. To address this issue, a natural language processing system capable of extracting and normalizing FH information can be used. OBJECTIVE: In this study, we aimed to construct an FH lexical resource for information extraction and normalization. METHODS: We exploited a transformer-based method to construct an FH lexical resource leveraging a corpus consisting of clinical notes generated as part of primary care. The usability of the lexicon was demonstrated through the development of a rule-based FH system that extracts FH entities and relations as specified in previous FH challenges. We also experimented with a deep learning-based FH system for FH information extraction. Previous FH challenge data sets were used for evaluation. RESULTS: The resulting lexicon contains 33,603 lexicon entries normalized to 6408 concept unique identifiers of the Unified Medical Language System and 15,126 codes of the Systematized Nomenclature of Medicine Clinical Terms, with an average number of 5.4 variants per concept. The performance evaluation demonstrated that the rule-based FH system achieved reasonable performance. The combination of the rule-based FH system with a state-of-the-art deep learning-based FH system can improve the recall of FH information evaluated using the BioCreative/N2C2 FH challenge data set, with the F1 score varied but comparable. CONCLUSIONS: The resulting lexicon and rule-based FH system are freely available through the Open Health Natural Language Processing GitHub.

15.
AMIA Jt Summits Transl Sci Proc ; 2023: 196-205, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37350914

RESUMEN

Gender stereotyping is the practice of assigning or ascribing specific characteristics, differences, or identities to a person solely based on their gender. Biased conceptions of gender can create barriers to equality and need to be proactively identified and addressed. In biomedical education, letters of recommendation (LOR) are considered an important source for evaluating candidates' past performance. Because LOR is subjective and has no standard formatting requirements for the writer, potential language bias can be introduced. Natural language processing (NLP) offers a promising solution to detect language bias in LOR through automatic extraction of sensitive language and identification of letters with strong biases. In our study, we developed, evaluated, and deployed four NLP different methods (sublanguage analysis, dictionary-based approach, rule-based approach, and deep learning approach) for the extraction of psycholinguistics and thematic characteristics in LORs from three different physical therapy residency programs (Neurologic, Orthopaedic, and Sport) at Mayo Clinic. The evaluation statistics suggest that both MedTaggerIE model and Bidirectional Encoder Representations from Transformers model achieved moderate-high performance across eight different thematic categories. Through the pilot demonstration study, we learned that male writers were more likely to use the words 'intelligence', 'exceptional', and 'pursue' and male applicants were more likely to have the words 'strength', 'interpersonal skills', 'conversations', and 'pursue' in their letters of recommendation. Thematic analysis suggested that male and female writers have significant differences in expressing doubt, motivation, and recommendation. Findings derived from the study needed to be carefully interpreted based on the context of the study setting, residency programs, and data. A follow-up demonstration study is needed to further evaluate and interpret the findings.

16.
J Am Med Inform Assoc ; 30(9): 1465-1473, 2023 08 18.
Artículo en Inglés | MEDLINE | ID: mdl-37301740

RESUMEN

OBJECTIVE: Social determinants of health (SDoH) play critical roles in health outcomes and well-being. Understanding the interplay of SDoH and health outcomes is critical to reducing healthcare inequalities and transforming a "sick care" system into a "health-promoting" system. To address the SDOH terminology gap and better embed relevant elements in advanced biomedical informatics, we propose an SDoH ontology (SDoHO), which represents fundamental SDoH factors and their relationships in a standardized and measurable way. MATERIAL AND METHODS: Drawing on the content of existing ontologies relevant to certain aspects of SDoH, we used a top-down approach to formally model classes, relationships, and constraints based on multiple SDoH-related resources. Expert review and coverage evaluation, using a bottom-up approach employing clinical notes data and a national survey, were performed. RESULTS: We constructed the SDoHO with 708 classes, 106 object properties, and 20 data properties, with 1,561 logical axioms and 976 declaration axioms in the current version. Three experts achieved 0.967 agreement in the semantic evaluation of the ontology. A comparison between the coverage of the ontology and SDOH concepts in 2 sets of clinical notes and a national survey instrument also showed satisfactory results. DISCUSSION: SDoHO could potentially play an essential role in providing a foundation for a comprehensive understanding of the associations between SDoH and health outcomes and paving the way for health equity across populations. CONCLUSION: SDoHO has well-designed hierarchies, practical objective properties, and versatile functionalities, and the comprehensive semantic and coverage evaluation achieved promising performance compared to the existing ontologies relevant to SDoH.


Asunto(s)
Equidad en Salud , Determinantes Sociales de la Salud , Humanos , Semántica , Disparidades en Atención de Salud
17.
PLoS One ; 18(3): e0283800, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37000801

RESUMEN

BACKGROUND: The incorporation of information from clinical narratives is critical for computational phenotyping. The accurate interpretation of clinical terms highly depends on their associated context, especially the corresponding clinical section information. However, the heterogeneity across different Electronic Health Record (EHR) systems poses challenges in utilizing the section information. OBJECTIVES: Leveraging the eMERGE heart failure (HF) phenotyping algorithm, we assessed the heterogeneity quantitatively through the performance comparison of machine learning (ML) classifiers which map clinical sections containing HF-relevant terms across different EHR systems to standard sections in Health Level 7 (HL7) Clinical Document Architecture (CDA). METHODS: We experimented with both random forest models with sentence-embedding features and bidirectional encoder representations from transformers models. We trained MLs using an automated labeled corpus from an EHR system that adopted HL7 CDA standard. We assessed the performance using a blind test set (n = 300) from the same EHR system and a gold standard (n = 900) manually annotated from three other EHR systems. RESULTS: The F-measure of those ML models varied widely (0.00-0.91%), indicating MLs with one tuning parameter set were insufficient to capture sections across different EHR systems. The error analysis indicates that the section does not always comply with the corresponding standardized sections, leading to low performance. CONCLUSIONS: We presented the potential use of ML techniques to map the sections containing HF-relevant terms in multiple EHR systems to standard sections. However, the findings suggested that the quality and heterogeneity of section structure across different EHRs affect applications due to the poor adoption of documentation standards.


Asunto(s)
Registros Electrónicos de Salud , Insuficiencia Cardíaca , Humanos , Programas Informáticos , Algoritmos , Aprendizaje Automático
18.
medRxiv ; 2023 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-36747787

RESUMEN

Heart failure management is challenging due to the complex and heterogenous nature of its pathophysiology which makes the conventional treatments based on the "one size fits all" ideology not suitable. Coupling the longitudinal medical data with novel deep learning and network-based analytics will enable identifying the distinct patient phenotypic characteristics to help individualize the treatment regimen through the accurate prediction of the physiological response. In this study, we develop a graph representation learning framework that integrates the heterogeneous clinical events in the electronic health records (EHR) as graph format data, in which the patient-specific patterns and features are naturally infused for personalized predictions of lab test response. The framework includes a novel Graph Transformer Network that is equipped with a self-attention mechanism to model the underlying spatial interdependencies among the clinical events characterizing the cardiac physiological interactions in the heart failure treatment and a graph neural network (GNN) layer to incorporate the explicit temporality of each clinical event, that would help summarize the therapeutic effects induced on the physiological variables, and subsequently on the patient's health status as the heart failure condition progresses over time. We introduce a global attention mask that is computed based on event co-occurrences and is aggregated across all patient records to enhance the guidance of neighbor selection in graph representation learning. We test the feasibility of our model through detailed quantitative and qualitative evaluations on observational EHR data.

19.
Arch Bone Jt Surg ; 11(1): 1-11, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36793660

RESUMEN

Background: Knee osteoarthritis (OA) is a prevalent joint disease. Clinical prediction models consider a wide range of risk factors for knee OA. This review aimed to evaluate published prediction models for knee OA and identify opportunities for future model development. Methods: We searched Scopus, PubMed, and Google Scholar using the terms knee osteoarthritis, prediction model, deep learning, and machine learning. All the identified articles were reviewed by one of the researchers and we recorded information on methodological characteristics and findings. We only included articles that were published after 2000 and reported a knee OA incidence or progression prediction model. Results: We identified 26 models of which 16 employed traditional regression-based models and 10 machine learning (ML) models. Four traditional and five ML models relied on data from the Osteoarthritis Initiative. There was significant variation in the number and type of risk factors. The median sample size for traditional and ML models was 780 and 295, respectively. The reported Area Under the Curve (AUC) ranged between 0.6 and 1.0. Regarding external validation, 6 of the 16 traditional models and only 1 of the 10 ML models validated their results in an external data set. Conclusion: Diverse use of knee OA risk factors, small, non-representative cohorts, and use of magnetic resonance imaging which is not a routine evaluation tool of knee OA in daily clinical practice are some of the main limitations of current knee OA prediction models.

20.
Cerebrovasc Dis ; 52(1): 117-122, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-35760063

RESUMEN

BACKGROUND: Covert cerebrovascular disease (CCD) includes white matter disease (WMD) and covert brain infarction (CBI). Incidentally discovered CCD is associated with increased risk of subsequent symptomatic stroke. However, it is unknown whether the severity of WMD or the location of CBI predicts risk. OBJECTIVES: The aim of this study was to examine the association of incidentally discovered WMD severity and CBI location with risk of subsequent symptomatic stroke. METHOD: This retrospective cohort study includes patients aged ≥50 years old in the Kaiser Permanente Southern California health system who received neuroimaging for a nonstroke indication between 2009 and 2019. Incidental CBI and WMD were identified via natural language processing of the neuroimage report, and WMD severity was classified into grades. RESULTS: A total of 261,960 patients received neuroimaging; 78,555 patients (30.0%) were identified to have incidental WMD and 12,857 patients (4.9%) to have incidental CBI. Increasing WMD severity is associated with an increased incidence rate of future stroke. However, the stroke incidence rate in CT-identified WMD is higher at each level of severity compared to rates in MRI-identified WMD. Patients with mild WMD via CT have a stroke incidence rate of 24.9 per 1,000 person-years, similar to that of patients with severe WMD via MRI. Among incidentally discovered CBI patients with a determined CBI location, 97.9% are subcortical rather than cortical infarcts. CBI confers a similar risk of future stroke, whether cortical or subcortical or whether MRI- or CT-detected. CONCLUSIONS: Increasing severity of incidental WMD is associated with an increased risk of future symptomatic stroke, dependent on the imaging modality. Subcortical and cortical CBI conferred similar risks.


Asunto(s)
Trastornos Cerebrovasculares , Leucoencefalopatías , Accidente Cerebrovascular , Sustancia Blanca , Humanos , Persona de Mediana Edad , Estudios Retrospectivos , Infarto Encefálico , Accidente Cerebrovascular/diagnóstico por imagen , Accidente Cerebrovascular/epidemiología , Trastornos Cerebrovasculares/complicaciones , Leucoencefalopatías/diagnóstico por imagen , Leucoencefalopatías/epidemiología , Leucoencefalopatías/complicaciones , Imagen por Resonancia Magnética/métodos , Sustancia Blanca/diagnóstico por imagen
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA