Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 191
Filtrar
1.
medRxiv ; 2024 May 22.
Artigo em Inglês | MEDLINE | ID: mdl-38826441

RESUMO

The consistent and persuasive evidence illustrating the influence of social determinants on health has prompted a growing realization throughout the health care sector that enhancing health and health equity will likely depend, at least to some extent, on addressing detrimental social determinants. However, detailed social determinants of health (SDoH) information is often buried within clinical narrative text in electronic health records (EHRs), necessitating natural language processing (NLP) methods to automatically extract these details. Most current NLP efforts for SDoH extraction have been limited, investigating on limited types of SDoH elements, deriving data from a single institution, focusing on specific patient cohorts or note types, with reduced focus on generalizability. This study aims to address these issues by creating cross-institutional corpora spanning different note types and healthcare systems, and developing and evaluating the generalizability of classification models, including novel large language models (LLMs), for detecting SDoH factors from diverse types of notes from four institutions: Harris County Psychiatric Center, University of Texas Physician Practice, Beth Israel Deaconess Medical Center, and Mayo Clinic. Four corpora of deidentified clinical notes were annotated with 21 SDoH factors at two levels: level 1 with SDoH factor types only and level 2 with SDoH factors along with associated values. Three traditional classification algorithms (XGBoost, TextCNN, Sentence BERT) and an instruction tuned LLM-based approach (LLaMA) were developed to identify multiple SDoH factors. Substantial variation was noted in SDoH documentation practices and label distributions based on patient cohorts, note types, and hospitals. The LLM achieved top performance with micro-averaged F1 scores over 0.9 on level 1 annotated corpora and an F1 over 0.84 on level 2 annotated corpora. While models performed well when trained and tested on individual datasets, cross-dataset generalization highlighted remaining obstacles. To foster collaboration, access to partial annotated corpora and models trained by merging all annotated datasets will be made available on the PhysioNet repository.

2.
Neural Netw ; 176: 106338, 2024 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-38692190

RESUMO

Electroencephalography (EEG) based Brain Computer Interface (BCI) systems play a significant role in facilitating how individuals with neurological impairments effectively interact with their environment. In real world applications of BCI system for clinical assistance and rehabilitation training, the EEG classifier often needs to learn on sequentially arriving subjects in an online manner. As patterns of EEG signals can be significantly different for different subjects, the EEG classifier can easily erase knowledge of learnt subjects after learning on later ones as it performs decoding in online streaming scenario, namely catastrophic forgetting. In this work, we tackle this problem with a memory-based approach, which considers the following conditions: (1) subjects arrive sequentially in an online manner, with no large scale dataset available for joint training beforehand, (2) data volume from the different subjects could be imbalanced, (3) decoding difficulty of the sequential streaming signal vary, (4) continual classification for a long time is required. This online sequential EEG decoding problem is more challenging than classic cross subject EEG decoding as there is no large-scale training data from the different subjects available beforehand. The proposed model keeps a small balanced memory buffer during sequential learning, with memory data dynamically selected based on joint consideration of data volume and informativeness. Furthermore, for the more general scenarios where subject identity is unknown to the EEG decoder, aka. subject agnostic scenario, we propose a kernel based subject shift detection method that identifies underlying subject changes on the fly in a computationally efficient manner. We develop challenging benchmarks of streaming EEG data from sequentially arriving subjects with both balanced and imbalanced data volumes, and performed extensive experiments with a detailed ablation study on the proposed model. The results show the effectiveness of our proposed approach, enabling the decoder to maintain performance on all previously seen subjects over a long period of sequential decoding. The model demonstrates the potential for real-world applications.

3.
JMIR Med Inform ; 12: e50428, 2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38787295

RESUMO

Background: Individuals from minoritized racial and ethnic backgrounds experience pernicious and pervasive health disparities that have emerged, in part, from clinician bias. Objective: We used a natural language processing approach to examine whether linguistic markers in electronic health record (EHR) notes differ based on the race and ethnicity of the patient. To validate this methodological approach, we also assessed the extent to which clinicians perceive linguistic markers to be indicative of bias. Methods: In this cross-sectional study, we extracted EHR notes for patients who were aged 18 years or older; had more than 5 years of diabetes diagnosis codes; and received care between 2006 and 2014 from family physicians, general internists, or endocrinologists practicing in an urban, academic network of clinics. The race and ethnicity of patients were defined as White non-Hispanic, Black non-Hispanic, or Hispanic or Latino. We hypothesized that Sentiment Analysis and Social Cognition Engine (SEANCE) components (ie, negative adjectives, positive adjectives, joy words, fear and disgust words, politics words, respect words, trust verbs, and well-being words) and mean word count would be indicators of bias if racial differences emerged. We performed linear mixed effects analyses to examine the relationship between the outcomes of interest (the SEANCE components and word count) and patient race and ethnicity, controlling for patient age. To validate this approach, we asked clinicians to indicate the extent to which they thought variation in the use of SEANCE language domains for different racial and ethnic groups was reflective of bias in EHR notes. Results: We examined EHR notes (n=12,905) of Black non-Hispanic, White non-Hispanic, and Hispanic or Latino patients (n=1562), who were seen by 281 physicians. A total of 27 clinicians participated in the validation study. In terms of bias, participants rated negative adjectives as 8.63 (SD 2.06), fear and disgust words as 8.11 (SD 2.15), and positive adjectives as 7.93 (SD 2.46) on a scale of 1 to 10, with 10 being extremely indicative of bias. Notes for Black non-Hispanic patients contained significantly more negative adjectives (coefficient 0.07, SE 0.02) and significantly more fear and disgust words (coefficient 0.007, SE 0.002) than those for White non-Hispanic patients. The notes for Hispanic or Latino patients included significantly fewer positive adjectives (coefficient -0.02, SE 0.007), trust verbs (coefficient -0.009, SE 0.004), and joy words (coefficient -0.03, SE 0.01) than those for White non-Hispanic patients. Conclusions: This approach may enable physicians and researchers to identify and mitigate bias in medical interactions, with the goal of reducing health disparities stemming from bias.

4.
J Healthc Inform Res ; 8(2): 206-224, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38681754

RESUMO

Biomedical relation extraction (RE) is critical in constructing high-quality knowledge graphs and databases as well as supporting many downstream text mining applications. This paper explores prompt tuning on biomedical RE and its few-shot scenarios, aiming to propose a simple yet effective model for this specific task. Prompt tuning reformulates natural language processing (NLP) downstream tasks into masked language problems by embedding specific text prompts into the original input, facilitating the adaption of pre-trained language models (PLMs) to better address these tasks. This study presents a customized prompt tuning model designed explicitly for biomedical RE, including its applicability in few-shot learning contexts. The model's performance was rigorously assessed using the chemical-protein relation (CHEMPROT) dataset from BioCreative VI and the drug-drug interaction (DDI) dataset from SemEval-2013, showcasing its superior performance over conventional fine-tuned PLMs across both datasets, encompassing few-shot scenarios. This observation underscores the effectiveness of prompt tuning in enhancing the capabilities of conventional PLMs, though the extent of enhancement may vary by specific model. Additionally, the model demonstrated a harmonious balance between simplicity and efficiency, matching state-of-the-art performance without needing external knowledge or extra computational resources. The pivotal contribution of our study is the development of a suitably designed prompt tuning model, highlighting prompt tuning's effectiveness in biomedical RE. It offers a robust, efficient approach to the field's challenges and represents a significant advancement in extracting complex relations from biomedical texts. Supplementary Information: The online version contains supplementary material available at 10.1007/s41666-024-00162-9.

5.
Online J Public Health Inform ; 16: e52845, 2024 Mar 13.
Artigo em Inglês | MEDLINE | ID: mdl-38477963

RESUMO

BACKGROUND: Social determinants of health (SDoH) have been described by the World Health Organization as the conditions in which individuals are born, live, work, and age. These conditions can be grouped into 3 interrelated levels known as macrolevel (societal), mesolevel (community), and microlevel (individual) determinants. The scope of SDoH expands beyond the biomedical level, and there remains a need to connect other areas such as economics, public policy, and social factors. OBJECTIVE: Providing a computable artifact that can link health data to concepts involving the different levels of determinants may improve our understanding of the impact SDoH have on human populations. Modeling SDoH may help to reduce existing gaps in the literature through explicit links between the determinants and biological factors. This in turn can allow researchers and clinicians to make better sense of data and discover new knowledge through the use of semantic links. METHODS: An experimental ontology was developed to represent knowledge of the social and economic characteristics of SDoH. Information from 27 literature sources was analyzed to gather concepts and encoded using Web Ontology Language, version 2 (OWL2) and Protégé. Four evaluators independently reviewed the ontology axioms using natural language translation. The analyses from the evaluations and selected terminologies from the Basic Formal Ontology were used to create a revised ontology with a broad spectrum of knowledge concepts ranging from the macrolevel to the microlevel determinants. RESULTS: The literature search identified several topics of discussion for each determinant level. Publications for the macrolevel determinants centered around health policy, income inequality, welfare, and the environment. Articles relating to the mesolevel determinants discussed work, work conditions, psychosocial factors, socioeconomic position, outcomes, food, poverty, housing, and crime. Finally, sources found for the microlevel determinants examined gender, ethnicity, race, and behavior. Concepts were gathered from the literature and used to produce an ontology consisting of 383 classes, 109 object properties, and 748 logical axioms. A reasoning test revealed no inconsistent axioms. CONCLUSIONS: This ontology models heterogeneous social and economic concepts to represent aspects of SDoH. The scope of SDoH is expansive, and although the ontology is broad, it is still in its early stages. To our current understanding, this ontology represents the first attempt to concentrate on knowledge concepts that are currently not covered by existing ontologies. Future direction will include further expanding the ontology to link with other biomedical ontologies, including alignment for granular semantics.

6.
Artigo em Inglês | MEDLINE | ID: mdl-38520725

RESUMO

OBJECTIVES: The rapid expansion of biomedical literature necessitates automated techniques to discern relationships between biomedical concepts from extensive free text. Such techniques facilitate the development of detailed knowledge bases and highlight research deficiencies. The LitCoin Natural Language Processing (NLP) challenge, organized by the National Center for Advancing Translational Science, aims to evaluate such potential and provides a manually annotated corpus for methodology development and benchmarking. MATERIALS AND METHODS: For the named entity recognition (NER) task, we utilized ensemble learning to merge predictions from three domain-specific models, namely BioBERT, PubMedBERT, and BioM-ELECTRA, devised a rule-driven detection method for cell line and taxonomy names and annotated 70 more abstracts as additional corpus. We further finetuned the T0pp model, with 11 billion parameters, to boost the performance on relation extraction and leveraged entites' location information (eg, title, background) to enhance novelty prediction performance in relation extraction (RE). RESULTS: Our pioneering NLP system designed for this challenge secured first place in Phase I-NER and second place in Phase II-relation extraction and novelty prediction, outpacing over 200 teams. We tested OpenAI ChatGPT 3.5 and ChatGPT 4 in a Zero-Shot setting using the same test set, revealing that our finetuned model considerably surpasses these broad-spectrum large language models. DISCUSSION AND CONCLUSION: Our outcomes depict a robust NLP system excelling in NER and RE across various biomedical entities, emphasizing that task-specific models remain superior to generic large ones. Such insights are valuable for endeavors like knowledge graph development and hypothesis formulation in biomedical research.

7.
PLoS One ; 19(3): e0300919, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38512919

RESUMO

Though Vaccines are instrumental in global health, mitigating infectious diseases and pandemic outbreaks, they can occasionally lead to adverse events (AEs). Recently, Large Language Models (LLMs) have shown promise in effectively identifying and cataloging AEs within clinical reports. Utilizing data from the Vaccine Adverse Event Reporting System (VAERS) from 1990 to 2016, this study particularly focuses on AEs to evaluate LLMs' capability for AE extraction. A variety of prevalent LLMs, including GPT-2, GPT-3 variants, GPT-4, and Llama2, were evaluated using Influenza vaccine as a use case. The fine-tuned GPT 3.5 model (AE-GPT) stood out with a 0.704 averaged micro F1 score for strict match and 0.816 for relaxed match. The encouraging performance of the AE-GPT underscores LLMs' potential in processing medical data, indicating a significant stride towards advanced AE detection, thus presumably generalizable to other AE extraction tasks.


Assuntos
Vacinas contra Influenza , Influenza Humana , Humanos , Vacinas contra Influenza/efeitos adversos , Sistemas de Notificação de Reações Adversas a Medicamentos , Influenza Humana/prevenção & controle , Alanina Transaminase , Surtos de Doenças
8.
J Biomed Inform ; 152: 104621, 2024 04.
Artigo em Inglês | MEDLINE | ID: mdl-38447600

RESUMO

OBJECTIVE: The primary objective of this review is to investigate the effectiveness of machine learning and deep learning methodologies in the context of extracting adverse drug events (ADEs) from clinical benchmark datasets. We conduct an in-depth analysis, aiming to compare the merits and drawbacks of both machine learning and deep learning techniques, particularly within the framework of named-entity recognition (NER) and relation classification (RC) tasks related to ADE extraction. Additionally, our focus extends to the examination of specific features and their impact on the overall performance of these methodologies. In a broader perspective, our research extends to ADE extraction from various sources, including biomedical literature, social media data, and drug labels, removing the limitation to exclusively machine learning or deep learning methods. METHODS: We conducted an extensive literature review on PubMed using the query "(((machine learning [Medical Subject Headings (MeSH) Terms]) OR (deep learning [MeSH Terms])) AND (adverse drug event [MeSH Terms])) AND (extraction)", and supplemented this with a snowballing approach to review 275 references sourced from retrieved articles. RESULTS: In our analysis, we included twelve articles for review. For the NER task, deep learning models outperformed machine learning models. In the RC task, gradient Boosting, multilayer perceptron and random forest models excelled. The Bidirectional Encoder Representations from Transformers (BERT) model consistently achieved the best performance in the end-to-end task. Future efforts in the end-to-end task should prioritize improving NER accuracy, especially for 'ADE' and 'Reason'. CONCLUSION: These findings hold significant implications for advancing the field of ADE extraction and pharmacovigilance, ultimately contributing to improved drug safety monitoring and healthcare outcomes.


Assuntos
Aprendizado Profundo , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Humanos , Inteligência Artificial , Farmacovigilância , Benchmarking , Processamento de Linguagem Natural
9.
J Biomed Inform ; 152: 104623, 2024 04.
Artigo em Inglês | MEDLINE | ID: mdl-38458578

RESUMO

INTRODUCTION: Patients' functional status assesses their independence in performing activities of daily living, including basic ADLs (bADL), and more complex instrumental activities (iADL). Existing studies have discovered that patients' functional status is a strong predictor of health outcomes, particularly in older adults. Depite their usefulness, much of the functional status information is stored in electronic health records (EHRs) in either semi-structured or free text formats. This indicates the pressing need to leverage computational approaches such as natural language processing (NLP) to accelerate the curation of functional status information. In this study, we introduced FedFSA, a hybrid and federated NLP framework designed to extract functional status information from EHRs across multiple healthcare institutions. METHODS: FedFSA consists of four major components: 1) individual sites (clients) with their private local data, 2) a rule-based information extraction (IE) framework for ADL extraction, 3) a BERT model for functional status impairment classification, and 4) a concept normalizer. The framework was implemented using the OHNLP Backbone for rule-based IE and open-source Flower and PyTorch library for federated BERT components. For gold standard data generation, we carried out corpus annotation to identify functional status-related expressions based on ICF definitions. Four healthcare institutions were included in the study. To assess FedFSA, we evaluated the performance of category- and institution-specific ADL extraction across different experimental designs. RESULTS: ADL extraction performance ranges from an F1-score of 0.907 to 0.986 for bADL and 0.825 to 0.951 for iADL across the four healthcare sites. The performance for ADL extraction with impairment ranges from an F1-score of 0.722 to 0.954 for bADL and 0.674 to 0.813 for iADL across four healthcare sites. For category-specific ADL extraction, laundry and transferring yielded relatively high performance, while dressing, medication, bathing, and continence achieved moderate-high performance. Conversely, food preparation and toileting showed low performance. CONCLUSION: NLP performance varied across ADL categories and healthcare sites. Federated learning using a FedFSA framework performed higher than non-federated learning for impaired ADL extraction at all healthcare sites. Our study demonstrated the potential of the federated learning framework in functional status extraction and impairment classification in EHRs, exemplifying the importance of a large-scale, multi-institutional collaborative development effort.


Assuntos
Atividades Cotidianas , Estado Funcional , Humanos , Idoso , Aprendizagem , Armazenamento e Recuperação da Informação , Processamento de Linguagem Natural
10.
BMC Med Inform Decis Mak ; 23(Suppl 4): 299, 2024 Feb 07.
Artigo em Inglês | MEDLINE | ID: mdl-38326827

RESUMO

BACKGROUND: In this era of big data, data harmonization is an important step to ensure reproducible, scalable, and collaborative research. Thus, terminology mapping is a necessary step to harmonize heterogeneous data. Take the Medical Dictionary for Regulatory Activities (MedDRA) and International Classification of Diseases (ICD) for example, the mapping between them is essential for drug safety and pharmacovigilance research. Our main objective is to provide a quantitative and qualitative analysis of the mapping status between MedDRA and ICD. We focus on evaluating the current mapping status between MedDRA and ICD through the Unified Medical Language System (UMLS) and Observational Medical Outcomes Partnership Common Data Model (OMOP CDM). We summarized the current mapping statistics and evaluated the quality of the current MedDRA-ICD mapping; for unmapped terms, we used our self-developed algorithm to rank the best possible mapping candidates for additional mapping coverage. RESULTS: The identified MedDRA-ICD mapped pairs cover 27.23% of the overall MedDRA preferred terms (PT). The systematic quality analysis demonstrated that, among the mapped pairs provided by UMLS, only 51.44% are considered an exact match. For the 2400 sampled unmapped terms, 56 of the 2400 MedDRA Preferred Terms (PT) could have exact match terms from ICD. CONCLUSION: Some of the mapped pairs between MedDRA and ICD are not exact matches due to differences in granularity and focus. For 72% of the unmapped PT terms, the identified exact match pairs illustrate the possibility of identifying additional mapped pairs. Referring to its own mapping standard, some of the unmapped terms should qualify for the expansion of MedDRA to ICD mapping in UMLS.


Assuntos
Sistemas de Notificação de Reações Adversas a Medicamentos , Classificação Internacional de Doenças , Humanos , Unified Medical Language System , Farmacovigilância , Algoritmos
11.
BMC Med Inform Decis Mak ; 23(Suppl 4): 298, 2024 01 05.
Artigo em Inglês | MEDLINE | ID: mdl-38183034

RESUMO

BACKGROUND: Vaccine Adverse Events ReportingSystem (VAERS) is a promising resource of tracking adverse events following immunization. Medical Dictionary for Regulatory Activities (MedDRA) terminology used for coding adverse events in VAERS reports has several limitations. We focus on developing an automated system for semantic extraction of adverse events following vaccination and their temporal relationships for a better understanding of VAERS data and its integration into other applications. The aim of the present studyis to summarize the lessons learned during the initial phase of this project in annotating adverse events following influenza vaccination and related to Guillain-Barré syndrome (GBS). We emphasize on identifying the limitations of VAERS and MedDRA. RESULTS: We collected 282 VAERS reports documented between 1990 and 2016 and shortlisted those with at least 1,100 characters in the report. We used a subset of 50 reports for the preliminary investigation and annotated all adverse events following influenza vaccination by mapping to representative MedDRA terms. Associated time expressions were annotated when available. We used 16 System Organ Class (SOC) level MedDRA terms to map GBS related adverse events and expanded some SOC terms to Lowest Level Terms (LLT) for granular representation. We annotated three broad categories of events such as problems, clinical investigations, and treatments/procedures. The inter-annotator agreement of events achieved was 86%. Incomplete reports, typographical errors, lack of clarity and coherence, repeated texts, unavailability of associated temporal information, difficulty to interpret due to incorrect grammar, use of generalized terms to describe adverse events / symptoms, uncommon abbreviations, difficulty annotating multiple events with a conjunction / common phrase, irrelevant historical events and coexisting events were some of the challenges encountered. Some of the limitations we noted are in agreement with previous reports. CONCLUSIONS: We reported the challenges encountered and lessons learned during annotation of adverse events in VAERS reports following influenza vaccination and related to GBS. Though the challenges may be due to the inevitable limitations of public reporting systems and widely reported limitations of MedDRA, we emphasize the need to understand these limitations and extraction of other supportive information for a better understanding of adverse events following vaccination.


Assuntos
Síndrome de Guillain-Barré , Influenza Humana , Humanos , Síndrome de Guillain-Barré/etiologia , Sistemas de Notificação de Reações Adversas a Medicamentos , Influenza Humana/prevenção & controle , Vacinação/efeitos adversos , Linguística
12.
JMIR Aging ; 7: e49415, 2024 Jan 23.
Artigo em Inglês | MEDLINE | ID: mdl-38261365

RESUMO

BACKGROUND: Reminiscence, a therapy that uses stimulating materials such as old photos and videos to stimulate long-term memory, can improve the emotional well-being and life satisfaction of older adults, including those who are cognitively intact. However, providing personalized reminiscence therapy can be challenging for caregivers and family members. OBJECTIVE: This study aimed to achieve three objectives: (1) design and develop the GoodTimes app, an interactive multimodal photo album that uses artificial intelligence (AI) to engage users in personalized conversations and storytelling about their pictures, encompassing family, friends, and special moments; (2) examine the app's functionalities in various scenarios using use-case studies and assess the app's usability and user experience through the user study; and (3) investigate the app's potential as a supplementary tool for reminiscence therapy among cognitively intact older adults, aiming to enhance their psychological well-being by facilitating the recollection of past experiences. METHODS: We used state-of-the-art AI technologies, including image recognition, natural language processing, knowledge graph, logic, and machine learning, to develop GoodTimes. First, we constructed a comprehensive knowledge graph that models the information required for effective communication, including photos, people, locations, time, and stories related to the photos. Next, we developed a voice assistant that interacts with users by leveraging the knowledge graph and machine learning techniques. Then, we created various use cases to examine the functions of the system in different scenarios. Finally, to evaluate GoodTimes' usability, we conducted a study with older adults (N=13; age range 58-84, mean 65.8 years). The study period started from January to March 2023. RESULTS: The use-case tests demonstrated the performance of GoodTimes in handling a variety of scenarios, highlighting its versatility and adaptability. For the user study, the feedback from our participants was highly positive, with 92% (12/13) reporting a positive experience conversing with GoodTimes. All participants mentioned that the app invoked pleasant memories and aided in recollecting loved ones, resulting in a sense of happiness for the majority (11/13, 85%). Additionally, a significant majority found GoodTimes to be helpful (11/13, 85%) and user-friendly (12/13, 92%). Most participants (9/13, 69%) expressed a desire to use the app frequently, although some (4/13, 31%) indicated a need for technical support to navigate the system effectively. CONCLUSIONS: Our AI-based interactive photo album, GoodTimes, was able to engage users in browsing their photos and conversing about them. Preliminary evidence supports GoodTimes' usability and benefits cognitively intact older adults. Future work is needed to explore its potential positive effects among older adults with cognitive impairment.


Assuntos
Inteligência Artificial , Aplicativos Móveis , Humanos , Idoso , Pessoa de Meia-Idade , Idoso de 80 Anos ou mais , Memória , Memória de Longo Prazo , Aprendizado de Máquina
13.
J Am Heart Assoc ; 13(3): e029900, 2024 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-38293921

RESUMO

BACKGROUND: The rapid evolution of artificial intelligence (AI) in conjunction with recent updates in dual antiplatelet therapy (DAPT) management guidelines emphasizes the necessity for innovative models to predict ischemic or bleeding events after drug-eluting stent implantation. Leveraging AI for dynamic prediction has the potential to revolutionize risk stratification and provide personalized decision support for DAPT management. METHODS AND RESULTS: We developed and validated a new AI-based pipeline using retrospective data of drug-eluting stent-treated patients, sourced from the Cerner Health Facts data set (n=98 236) and Optum's de-identified Clinformatics Data Mart Database (n=9978). The 36 months following drug-eluting stent implantation were designated as our primary forecasting interval, further segmented into 6 sequential prediction windows. We evaluated 5 distinct AI algorithms for their precision in predicting ischemic and bleeding risks. Model discriminative accuracy was assessed using the area under the receiver operating characteristic curve, among other metrics. The weighted light gradient boosting machine stood out as the preeminent model, thus earning its place as our AI-DAPT model. The AI-DAPT demonstrated peak accuracy in the 30 to 36 months window, charting an area under the receiver operating characteristic curve of 90% [95% CI, 88%-92%] for ischemia and 84% [95% CI, 82%-87%] for bleeding predictions. CONCLUSIONS: Our AI-DAPT excels in formulating iterative, refined dynamic predictions by assimilating ongoing updates from patients' clinical profiles, holding value as a novel smart clinical tool to facilitate optimal DAPT duration management with high accuracy and adaptability.


Assuntos
Doença da Artéria Coronariana , Stents Farmacológicos , Infarto do Miocárdio , Intervenção Coronária Percutânea , Humanos , Inibidores da Agregação Plaquetária/efeitos adversos , Infarto do Miocárdio/etiologia , Doença da Artéria Coronariana/diagnóstico , Doença da Artéria Coronariana/cirurgia , Stents Farmacológicos/efeitos adversos , Inteligência Artificial , Estudos Retrospectivos , Resultado do Tratamento , Fatores de Risco , Quimioterapia Combinada , Hemorragia/induzido quimicamente , Prognóstico , Intervenção Coronária Percutânea/efeitos adversos
14.
J Eval Clin Pract ; 30(2): 251-259, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37933789

RESUMO

RATIONALE, AIMS, AND OBJECTIVE: Unwarranted clinical variation (UCV) is an undesirable aspect of a healthcare system, but analyzing for UCV can be difficult and time-consuming. No analytic feature guidelines currently exist to aid researchers. We performed a systematic review of UCV literature to identify and classify the features researchers have identified as necessary for the analysis of UCV. METHODS: The literature search followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses. We looked for articles with the terms 'medical practice variation' and 'unwarranted clinical variation' from four databases: Medline, Web of Science, EMBASE and CINAHL. The search was performed on 24 March 2023. The articles selected were original research articles in the English language reporting on UCV analysis in adult populations. Most of the studies were retrospective cohort analyses. We excluded studies reporting geographic variation based on the Atlas of Variation or small-area analysis methods. We used ASReview Lab software to assist in identifying articles for abstract review. We also conducted subsequent reference searches of the primary articles to retrieve additional articles. RESULTS: The search yielded 499 articles, and we reviewed 46. We identified 28 principal analytic features utilized to analyze for unwarranted variation, categorised under patient-related or local healthcare context factors. Within the patient-related factors, we identified three subcategories: patient sociodemographics, clinical characteristics, and preferences, and classified 17 features into seven subcategories. In the local context category, 11 features are classified under two subcategories. Examples are provided on the usage of each feature for analysis. CONCLUSION: Twenty-eight analytic features have been identified, and a categorisation has been established showing the relationships between features. Identifying and classifying features provides guidelines for known confounders during analysis and reduces the steps required when performing UCV analysis; there is no longer a need for a UCV researcher to engage in time-consuming feature engineering activities.


Assuntos
Atenção à Saúde , Software , Adulto , Humanos , Estudos Retrospectivos , Estudos de Coortes
15.
Expert Rev Vaccines ; 23(1): 53-59, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38063069

RESUMO

INTRODUCTION: The rapid development of COVID-19 vaccines has provided crucial tools for pandemic control, but the occurrence of vaccine-related adverse events (AEs) underscores the need for comprehensive monitoring. METHODS: This study analyzed the Vaccine Adverse Event Reporting System (VAERS) data from 2020-2022 using statistical methods such as zero-truncated Poisson regression and logistic regression to assess associations with age, gender groups, and vaccine manufacturers. RESULTS: Logistic regression identified 26 System Organ Classes (SOCs) significantly associated with age and gender. Females displayed especially higher odds in SOC 19 (Pregnancy, puerperium and perinatal conditions), while males had higher odds in SOC 25 (Surgical and medical procedures). Older adults (>65) were more prone to symptoms like Cardiac disorders, whereas those aged 18-65 showed susceptibility to AEs like Skin and subcutaneous tissue disorders. Moderna and Pfizer vaccines induced fewer SOC symptoms compared to Janssen and Novavax. The zero-truncated Poisson regression model estimated an average of 4.243 symptoms per individual. CONCLUSION: These findings offer vital insights into vaccine safety, guiding evidence-based vaccination strategies and monitoring programs for precise and effective outcomes.


Assuntos
Vacinas contra COVID-19 , COVID-19 , Vacinas , Idoso , Feminino , Humanos , Masculino , Gravidez , Sistemas de Notificação de Reações Adversas a Medicamentos , COVID-19/epidemiologia , COVID-19/prevenção & controle , Vacinas contra COVID-19/efeitos adversos , Estados Unidos , Vacinação/efeitos adversos , Vacinas/efeitos adversos
16.
Yearb Med Inform ; 32(1): 253-263, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38147867

RESUMO

OBJECTIVE: To summarize the recent methods and applications that leverage real-world data such as electronic health records (EHRs) with social determinants of health (SDoH) for public and population health and health equity and identify successes, challenges, and possible solutions. METHODS: In this opinion review, grounded on a social-ecological-model-based conceptual framework, we surveyed data sources and recent informatics approaches that enable leveraging SDoH along with real-world data to support public health and clinical health applications including helping design public health intervention, enhancing risk stratification, and enabling the prediction of unmet social needs. RESULTS: Besides summarizing data sources, we identified gaps in capturing SDoH data in existing EHR systems and opportunities to leverage informatics approaches to collect SDoH information either from structured and unstructured EHR data or through linking with public surveys and environmental data. We also surveyed recently developed ontologies for standardizing SDoH information and approaches that incorporate SDoH for disease risk stratification, public health crisis prediction, and development of tailored interventions. CONCLUSIONS: To enable effective public health and clinical applications using real-world data with SDoH, it is necessary to develop both non-technical solutions involving incentives, policies, and training as well as technical solutions such as novel social risk management tools that are integrated into clinical workflow. Ultimately, SDoH-powered social risk management, disease risk prediction, and development of SDoH tailored interventions for disease prevention and management have the potential to improve population health, reduce disparities, and improve health equity.


Assuntos
Equidade em Saúde , Saúde da População , Humanos , Determinantes Sociais da Saúde , Registros Eletrônicos de Saúde , Avaliação de Resultados em Cuidados de Saúde
17.
Yearb Med Inform ; 32(1): 215-224, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38147863

RESUMO

OBJECTIVES: Graph representation learning (GRL) has emerged as a pivotal field that has contributed significantly to breakthroughs in various fields, including biomedicine. The objective of this survey is to review the latest advancements in GRL methods and their applications in the biomedical field. We also highlight key challenges currently faced by GRL and outline potential directions for future research. METHODS: We conducted a comprehensive search of multiple databases, including PubMed, Web of Science, IEEE Xplore, and Google Scholar, to collect relevant publications from the past two years (2021-2022). The studies selected for review were based on their relevance to the topic and the publication quality. RESULTS: A total of 78 articles were included in our analysis. We identified three main categories of GRL methods and summarized their methodological foundations and notable models. In terms of GRL applications, we focused on two main topics: drug and disease. We analyzed the study frameworks and achievements of the prominent research. Based on the current state-of-the-art, we discussed the challenges and future directions. CONCLUSIONS: GRL methods applied in the biomedical field demonstrated several key characteristics, including the utilization of attention mechanisms to prioritize relevant features, a growing emphasis on model interpretability, and the combination of various techniques to improve model performance. There are also challenges needed to be addressed, including mitigating model bias, accommodating the heterogeneity of large-scale knowledge graphs, and improving the availability of high-quality graph data. To fully leverage the potential of GRL, future efforts should prioritize these areas of research.


Assuntos
Aprendizagem , Medicina , Medicina/tendências
18.
BMC Bioinformatics ; 24(Suppl 3): 477, 2023 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-38102593

RESUMO

BACKGROUND: With more clinical trials are offering optional participation in the collection of bio-specimens for biobanking comes the increasing complexity of requirements of informed consent forms. The aim of this study is to develop an automatic natural language processing (NLP) tool to annotate informed consent documents to promote biorepository data regulation, sharing, and decision support. We collected informed consent documents from several publicly available sources, then manually annotated them, covering sentences containing permission information about the sharing of either bio-specimens or donor data, or conducting genetic research or future research using bio-specimens or donor data. RESULTS: We evaluated a variety of machine learning algorithms including random forest (RF) and support vector machine (SVM) for the automatic identification of these sentences. 120 informed consent documents containing 29,204 sentences were annotated, of which 1250 sentences (4.28%) provide answers to a permission question. A support vector machine (SVM) model achieved a F-1 score of 0.95 on classifying the sentences when using a gold standard, which is a prefiltered corpus containing all relevant sentences. CONCLUSIONS: This study provides the feasibility of using machine learning tools to classify permission-related sentences in informed consent documents.


Assuntos
Bancos de Espécimes Biológicos , Termos de Consentimento , Aprendizado de Máquina , Algoritmos , Processamento de Linguagem Natural
19.
Res Sq ; 2023 Sep 27.
Artigo em Inglês | MEDLINE | ID: mdl-37841880

RESUMO

Background: Vaccines have revolutionized public health by providing protection against infectious diseases. They stimulate the immune system and generate memory cells to defend against targeted diseases. Clinical trials evaluate vaccine performance, including dosage, administration routes, and potential side effects. ClinicalTrials.gov is a valuable repository of clinical trial information, but the vaccine data in them lacks standardization, leading to challenges in automatic concept mapping, vaccine-related knowledge development, evidence-based decision-making, and vaccine surveillance. Results: In this study, we developed a cascaded framework that capitalized on multiple domain knowledge sources, including clinical trials, Unified Medical Language System (UMLS), and the Vaccine Ontology (VO), to enhance the performance of domain-specific language models for automated mapping of VO from clinical trials. The Vaccine Ontology (VO) is a community-based ontology that was developed to promote vaccine data standardization, integration, and computer-assisted reasoning. Our methodology involved extracting and annotating data from various sources. We then performed pre-training on the PubMedBERT model, leading to the development of CTPubMedBERT. Subsequently, we enhanced CTPubMedBERT by incorporating SAPBERT, which was pretrained using the UMLS, resulting in CTPubMedBERT + SAPBERT. Further refinement was accomplished through fine-tuning using the Vaccine Ontology corpus and vaccine data from clinical trials, yielding the CTPubMedBERT + SAPBERT + VO model. Finally, we utilized a collection of pre-trained models, along with the weighted rule-based ensemble approach, to normalize the vaccine corpus and improve the accuracy of the process. The ranking process in concept normalization involves prioritizing and ordering potential concepts to identify the most suitable match for a given context. We conducted a ranking of the Top 10 concepts, and our experimental results demonstrate that our proposed cascaded framework consistently outperformed existing effective baselines on vaccine mapping, achieving 71.8% on top 1 candidate's accuracy and 90.0% on top 10 candidate's accuracy. Conclusion: This study provides a detailed insight into a cascaded framework of fine-tuned domain-specific language models improving mapping of VO from clinical trials. By effectively leveraging domain-specific information and applying weighted rule-based ensembles of different pre-trained BERT models, our framework can significantly enhance the mapping of VO from clinical trials.

20.
BMC Med Inform Decis Mak ; 23(Suppl 1): 151, 2023 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-37542312

RESUMO

BACKGROUND: In the United States, the National Alzheimer's Coordinating Center (NACC) and the Alzheimer's Disease Neuroimaging Initiative (ADNI) are two major data sharing resources for Alzheimer's Disease (AD) research. NACC and ADNI strive to make their data more FAIR (findable, interoperable, accessible and reusable) for the broader research community. However, there is limited work harmonizing and supporting cross-cohort interoperability of the two resources. METHOD: In this paper, we leverage an ontology-based approach to harmonize data elements in the two resources and develop a web-based query system to search patient cohorts across the two resources. We first mapped data elements across NACC and ADNI, and performed value harmonization for the mapped data elements with inconsistent permissible values. Then we built an Alzheimer's Disease Data Element Ontology (ADEO) to model the mapped data elements in NACC and ADNI. We further developed a prototype cross-cohort query system to search patient cohorts across NACC and ADNI. RESULTS: After manual review, we found 172 mappings between NACC and ADNI. These 172 mappings were further used to construct common concepts in ADEO. Our data element mapping and harmonization resulted in five files storing common concepts, variables in NACC and ADNI, mappings between variables and common concepts, permissible values of categorical type data elements, and coding inconsistency harmonization, respectively. Our cross-cohort query system consists of three core architectural elements: a web-based interface, an advanced query engine, and a backend MongoDB database. CONCLUSIONS: In this work, ADEO has been specifically designed to facilitate data harmonization and cross-cohort query of NACC and ADNI data resources. Although our prototype cross-cohort query system was developed for exploring NACC and ADNI, its backend and frontend framework has been designed and implemented to be generally applicable to other domains for querying patient cohorts from multiple heterogeneous data sources.


Assuntos
Doença de Alzheimer , Humanos , Estados Unidos , Doença de Alzheimer/diagnóstico por imagem , Neuroimagem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA