Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 33
Filter
1.
AMIA Jt Summits Transl Sci Proc ; 2024: 384-390, 2024.
Article in English | MEDLINE | ID: mdl-38827064

ABSTRACT

This paper addresses the challenge of binary relation classification in biomedical Natural Language Processing (NLP), focusing on diverse domains including gene-disease associations, compound protein interactions, and social determinants of health (SDOH). We evaluate different approaches, including fine-tuning Bidirectional Encoder Representations from Transformers (BERT) models and generative Large Language Models (LLMs), and examine their performance in zero and few-shot settings. We also introduce a novel dataset of biomedical text annotated with social and clinical entities to facilitate research into relation classification. Our results underscore the continued complexity of this task for both humans and models. BERT-based models trained on domain-specific data excelled in certain domains and achieved comparable performance and generalization power to generative LLMs in others. Despite these encouraging results, these models are still far from achieving human-level performance. We also highlight the significance of high-quality training data and domain-specific fine-tuning on the performance of all the considered models.

3.
5.
AMIA Annu Symp Proc ; 2023: 426-435, 2023.
Article in English | MEDLINE | ID: mdl-38222374

ABSTRACT

Chronic gastrointestinal (GI) conditions, such as inflammatory bowel diseases (IBD), offer a promising opportunity to create classification systems that can enhance the accuracy of predicting the most effective therapies and prognosis for each patient. Here, we present a novel methodology to explore disease subtypes using our open-sourced BiomedSciAI toolkit. Applying methods available in this toolkit on the UK Biobank, including subpopulation-based feature selection and multi-dimensional subset scanning, we aimed to discover unique subgroups from GI surgery cohorts. Of a 12,073-patient cohort, a subgroup of 440 IBD patients was discovered with an increased risk of a subsequent GI surgery (OR: 2.21, 95% CI [1.81-2.69]). We iteratively demonstrate the discovery process using an additional cohort (with a narrower definition of GI surgery). Our results show that the iterative process can refine the subgroup discovery process and generate novel hypotheses to investigate determinants of treatment response.


Subject(s)
Inflammatory Bowel Diseases , UK Biobank , Humans , Biological Specimen Banks , Inflammatory Bowel Diseases/surgery , Prognosis , Chronic Disease , Treatment Outcome
6.
Sci Rep ; 12(1): 12542, 2022 07 22.
Article in English | MEDLINE | ID: mdl-35869152

ABSTRACT

Prediction models are commonly used to estimate risk for cardiovascular diseases, to inform diagnosis and management. However, performance may vary substantially across relevant subgroups of the population. Here we investigated heterogeneity of accuracy and fairness metrics across a variety of subgroups for risk prediction of two common diseases: atrial fibrillation (AF) and atherosclerotic cardiovascular disease (ASCVD). We calculated the Cohorts for Heart and Aging in Genomic Epidemiology Atrial Fibrillation (CHARGE-AF) score for AF and the Pooled Cohort Equations (PCE) score for ASCVD in three large datasets: Explorys Life Sciences Dataset (Explorys, n = 21,809,334), Mass General Brigham (MGB, n = 520,868), and the UK Biobank (UKBB, n = 502,521). Our results demonstrate important performance heterogeneity across subpopulations defined by age, sex, and presence of preexisting disease, with fairly consistent patterns across both scores. For example, using CHARGE-AF, discrimination declined with increasing age, with a concordance index of 0.72 [95% CI 0.72-0.73] for the youngest (45-54 years) subgroup to 0.57 [0.56-0.58] for the oldest (85-90 years) subgroup in Explorys. Even though sex is not included in CHARGE-AF, the statistical parity difference (i.e., likelihood of being classified as high risk) was considerable between males and females within the 65-74 years subgroup with a value of - 0.33 [95% CI - 0.33 to - 0.33]. We also observed weak discrimination (i.e., < 0.7) and suboptimal calibration (i.e., calibration slope outside of 0.7-1.3) in large subsets of the population; for example, all individuals aged 75 years or older in Explorys (17.4%). Our findings highlight the need to characterize and quantify the behavior of clinical risk models within specific subpopulations so they can be used appropriately to facilitate more accurate, consistent, and equitable assessment of disease risk.


Subject(s)
Atherosclerosis , Atrial Fibrillation , Cardiovascular Diseases , Atherosclerosis/epidemiology , Atrial Fibrillation/diagnosis , Atrial Fibrillation/epidemiology , Atrial Fibrillation/genetics , Cardiovascular Diseases/epidemiology , Female , Heart Disease Risk Factors , Humans , Male , Middle Aged , Risk Assessment/methods , Risk Factors
7.
J Med Syst ; 46(6): 33, 2022 May 03.
Article in English | MEDLINE | ID: mdl-35505220

ABSTRACT

The article introduces a new type of an authentication technique denoted as memory-memory (M2). A core component of M2 is its ability to collect and populate a voice profile database and use it to perform the verification process. The method relies on a database that includes voice profiles in the form of audio recordings of individuals; the profiles are interconnected based on known relationships between people such that relationships can be used to determine which voice profiles to select to test a person's knowledge of the identity of the people in the recordings (e.g., their names, their relation to each other). Combining widely known concepts (e.g., humans are superior to computers in processing voices and computers are superior to humans in handling data) expects to significantly enhance existing authentication methods (e.g., passwords, biometrics-based).


Subject(s)
Voice , Biometry , Databases, Factual , Humans , Knowledge
8.
Circulation ; 144(6): 410-422, 2021 08 10.
Article in English | MEDLINE | ID: mdl-34247495

ABSTRACT

BACKGROUND: Individuals of South Asian ancestry represent 23% of the global population, corresponding to 1.8 billion people, and have substantially higher risk of atherosclerotic cardiovascular disease compared with most other ethnicities. US practice guidelines now recognize South Asian ancestry as an important risk-enhancing factor. The magnitude of enhanced risk within the context of contemporary clinical care, the extent to which it is captured by existing risk estimators, and its potential mechanisms warrant additional study. METHODS: Within the UK Biobank prospective cohort study, 8124 middle-aged participants of South Asian ancestry and 449 349 participants of European ancestry who were free of atherosclerotic cardiovascular disease at the time of enrollment were examined. The relationship of ancestry to risk of incident atherosclerotic cardiovascular disease-defined as myocardial infarction, coronary revascularization, or ischemic stroke-was assessed with Cox proportional hazards regression, along with examination of a broad range of clinical, anthropometric, and lifestyle mediators. RESULTS: The mean age at study enrollment was 57 years, and 202 405 (44%) were male. Over a median follow-up of 11 years, 554 of 8124 (6.8%) individuals of South Asian ancestry experienced an atherosclerotic cardiovascular disease event compared with 19 756 of 449 349 (4.4%) individuals of European ancestry, corresponding to an adjusted hazard ratio of 2.03 (95% CI, 1.86-2.22; P<0.001). This higher relative risk was largely consistent across a range of age, sex, and clinical subgroups. Despite the >2-fold higher observed risk, the predicted 10-year risk of cardiovascular disease according to the American Heart Association/American College of Cardiology Pooled Cohort equations and QRISK3 equations was nearly identical for individuals of South Asian and European ancestry. Adjustment for a broad range of clinical, anthropometric, and lifestyle risk factors led to only modest attenuation of the observed hazard ratio to 1.45 (95% CI, 1.28-1.65, P<0.001). Assessment of variance explained by 18 candidate risk factors suggested greater importance of hypertension, diabetes, and central adiposity in South Asian individuals. CONCLUSIONS: Within a large prospective study, South Asian individuals had substantially higher risk of atherosclerotic cardiovascular disease compared with individuals of European ancestry, and this risk was not captured by the Pooled Cohort Equations.


Subject(s)
Asian People , Atherosclerosis/epidemiology , Atherosclerosis/etiology , Adult , Aged , Biological Specimen Banks , Disease Susceptibility , Female , Follow-Up Studies , Heart Disease Risk Factors , Humans , Male , Middle Aged , Population Surveillance , Proportional Hazards Models , Risk Assessment , Risk Factors , United Kingdom/epidemiology , United Kingdom/ethnology
9.
J Med Syst ; 45(5): 57, 2021 Mar 30.
Article in English | MEDLINE | ID: mdl-33783646
11.
Sci Rep ; 11(1): 1139, 2021 01 13.
Article in English | MEDLINE | ID: mdl-33441956

ABSTRACT

To support point-of-care decision making by presenting outcomes of past treatment choices for cohorts of similar patients based on observational data from electronic health records (EHRs), a machine-learning precision cohort treatment option (PCTO) workflow consisting of (1) data extraction, (2) similarity model training, (3) precision cohort identification, and (4) treatment options analysis was developed. The similarity model is used to dynamically create a cohort of similar patients, to inform clinical decisions about an individual patient. The workflow was implemented using EHR data from a large health care provider for three different highly prevalent chronic diseases: hypertension (HTN), type 2 diabetes mellitus (T2DM), and hyperlipidemia (HL). A retrospective analysis demonstrated that treatment options with better outcomes were available for a majority of cases (75%, 74%, 85% for HTN, T2DM, HL, respectively). The models for HTN and T2DM were deployed in a pilot study with primary care physicians using it during clinic visits. A novel data-analytic workflow was developed to create patient-similarity models that dynamically generate personalized treatment insights at the point-of-care. By leveraging both knowledge-driven treatment guidelines and data-driven EHR data, physicians can incorporate real-world evidence in their medical decision-making process when considering treatment options for individual patients.


Subject(s)
Diabetes Mellitus, Type 2/therapy , Hyperlipidemias/therapy , Hypertension/therapy , Cohort Studies , Data Mining , Electronic Health Records , Humans , Machine Learning , Precision Medicine , Retrospective Studies , Workflow
13.
Circ Arrhythm Electrophysiol ; 14(1): e008997, 2021 01.
Article in English | MEDLINE | ID: mdl-33295794

ABSTRACT

BACKGROUND: Atrial fibrillation (AF) is associated with increased risks of stroke and heart failure. Electronic health record (EHR)-based AF risk prediction may facilitate efficient deployment of interventions to diagnose or prevent AF altogether. METHODS: We externally validated an electronic health record AF (EHR-AF) score in IBM Explorys Life Sciences, a multi-institutional dataset containing statistically deidentified EHR data for over 21 million individuals (Explorys Dataset). We included individuals with complete AF risk data, ≥2 office visits within 2 years, and no prevalent AF. We compared EHR-AF to existing scores including CHARGE-AF (Cohorts for Heart and Aging Research in Genomic Epidemiology Atrial Fibrillation), C2HEST (coronary artery disease or chronic obstructive pulmonary disease, hypertension, elderly, systolic heart failure, thyroid disease), and CHA2DS2-VASc. We assessed association between AF risk scores and 5-year incident AF, stroke, and heart failure using Cox proportional hazards modeling, 5-year AF discrimination using C indices, and calibration of predicted AF risk to observed AF incidence. RESULTS: Of 21 825 853 individuals in the Explorys Dataset, 4 508 180 comprised the analysis (age 62.5, 56.3% female). AF risk scores were strongly associated with 5-year incident AF (hazard ratio per SD increase 1.85 using CHA2DS2-VASc to 2.88 using EHR-AF), stroke (1.61 using C2HEST to 1.92 using CHARGE-AF), and heart failure (1.91 using CHA2DS2-VASc to 2.58 using EHR-AF). EHR-AF (C index, 0.808 [95% CI, 0.807-0.809]) demonstrated favorable AF discrimination compared to CHARGE-AF (0.806 [95% CI, 0.805-0.807]), C2HEST (0.683 [95% CI, 0.682-0.684]), and CHA2DS2-VASc (0.720 [95% CI, 0.719-0.722]). Of the scores, EHR-AF demonstrated the best calibration to incident AF (calibration slope, 1.002 [95% CI, 0.997-1.007]). In subgroup analyses, AF discrimination using EHR-AF was lower in individuals with stroke (C index, 0.696 [95% CI, 0.692-0.700]) and heart failure (0.621 [95% CI, 0.617-0.625]). CONCLUSIONS: EHR-AF demonstrates predictive accuracy for incident AF using readily ascertained EHR data. AF risk is associated with incident stroke and heart failure. Use of such risk scores may facilitate decision support and population health management efforts focused on minimizing AF-related morbidity.


Subject(s)
Atrial Fibrillation/diagnosis , Risk Assessment/methods , Stroke/epidemiology , Age Factors , Atrial Fibrillation/complications , Female , Global Health , Humans , Incidence , Male , Middle Aged , Stroke/etiology , Survival Rate/trends
14.
J Am Med Inform Assoc ; 28(3): 588-595, 2021 03 01.
Article in English | MEDLINE | ID: mdl-33180897

ABSTRACT

OBJECTIVE: To present clinicians at the point-of-care with real-world data on the effectiveness of various treatment options in a precision cohort of patients closely matched to the index patient. MATERIALS AND METHODS: We developed disease-specific, machine-learning, patient-similarity models for hypertension (HTN), type II diabetes mellitus (T2DM), and hyperlipidemia (HL) using data on approximately 2.5 million patients in a large medical group practice. For each identified decision point, an encounter during which the patient's condition was not controlled, we compared the actual outcome of the treatment decision administered to that of the best-achieved outcome for similar patients in similar clinical situations. RESULTS: For the majority of decision points (66.8%, 59.0%, and 83.5% for HTN, T2DM, and HL, respectively), there were alternative treatment options administered to patients in the precision cohort that resulted in a significantly increased proportion of patients under control than the treatment option chosen for the index patient. The expected percentage of patients whose condition would have been controlled if the best-practice treatment option had been chosen would have been better than the actual percentage by: 36% (65.1% vs 48.0%, HTN), 68% (37.7% vs 22.5%, T2DM), and 138% (75.3% vs 31.7%, HL). CONCLUSION: Clinical guidelines are primarily based on the results of randomized controlled trials, which apply to a homogeneous subject population. Providing the effectiveness of various treatment options used in a precision cohort of patients similar to the index patient can provide complementary information to tailor guideline recommendations for individual patients and potentially improve outcomes.


Subject(s)
Decision Making, Computer-Assisted , Diabetes Mellitus, Type 2/therapy , Hyperlipidemias/therapy , Hypertension/therapy , Machine Learning , Patient Care Management/methods , Practice Guidelines as Topic , Electronic Health Records , Evidence-Based Medicine , Humans , Treatment Outcome
15.
Transplantation ; 104(6): e182, 2020 06.
Article in English | MEDLINE | ID: mdl-32433233

Subject(s)
Liver , Sodium , Humans
16.
NPJ Digit Med ; 3: 8, 2020.
Article in English | MEDLINE | ID: mdl-31993506

ABSTRACT

The ability to identify patients who are likely to have an adverse outcome is an essential component of good clinical care. Therefore, predictive risk stratification models play an important role in clinical decision making. Determining whether a given predictive model is suitable for clinical use usually involves evaluating the model's performance on large patient datasets using standard statistical measures of success (e.g., accuracy, discriminatory ability). However, as these metrics correspond to averages over patients who have a range of different characteristics, it is difficult to discern whether an individual prediction on a given patient should be trusted using these measures alone. In this paper, we introduce a new method for identifying patient subgroups where a predictive model is expected to be poor, thereby highlighting when a given prediction is misleading and should not be trusted. The resulting "unreliability score" can be computed for any clinical risk model and is suitable in the setting of large class imbalance, a situation often encountered in healthcare settings. Using data from more than 40,000 patients in the Global Registry of Acute Coronary Events (GRACE), we demonstrate that patients with high unreliability scores form a subgroup in which the predictive model has both decreased accuracy and decreased discriminatory ability.

17.
19.
Curr Med Res Opin ; 35(12): 2063-2070, 2019 12.
Article in English | MEDLINE | ID: mdl-31337263

ABSTRACT

Aims: To assess demographic and clinical characteristics associated with clinical inertia in a real-world cohort of type 2 diabetes mellitus patients not at hemoglobin A1c goal (<7%) on metformin monotherapy.Methods: Adult (≥18 years) type 2 diabetes mellitus patients who received care at Massachusetts General Hospital/Brigham and Women's Hospital and received a new metformin prescription between 1992 and 2010 were included in the analysis. Clinical inertia was defined as two consecutive hemoglobin A1c measures ≥7% ≥3 months apart while remaining on metformin monotherapy (i.e. without add-on therapy). The association between clinical inertia and demographic and clinical characteristics was examined via logistic regression.Results: Of 2848 eligible patients, 43% did not achieve a hemoglobin A1c goal of <7% 3 months after metformin monotherapy initiation. A sub-group of 1533 patients was included in the clinical inertia analysis, of which 36% experienced clinical inertia. Asian race was associated with an increased likelihood of clinical inertia (OR = 2.43; 95% CI = 1.48-3.96), while congestive heart failure had a decreased likelihood (OR = 0.58; 95% CI = 0.32-0.98). Chronic kidney disease and cardiovascular/cerebrovascular disease had weaker associations but were directionally similar to congestive heart failure.Conclusions: Asian patients were at an increased risk of clinical inertia, whereas patients with comorbidities appeared to have their treatment more appropriately intensified. A better understanding of these factors may inform efforts to decrease the likelihood for clinical inertia.


Subject(s)
Diabetes Mellitus, Type 2 , Glycated Hemoglobin/analysis , Medication Therapy Management/standards , Metformin/therapeutic use , Practice Patterns, Physicians'/standards , Diabetes Mellitus, Type 2/blood , Diabetes Mellitus, Type 2/drug therapy , Diabetes Mellitus, Type 2/epidemiology , Electronic Health Records/statistics & numerical data , Female , Health Services Needs and Demand , Humans , Hypoglycemic Agents/therapeutic use , Male , Middle Aged , Retrospective Studies , Risk Factors , United States/epidemiology
SELECTION OF CITATIONS
SEARCH DETAIL
...