Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 95
Filter
1.
Comput Biol Med ; 168: 107754, 2024 01.
Article in English | MEDLINE | ID: mdl-38016372

ABSTRACT

Hospital-acquired pressure injury is one of the most harmful events in clinical settings. Patients who do not receive early prevention and treatment can experience a significant financial burden and physical trauma. Several hospital-acquired pressure injury prediction algorithms have been developed to tackle this problem, but these models assume a consensus, gold-standard label (i.e., presence of pressure injury or not) is present for all training data. Existing definitions for identifying hospital-acquired pressure injuries are inconsistent due to the lack of high-quality documentation surrounding pressure injuries. To address this issue, we propose in this paper an ensemble-based algorithm that leverages truth inference methods to resolve label inconsistencies between various case definitions and the level of disagreements in annotations. Application of our method to MIMIC-III, a publicly available intensive care unit dataset, gives empirical results that illustrate the promise of learning a prediction model using truth inference-based labels and observed conflict among annotators.


Subject(s)
Pressure Ulcer , Humans , Pressure Ulcer/diagnosis , Algorithms , Intensive Care Units , Hospitals
2.
Comput Inform Nurs ; 42(3): 184-192, 2024 Mar 01.
Article in English | MEDLINE | ID: mdl-37607706

ABSTRACT

Incidence of hospital-acquired pressure injury, a key indicator of nursing quality, is directly proportional to adverse outcomes, increased hospital stays, and economic burdens on patients, caregivers, and society. Thus, predicting hospital-acquired pressure injury is important. Prediction models use structured data more often than unstructured notes, although the latter often contain useful patient information. We hypothesize that unstructured notes, such as nursing notes, can predict hospital-acquired pressure injury. We evaluate the impact of using various natural language processing packages to identify salient patient information from unstructured text. We use named entity recognition to identify keywords, which comprise the feature space of our classifier for hospital-acquired pressure injury prediction. We compare scispaCy and Stanza, two different named entity recognition models, using unstructured notes in Medical Information Mart for Intensive Care III, a publicly available ICU data set. To assess the impact of vocabulary size reduction, we compare the use of all clinical notes with only nursing notes. Our results suggest that named entity recognition extraction using nursing notes can yield accurate models. Moreover, the extracted keywords play a significant role in the prediction of hospital-acquired pressure injury.


Subject(s)
Natural Language Processing , Pressure Ulcer , Humans , Pressure Ulcer/diagnosis , Critical Care , Hospitals
3.
Cancer Causes Control ; 35(5): 749-760, 2024 May.
Article in English | MEDLINE | ID: mdl-38145439

ABSTRACT

INTRODUCTION: The NIH All of Us Research Program has enrolled over 544,000 participants across the US with unprecedented racial/ethnic diversity, offering opportunities to investigate myriad exposures and diseases. This paper aims to investigate the association between PM2.5 exposure and cancer risks. MATERIALS AND METHODS: This work was performed on data from 409,876 All of Us Research Program participants using the All of Us Researcher Workbench. Cancer case ascertainment was performed using data from electronic health records and the self-reported Personal Medical History questionnaire. PM2.5 exposure was retrieved from NASA's Earth Observing System Data and Information Center and assigned using participants' 3-digit zip code prefixes. Multivariate logistic regression was used to estimate the odds ratio (OR) and 95% confidence interval (CI). Generalized additive models (GAMs) were used to investigate non-linear relationships. RESULTS: A total of 33,387 participants and 46,176 prevalent cancer cases were ascertained from participant EHR data, while 20,297 cases were ascertained from self-reported survey data from 18,133 participants; 9,502 cancer cases were captured in both the EHR and survey data. Average PM2.5 level from 2007 to 2016 was 8.90 µg/m3 (min 2.56, max 15.05). In analysis of cancer cases from EHR, an increased odds for breast cancer (OR 1.17, 95% CI 1.09-1.25), endometrial cancer (OR 1.33, 95% CI 1.09-1.62) and ovarian cancer (OR 1.20, 95% CI 1.01-1.42) in the 4th quartile of exposure compared to the 1st. In GAM, higher PM2.5 concentration was associated with increased odds for blood cancer, bone cancer, brain cancer, breast cancer, colon and rectum cancer, endocrine system cancer, lung cancer, pancreatic cancer, prostate cancer, and thyroid cancer. CONCLUSIONS: We found evidence of an association of PM2.5 with breast, ovarian, and endometrial cancers. There is little to no prior evidence in the literature on the impact of PM2.5 on risk of these cancers, warranting further investigation.


Subject(s)
Neoplasms , Humans , Female , Male , Neoplasms/epidemiology , Neoplasms/etiology , United States/epidemiology , Middle Aged , Adult , Air Pollution/adverse effects , Air Pollution/analysis , Risk Factors , Aged , Particulate Matter/adverse effects , Particulate Matter/analysis , Environmental Exposure/adverse effects , Young Adult
4.
Article in English | MEDLINE | ID: mdl-37332899

ABSTRACT

Aims: Various cardiovascular risk prediction models have been developed for patients with type 2 diabetes mellitus. Yet few models have been validated externally. We perform a comprehensive validation of existing risk models on a heterogeneous population of patients with type 2 diabetes using secondary analysis of electronic health record data. Methods: Electronic health records of 47,988 patients with type 2 diabetes between 2013 and 2017 were used to validate 16 cardiovascular risk models, including 5 that had not been compared previously, to estimate the 1-year risk of various cardiovascular outcomes. Discrimination and calibration were assessed by the c-statistic and the Hosmer-Lemeshow goodness-of-fit statistic, respectively. Each model was also evaluated based on the missing measurement rate. Sub-analysis was performed to determine the impact of race on discrimination performance. Results: There was limited discrimination (c-statistics ranged from 0.51 to 0.67) across the cardiovascular risk models. Discrimination generally improved when the model was tailored towards the individual outcome. After recalibration of the models, the Hosmer-Lemeshow statistic yielded p-values above 0.05. However, several of the models with the best discrimination relied on measurements that were often imputed (up to 39% missing). Conclusion: No single prediction model achieved the best performance on a full range of cardiovascular endpoints. Moreover, several of the highest-scoring models relied on variables with high missingness frequencies such as HbA1c and cholesterol that necessitated data imputation and may not be as useful in practice. An open-source version of our developed Python package, cvdm, is available for comparisons using other data sources.

5.
AMIA Jt Summits Transl Sci Proc ; 2023: 582-591, 2023.
Article in English | MEDLINE | ID: mdl-37350881

ABSTRACT

Electronic health records (EHR) data contain rich information about patients' health conditions including diagnosis, procedures, medications and etc., which have been widely used to facilitate digital medicine. Despite its importance, it is often non-trivial to learn useful representations for patients' visits that support downstream clinical predictions, as each visit contains massive and diverse medical codes. As a result, the complex interactions among medical codes are often not captured, which leads to substandard predictions. To better model these complex relations, we leverage hypergraphs, which go beyond pairwise relations to jointly learn the representations for visits and medical codes. We also propose to use the self-attention mechanism to automatically identify the most relevant medical codes for each visit based on the downstream clinical predictions with better generalization power. Experiments on two EHR datasets show that our proposed method not only yields superior performance, but also provides reasonable insights towards the target tasks.

6.
Otolaryngol Head Neck Surg ; 168(6): 1279-1288, 2023 06.
Article in English | MEDLINE | ID: mdl-36939620

ABSTRACT

OBJECTIVE: In primary parotid gland malignancies, the incidence of level-specific cervical lymph node metastasis in clinically node-positive necks remains unclear. This study aimed to determine the incidence of level-specific cervical node metastasis in clinically node-negative (cN0) and node-positive (cN+) patients who presented with primary parotid malignancies. DATA SOURCES: Electronic databases (MEDLINE, EMBASE, PubMed, Cochrane). REVIEW METHODS: Random-effects meta-analysis was used to calculate pooled estimate incidence of level-specific nodal metastasis for parotid malignancies with 95% confidence intervals (CIs). Subgroup analyses of cN0 and cN+ were performed. RESULTS: Thirteen publications consisting of 818 patients were included. The overall incidence of cervical nodal involvement in all neck dissections was 47% (95% CI, 31%-63%). Among those who were cN+, the incidence of nodal positivity was 89% (95% CI, 75%-98%). Those who were cN0 had an incidence of 32% (95% CI, 14%-53%). In cN+ patients, the incidence of nodal metastasis was high at all levels (level I 33%, level II 73%, level III 48%, level IV 39%, and level V 37%). In cN0 patients, the incidence of nodal metastasis was highest at levels II (28%) and III (11%). CONCLUSION: For primary parotid malignancies, the incidence of occult metastases was 32% compared to 89% in a clinically positive neck. It is recommended that individuals with a primary parotid malignancy requiring elective treatment of the neck have a selective neck dissection which involves levels II to III, with the inclusion of level IV based on clinical judgment. Those undergoing a therapeutic neck dissection should undergo a comprehensive neck dissection (levels I-V).


Subject(s)
Carcinoma , Parotid Neoplasms , Humans , Parotid Neoplasms/pathology , Parotid Gland/surgery , Incidence , Retrospective Studies , Carcinoma/pathology , Neck Dissection , Lymph Nodes/pathology , Neoplasm Staging
7.
Medicine (Baltimore) ; 102(10): e32859, 2023 Mar 10.
Article in English | MEDLINE | ID: mdl-36897716

ABSTRACT

To determine the hepatitis C virus (HCV) care cascade among persons who were born during 1945 to 1965 and received outpatient care on or after January 2014 at a large academic healthcare system. Deidentified electronic health record data in an existing research database were analyzed for this study. Laboratory test results for HCV antibody and HCV ribonucleic acid (RNA) indicated seropositivity and confirmatory testing. HCV genotyping was used as a proxy for linkage to care. A direct-acting antiviral (DAA) prescription indicated treatment initiation, an undetectable HCV RNA at least 20 weeks after initiation of antiviral treatment indicated a sustained virologic response. Of the 121,807 patients in the 1945 to 1965 birth cohort who received outpatient care between January 1, 2014 and June 30, 2017, 3399 (3%) patients were screened for HCV; 540 (16%) were seropositive. Among the seropositive, 442 (82%) had detectable HCV RNA, 68 (13%) had undetectable HCV RNA, and 30 (6%) lacked HCV RNA testing. Of the 442 viremic patients, 237 (54%) were linked to care, 65 (15%) initiated DAA treatment, and 32 (7%) achieved sustained virologic response. While only 3% were screened for HCV, the seroprevalence was high in the screened sample. Despite the established safety and efficacy of DAAs, only 15% initiated treatment during the study period. To achieve HCV elimination, improved HCV screening and linkage to HCV care and DAA treatment are needed.


Subject(s)
Hepatitis C, Chronic , Hepatitis C , Humans , Hepacivirus/genetics , Antiviral Agents/therapeutic use , Seroepidemiologic Studies , Hepatitis C, Chronic/drug therapy , Hepatitis C/drug therapy , Delivery of Health Care , Sustained Virologic Response , RNA, Viral
8.
JMIR Med Inform ; 11: e40672, 2023 Feb 23.
Article in English | MEDLINE | ID: mdl-36649481

ABSTRACT

BACKGROUND: Patients develop pressure injuries (PIs) in the hospital owing to low mobility, exposure to localized pressure, circulatory conditions, and other predisposing factors. Over 2.5 million Americans develop PIs annually. The Center for Medicare and Medicaid considers hospital-acquired PIs (HAPIs) as the most frequent preventable event, and they are the second most common claim in lawsuits. With the growing use of electronic health records (EHRs) in hospitals, an opportunity exists to build machine learning models to identify and predict HAPI rather than relying on occasional manual assessments by human experts. However, accurate computational models rely on high-quality HAPI data labels. Unfortunately, the different data sources within EHRs can provide conflicting information on HAPI occurrence in the same patient. Furthermore, the existing definitions of HAPI disagree with each other, even within the same patient population. The inconsistent criteria make it impossible to benchmark machine learning methods to predict HAPI. OBJECTIVE: The objective of this project was threefold. We aimed to identify discrepancies in HAPI sources within EHRs, to develop a comprehensive definition for HAPI classification using data from all EHR sources, and to illustrate the importance of an improved HAPI definition. METHODS: We assessed the congruence among HAPI occurrences documented in clinical notes, diagnosis codes, procedure codes, and chart events from the Medical Information Mart for Intensive Care III database. We analyzed the criteria used for the 3 existing HAPI definitions and their adherence to the regulatory guidelines. We proposed the Emory HAPI (EHAPI), which is an improved and more comprehensive HAPI definition. We then evaluated the importance of the labels in training a HAPI classification model using tree-based and sequential neural network classifiers. RESULTS: We illustrate the complexity of defining HAPI, with <13% of hospital stays having at least 3 PI indications documented across 4 data sources. Although chart events were the most common indicator, it was the only PI documentation for >49% of the stays. We demonstrate a lack of congruence across existing HAPI definitions and EHAPI, with only 219 stays having a consensus positive label. Our analysis highlights the importance of our improved HAPI definition, with classifiers trained using our labels outperforming others on a small manually labeled set from nurse annotators and a consensus set in which all definitions agreed on the label. CONCLUSIONS: Standardized HAPI definitions are important for accurately assessing HAPI nursing quality metric and determining HAPI incidence for preventive measures. We demonstrate the complexity of defining an occurrence of HAPI, given the conflicting and incomplete EHR data. Our EHAPI definition has favorable properties, making it a suitable candidate for HAPI classification tasks.

9.
Br J Oral Maxillofac Surg ; 61(1): 101-106, 2023 01.
Article in English | MEDLINE | ID: mdl-36586735

ABSTRACT

The purpose of this study was to determine the relationship of early and delayed tracheostomy decannulation protocols on the length of stay, time to oral feeding and incidence of postoperative complications in patients undergoing microvascular reconstruction for oral cancer. A review of all patients who underwent surgical management of oral squamous cell carcinoma (OSCC) over the study period from 01/07/2017 to 31/06/2021 was performed. Patients who underwent elective tracheostomy as part of their microvascular reconstruction were included. Two cohorts were identified based on distinct postoperative tracheostomy decannulation protocols; early (Within 7 days) and delayed (≥7 days). Time to oral feeding, length of stay and complication rates was determined for both groups for statistical analysis. A total of 103 patients with OSCC were included in the study. The overall complication rate was 35.9% and were more likely in node positive patients (53.7% vs 23.2%; p = 0.003) and in cases where the geniohyoid muscle complex was disrupted during tumour resection (66.7% vs 31.9%; p = 0.026). Early decannulation was significantly associated with shorter length of hospital stay (10 days vs 15 days) and earlier removal of nasogastric feeding tubes (7 vs 10 days). There was no difference in the overall complication rate between the two groups (33.3% vs 37.5%; p = 0.833). Early decannulation in appropriately selected patients is recommended as it significantly reduces the length of hospital stay and aids in early resumption of oral intake. Furthermore, this approach is not associated with increased rates of complications.


Subject(s)
Carcinoma, Squamous Cell , Mouth Neoplasms , Humans , Carcinoma, Squamous Cell/surgery , Length of Stay , Mouth Neoplasms/surgery , Postoperative Complications/epidemiology , Postoperative Complications/etiology , Postoperative Complications/surgery , Retrospective Studies , Tracheostomy/methods
10.
Proc AAAI Conf Artif Intell ; 37(9): 10611-10619, 2023 Jun 27.
Article in English | MEDLINE | ID: mdl-38333625

ABSTRACT

Training deep neural networks (DNNs) with limited supervision has been a popular research topic as it can significantly alleviate the annotation burden. Self-training has been successfully applied in semi-supervised learning tasks, but one drawback of self-training is that it is vulnerable to the label noise from incorrect pseudo labels. Inspired by the fact that samples with similar labels tend to share similar representations, we develop a neighborhood-based sample selection approach to tackle the issue of noisy pseudo labels. We further stabilize self-training via aggregating the predictions from different rounds during sample selection. Experiments on eight tasks show that our proposed method outperforms the strongest self-training baseline with 1.83% and 2.51% performance gain for text and graph datasets on average. Our further analysis demonstrates that our proposed data selection strategy reduces the noise of pseudo labels by 36.8% and saves 57.3% of the time when compared with the best baseline. Our code and appendices will be uploaded to https://github.com/ritaranx/NeST.

11.
Int ACM SIGIR Conf Res Dev Inf Retr ; 2023: 2501-2505, 2023 Jul.
Article in English | MEDLINE | ID: mdl-38352126

ABSTRACT

Scientific document classification is a critical task for a wide range of applications, but the cost of collecting human-labeled data can be prohibitive. We study scientific document classification using label names only. In scientific domains, label names often include domain-specific concepts that may not appear in the document corpus, making it difficult to match labels and documents precisely. To tackle this issue, we propose WanDeR, which leverages dense retrieval to perform matching in the embedding space to capture the semantics of label names. We further design the label name expansion module to enrich its representations. Lastly, a self-training step is used to refine the predictions. The experiments on three datasets show that WanDeR outperforms the best baseline by 11.9%. Our code will be published at https://github.com/ritaranx/wander.

12.
Proc ACM Int Conf Inf Knowl Manag ; 2022: 4470-4474, 2022 Oct.
Article in English | MEDLINE | ID: mdl-36382341

ABSTRACT

With the ever-increasing abundance of biomedical articles, improving the accuracy of keyword search results becomes crucial for ensuring reproducible research. However, keyword extraction for biomedical articles is hard due to the existence of obscure keywords and the lack of a comprehensive benchmark. PubMedAKE is an author-assigned keyword extraction dataset that contains the title, abstract, and keywords of over 843,269 articles from the PubMed open access subset database. This dataset, publicly available on Zenodo, is the largest keyword extraction benchmark with sufficient samples to train neural networks. Experimental results using state-of-the-art baseline methods illustrate the need for developing automatic keyword extraction methods for biomedical literature.

14.
PLoS One ; 17(9): e0272522, 2022.
Article in English | MEDLINE | ID: mdl-36048778

ABSTRACT

INTRODUCTION: The NIH All of Us Research Program will have the scale and scope to enable research for a wide range of diseases, including cancer. The program's focus on diversity and inclusion promises a better understanding of the unequal burden of cancer. Preliminary cancer ascertainment in the All of Us cohort from two data sources (self-reported versus electronic health records (EHR)) is considered. MATERIALS AND METHODS: This work was performed on data collected from the All of Us Research Program's 315,297 enrolled participants to date using the Researcher Workbench, where approved researchers can access and analyze All of Us data on cancer and other diseases. Cancer case ascertainment was performed using data from EHR and self-reported surveys across key factors. Distribution of cancer types and concordance of data sources by cancer site and demographics is analyzed. RESULTS AND DISCUSSION: Data collected from 315,297 participants resulted in 13,298 cancer cases detected in the survey (in 89,261 participants), 23,520 cancer cases detected in the EHR (in 203,813 participants), and 7,123 cancer cases detected across both sources (in 62,497 participants). Key differences in survey completion by race/ethnicity impacted the makeup of cohorts when compared to cancer in the EHR and national NCI SEER data. CONCLUSIONS: This study provides key insight into cancer detection in the All of Us Research Program and points to the existing strengths and limitations of All of Us as a platform for cancer research now and in the future.


Subject(s)
Neoplasms , Population Health , Cohort Studies , Electronic Health Records , Humans , Neoplasms/epidemiology , Surveys and Questionnaires
15.
Proc Mach Learn Res ; 193: 259-278, 2022 Nov.
Article in English | MEDLINE | ID: mdl-37255863

ABSTRACT

Electronic Health Record modeling is crucial for digital medicine. However, existing models ignore higher-order interactions among medical codes and their causal relations towards downstream clinical predictions. To address such limitations, we propose a novel framework CACHE, to provide effective and insightful clinical predictions based on hypergraph representation learning and counterfactual and factual reasoning techniques. Experiments on two real EHR datasets show the superior performance of CACHE. Case studies with a domain expert illustrate a preferred capability of CACHE in generating clinically meaningful interpretations towards the correct predictions.

17.
Infect Control Hosp Epidemiol ; 43(9): 1207-1215, 2022 09.
Article in English | MEDLINE | ID: mdl-34369331

ABSTRACT

OBJECTIVE: To determine the changes in severe acute respiratory coronavirus virus 2 (SARS-CoV-2) serologic status and SARS-CoV-2 infection rates in healthcare workers (HCWs) over 6-months of follow-up. DESIGN: Prospective cohort study. SETTING AND PARTICIPANTS: HCWs in the Chicago area. METHODS: Cohort participants were recruited in May and June 2020 for baseline serology testing (Abbott anti-nucleocapsid IgG) and were then invited for follow-up serology testing 6 months later. Participants completed monthly online surveys that assessed demographics, medical history, coronavirus disease 2019 (COVID-19), and exposures to SARS-CoV-2. The electronic medical record was used to identify SARS-CoV-2 polymerase chain reaction (PCR) positivity during follow-up. Serologic conversion and SARS-CoV-2 infection or possible reinfection rates (cases per 10,000 person days) by antibody status at baseline and follow-up were assessed. RESULTS: In total, 6,510 HCWs were followed for a total of 1,285,395 person days (median follow-up, 216 days). For participants who had baseline and follow-up serology checked, 285 (6.1%) of the 4,681 seronegative participants at baseline seroconverted to positive at follow-up; 138 (48%) of the 263 who were seropositive at baseline were seronegative at follow-up. When analyzed by baseline serostatus alone, 519 (8.4%) of 6,194 baseline seronegative participants had a positive PCR after baseline serology testing (4.25 per 10,000 person days). Of 316 participants who were seropositive at baseline, 8 (2.5%) met criteria for possible SARS-CoV-2 reinfection (ie, PCR positive >90 days after baseline serology) during follow-up, a rate of 1.27 per 10,000 days at risk. The adjusted rate ratio for possible reinfection in baseline seropositive compared to infection in baseline seronegative participants was 0.26 (95% confidence interval, 0.13-0.53). CONCLUSIONS: Seropositivity in HCWs is associated with moderate protection from future SARS-CoV-2 infection.


Subject(s)
COVID-19 , Pneumonia , COVID-19/diagnosis , COVID-19/epidemiology , COVID-19 Testing , Chicago/epidemiology , Cohort Studies , Follow-Up Studies , Health Personnel , Humans , Immunoglobulin G , Prospective Studies , Reinfection , SARS-CoV-2
18.
Infect Control Hosp Epidemiol ; 43(12): 1806-1812, 2022 12.
Article in English | MEDLINE | ID: mdl-34955103

ABSTRACT

OBJECTIVES: Healthcare workers (HCWs) are a high-priority group for coronavirus disease 2019 (COVID-19) vaccination and serve as sources for public information. In this analysis, we assessed vaccine intentions, factors associated with intentions, and change in uptake over time in HCWs. METHODS: A prospective cohort study of COVID-19 seroprevalence was conducted with HCWs in a large healthcare system in the Chicago area. Participants completed surveys from November 25, 2020, to January 9, 2021, and from April 24 to July 12, 2021, on COVID-19 exposures, diagnosis and symptoms, demographics, and vaccination status. RESULTS: Of 4,180 HCWs who responded to a survey, 77.1% indicated that they intended to get the vaccine. In this group, 23.2% had already received at least 1 dose of the vaccine, 17.4% were unsure, and 5.5% reported that they would not get the vaccine. Factors associated with intention or vaccination were being exposed to clinical procedures (vs no procedures: adjusted odds ratio [AOR], 1.39; 95% confidence interval [CI], 1.16-1.65) and having a negative serology test for COVID-19 (vs no test: AOR, 1.46; 95% CI, 1.24-1.73). Nurses (vs physicians: AOR, 0.24; 95% CI, 0.17-0.33), non-Hispanic Black (vs Asians: AOR, 0.35; 95% CI, 0.21-0.59), and women (vs men: AOR, 0.38; 95% CI, 0.30-0.50) had lower odds of intention to get vaccinated. By 6-months follow-up, >90% of those who had previously been unsure were vaccinated, whereas 59.7% of those who previously reported no intention of getting vaccinated, were vaccinated. CONCLUSIONS: COVID-19 vaccination in HCWs was high, but variability in vaccination intention exists. Targeted messaging coupled with vaccine mandates can support uptake.


Subject(s)
COVID-19 Vaccines , COVID-19 , Male , Female , Humans , Longitudinal Studies , Seroepidemiologic Studies , COVID-19 Testing , Prospective Studies , COVID-19/epidemiology , COVID-19/prevention & control , Health Personnel , Vaccination , Delivery of Health Care
19.
Diabetol Metab Syndr ; 13(1): 146, 2021 Dec 18.
Article in English | MEDLINE | ID: mdl-34922618

ABSTRACT

BACKGROUND: Diabetes and hypertension disparities are pronounced among South Asians. There is regional variation in the prevalence of diabetes and hypertension in the US, but it is unknown whether there is variation among South Asians living in the US. The objective of this study was to compare the burden of diabetes and hypertension between South Asian patients receiving care in the health systems of two US cities. METHODS: Cross-sectional analyses were performed using electronic health records (EHR) for 90,137 South Asians receiving care at New York University Langone in New York City (NYC) and 28,868 South Asians receiving care at Emory University (Atlanta). Diabetes was defined as having 2 + encounters with a diagnosis of diabetes, having a diabetes medication prescribed (excluding Acarbose/Metformin), or having 2 + abnormal A1C levels (≥ 6.5%) and 1 + encounter with a diagnosis of diabetes. Hypertension was defined as having 3 + BP readings of systolic BP ≥ 130 mmHg or diastolic BP ≥ 80 mmHg, 2 + encounters with a diagnosis of hypertension, or having an anti-hypertensive medication prescribed. RESULTS: Among South Asian patients at these two large, private health systems, age-adjusted diabetes burden was 10.7% in NYC compared to 6.7% in Atlanta. Age-adjusted hypertension burden was 20.9% in NYC compared to 24.7% in Atlanta. In Atlanta, 75.6% of those with diabetes had comorbid hypertension compared to 46.2% in NYC. CONCLUSIONS: These findings suggest differences by region and sex in diabetes and hypertension risk. Additionally, these results call for better characterization of race/ethnicity in EHRs to identify ethnic subgroup variation, as well as intervention studies to reduce lifestyle exposures that underlie the elevated risk for type 2 diabetes and hypertension development in South Asians.

20.
Adv Databases Inf Syst ; 1450: 50-60, 2021 Aug.
Article in English | MEDLINE | ID: mdl-34604867

ABSTRACT

Sequential pattern mining can be used to extract meaningful sequences from electronic health records. However, conventional sequential pattern mining algorithms that discover all frequent sequential patterns can incur a high computational and be susceptible to noise in the observations. Approximate sequential pattern mining techniques have been introduced to address these shortcomings yet, existing approximate methods fail to reflect the true frequent sequential patterns or only target single-item event sequences. Multi-item event sequences are prominent in healthcare as a patient can have multiple interventions for a single visit. To alleviate these issues, we propose GASP, a graph-based approximate sequential pattern mining, that discovers frequent patterns for multi-item event sequences. Our approach compresses the sequential information into a concise graph structure which has computational benefits. The empirical results on two healthcare datasets suggest that GASP outperforms existing approximate models by improving recoverability and extracts better predictive patterns.

SELECTION OF CITATIONS
SEARCH DETAIL
...