Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 97
Filter
Add more filters

Country/Region as subject
Publication year range
1.
PLoS Genet ; 18(6): e1010193, 2022 06.
Article in English | MEDLINE | ID: mdl-35653334

ABSTRACT

BACKGROUND: Height has been associated with many clinical traits but whether such associations are causal versus secondary to confounding remains unclear in many cases. To systematically examine this question, we performed a Mendelian Randomization-Phenome-wide association study (MR-PheWAS) using clinical and genetic data from a national healthcare system biobank. METHODS AND FINDINGS: Analyses were performed using data from the US Veterans Affairs (VA) Million Veteran Program in non-Hispanic White (EA, n = 222,300) and non-Hispanic Black (AA, n = 58,151) adults in the US. We estimated height genetic risk based on 3290 height-associated variants from a recent European-ancestry genome-wide meta-analysis. We compared associations of measured and genetically-predicted height with phenome-wide traits derived from the VA electronic health record, adjusting for age, sex, and genetic principal components. We found 345 clinical traits associated with measured height in EA and an additional 17 in AA. Of these, 127 were associated with genetically-predicted height at phenome-wide significance in EA and 2 in AA. These associations were largely independent from body mass index. We confirmed several previously described MR associations between height and cardiovascular disease traits such as hypertension, hyperlipidemia, coronary heart disease (CHD), and atrial fibrillation, and further uncovered MR associations with venous circulatory disorders and peripheral neuropathy in the presence and absence of diabetes. As a number of traits associated with genetically-predicted height frequently co-occur with CHD, we evaluated effect modification by CHD status of genetically-predicted height associations with risk factors for and complications of CHD. We found modification of effects of MR associations by CHD status for atrial fibrillation/flutter but not for hypertension, hyperlipidemia, or venous circulatory disorders. CONCLUSIONS: We conclude that height may be an unrecognized but biologically plausible risk factor for several common conditions in adults. However, more studies are needed to reliably exclude horizontal pleiotropy as a driving force behind at least some of the MR associations observed in this study.


Subject(s)
Atrial Fibrillation , Hypertension , Veterans , Adult , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Hypertension/epidemiology , Hypertension/genetics , Polymorphism, Single Nucleotide/genetics
2.
PLoS Genet ; 18(4): e1010113, 2022 04.
Article in English | MEDLINE | ID: mdl-35482673

ABSTRACT

The study aims to determine the shared genetic architecture between COVID-19 severity with existing medical conditions using electronic health record (EHR) data. We conducted a Phenome-Wide Association Study (PheWAS) of genetic variants associated with critical illness (n = 35) or hospitalization (n = 42) due to severe COVID-19 using genome-wide association summary data from the Host Genetics Initiative. PheWAS analysis was performed using genotype-phenotype data from the Veterans Affairs Million Veteran Program (MVP). Phenotypes were defined by International Classification of Diseases (ICD) codes mapped to clinically relevant groups using published PheWAS methods. Among 658,582 Veterans, variants associated with severe COVID-19 were tested for association across 1,559 phenotypes. Variants at the ABO locus (rs495828, rs505922) associated with the largest number of phenotypes (nrs495828 = 53 and nrs505922 = 59); strongest association with venous embolism, odds ratio (ORrs495828 1.33 (p = 1.32 x 10-199), and thrombosis ORrs505922 1.33, p = 2.2 x10-265. Among 67 respiratory conditions tested, 11 had significant associations including MUC5B locus (rs35705950) with increased risk of idiopathic fibrosing alveolitis OR 2.83, p = 4.12 × 10-191; CRHR1 (rs61667602) associated with reduced risk of pulmonary fibrosis, OR 0.84, p = 2.26× 10-12. The TYK2 locus (rs11085727) associated with reduced risk for autoimmune conditions, e.g., psoriasis OR 0.88, p = 6.48 x10-23, lupus OR 0.84, p = 3.97 x 10-06. PheWAS stratified by ancestry demonstrated differences in genotype-phenotype associations. LMNA (rs581342) associated with neutropenia OR 1.29 p = 4.1 x 10-13 among Veterans of African and Hispanic ancestry but not European. Overall, we observed a shared genetic architecture between COVID-19 severity and conditions related to underlying risk factors for severe and poor COVID-19 outcomes. Differing associations between genotype-phenotype across ancestries may inform heterogenous outcomes observed with COVID-19. Divergent associations between risk for severe COVID-19 with autoimmune inflammatory conditions both respiratory and non-respiratory highlights the shared pathways and fine balance of immune host response and autoimmunity and caution required when considering treatment targets.


Subject(s)
COVID-19 , Veterans , COVID-19/epidemiology , COVID-19/genetics , Genetic Association Studies , Genome-Wide Association Study/methods , Humans , Polymorphism, Single Nucleotide/genetics
3.
Bioinformatics ; 39(2)2023 02 03.
Article in English | MEDLINE | ID: mdl-36805623

ABSTRACT

MOTIVATION: Predicting molecule-disease indications and side effects is important for drug development and pharmacovigilance. Comprehensively mining molecule-molecule, molecule-disease and disease-disease semantic dependencies can potentially improve prediction performance. METHODS: We introduce a Multi-Modal REpresentation Mapping Approach to Predicting molecular-disease relations (M2REMAP) by incorporating clinical semantics learned from electronic health records (EHR) of 12.6 million patients. Specifically, M2REMAP first learns a multimodal molecule representation that synthesizes chemical property and clinical semantic information by mapping molecule chemicals via a deep neural network onto the clinical semantic embedding space shared by drugs, diseases and other common clinical concepts. To infer molecule-disease relations, M2REMAP combines multimodal molecule representation and disease semantic embedding to jointly infer indications and side effects. RESULTS: We extensively evaluate M2REMAP on molecule indications, side effects and interactions. Results show that incorporating EHR embeddings improves performance significantly, for example, attaining an improvement over the baseline models by 23.6% in PRC-AUC on indications and 23.9% on side effects. Further, M2REMAP overcomes the limitation of existing methods and effectively predicts drugs for novel diseases and emerging pathogens. AVAILABILITY AND IMPLEMENTATION: The code is available at https://github.com/celehs/M2REMAP, and prediction results are provided at https://shiny.parse-health.org/drugs-diseases-dev/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Drug-Related Side Effects and Adverse Reactions , Humans , Drug Development , Electronic Health Records , Neural Networks, Computer , Pharmacovigilance
4.
J Nutr ; 154(3): 886-895, 2024 03.
Article in English | MEDLINE | ID: mdl-38163586

ABSTRACT

BACKGROUND: Red meat consumption was associated with an increased risk of cardiovascular disease (CVD) in prospective cohort studies and a profile of biomarkers favoring high CVD risk in short-term controlled trials. However, several recent systematic reviews and meta-analyses concluded with no or weak evidence for limiting red meat intake. OBJECTIVES: To prospectively examine the associations between red meat intake and incident CVD in an ongoing cohort study with diverse socioeconomic and racial or ethnic backgrounds. METHODS: Our study included 148,506 participants [17,804 female (12.0%)] who were free of cancer, diabetes, and CVD at baseline from the Million Veteran Program. A food frequency questionnaire measured red meat intakes at baseline. Nonfatal myocardial infarction and acute ischemic stroke were identified through a high-throughput phenotyping algorithm, and fatal CVD events were identified by searching the National Death Index. RESULTS: Comparing the extreme categories of intake, the multivariate-adjusted relative risks of CVD was 1.18 (95% CI: 1.01, 1.38; P-trend < 0.0001) for total red meat, 1.14 (95% CI: 0.96, 1.36; P-trend = 0.01) for unprocessed red meat, and 1.29 (95% CI: 1.04, 1.60; P-trend = 0.003) for processed red meat. We observed a more pronounced positive association between red meat intake and CVD in African American participants than in White participants (P-interaction = 0.01). Replacing 0.5 servings/d of red meat with 0.5 servings/d of nuts, whole grains, and skimmed milk was associated with 14% (RR: 0.86; 95% CI: 0.83, 0.90), 7% (RR: 0.93; 95% CI: 0.89, 0.96), and 4% (RR: 0.96; 95% CI: 0.94, 0.99) lower risks of CVD, respectively. CONCLUSIONS: Red meat consumption is associated with an increased risk of CVD. Our findings support lowering red meat intake and replacing red meat with plant-based protein sources or low-fat dairy foods as a key dietary recommendation for the prevention of CVD.


Subject(s)
Cardiovascular Diseases , Ischemic Stroke , Red Meat , Veterans , Humans , Cardiovascular Diseases/epidemiology , Cardiovascular Diseases/etiology , Prospective Studies , Cohort Studies , Ischemic Stroke/complications , Risk Factors , Diet , Meat/adverse effects , Red Meat/adverse effects
5.
Mol Psychiatry ; 28(3): 1293-1302, 2023 03.
Article in English | MEDLINE | ID: mdl-36543923

ABSTRACT

While genome wide association studies (GWASs) of Alzheimer's Disease (AD) in European (EUR) ancestry cohorts have identified approximately 83 potentially independent AD risk loci, progress in non-European populations has lagged. In this study, data from the Million Veteran Program (MVP), a biobank which includes genetic data from more than 650,000 US Veteran participants, was used to examine dementia genetics in an African descent (AFR) cohort. A GWAS of Alzheimer's disease and related dementias (ADRD), an expanded AD phenotype including dementias such as vascular and non-specific dementia that included 4012 cases and 18,435 controls age 60+ in AFR MVP participants was performed. A proxy dementia GWAS based on survey-reported parental AD or dementia (n = 4385 maternal cases, 2256 paternal cases, and 45,970 controls) was also performed. These two GWASs were meta-analyzed, and then subsequently compared and meta-analyzed with the results from a previous AFR AD GWAS from the Alzheimer's Disease Genetics Consortium (ADGC). A meta-analysis of common variants across the MVP ADRD and proxy GWASs yielded GWAS significant associations in the region of APOE (p = 2.48 × 10-101), in ROBO1 (rs11919682, p = 1.63 × 10-8), and RNA RP11-340A13.2 (rs148433063, p = 8.56 × 10-9). The MVP/ADGC meta-analysis yielded additional significant SNPs near known AD risk genes TREM2 (rs73427293, p = 2.95 × 10-9), CD2AP (rs7738720, p = 1.14 × 10-9), and ABCA7 (rs73505251, p = 3.26 × 10-10), although the peak variants observed in these genes differed from those previously reported in EUR and AFR cohorts. Of the genes in or near suggestive or genome-wide significant associated variants, nine (CDA, SH2D5, DCBLD1, EML6, GOPC, ABCA7, ROS1, TMCO4, and TREM2) were differentially expressed in the brains of AD cases and controls. This represents the largest AFR GWAS of AD and dementia, finding non-APOE GWAS-significant common SNPs associated with dementia. Increasing representation of AFR participants is an important priority in genetic studies and may lead to increased insight into AD pathophysiology and reduce health disparities.


Subject(s)
Alzheimer Disease , Black or African American , Military Personnel , Aged , Humans , Middle Aged , Alzheimer Disease/epidemiology , Alzheimer Disease/ethnology , Alzheimer Disease/genetics , Black or African American/genetics , Black or African American/statistics & numerical data , Databases, Genetic/statistics & numerical data , Dementia/epidemiology , Dementia/ethnology , Dementia/genetics , Gene Expression Profiling , Genome-Wide Association Study , Genotype , Military Personnel/statistics & numerical data , Polymorphism, Genetic , United States/epidemiology , Genetic Predisposition to Disease/epidemiology , Genetic Predisposition to Disease/ethnology , Genetic Predisposition to Disease/genetics
6.
Occup Environ Med ; 81(10): 522-528, 2024 Oct 23.
Article in English | MEDLINE | ID: mdl-39327043

ABSTRACT

OBJECTIVE: We aimed to characterise self-reported military and occupational exposures including Agent Orange, chemical/biological warfare agents, solvents, fuels, pesticides, metals and burn pits among Veterans in the Department of Veterans Affairs Million Veteran Program (MVP). METHODS: MVP is an ongoing longitudinal cohort and mega-biobank of over one million US Veterans. Over 500 000 MVP participants reported military exposures on the baseline survey, and over 300 000 reported occupational exposures on the lifestyle survey. We determined frequencies of selected self-reported occupational exposures by service era, specific deployment operation (1990-1991 Gulf War, Operation Enduring Freedom/Operation Iraqi Freedom (OEF/OIF)), service in a combat zone and occupational categories. We also explored differences in self-reported exposures by sex and race. RESULTS: Agent Orange exposure was mainly reported by Vietnam-era Veterans. Gulf War and OEF/OIF Veterans deployed to a combat zone were more likely to report exposures to burn pits, chemical/biological weapons, anthrax vaccination and pyridostigmine bromide pill intake as compared with non-combat deployers and those not deployed. Occupational categories related to combat (infantry, combat engineer and helicopter pilot) often had the highest percentages of self-reported exposures, whereas those in healthcare-related occupations (dentists, physicians and occupational therapists) tended to report exposures much less often. Self-reported exposures also varied by race and sex. CONCLUSIONS: Our results demonstrate that the distribution of self-reported exposures varied by service era, demographics, deployment, combat experience and military occupation in MVP. Overall, the pattern of findings was consistent with previous population-based studies of US military Veterans.


Subject(s)
Occupational Exposure , Self Report , Veterans , Humans , Occupational Exposure/adverse effects , Occupational Exposure/statistics & numerical data , Male , Veterans/statistics & numerical data , Female , United States/epidemiology , Adult , Middle Aged , Pesticides , Agent Orange , Longitudinal Studies , Iraq War, 2003-2011 , Afghan Campaign 2001- , Chemical Warfare Agents , Gulf War , Military Personnel/statistics & numerical data , United States Department of Veterans Affairs/statistics & numerical data , Polychlorinated Dibenzodioxins
7.
Am J Respir Crit Care Med ; 206(10): 1220-1229, 2022 11 15.
Article in English | MEDLINE | ID: mdl-35771531

ABSTRACT

Rationale: A common MUC5B gene polymorphism, rs35705950-T, is associated with idiopathic pulmonary fibrosis (IPF), but its role in severe acute respiratory syndrome coronavirus 2 infection and disease severity is unclear. Objectives: To assess whether rs35705950-T confers differential risk for clinical outcomes associated with coronavirus disease (COVID-19) infection among participants in the Million Veteran Program (MVP). Methods: The MUC5B rs35705950-T allele was directly genotyped among MVP participants; clinical events and comorbidities were extracted from the electronic health records. Associations between the incidence or severity of COVID-19 and rs35705950-T were analyzed within each ancestry group in the MVP followed by transancestry meta-analysis. Replication and joint meta-analysis were conducted using summary statistics from the COVID-19 Host Genetics Initiative (HGI). Sensitivity analyses with adjustment for additional covariates (body mass index, Charlson comorbidity index, smoking, asbestosis, rheumatoid arthritis with interstitial lung disease, and IPF) and associations with post-COVID-19 pneumonia were performed in MVP subjects. Measurements and Main Results: The rs35705950-T allele was associated with fewer COVID-19 hospitalizations in transancestry meta-analyses within the MVP (Ncases = 4,325; Ncontrols = 507,640; OR = 0.89 [0.82-0.97]; P = 6.86 × 10-3) and joint meta-analyses with the HGI (Ncases = 13,320; Ncontrols = 1,508,841; OR, 0.90 [0.86-0.95]; P = 8.99 × 10-5). The rs35705950-T allele was not associated with reduced COVID-19 positivity in transancestry meta-analysis within the MVP (Ncases = 19,168/Ncontrols = 492,854; OR, 0.98 [0.95-1.01]; P = 0.06) but was nominally significant (P < 0.05) in the joint meta-analysis with the HGI (Ncases = 44,820; Ncontrols = 1,775,827; OR, 0.97 [0.95-1.00]; P = 0.03). Associations were not observed with severe outcomes or mortality. Among individuals of European ancestry in the MVP, rs35705950-T was associated with fewer post-COVID-19 pneumonia events (OR, 0.82 [0.72-0.93]; P = 0.001). Conclusions: The MUC5B variant rs35705950-T may confer protection in COVID-19 hospitalizations.


Subject(s)
COVID-19 , Idiopathic Pulmonary Fibrosis , Humans , COVID-19/epidemiology , COVID-19/genetics , Mucin-5B/genetics , Polymorphism, Genetic , Idiopathic Pulmonary Fibrosis/genetics , Genotype , Hospitalization , Genetic Predisposition to Disease/genetics
8.
PLoS Genet ; 16(3): e1008684, 2020 03.
Article in English | MEDLINE | ID: mdl-32226016

ABSTRACT

Lipid levels are important markers for the development of cardio-metabolic diseases. Although hundreds of associated loci have been identified through genetic association studies, the contribution of genetic factors to variation in lipids is not fully understood, particularly in U.S. minority groups. We performed genome-wide association analyses for four lipid traits in over 45,000 ancestrally diverse participants from the Population Architecture using Genomics and Epidemiology (PAGE) Study, followed by a meta-analysis with several European ancestry studies. We identified nine novel lipid loci, five of which showed evidence of replication in independent studies. Furthermore, we discovered one novel gene in a PrediXcan analysis, minority-specific independent signals at eight previously reported loci, and potential functional variants at two known loci through fine-mapping. Systematic examination of known lipid loci revealed smaller effect estimates in African American and Hispanic ancestry populations than those in Europeans, and better performance of polygenic risk scores based on minority-specific effect estimates. Our findings provide new insight into the genetic architecture of lipid traits and highlight the importance of conducting genetic studies in diverse populations in the era of precision medicine.


Subject(s)
Lipids/blood , Lipids/genetics , Racial Groups/genetics , Databases, Genetic , Female , Genome-Wide Association Study/methods , Genotype , Humans , Lipids/analysis , Male , Metagenomics/methods , Minority Groups , Multifactorial Inheritance/genetics , Phenotype , Polymorphism, Single Nucleotide/genetics , United States/epidemiology
9.
J Infect Dis ; 226(12): 2113-2117, 2022 12 13.
Article in English | MEDLINE | ID: mdl-35512327

ABSTRACT

In this retrospective cohort study of 94 595 severe acute respiratory syndrome coronavirus 2-positive cases, we developed and validated an algorithm to assess the association between coronavirus disease 2019 (COVID-19) severity and long-term complications (stroke, myocardial infarction, pulmonary embolism/deep vein thrombosis, heart failure, and mortality). COVID-19 severity was associated with a greater risk of experiencing a long-term complication 31-120 days postinfection. Most incident events occurred 31-60 days postinfection and diminished after day 91, except heart failure for severe patients and death for moderate patients, which peaked on days 91-120. Understanding the differential impact of COVID-19 severity on long-term events provides insight into possible intervention modalities and critical prevention strategies.


Subject(s)
COVID-19 , Heart Failure , Veterans , Humans , United States/epidemiology , Retrospective Studies
10.
J Biomed Inform ; 132: 104109, 2022 08.
Article in English | MEDLINE | ID: mdl-35660521

ABSTRACT

OBJECTIVE: Accurately assigning phenotype information to individual patients via computational phenotyping using Electronic Health Records (EHRs) has been seen as the first step towards enabling EHRs for precision medicine research. Chart review labels annotated by clinical experts, also known as "gold standard" labels, are essential for the development and validation of computational phenotyping algorithms. However, given the complexity of EHR systems, the process of chart review is both labor intensive and time consuming. We propose a fully automated algorithm, referred to as pGUESS, to rank EHR notes according to their relevance to a given phenotype. By identifying the most relevant notes, pGUESS can greatly improve the efficiency and accuracy of chart reviews. METHOD: pGUESS uses prior guided semantic similarity to measure the informativeness of a clinical note to a given phenotype. We first select candidate clinical concepts from a pool of comprehensive medical concepts using public knowledge sources and then derive the semantic embedding vector (SEV) for a reference article (SEVref) and each note (SEVnote). The algorithm scores the relevance of a note as the cosine similarity between SEVnote and SEVref. RESULTS: The algorithm was validated against four sets of 200 notes that were manually annotated by clinical experts to assess their informativeness to one of three disease phenotypes. pGUESS algorithm substantially outperforms existing unsupervised approaches for classifying the relevance status with respect to both accuracy and scalability across phenotypes. Averaging over the three phenotypes, the rank correlation between the algorithm ranking and gold standard label was 0.64 for pGUESS, but only 0.47 and 0.35 for the next two best performing algorithms. pGUESS is also much more computationally scalable compared to existing algorithms. CONCLUSION: pGUESS algorithm can substantially reduce the burden of chart review and holds potential in improving the efficiency and accuracy of human annotation.


Subject(s)
Algorithms , Semantics , Electronic Health Records , Humans , Natural Language Processing , Phenotype , Precision Medicine
11.
J Biomed Inform ; 133: 104147, 2022 09.
Article in English | MEDLINE | ID: mdl-35872266

ABSTRACT

OBJECTIVE: The growing availability of electronic health records (EHR) data opens opportunities for integrative analysis of multi-institutional EHR to produce generalizable knowledge. A key barrier to such integrative analyses is the lack of semantic interoperability across different institutions due to coding differences. We propose a Multiview Incomplete Knowledge Graph Integration (MIKGI) algorithm to integrate information from multiple sources with partially overlapping EHR concept codes to enable translations between healthcare systems. METHODS: The MIKGI algorithm combines knowledge graph information from (i) embeddings trained from the co-occurrence patterns of medical codes within each EHR system and (ii) semantic embeddings of the textual strings of all medical codes obtained from the Self-Aligning Pretrained BERT (SAPBERT) algorithm. Due to the heterogeneity in the coding across healthcare systems, each EHR source provides partial coverage of the available codes. MIKGI synthesizes the incomplete knowledge graphs derived from these multi-source embeddings by minimizing a spherical loss function that combines the pairwise directional similarities of embeddings computed from all available sources. MIKGI outputs harmonized semantic embedding vectors for all EHR codes, which improves the quality of the embeddings and enables direct assessment of both similarity and relatedness between any pair of codes from multiple healthcare systems. RESULTS: With EHR co-occurrence data from Veteran Affairs (VA) healthcare and Mass General Brigham (MGB), MIKGI algorithm produces high quality embeddings for a variety of downstream tasks including detecting known similar or related entity pairs and mapping VA local codes to the relevant EHR codes used at MGB. Based on the cosine similarity of the MIKGI trained embeddings, the AUC was 0.918 for detecting similar entity pairs and 0.809 for detecting related pairs. For cross-institutional medical code mapping, the top 1 and top 5 accuracy were 91.0% and 97.5% when mapping medication codes at VA to RxNorm medication codes at MGB; 59.1% and 75.8% when mapping VA local laboratory codes to LOINC hierarchy. When trained with 500 labels, the lab code mapping attained top 1 and 5 accuracy at 77.7% and 87.9%. MIKGI also attained best performance in selecting VA local lab codes for desired laboratory tests and COVID-19 related features for COVID EHR studies. Compared to existing methods, MIKGI attained the most robust performance with accuracy the highest or near the highest across all tasks. CONCLUSIONS: The proposed MIKGI algorithm can effectively integrate incomplete summary data from biomedical text and EHR data to generate harmonized embeddings for EHR codes for knowledge graph modeling and cross-institutional translation of EHR codes.


Subject(s)
COVID-19 , Electronic Health Records , Algorithms , Humans , Logical Observation Identifiers Names and Codes , Pattern Recognition, Automated
12.
J Biomed Inform ; 134: 104176, 2022 10.
Article in English | MEDLINE | ID: mdl-36007785

ABSTRACT

OBJECTIVE: For multi-center heterogeneous Real-World Data (RWD) with time-to-event outcomes and high-dimensional features, we propose the SurvMaximin algorithm to estimate Cox model feature coefficients for a target population by borrowing summary information from a set of health care centers without sharing patient-level information. MATERIALS AND METHODS: For each of the centers from which we want to borrow information to improve the prediction performance for the target population, a penalized Cox model is fitted to estimate feature coefficients for the center. Using estimated feature coefficients and the covariance matrix of the target population, we then obtain a SurvMaximin estimated set of feature coefficients for the target population. The target population can be an entire cohort comprised of all centers, corresponding to federated learning, or a single center, corresponding to transfer learning. RESULTS: Simulation studies and a real-world international electronic health records application study, with 15 participating health care centers across three countries (France, Germany, and the U.S.), show that the proposed SurvMaximin algorithm achieves comparable or higher accuracy compared with the estimator using only the information of the target site and other existing methods. The SurvMaximin estimator is robust to variations in sample sizes and estimated feature coefficients between centers, which amounts to significantly improved estimates for target sites with fewer observations. CONCLUSIONS: The SurvMaximin method is well suited for both federated and transfer learning in the high-dimensional survival analysis setting. SurvMaximin only requires a one-time summary information exchange from participating centers. Estimated regression vectors can be very heterogeneous. SurvMaximin provides robust Cox feature coefficient estimates without outcome information in the target population and is privacy-preserving.


Subject(s)
Algorithms , Electronic Health Records , Humans , Privacy , Proportional Hazards Models , Survival Analysis
13.
Public Health Nutr ; : 1-38, 2022 Mar 21.
Article in English | MEDLINE | ID: mdl-35307047

ABSTRACT

OBJECTIVE: To examine the associations between adherence to plant-based diets and mortality. DESIGN: prospective study. We calculated a plant-based diet index (PDI) by assigning positive scores to plant foods and reverse scores to animal foods. We also created a healthful PDI (hPDI) and an unhealthful PDI (uPDI) by further separate the healthy plant foods from less-healthy plant foods. SETTING: the VA Million Veteran Program. PARTICIPANTS: 315,919 men and women aged 19 to 104 years who completed a food frequency questionnaire at the baseline. RESULTS: We documented 31,136 deaths during the follow-up. A higher PDI was significantly associated with lower total mortality [hazard ratio (HR) comparing extreme deciles =0.75, 95% confidence interval (CI): 0.71 to 0.79, Ptrend <0.001]. We observed an inverse association between hPDI and total mortality (HR comparing extreme deciles =0.64, 95% CI: 0.61 to 0.68, Ptrend <0.001), whereas uPDI was positively associated with total mortality (HR comparing extreme deciles =1.41, 95% CI: 1.33 to 1.49, Ptrend <0.001). Similar significant associations of PDI, hPDI, and uPDI were also observed for CVD and cancer mortality. The associations between the plant-based diet indices and total mortality were consistent among African and European American participants, and participants free from CVD and cancer and those who were diagnosed with major chronic disease at baseline. CONCLUSIONS: A greater adherence to a plant-based diet was associated with substantially lower total mortality in this large population of veterans. These findings support recommending plant-rich dietary patterns for the prevention of major chronic diseases.

14.
J Infect Dis ; 224(6): 967-975, 2021 09 17.
Article in English | MEDLINE | ID: mdl-34153099

ABSTRACT

BACKGROUND: Early convalescent plasma transfusion may reduce mortality in patients with nonsevere coronavirus disease 2019 (COVID-19). METHODS: This study emulates a (hypothetical) target trial using observational data from a cohort of US veterans admitted to a Department of Veterans Affairs (VA) facility between 1 May and 17 November 2020 with nonsevere COVID-19. The intervention was convalescent plasma initiated within 2 days of eligibility. Thirty-day mortality was compared using cumulative incidence curves, risk differences, and hazard ratios estimated from pooled logistic models with inverse probability weighting to adjust for confounding. RESULTS: Of 11 269 eligible person-trials contributed by 4755 patients, 402 trials were assigned to the convalescent plasma group. Forty and 671 deaths occurred within the plasma and nonplasma groups, respectively. The estimated 30-day mortality risk was 6.5% (95% confidence interval [CI], 4.0%-9.7%) in the plasma group and 6.2% (95% CI, 5.6%-7.0%) in the nonplasma group. The associated risk difference was 0.30% (95% CI, -2.30% to 3.60%) and the hazard ratio was 1.04 (95% CI, .64-1.62). CONCLUSIONS: Our target trial emulation estimated no meaningful differences in 30-day mortality between nonsevere COVID-19 patients treated and untreated with convalescent plasma. Clinical Trials Registration. NCT04545047.


Subject(s)
Blood Component Transfusion , COVID-19/mortality , COVID-19/therapy , Immunization, Passive , Plasma , Adult , Aged , Aged, 80 and over , Female , Hospitalization , Humans , Male , Middle Aged , Treatment Outcome , United States/epidemiology , Veterans , Young Adult , COVID-19 Serotherapy
15.
Am J Epidemiol ; 190(11): 2405-2419, 2021 11 02.
Article in English | MEDLINE | ID: mdl-34165150

ABSTRACT

Hydroxychloroquine (HCQ) was proposed as an early therapy for coronavirus disease 2019 (COVID-19) after in vitro studies indicated possible benefit. Previous in vivo observational studies have presented conflicting results, though recent randomized clinical trials have reported no benefit from HCQ among patients hospitalized with COVID-19. We examined the effects of HCQ alone and in combination with azithromycin in a hospitalized population of US veterans with COVID-19, using a propensity score-adjusted survival analysis with imputation of missing data. According to electronic health record data from the US Department of Veterans Affairs health care system, 64,055 US Veterans were tested for the virus that causes COVID-19 between March 1, 2020 and April 30, 2020. Of the 7,193 veterans who tested positive, 2,809 were hospitalized, and 657 individuals were prescribed HCQ within the first 48-hours of hospitalization for the treatment of COVID-19. There was no apparent benefit associated with HCQ receipt, alone or in combination with azithromycin, and there was an increased risk of intubation when HCQ was used in combination with azithromycin (hazard ratio = 1.55; 95% confidence interval: 1.07, 2.24). In conclusion, we assessed the effectiveness of HCQ with or without azithromycin in treatment of patients hospitalized with COVID-19, using a national sample of the US veteran population. Using rigorous study design and analytic methods to reduce confounding and bias, we found no evidence of a survival benefit from the administration of HCQ.


Subject(s)
Anti-Bacterial Agents/therapeutic use , Azithromycin/therapeutic use , COVID-19 Drug Treatment , Hospitalization/statistics & numerical data , Hydroxychloroquine/therapeutic use , Veterans/statistics & numerical data , Aged , Aged, 80 and over , Anti-Bacterial Agents/adverse effects , Azithromycin/adverse effects , COVID-19/mortality , Drug Therapy, Combination , Female , Humans , Hydroxychloroquine/adverse effects , Intention to Treat Analysis , Machine Learning , Male , Middle Aged , Pharmacoepidemiology , Retrospective Studies , SARS-CoV-2 , Treatment Outcome , United States/epidemiology
17.
J Med Internet Res ; 23(10): e31400, 2021 10 11.
Article in English | MEDLINE | ID: mdl-34533459

ABSTRACT

BACKGROUND: Many countries have experienced 2 predominant waves of COVID-19-related hospitalizations. Comparing the clinical trajectories of patients hospitalized in separate waves of the pandemic enables further understanding of the evolving epidemiology, pathophysiology, and health care dynamics of the COVID-19 pandemic. OBJECTIVE: In this retrospective cohort study, we analyzed electronic health record (EHR) data from patients with SARS-CoV-2 infections hospitalized in participating health care systems representing 315 hospitals across 6 countries. We compared hospitalization rates, severe COVID-19 risk, and mean laboratory values between patients hospitalized during the first and second waves of the pandemic. METHODS: Using a federated approach, each participating health care system extracted patient-level clinical data on their first and second wave cohorts and submitted aggregated data to the central site. Data quality control steps were adopted at the central site to correct for implausible values and harmonize units. Statistical analyses were performed by computing individual health care system effect sizes and synthesizing these using random effect meta-analyses to account for heterogeneity. We focused the laboratory analysis on C-reactive protein (CRP), ferritin, fibrinogen, procalcitonin, D-dimer, and creatinine based on their reported associations with severe COVID-19. RESULTS: Data were available for 79,613 patients, of which 32,467 were hospitalized in the first wave and 47,146 in the second wave. The prevalence of male patients and patients aged 50 to 69 years decreased significantly between the first and second waves. Patients hospitalized in the second wave had a 9.9% reduction in the risk of severe COVID-19 compared to patients hospitalized in the first wave (95% CI 8.5%-11.3%). Demographic subgroup analyses indicated that patients aged 26 to 49 years and 50 to 69 years; male and female patients; and black patients had significantly lower risk for severe disease in the second wave than in the first wave. At admission, the mean values of CRP were significantly lower in the second wave than in the first wave. On the seventh hospital day, the mean values of CRP, ferritin, fibrinogen, and procalcitonin were significantly lower in the second wave than in the first wave. In general, countries exhibited variable changes in laboratory testing rates from the first to the second wave. At admission, there was a significantly higher testing rate for D-dimer in France, Germany, and Spain. CONCLUSIONS: Patients hospitalized in the second wave were at significantly lower risk for severe COVID-19. This corresponded to mean laboratory values in the second wave that were more likely to be in typical physiological ranges on the seventh hospital day compared to the first wave. Our federated approach demonstrated the feasibility and power of harmonizing heterogeneous EHR data from multiple international health care systems to rapidly conduct large-scale studies to characterize how COVID-19 clinical trajectories evolve.


Subject(s)
COVID-19 , Pandemics , Adult , Aged , Female , Hospitalization , Hospitals , Humans , Male , Middle Aged , Retrospective Studies , SARS-CoV-2
18.
JAMA ; 324(1): 68-78, 2020 07 07.
Article in English | MEDLINE | ID: mdl-32633800

ABSTRACT

Importance: Data are limited regarding statin therapy for primary prevention of atherosclerotic cardiovascular disease (ASCVD) in adults 75 years and older. Objective: To evaluate the role of statin use for mortality and primary prevention of ASCVD in veterans 75 years and older. Design, Setting, and Participants: Retrospective cohort study that used Veterans Health Administration (VHA) data on adults 75 years and older, free of ASCVD, and with a clinical visit in 2002-2012. Follow-up continued through December 31, 2016. All data were linked to Medicare and Medicaid claims and pharmaceutical data. A new-user design was used, excluding those with any prior statin use. Cox proportional hazards models were fit to evaluate the association of statin use with outcomes. Analyses were conducted using propensity score overlap weighting to balance baseline characteristics. Exposures: Any new statin prescription. Main Outcomes and Measures: The primary outcomes were all-cause and cardiovascular mortality. Secondary outcomes included a composite of ASCVD events (myocardial infarction, ischemic stroke, and revascularization with coronary artery bypass graft surgery or percutaneous coronary intervention). Results: Of 326 981 eligible veterans (mean [SD] age, 81.1 [4.1] years; 97% men; 91% white), 57 178 (17.5%) newly initiated statins during the study period. During a mean follow-up of 6.8 (SD, 3.9) years, a total 206 902 deaths occurred including 53 296 cardiovascular deaths, with 78.7 and 98.2 total deaths/1000 person-years among statin users and nonusers, respectively (weighted incidence rate difference [IRD]/1000 person-years, -19.5 [95% CI, -20.4 to -18.5]). There were 22.6 and 25.7 cardiovascular deaths per 1000 person-years among statin users and nonusers, respectively (weighted IRD/1000 person-years, -3.1 [95 CI, -3.6 to -2.6]). For the composite ASCVD outcome there were 123 379 events, with 66.3 and 70.4 events/1000 person-years among statin users and nonusers, respectively (weighted IRD/1000 person-years, -4.1 [95% CI, -5.1 to -3.0]). After propensity score overlap weighting was applied, the hazard ratio was 0.75 (95% CI, 0.74-0.76) for all-cause mortality, 0.80 (95% CI, 0.78-0.81) for cardiovascular mortality, and 0.92 (95% CI, 0.91-0.94) for a composite of ASCVD events when comparing statin users with nonusers. Conclusions and Relevance: Among US veterans 75 years and older and free of ASCVD at baseline, new statin use was significantly associated with a lower risk of all-cause and cardiovascular mortality. Further research, including from randomized clinical trials, is needed to more definitively determine the role of statin therapy in older adults for primary prevention of ASCVD.


Subject(s)
Atherosclerosis/prevention & control , Cardiovascular Diseases/mortality , Hydroxymethylglutaryl-CoA Reductase Inhibitors/therapeutic use , Veterans , Aged , Aged, 80 and over , Cardiovascular Diseases/prevention & control , Cause of Death , Confounding Factors, Epidemiologic , Female , Humans , Male , Mortality , Propensity Score , Retrospective Studies , United States/epidemiology , Veterans Health Services
19.
J Biomed Inform ; 100: 103322, 2019 12.
Article in English | MEDLINE | ID: mdl-31672532

ABSTRACT

OBJECTIVE: With its increasingly widespread adoption, electronic health records (EHR) have enabled phenotypic information extraction at an unprecedented granularity and scale. However, often a medical concept (e.g. diagnosis, prescription, symptom) is described in various synonyms across different EHR systems, hindering data integration for signal enhancement and complicating dimensionality reduction for knowledge discovery. Despite existing ontologies and hierarchies, tremendous human effort is needed for curation and maintenance - a process that is both unscalable and susceptible to subjective biases. This paper aims to develop a data-driven approach to automate grouping medical terms into clinically relevant concepts by combining multiple up-to-date data sources in an unbiased manner. METHODS: We present a novel data-driven grouping approach - multi-view banded spectral clustering (mvBSC) combining summary data from multiple healthcare systems. The proposed method consists of a banding step that leverages the prior knowledge from the existing coding hierarchy, and a combining step that performs spectral clustering on an optimally weighted matrix. RESULTS: We apply the proposed method to group ICD-9 and ICD-10-CM codes together by integrating data from two healthcare systems. We show grouping results and hierarchies for 13 representative disease categories. Individual grouping qualities were evaluated using normalized mutual information, adjusted Rand index, and F1-measure, and were found to consistently exhibit great similarity to the existing manual grouping counterpart. The resulting ICD groupings also enjoy comparable interpretability and are well aligned with the current ICD hierarchy. CONCLUSION: The proposed approach, by systematically leveraging multiple data sources, is able to overcome bias while maximizing consensus to achieve generalizability. It has the advantage of being efficient, scalable, and adaptive to the evolving human knowledge reflected in the data, showing a significant step toward automating medical knowledge integration.


Subject(s)
Electronic Health Records , International Classification of Diseases , Algorithms , Automation , Cluster Analysis , Humans
20.
J Biomed Inform ; 78: 54-59, 2018 02.
Article in English | MEDLINE | ID: mdl-29305952

ABSTRACT

AIMS: Despite growing interest in using electronic health records (EHR) to create longitudinal cohort studies, the distribution and missingness of EHR data might introduce selection bias and information bias to such analyses. We aimed to examine the yield and potential for these healthcare process biases in defining a study baseline using EHR data, using the example of cholesterol and blood pressure (BP) measurements. METHODS: We created a virtual cohort study of cardiovascular disease (CVD) from patients with eligible cholesterol profiles in the New England (NE) and Southeast (SE) networks of the Veterans Health Administration in the United States. Using clinical data from the EHR, we plotted the yield of patients with BP measurements within an expanding timeframe around an index date of cholesterol testing. We compared three groups: (1) patients with BP from the exact index date; (2) patients with BP not on the index date but within the network-specific 90th percentile around the index date; and (3) patients with no BP within the network-specific 90th percentile. RESULTS: Among 589,361 total patients in the two networks, 146,636 (61.0%) of 240,479 patients from NE and 289,906 (83.1%) of 348,882 patients from SE had BP measurements on the index date. Ninety percent had BP measured within 11 days of the index date in NE and within 5 days of the index date in SE. Group 3 in both networks had fewer available race data, fewer comorbidities and CVD medications, and fewer health system encounters. CONCLUSIONS: Requiring same-day risk factor measurement in the creation of a virtual CVD cohort study from EHR data might exclude 40% of eligible patients, but including patients with infrequent visits might introduce bias. Data visualization can inform study-specific strategies to address these challenges for the research use of EHR data.


Subject(s)
Bias , Cardiovascular Diseases/epidemiology , Electronic Health Records/statistics & numerical data , Epidemiologic Research Design , Medical Informatics/standards , Aged , Blood Pressure/physiology , Cholesterol/blood , Cohort Studies , Female , Humans , Male , Middle Aged , United States/epidemiology
SELECTION OF CITATIONS
SEARCH DETAIL