Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 20 de 31
1.
Article En | MEDLINE | ID: mdl-38742457

OBJECTIVES: To develop recommendations regarding the use of weights to reduce selection bias for commonly performed analyses using electronic health record (EHR)-linked biobank data. MATERIALS AND METHODS: We mapped diagnosis (ICD code) data to standardized phecodes from 3 EHR-linked biobanks with varying recruitment strategies: All of Us (AOU; n = 244 071), Michigan Genomics Initiative (MGI; n = 81 243), and UK Biobank (UKB; n = 401 167). Using 2019 National Health Interview Survey data, we constructed selection weights for AOU and MGI to represent the US adult population more. We used weights previously developed for UKB to represent the UKB-eligible population. We conducted 4 common analyses comparing unweighted and weighted results. RESULTS: For AOU and MGI, estimated phecode prevalences decreased after weighting (weighted-unweighted median phecode prevalence ratio [MPR]: 0.82 and 0.61), while UKB estimates increased (MPR: 1.06). Weighting minimally impacted latent phenome dimensionality estimation. Comparing weighted versus unweighted phenome-wide association study for colorectal cancer, the strongest associations remained unaltered, with considerable overlap in significant hits. Weighting affected the estimated log-odds ratio for sex and colorectal cancer to align more closely with national registry-based estimates. DISCUSSION: Weighting had a limited impact on dimensionality estimation and large-scale hypothesis testing but impacted prevalence and association estimation. When interested in estimating effect size, specific signals from untargeted association analyses should be followed up by weighted analysis. CONCLUSION: EHR-linked biobanks should report recruitment and selection mechanisms and provide selection weights with defined target populations. Researchers should consider their intended estimands, specify source and target populations, and weight EHR-linked biobank analyses accordingly.

2.
medRxiv ; 2024 Feb 13.
Article En | MEDLINE | ID: mdl-38405832

Objective: To explore the role of selection bias adjustment by weighting electronic health record (EHR)-linked biobank data for commonly performed analyses. Materials and methods: We mapped diagnosis (ICD code) data to standardized phecodes from three EHR-linked biobanks with varying recruitment strategies: All of Us (AOU; n=244,071), Michigan Genomics Initiative (MGI; n=81,243), and UK Biobank (UKB; n=401,167). Using 2019 National Health Interview Survey data, we constructed selection weights for AOU and MGI to be more representative of the US adult population. We used weights previously developed for UKB to represent the UKB-eligible population. We conducted four common descriptive and analytic tasks comparing unweighted and weighted results. Results: For AOU and MGI, estimated phecode prevalences decreased after weighting (weighted-unweighted median phecode prevalence ratio [MPR]: 0.82 and 0.61), while UKB's estimates increased (MPR: 1.06). Weighting minimally impacted latent phenome dimensionality estimation. Comparing weighted versus unweighted PheWAS for colorectal cancer, the strongest associations remained unaltered and there was large overlap in significant hits. Weighting affected the estimated log-odds ratio for sex and colorectal cancer to align more closely with national registry-based estimates. Discussion: Weighting had limited impact on dimensionality estimation and large-scale hypothesis testing but impacted prevalence and association estimation more. Results from untargeted association analyses should be followed by weighted analysis when effect size estimation is of interest for specific signals. Conclusion: EHR-linked biobanks should report recruitment and selection mechanisms and provide selection weights with defined target populations. Researchers should consider their intended estimands, specify source and target populations, and weight EHR-linked biobank analyses accordingly.

3.
J Clin Med ; 12(23)2023 Nov 25.
Article En | MEDLINE | ID: mdl-38068365

BACKGROUND: Post-Acute Sequelae of COVID-19 (PASC) have emerged as a global public health and healthcare challenge. This study aimed to uncover predictive factors for PASC from multi-modal data to develop a predictive model for PASC diagnoses. METHODS: We analyzed electronic health records from 92,301 COVID-19 patients, covering medical phenotypes, medications, and lab results. We used a Super Learner-based prediction approach to identify predictive factors. We integrated the model outputs into individual and composite risk scores and evaluated their predictive performance. RESULTS: Our analysis identified several factors predictive of diagnoses of PASC, including being overweight/obese and the use of HMG CoA reductase inhibitors prior to COVID-19 infection, and respiratory system symptoms during COVID-19 infection. We developed a composite risk score with a moderate discriminatory ability for PASC (covariate-adjusted AUC (95% confidence interval): 0.66 (0.63, 0.69)) by combining the risk scores based on phenotype and medication records. The combined risk score could identify 10% of individuals with a 2.2-fold increased risk for PASC. CONCLUSIONS: We identified several factors predictive of diagnoses of PASC and integrated the information into a composite risk score for PASC prediction, which could contribute to the identification of individuals at higher risk for PASC and inform preventive efforts.

4.
PLoS Genet ; 19(12): e1010907, 2023 Dec.
Article En | MEDLINE | ID: mdl-38113267

OBJECTIVE: To overcome the limitations associated with the collection and curation of COVID-19 outcome data in biobanks, this study proposes the use of polygenic risk scores (PRS) as reliable proxies of COVID-19 severity across three large biobanks: the Michigan Genomics Initiative (MGI), UK Biobank (UKB), and NIH All of Us. The goal is to identify associations between pre-existing conditions and COVID-19 severity. METHODS: Drawing on a sample of more than 500,000 individuals from the three biobanks, we conducted a phenome-wide association study (PheWAS) to identify associations between a PRS for COVID-19 severity, derived from a genome-wide association study on COVID-19 hospitalization, and clinical pre-existing, pre-pandemic phenotypes. We performed cohort-specific PRS PheWAS and a subsequent fixed-effects meta-analysis. RESULTS: The current study uncovered 23 pre-existing conditions significantly associated with the COVID-19 severity PRS in cohort-specific analyses, of which 21 were observed in the UKB cohort and two in the MGI cohort. The meta-analysis yielded 27 significant phenotypes predominantly related to obesity, metabolic disorders, and cardiovascular conditions. After adjusting for body mass index, several clinical phenotypes, such as hypercholesterolemia and gastrointestinal disorders, remained associated with an increased risk of hospitalization following COVID-19 infection. CONCLUSION: By employing PRS as a proxy for COVID-19 severity, we corroborated known risk factors and identified novel associations between pre-existing clinical phenotypes and COVID-19 severity. Our study highlights the potential value of using PRS when actual outcome data may be limited or inadequate for robust analyses.


COVID-19 , Population Health , Humans , Genome-Wide Association Study , Genetic Risk Score , COVID-19/genetics , Biological Specimen Banks , Preexisting Condition Coverage , Risk Factors , Genetic Predisposition to Disease
5.
Sci Adv ; 9(51): eadj3747, 2023 Dec 22.
Article En | MEDLINE | ID: mdl-38117882

We investigated the design and analysis of observational booster vaccine effectiveness (VE) studies by performing a scoping review of booster VE literature with a focus on study design and analytic choices. We then applied 20 different approaches, including those found in the literature, to a single dataset from Michigan Medicine. We identified 80 studies in our review, including over 150 million observations in total. We found that while protection against infection is variable and dependent on several factors including the study population and time period, both monovalent boosters and particularly the bivalent booster offer strong protection against severe COVID-19. In addition, VE analyses with a severe disease outcome (hospitalization, intensive care unit admission, or death) appear to be more robust to design and analytic choices than an infection endpoint. In terms of design choices, we found that test-negative designs and their variants may offer advantages in statistical efficiency compared to cohort designs.


COVID-19 , Humans , COVID-19/epidemiology , COVID-19/prevention & control , Hospitalization , Intensive Care Units , Michigan/epidemiology , Observational Studies as Topic
6.
Epidemiol Health ; 45: e2023074, 2023.
Article En | MEDLINE | ID: mdl-37591787

The Epidemiologic Questionnaire (EPI-Q) was established to collect broad, uniform, self-reported health data to supplement electronic health record (EHR) and genotype information from participants in the University of Michigan (UM) Precision Health cohorts. Recruitment of EPI-Q participants, who were already enrolled in 1 of 3 ongoing UM Precision Health cohorts-the Michigan Genomics Initiative, Mental Health Biobank, and Metabolism, Endocrinology, and Diabetes cohorts-began in March 2020. Of 54,043 retrospective invitations, 5,577 individuals enrolled, representing a 10.3% response rate. Of these, 3,502 (63.7%) were female, and the average age was 56.1 years (standard deviation, 15.4). The baseline survey comprises 11 modules on topics including personal and family health history, lifestyle, and cancer screening and history. Additionally, 11 optional modules cover topics including financial toxicity, occupational exposure, and life meaning. The questions are based on standardized and validated instruments used in other cohorts, and we share resources to expedite development of similar surveys. Data are collected via the MyDataHelps platform, which enables current and future participants to share non-Michigan Medicine EHR data. Recruitment is ongoing. Cohort data are available to those with institutional review board approval; for details, contact the Data Office for Clinical and Translational Research (DataOffice@umich.edu).


Electronic Health Records , Mobile Applications , Humans , Female , Middle Aged , Male , Retrospective Studies , Genotype , Surveys and Questionnaires , Health Surveys
7.
medRxiv ; 2023 Jun 28.
Article En | MEDLINE | ID: mdl-37425863

Background: Observational vaccine effectiveness (VE) studies based on real-world data are a crucial supplement to initial randomized clinical trials of Coronavirus Disease 2019 (COVID-19) vaccines. However, there exists substantial heterogeneity in study designs and statistical methods for estimating VE. The impact of such heterogeneity on VE estimates is not clear. Methods: We conducted a two-step literature review of booster VE: a literature search for first or second monovalent boosters on January 1, 2023, and a rapid search for bivalent boosters on March 28, 2023. For each study identified, study design, methods, and VE estimates for infection, hospitalization, and/or death were extracted and summarized via forest plots. We then applied methods identified in the literature to a single dataset from Michigan Medicine (MM), providing a comparison of the impact of different statistical methodologies on the same dataset. Results: We identified 53 studies estimating VE of the first booster, 16 for the second booster. Of these studies, 2 were case-control, 17 were test-negative, and 50 were cohort studies. Together, they included nearly 130 million people worldwide. VE for all outcomes was very high (around 90%) in earlier studies (i.e., in 2021), but became attenuated and more heterogeneous over time (around 40%-50% for infection, 60%-90% for hospitalization, and 50%-90% for death). VE compared to the previous dose was lower for the second booster (10-30% for infection, 30-60% against hospitalization, and 50-90% against death). We also identified 11 bivalent booster studies including over 20 million people. Early studies of the bivalent booster showed increased effectiveness compared to the monovalent booster (VE around 50-80% for hospitalization and death).Our primary analysis with MM data using a cohort design included 186,495 individuals overall (including 153,811 boosted and 32,684 with only a primary series vaccination), and a secondary test-negative design included 65,992 individuals tested for SARS-CoV-2. When different statistical designs and methods were applied to MM data, VE estimates for hospitalization and death were robust to analytic choices, with test-negative designs leading to narrower confidence intervals. Adjusting either for the propensity of getting boosted or directly adjusting for covariates reduced the heterogeneity across VE estimates for the infection outcome. Conclusion: While the advantage of the second monovalent booster is not obvious from the literature review, the first monovalent booster and the bivalent booster appear to offer strong protection against severe COVID-19. Based on both the literature view and data analysis, VE analyses with a severe disease outcome (hospitalization, ICU admission, or death) appear to be more robust to design and analytic choices than an infection endpoint. Test-negative designs can extend to severe disease outcomes and may offer advantages in statistical efficiency when used properly.

8.
Cancer Epidemiol Biomarkers Prev ; 32(6): 748-759, 2023 06 01.
Article En | MEDLINE | ID: mdl-36626383

BACKGROUND: Studies have shown an increased risk of severe SARS-CoV-2-related (COVID-19) disease outcome and mortality for patients with cancer, but it is not well understood whether associations vary by cancer site, cancer treatment, and vaccination status. METHODS: Using electronic health record data from an academic medical center, we identified a retrospective cohort of 260,757 individuals tested for or diagnosed with COVID-19 from March 10, 2020, to August 1, 2022. Of these, 52,019 tested positive for COVID-19 of whom 13,752 had a cancer diagnosis. We conducted Firth-corrected logistic regression to assess the association between cancer status, site, treatment, vaccination, and four COVID-19 outcomes: hospitalization, intensive care unit admission, mortality, and a composite "severe COVID" outcome. RESULTS: Cancer diagnosis was significantly associated with higher rates of severe COVID, hospitalization, and mortality. These associations were driven by patients whose most recent initial cancer diagnosis was within the past 3 years. Chemotherapy receipt, colorectal cancer, hematologic malignancies, kidney cancer, and lung cancer were significantly associated with higher rates of worse COVID-19 outcomes. Vaccinations were significantly associated with lower rates of worse COVID-19 outcomes regardless of cancer status. CONCLUSIONS: Patients with colorectal cancer, hematologic malignancies, kidney cancer, or lung cancer or who receive chemotherapy for treatment should be cautious because of their increased risk of worse COVID-19 outcomes, even after vaccination. IMPACT: Additional COVID-19 precautions are warranted for people with certain cancer types and treatments. Significant benefit from vaccination is noted for both cancer and cancer-free patients.


COVID-19 , Colorectal Neoplasms , Hematologic Neoplasms , Kidney Neoplasms , Lung Neoplasms , Humans , COVID-19/epidemiology , SARS-CoV-2 , Retrospective Studies , Hospitalization , Vaccination
9.
PLoS One ; 17(7): e0269017, 2022.
Article En | MEDLINE | ID: mdl-35877617

Since the beginning of the Coronavirus Disease 2019 (COVID-19) pandemic, a focus of research has been to identify risk factors associated with COVID-19-related outcomes, such as testing and diagnosis, and use them to build prediction models. Existing studies have used data from digital surveys or electronic health records (EHRs), but very few have linked the two sources to build joint predictive models. In this study, we used survey data on 7,054 patients from the Michigan Genomics Initiative biorepository to evaluate how well self-reported data could be integrated with electronic records for the purpose of modeling COVID-19-related outcomes. We observed that among survey respondents, self-reported COVID-19 diagnosis captured a larger number of cases than the corresponding EHRs, suggesting that self-reported outcomes may be better than EHRs for distinguishing COVID-19 cases from controls. In the modeling context, we compared the utility of survey- and EHR-derived predictor variables in models of survey-reported COVID-19 testing and diagnosis. We found that survey-derived predictors produced uniformly stronger models than EHR-derived predictors-likely due to their specificity, temporal proximity, and breadth-and that combining predictors from both sources offered no consistent improvement compared to using survey-based predictors alone. Our results suggest that, even though general EHRs are useful in predictive models of COVID-19 outcomes, they may not be essential in those models when rich survey data are already available. The two data sources together may offer better prediction for COVID severity, but we did not have enough severe cases in the survey respondents to assess that hypothesis in in our study.


COVID-19 , Electronic Health Records , COVID-19/diagnosis , COVID-19/epidemiology , COVID-19 Testing , Humans , Self Report , Surveys and Questionnaires
10.
Sci Adv ; 8(24): eabp8621, 2022 Jun 17.
Article En | MEDLINE | ID: mdl-35714183

India experienced a massive surge in SARS-CoV-2 infections and deaths during April to June 2021 despite having controlled the epidemic relatively well during 2020. Using counterfactual predictions from epidemiological disease transmission models, we produce evidence in support of how strengthening public health interventions early would have helped control transmission in the country and significantly reduced mortality during the second wave, even without harsh lockdowns. We argue that enhanced surveillance at district, state, and national levels and constant assessment of risk associated with increased transmission are critical for future pandemic responsiveness. Building on our retrospective analysis, we provide a tiered data-driven framework for timely escalation of future interventions as a tool for policy-makers.

11.
PLoS Genet ; 17(9): e1009670, 2021 09.
Article En | MEDLINE | ID: mdl-34529658

Polygenic risk scores (PRS) can provide useful information for personalized risk stratification and disease risk assessment, especially when combined with non-genetic risk factors. However, their construction depends on the availability of summary statistics from genome-wide association studies (GWAS) independent from the target sample. For best compatibility, it was reported that GWAS and the target sample should match in terms of ancestries. Yet, GWAS, especially in the field of cancer, often lack diversity and are predominated by European ancestry. This bias is a limiting factor in PRS research. By using electronic health records and genetic data from the UK Biobank, we contrast the utility of breast and prostate cancer PRS derived from external European-ancestry-based GWAS across African, East Asian, European, and South Asian ancestry groups. We highlight differences in the PRS distributions of these groups that are amplified when PRS methods condense hundreds of thousands of variants into a single score. While European-GWAS-derived PRS were not directly transferrable across ancestries on an absolute scale, we establish their predictive potential when considering them separately within each group. For example, the top 10% of the breast cancer PRS distributions within each ancestry group each revealed significant enrichments of breast cancer cases compared to the bottom 90% (odds ratio of 2.81 [95%CI: 2.69,2.93] in European, 2.88 [1.85, 4.48] in African, 2.60 [1.25, 5.40] in East Asian, and 2.33 [1.55, 3.51] in South Asian individuals). Our findings highlight a compromise solution for PRS research to compensate for the lack of diversity in well-powered European GWAS efforts while recruitment of diverse participants in the field catches up.


Breast Neoplasms/genetics , Genetic Predisposition to Disease , Multifactorial Inheritance , Female , Genome-Wide Association Study , Humans
14.
BMC Infect Dis ; 21(1): 533, 2021 Jun 07.
Article En | MEDLINE | ID: mdl-34098885

BACKGROUND: Many popular disease transmission models have helped nations respond to the COVID-19 pandemic by informing decisions about pandemic planning, resource allocation, implementation of social distancing measures, lockdowns, and other non-pharmaceutical interventions. We study how five epidemiological models forecast and assess the course of the pandemic in India: a baseline curve-fitting model, an extended SIR (eSIR) model, two extended SEIR (SAPHIRE and SEIR-fansy) models, and a semi-mechanistic Bayesian hierarchical model (ICM). METHODS: Using COVID-19 case-recovery-death count data reported in India from March 15 to October 15 to train the models, we generate predictions from each of the five models from October 16 to December 31. To compare prediction accuracy with respect to reported cumulative and active case counts and reported cumulative death counts, we compute the symmetric mean absolute prediction error (SMAPE) for each of the five models. For reported cumulative cases and deaths, we compute Pearson's and Lin's correlation coefficients to investigate how well the projected and observed reported counts agree. We also present underreporting factors when available, and comment on uncertainty of projections from each model. RESULTS: For active case counts, SMAPE values are 35.14% (SEIR-fansy) and 37.96% (eSIR). For cumulative case counts, SMAPE values are 6.89% (baseline), 6.59% (eSIR), 2.25% (SAPHIRE) and 2.29% (SEIR-fansy). For cumulative death counts, the SMAPE values are 4.74% (SEIR-fansy), 8.94% (eSIR) and 0.77% (ICM). Three models (SAPHIRE, SEIR-fansy and ICM) return total (sum of reported and unreported) cumulative case counts as well. We compute underreporting factors as of October 31 and note that for cumulative cases, the SEIR-fansy model yields an underreporting factor of 7.25 and ICM model yields 4.54 for the same quantity. For total (sum of reported and unreported) cumulative deaths the SEIR-fansy model reports an underreporting factor of 2.97. On October 31, we observe 8.18 million cumulative reported cases, while the projections (in millions) from the baseline model are 8.71 (95% credible interval: 8.63-8.80), while eSIR yields 8.35 (7.19-9.60), SAPHIRE returns 8.17 (7.90-8.52) and SEIR-fansy projects 8.51 (8.18-8.85) million cases. Cumulative case projections from the eSIR model have the highest uncertainty in terms of width of 95% credible intervals, followed by those from SAPHIRE, the baseline model and finally SEIR-fansy. CONCLUSIONS: In this comparative paper, we describe five different models used to study the transmission dynamics of the SARS-Cov-2 virus in India. While simulation studies are the only gold standard way to compare the accuracy of the models, here we were uniquely poised to compare the projected case-counts against observed data on a test period. The largest variability across models is observed in predicting the "total" number of infections including reported and unreported cases (on which we have no validation data). The degree of under-reporting has been a major concern in India and is characterized in this report. Overall, the SEIR-fansy model appeared to be a good choice with publicly available R-package and desired flexibility plus accuracy.


COVID-19/epidemiology , COVID-19/transmission , Pandemics , Bayes Theorem , Communicable Disease Control/methods , Computer Simulation , Forecasting , Humans , India/epidemiology , Models, Statistical
15.
Sci Rep ; 11(1): 9748, 2021 05 07.
Article En | MEDLINE | ID: mdl-33963259

Susceptible-Exposed-Infected-Removed (SEIR)-type epidemiologic models, modeling unascertained infections latently, can predict unreported cases and deaths assuming perfect testing. We apply a method we developed to account for the high false negative rates of diagnostic RT-PCR tests for detecting an active SARS-CoV-2 infection in a classic SEIR model. The number of unascertained cases and false negatives being unobservable in a real study, population-based serosurveys can help validate model projections. Applying our method to training data from Delhi, India, during March 15-June 30, 2020, we estimate the underreporting factor for cases at 34-53 (deaths: 8-13) on July 10, 2020, largely consistent with the findings of the first round of serosurveys for Delhi (done during June 27-July 10, 2020) with an estimated 22.86% IgG antibody prevalence, yielding estimated underreporting factors of 30-42 for cases. Together, these imply approximately 96-98% cases in Delhi remained unreported (July 10, 2020). Updated calculations using training data during March 15-December 31, 2020 yield estimated underreporting factor for cases at 13-22 (deaths: 3-7) on January 23, 2021, which are again consistent with the latest (fifth) round of serosurveys for Delhi (done during January 15-23, 2021) with an estimated 56.13% IgG antibody prevalence, yielding an estimated range for the underreporting factor for cases at 17-21. Together, these updated estimates imply approximately 92-96% cases in Delhi remained unreported (January 23, 2021). Such model-based estimates, updated with latest data, provide a viable alternative to repeated resource-intensive serosurveys for tracking unreported cases and deaths and gauging the true extent of the pandemic.


COVID-19/diagnosis , COVID-19/epidemiology , SARS-CoV-2/isolation & purification , Adolescent , Adult , Antibodies, Viral/immunology , COVID-19/immunology , COVID-19/transmission , COVID-19 Testing , Child , Child, Preschool , False Negative Reactions , Female , Humans , Immunoglobulin G/immunology , India/epidemiology , Male , SARS-CoV-2/immunology , Seroepidemiologic Studies , Young Adult
16.
J Clin Med ; 10(7)2021 Mar 25.
Article En | MEDLINE | ID: mdl-33805886

BACKGROUND: We performed a phenome-wide association study to identify pre-existing conditions related to Coronavirus disease 2019 (COVID-19) prognosis across the medical phenome and how they vary by race. METHODS: The study is comprised of 53,853 patients who were tested/diagnosed for COVID-19 between 10 March and 2 September 2020 at a large academic medical center. RESULTS: Pre-existing conditions strongly associated with hospitalization were renal failure, pulmonary heart disease, and respiratory failure. Hematopoietic conditions were associated with intensive care unit (ICU) admission/mortality and mental disorders were associated with mortality in non-Hispanic Whites. Circulatory system and genitourinary conditions were associated with ICU admission/mortality in non-Hispanic Blacks. CONCLUSIONS: Understanding pre-existing clinical diagnoses related to COVID-19 outcomes informs the need for targeted screening to support specific vulnerable populations to improve disease prevention and healthcare delivery.

17.
J Biomed Inform ; 113: 103652, 2021 01.
Article En | MEDLINE | ID: mdl-33279681

BACKGROUND: Traditional methods for disease risk prediction and assessment, such as diagnostic tests using serum, urine, blood, saliva or imaging biomarkers, have been important for identifying high-risk individuals for many diseases, leading to early detection and improved survival. For pancreatic cancer, traditional methods for screening have been largely unsuccessful in identifying high-risk individuals in advance of disease progression leading to high mortality and poor survival. Electronic health records (EHR) linked to genetic profiles provide an opportunity to integrate multiple sources of patient information for risk prediction and stratification. We leverage a constellation of temporally associated diagnoses available in the EHR to construct a summary risk score, called a phenotype risk score (PheRS), for identifying individuals at high-risk for having pancreatic cancer. The proposed PheRS approach incorporates the time with respect to disease onset into the prediction framework. We combine and contrast the PheRS with more well-known measures of inherited susceptibility, namely, the polygenic risk scores (PRS) for prediction of pancreatic cancer. METHODOLOGY: We first calculated pairwise, unadjusted associations between pancreatic cancer diagnosis and all possible other diagnoses across the medical phenome. We call these pairwise associations co-occurrences. After accounting for cross-phenotype correlations, the multivariable association estimates from a subset of relatively independent diagnoses were used to create a weighted sum PheRS. We constructed time-restricted risk scores using data from 38,359 participants in the Michigan Genomics Initiative (MGI) based on the diagnoses contained in the EHR at 0, 1, 2, and 5 years prior to the target pancreatic cancer diagnosis. The PheRS was assessed for predictability in the UK Biobank (UKB). We tested the relative contribution of PheRS when added to a model containing a summary measure of inherited genetic susceptibility (PRS) plus other covariates like age, sex, smoking status, drinking status, and body mass index (BMI). RESULTS: Our exploration of co-occurrence patterns identified expected associations while also revealing unexpected relationships that may warrant closer attention. Solely using the pancreatic cancer PheRS at 5 years before the target diagnoses yielded an AUC of 0.60 (95% CI = [0.58, 0.62]) in UKB. A larger predictive model including PheRS, PRS, and the covariates at the 5-year threshold achieved an AUC of 0.74 (95% CI = [0.72, 0.76]) in UKB. We note that PheRS does contribute independently in the joint model. Finally, scores at the top percentiles of the PheRS distribution demonstrated promise in terms of risk stratification. Scores in the top 2% were 10.20 (95% CI = [9.34, 12.99]) times more likely to identify cases than those in the bottom 98% in UKB at the 5-year threshold prior to pancreatic cancer diagnosis. CONCLUSIONS: We developed a framework for creating a time-restricted PheRS from EHR data for pancreatic cancer using the rich information content of a medical phenome. In addition to identifying hypothesis-generating associations for future research, this PheRS demonstrates a potentially important contribution in identifying high-risk individuals, even after adjusting for PRS for pancreatic cancer and other traditional epidemiologic covariates. The methods are generalizable to other phenotypic traits.


Electronic Health Records , Pancreatic Neoplasms , Biological Specimen Banks , Genome-Wide Association Study , Humans , Michigan , Pancreatic Neoplasms/genetics , Phenotype , Risk Factors
18.
medRxiv ; 2021 Feb 20.
Article En | MEDLINE | ID: mdl-32793923

BACKGROUND: We perform a phenome-wide scan to identify pre-existing conditions related to COVID-19 susceptibility and prognosis across the medical phenome and how they vary by race. METHODS: The study is comprised of 53,853 patients who were tested/positive for COVID-19 between March 10 and September 2, 2020 at a large academic medical center. RESULTS: Pre-existing conditions strongly associated with hospitalization were renal failure, pulmonary heart disease, and respiratory failure. Hematopoietic conditions were associated with ICU admission/mortality and mental disorders were associated with mortality in non-Hispanic Whites. Circulatory system and genitourinary conditions were associated with ICU admission/mortality in non-Hispanic Blacks. CONCLUSIONS: Understanding pre-existing clinical diagnoses related to COVID-19 outcomes informs the need for targeted screening to support specific vulnerable populations to improve disease prevention and healthcare delivery.

19.
BMJ Open ; 10(12): e041778, 2020 12 10.
Article En | MEDLINE | ID: mdl-33303462

OBJECTIVES: To evaluate the effect of four-phase national lockdown from March 25 to May 31 in response to the COVID-19 pandemic in India and unmask the state-wise variations in terms of multiple public health metrics. DESIGN: Cohort study (daily time series of case counts). SETTING: Observational and population based. PARTICIPANTS: Confirmed COVID-19 cases nationally and across 20 states that accounted for >99% of the current cumulative case counts in India until 31 May 2020. EXPOSURE: Lockdown (non-medical intervention). MAIN OUTCOMES AND MEASURES: We illustrate the masking of state-level trends and highlight the variations across states by presenting evaluative evidence on some aspects of the COVID-19 outbreak: case fatality rates, doubling times of cases, effective reproduction numbers and the scale of testing. RESULTS: The estimated effective reproduction number R for India was 3.36 (95% CI 3.03 to 3.71) on 24 March, whereas the average of estimates from 25 May to 31 May stands at 1.27 (95% CI 1.26 to 1.28). Similarly, the estimated doubling time across India was at 3.56 days on 24 March, and the past 7-day average for the same on 31 May is 14.37 days. The average daily number of tests increased from 1717 (19-25 March) to 113 372 (25-31 May) while the test positivity rate increased from 2.1% to 4.2%, respectively. However, various states exhibit substantial departures from these national patterns. CONCLUSIONS: Patterns of change over lockdown periods indicate the lockdown has been partly effective in slowing the spread of the virus nationally. However, there exist large state-level variations and identifying these variations can help in both understanding the dynamics of the pandemic and formulating effective public health interventions. Our framework offers a holistic assessment of the pandemic across Indian states and union territories along with a set of interactive visualisation tools that are daily updated at covind19.org.


COVID-19 Testing/statistics & numerical data , COVID-19/mortality , Public Health/trends , Quarantine/statistics & numerical data , COVID-19/prevention & control , Humans , India/epidemiology
20.
JAMA Netw Open ; 3(10): e2025197, 2020 10 01.
Article En | MEDLINE | ID: mdl-33084902

Importance: Black patients are overrepresented in the number of COVID-19 infections, hospitalizations, and deaths in the US. Reasons for this disparity may be due to underlying comorbidities or sociodemographic factors that require further exploration. Objective: To systematically determine patient characteristics associated with racial/ethnic disparities in COVID-19 outcomes. Design, Setting, and Participants: This retrospective cohort study used comparative groups of patients tested or treated for COVID-19 at the University of Michigan from March 10, 2020, to April 22, 2020, with an outcome update through July 28, 2020. A group of randomly selected untested individuals were included for comparison. Examined factors included race/ethnicity, age, smoking, alcohol consumption, comorbidities, body mass index (BMI; calculated as weight in kilograms divided by height in meters squared), and residential-level socioeconomic characteristics. Exposure: In-house polymerase chain reaction (PCR) tests, commercial antibody tests, nasopharynx or oropharynx PCR deployed by the Michigan Department of Health and Human Services and reverse transcription-PCR tests performed in external labs. Main Outcomes and Measures: The main outcomes were being tested for COVID-19, having test results positive for COVID-19 or being diagnosed with COVID-19, being hospitalized for COVID-19, requiring intensive care unit (ICU) admission for COVID-19, and COVID-19-related mortality (including inpatient and outpatient). Medical comorbidities were defined from the International Classification of Diseases, Ninth Revision, and International Classification of Diseases, Tenth Revision, codes and were aggregated into a comorbidity score. Associations with COVID-19 outcomes were examined using odds ratios (ORs). Results: Of 5698 patients tested for COVID-19 (mean [SD] age, 47.4 [20.9] years; 2167 [38.0%] men; mean [SD] BMI, 30.0 [8.0]), most were non-Hispanic White (3740 patients [65.6%]) or non-Hispanic Black (1058 patients [18.6%]). The comparison group included 7168 individuals who were not tested (mean [SD] age, 43.1 [24.1] years; 3257 [45.4%] men; mean [SD] BMI, 28.5 [7.1]). Among 1139 patients diagnosed with COVID-19, 492 (43.2%) were White and 442 (38.8%) were Black; 523 (45.9%) were hospitalized, 283 (24.7%) were admitted to the ICU, and 88 (7.7%) died. Adjusting for age, sex, socioeconomic status, and comorbidity score, Black patients were more likely to be hospitalized compared with White patients (OR, 1.72 [95% CI, 1.15-2.58]; P = .009). In addition to older age, male sex, and obesity, living in densely populated areas was associated with increased risk of hospitalization (OR, 1.10 [95% CI, 1.01-1.19]; P = .02). In the overall population, higher risk of hospitalization was also observed in patients with preexisting type 2 diabetes (OR, 1.82 [95% CI, 1.25-2.64]; P = .02) and kidney disease (OR, 2.87 [95% CI, 1.87-4.42]; P < .001). Compared with White patients, obesity was associated with higher risk of having test results positive for COVID-19 among Black patients (White: OR, 1.37 [95% CI, 1.01-1.84]; P = .04. Black: OR, 3.11 [95% CI, 1.64-5.90]; P < .001; P for interaction = .02). Having any cancer was associated with higher risk of positive COVID-19 test results for Black patients (OR, 1.82 [95% CI, 1.19-2.78]; P = .005) but not White patients (OR, 1.08 [95% CI, 0.84-1.40]; P = .53; P for interaction = .04). Overall comorbidity burden was associated with higher risk of hospitalization in White patients (OR, 1.30 [95% CI, 1.11-1.53]; P = .001) but not in Black patients (OR, 0.99 [95% CI, 0.83-1.17]; P = .88; P for interaction = .02), as was type 2 diabetes (White: OR, 2.59 [95% CI, 1.49-4.48]; P < .001; Black: OR, 1.17 [95% CI, 0.66-2.06]; P = .59; P for interaction = .046). No statistically significant racial differences were found in ICU admission and mortality based on adjusted analysis. Conclusions and Relevance: These findings suggest that preexisting type 2 diabetes or kidney diseases and living in high-population density areas were associated with higher risk for COVID-19 hospitalization. Associations of risk factors with COVID-19 outcomes differed by race.


Black or African American , Coronavirus Infections/ethnology , Health Status Disparities , Hospitalization , Pneumonia, Viral/ethnology , White People , Adult , Aged , Betacoronavirus , COVID-19 , Comorbidity , Coronavirus Infections/epidemiology , Coronavirus Infections/therapy , Coronavirus Infections/virology , Diabetes Mellitus, Type 2/epidemiology , Female , Humans , Intensive Care Units , Kidney Diseases/epidemiology , Male , Michigan/epidemiology , Middle Aged , Neoplasms/epidemiology , Obesity/epidemiology , Odds Ratio , Pandemics , Pneumonia, Viral/epidemiology , Pneumonia, Viral/therapy , Pneumonia, Viral/virology , Population Density , Retrospective Studies , Risk Factors , SARS-CoV-2
...