RESUMEN
BACKGROUND: Although the COVID-19 pandemic has persisted for over 3 years, reinfections with SARS-CoV-2 are not well understood. We aim to characterize reinfection, understand development of Long COVID after reinfection, and compare severity of reinfection with initial infection. METHODS: We use an electronic health record study cohort of over 3 million patients from the National COVID Cohort Collaborative as part of the NIH Researching COVID to Enhance Recovery Initiative. We calculate summary statistics, effect sizes, and Kaplan-Meier curves to better understand COVID-19 reinfections. RESULTS: Here we validate previous findings of reinfection incidence (6.9%), the occurrence of most reinfections during the Omicron epoch, and evidence of multiple reinfections. We present findings that the proportion of Long COVID diagnoses is higher following initial infection than reinfection for infections in the same epoch. We report lower albumin levels leading up to reinfection and a statistically significant association of severity between initial infection and reinfection (chi-squared value: 25,697, p-value: <0.0001) with a medium effect size (Cramer's V: 0.20, DoF = 3). Individuals who experienced severe initial and first reinfection were older in age and at a higher mortality risk than those who had mild initial infection and reinfection. CONCLUSIONS: In a large patient cohort, we find that the severity of reinfection appears to be associated with the severity of initial infection and that Long COVID diagnoses appear to occur more often following initial infection than reinfection in the same epoch. Future research may build on these findings to better understand COVID-19 reinfections.
More than three years after the start of the COVID-19 pandemic, individuals are frequently reporting multiple COVID-19 infections. However, these reinfections remain poorly understood. Here, we investigate COVID-19 reinfections in a large electronic health record cohort of over 3 million patients. We use data summary techniques and statistical tests to characterize reinfections and their relationships with disease severity, biomarkers, and Long COVID. We find that individuals with severe initial infection are more likely to experience severe reinfection, that some protein levels are lower, leading to reinfection, and that a lower proportion of individuals are diagnosed with Long COVID following reinfection than initial infection. Our work highlights the prevalence and impact of reinfections and suggests the need for further research.
RESUMEN
Although the COVID-19 pandemic has persisted for over 2 years, reinfections with SARS-CoV-2 are not well understood. We use the electronic health record (EHR)-based study cohort from the National COVID Cohort Collaborative (N3C) as part of the NIH Researching COVID to Enhance Recovery (RECOVER) Initiative to characterize reinfection, understand development of Long COVID after reinfection, and compare severity of reinfection with initial infection. We validate previous findings of reinfection incidence (5.9%), the occurrence of most reinfections during the Omicron epoch, and evidence of multiple reinfections. We present novel findings that Long COVID diagnoses occur closer to the index date for infection or reinfection in the Omicron BA epoch. We report lower albumin levels leading up to reinfection and a statistically significant association of severity between first infection and reinfection (chi-squared value: 9446.2, p-value: 0) with a medium effect size (Cramer's V: 0.18, DoF = 4).
RESUMEN
OBJECTIVE: Clinical encounter data are heterogeneous and vary greatly from institution to institution. These problems of variance affect interpretability and usability of clinical encounter data for analysis. These problems are magnified when multisite electronic health record (EHR) data are networked together. This article presents a novel, generalizable method for resolving encounter heterogeneity for analysis by combining related atomic encounters into composite "macrovisits." MATERIALS AND METHODS: Encounters were composed of data from 75 partner sites harmonized to a common data model as part of the NIH Researching COVID to Enhance Recovery Initiative, a project of the National Covid Cohort Collaborative. Summary statistics were computed for overall and site-level data to assess issues and identify modifications. Two algorithms were developed to refine atomic encounters into cleaner, analyzable longitudinal clinical visits. RESULTS: Atomic inpatient encounters data were found to be widely disparate between sites in terms of length-of-stay (LOS) and numbers of OMOP CDM measurements per encounter. After aggregating encounters to macrovisits, LOS and measurement variance decreased. A subsequent algorithm to identify hospitalized macrovisits further reduced data variability. DISCUSSION: Encounters are a complex and heterogeneous component of EHR data and native data issues are not addressed by existing methods. These types of complex and poorly studied issues contribute to the difficulty of deriving value from EHR data, and these types of foundational, large-scale explorations, and developments are necessary to realize the full potential of modern real-world data. CONCLUSION: This article presents method developments to manipulate and resolve EHR encounter data issues in a generalizable way as a foundation for future research and analysis.