ABSTRACT
MOTIVATION: Data is increasingly used for improvement and research in public health, especially administrative data such as that collected in electronic health records. Patients enter and exit these typically open-cohort datasets non-uniformly; this can render simple questions about incidence and prevalence time-consuming and with unnecessary variation between analyses. We therefore developed methods to automate analysis of incidence and prevalence in open cohort datasets, to improve transparency, productivity and reproducibility of analyses. IMPLEMENTATION: We provide both a code-free set of rules for incidence and prevalence that can be applied to any open cohort, and a python Command Line Interface implementation of these rules requiring python 3.9 or later. GENERAL FEATURES: The Command Line Interface is used to calculate incidence and point prevalence time series from open cohort data. The ruleset can be used in developing other implementations or can be rearranged to form other analytical questions such as period prevalence. AVAILABILITY: The command line interface is freely available from https://github.com/THINKINGGroup/analogy_publication .
Subject(s)
Electronic Health Records , Humans , Prevalence , Incidence , Cohort Studies , Electronic Health Records/statistics & numerical data , Software , Reproducibility of ResultsABSTRACT
BACKGROUND: Female reproductive factors are gaining prominence as factors that enhance cardiovascular disease (CVD) risk; nonetheless, menstrual cycle characteristics are under-recognized as a factor associated with CVD. Additionally, there is limited data from the UK pertaining to menstrual cycle characteristics and CVD risk. METHODS: A UK retrospective cohort study (1995-2021) using data from a nationwide database (The Health Improvement Network). Women aged 18-40 years at index date were included. 252,325 women with history of abnormal menstruation were matched with up to two controls. Two exposures were examined: regularity and frequency of menstrual cycles; participants were assigned accordingly to one of two separate cohorts. The primary outcome was composite cardiovascular disease (CVD). Secondary outcomes were ischemic heart disease (IHD), cerebrovascular disease, heart failure (HF), hypertension, and type 2 diabetes mellitus (T2DM). Cox proportional hazards regression models were used to derive adjusted hazard ratios (aHR) of cardiometabolic outcomes in women in the exposed groups compared matched controls. RESULTS: During 26 years of follow-up, 20,605 cardiometabolic events occurred in 704,743 patients. Compared to women with regular menstrual cycles, the aHRs (95% CI) for cardiometabolic outcomes in women with irregular menstrual cycles were as follows: composite CVD 1.08 (95% CI 1.00-1.19), IHD 1.18 (1.01-1.37), cerebrovascular disease 1.04 (0.92-1.17), HF 1.30 (1.02-1.65), hypertension 1.07 (1.03-1.11), T2DM 1.37 (1.29-1.45). The aHR comparing frequent or infrequent menstrual cycles to menstrual cycles of normal frequency were as follows: composite CVD 1.24 (1.02-1.52), IHD 1.13 (0.81-1.57), cerebrovascular disease 1.43 (1.10-1.87), HF 0.99 (0.57-1.75), hypertension 1.31 (1.21-1.43), T2DM 1.74 (1.52-1.98). CONCLUSIONS: History of either menstrual cycle irregularity or frequent or infrequent cycles were associated with an increased risk of cardiometabolic outcomes in later life. Menstrual history may be a useful tool in identifying women eligible for periodic assessment of their cardiometabolic health.