Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 106
Filter
Add more filters

Publication year range
1.
Am J Epidemiol ; 2024 Apr 06.
Article in English | MEDLINE | ID: mdl-38583932

ABSTRACT

Administrative claims databases often do not capture date or fact of death, so studies using these data may inappropriately treat death as a censoring event-equivalent to other withdrawal reasons-rather than a competing event. We examined 1-, 3-, and 5-year inverse-probability-of-treatment-weighted cumulative risks of a composite cardiovascular outcome among 34,527 initiators of telmisartan (exposure) and ramipril (referent) ages ≥55 in Optum claims from 2003 to 2020. Differences in cumulative risks of the cardiovascular endpoint due to censoring of death (cause-specific), as compared to treating death as a competing event (sub-distribution), increased with greater follow-up time and older age, where event and mortality risks were higher. Among ramipril users (selected results), 5-year cause-specific and sub-distribution cumulative risk estimates per 100, respectively, were 16.4 (95% CI 15.3, 17.5) and 16.2 (95% CI 15.1, 17.3) among ages 55-64 (difference=0.2) and were 43.2 (95% CI 41.3, 45.2) and 39.7 (95% CI 37.9, 41.4) among ages ≥75 (difference=3.6). Plasmode simulation results demonstrated the differences in cause-specific versus sub-distribution cumulative risks to increase with increasing mortality rate. We suggest researchers consider the cohort's baseline mortality risk when deciding whether real-world data with incomplete death information can be used without concern.

2.
J Nutr ; 154(2): 680-690, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38122847

ABSTRACT

BACKGROUND: The periconceptional period is a critical window for the origins of adverse pregnancy and birth outcomes, yet little is known about the dietary patterns that promote perinatal health. OBJECTIVE: We used machine learning methods to determine the effect of periconceptional dietary patterns on risk of preeclampsia, gestational diabetes, preterm birth, small-for-gestational-age (SGA) birth, and a composite of these outcomes. METHODS: We used data from 8259 participants in the Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-Be (8 US medical centers, 2010‒2013). Usual daily periconceptional intake of 82 food groups was estimated from a food frequency questionnaire. We used k-means clustering with a Euclidean distance metric to identify dietary patterns. We estimated the effect of dietary patterns on each perinatal outcome using targeted maximum likelihood estimation and an ensemble of machine learning algorithms, adjusting for confounders including health behaviors and psychological, neighborhood, and sociodemographic factors. RESULTS: The 4 dietary patterns that emerged from our data were identified as "Sandwiches and snacks" (34% of the sample); "High fat, sugar, and sodium" (29%); "Beverages, refined grains, and mixed dishes" (21%); and "High fruits, vegetables, whole grains, and plant proteins" (16%). One-quarter of pregnancies had preeclampsia (8% incidence), gestational diabetes (5%), preterm birth (8%), or SGA birth (8%). Compared with the "High fat, sugar, and sodium" pattern, there were 3.3 to 4.3 fewer cases of the composite adverse outcome per 100 pregnancies among participants following the "Beverages, refined grains and mixed dishes" pattern (risk difference -0.043; 95% confidence interval -0.078, -0.009), "High fruits, vegetables, whole grains and plant proteins" pattern (-0.041; 95% confidence interval -0.078, -0.004), and "Sandwiches and snacks" pattern (-0.033; 95% confidence interval -0.065, -0.002). CONCLUSIONS: Our results highlight that there are a variety of periconceptional dietary patterns that are associated with perinatal health and reinforce the negative health implications of diets high in fat, sugars, and sodium.


Subject(s)
Diabetes, Gestational , Pre-Eclampsia , Premature Birth , Pregnancy , Female , Infant, Newborn , Humans , Premature Birth/epidemiology , Diabetes, Gestational/epidemiology , Dietary Patterns , Pre-Eclampsia/epidemiology , Pregnancy Outcome , Diet/adverse effects , Vegetables , Fetal Growth Retardation , Sodium , Sugars , Plant Proteins
3.
Acta Haematol ; 2024 May 10.
Article in English | MEDLINE | ID: mdl-38735288

ABSTRACT

INTRODUCTION: Most multiple myeloma (MM) patients experience cytopenias, likely driven by both disease and treatment-related factors. Immunomodulatory agents (IMiDs), which form the backbone of most anti-myeloma regimens, are known to cause higher grade cytopenias. In this context, the impact of sequential IMiD treatments on cytopenia risk is unknown. METHODS: We evaluated the cumulative risks of severe cytopenias following second line of therapy (LOT) initiation in 5573 MM patients in the Flatiron Health database. Patients for whom both LOTs 1 and 2 contained IMiDs were considered "sequentially exposed"; those for whom neither contained IMiDs were "never exposed." RESULTS: For the neutropenia outcome, compared to the never exposed, the sequentially exposed had the highest 1-year risk (risk difference [RD] 12%), followed by those only recently exposed during LOT 2 (RD 8%), then by those with only past exposure during LOT 1 (RD 5%). A similar pattern was observed for leukopenia, but no meaningful differences were observed for anemia or thrombocytopenia. The associations between sequential exposure, versus never, with neutropenia and leukopenia were even stronger among those with a recent cytopenia history. CONCLUSION: Results suggest that sequential exposure to IMiDs is a risk factor for higher grade cytopenias. These findings have profound clinical implications in choosing newer LOTs with potential risks of cytopenia.

4.
Am J Epidemiol ; 192(1): 102-110, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36124667

ABSTRACT

Inverse probability weighting (IPW) and g-computation are commonly used in time-varying analyses. To inform decisions on which to use, we compared these methods using a plasmode simulation based on data from the Effects of Aspirin in Gestation and Reproduction (EAGeR) Trial (June 15, 2007-July 15, 2011). In our main analysis, we simulated a cohort study of 1,226 individuals followed for up to 10 weeks. The exposure was weekly exercise, and the outcome was time to pregnancy. We controlled for 6 confounding factors: 4 baseline confounders (race, ever smoking, age, and body mass index) and 2 time-varying confounders (compliance with assigned treatment and nausea). We sought to estimate the average causal risk difference by 10 weeks, using IPW and g-computation implemented using a Monte Carlo estimator and iterated conditional expectations (ICE). Across 500 simulations, we compared the bias, empirical standard error (ESE), average standard error, standard error ratio, and 95% confidence interval coverage of each approach. IPW (bias = 0.02; ESE = 0.04; coverage = 92.6%) and Monte Carlo g-computation (bias = -0.01; ESE = 0.03; coverage = 94.2%) performed similarly. ICE g-computation was the least biased but least precise estimator (bias = 0.01; ESE = 0.06; coverage = 93.4%). When choosing an estimator, one should consider factors like the research question, the prevalences of the exposure and outcome, and the number of time points being analyzed.


Subject(s)
Cohort Studies , Humans , Probability , Computer Simulation , Survival Analysis , Bias
5.
Epidemiology ; 34(1): 38-44, 2023 01 01.
Article in English | MEDLINE | ID: mdl-36455245

ABSTRACT

BACKGROUND: In many research settings, the intervention implied by the average causal effect of a time-varying exposure is impractical or unrealistic, and we might instead prefer a more realistic target estimand. Instead of requiring all individuals to be always exposed versus unexposed, incremental effects quantify the impact of merely shifting each individual's probability of being exposed. METHODS: We demonstrate the estimation of incremental effects in the time-varying setting, using data from the Effects of Aspirin in Gestation and Reproduction trial, which assessed the effect of preconception low-dose aspirin on pregnancy outcomes. Compliance to aspirin or placebo was summarized weekly and was affected by time-varying confounders such as bleeding or nausea. We sought to estimate what the incidence of pregnancy by 26 weeks postrandomization would have been if we shifted each participant's probability of taking aspirin or placebo each week by odds ratios (OR) between 0.30 and 3.00. RESULTS: Under no intervention (OR = 1), the incidence of pregnancy was 77% (95% CI: 74%, 80%). Decreasing women's probability of complying with aspirin had little estimated effect on pregnancy incidence. When we increased women's probability of taking aspirin, estimated incidence of pregnancy increased, from 83% (95% confidence interval [CI] = 79%, 87%) for OR = 2 to 89% (95% CI = 84%, 93%) for OR=3. We observed similar results when we shifted women's probability of complying with a placebo. CONCLUSIONS: These results estimated that realistic interventions to increase women's probability of taking aspirin would have yielded little to no impact on the incidence of pregnancy, relative to similar interventions on placebo.


Subject(s)
Aspirin , Nausea , Pregnancy , Humans , Female , Incidence , Odds Ratio , Aspirin/therapeutic use , Probability
6.
J Nutr ; 153(8): 2369-2379, 2023 08.
Article in English | MEDLINE | ID: mdl-37271415

ABSTRACT

BACKGROUND: Racism is a key determinant of perinatal health disparities. Poor diet may contribute to this effect, but research on racism and dietary patterns is limited. OBJECTIVE: We aimed to describe the relation between experiences of racial discrimination and adherence to the 2015‒2020 Dietary Guidelines for Americans. METHODS: We used data from a prospective pregnancy cohort study conducted at 8 United States medical centers (2010‒2013). At 6‒13 weeks of gestation, 10,038 nulliparous people with singleton pregnancies were enrolled. Participants completed a Block food frequency questionnaire, assessing usual diet in the 3 mo around conception, and the Krieger Experiences of Discrimination Scale, assessing the number of situational domains (e.g., at school and on the street) in which participants ever experienced racial discrimination. Alignment of dietary intake with the 2015-2020 Dietary Guidelines for Americans was assessed using the Healthy Eating Index (HEI)-2015. RESULTS: The study showed that 49%, 44%, 35%, and 17% of the Asian, Black, Hispanic, and White participants reported experiences of racial discrimination in any domain. Most participants experienced discrimination in 1 or 2 situational domains. There were no meaningful differences in HEI-2015 total or component scores in any racial or ethnic group according to count of self-reported domains in which individuals experienced discrimination. For example, mean total scores were 57‒59 among Black, 61‒66 among White, 61‒63 among Hispanic, and 66‒69 among Asian participants across the count of racial discrimination domains. CONCLUSIONS: This null association stresses the importance of going beyond interpersonal racial discrimination to consider the institutions, systems, and practices affecting racialized people to eliminate persistent inequalities in diet and perinatal health.


Subject(s)
Racism , Female , Pregnancy , Humans , United States , Cohort Studies , Prospective Studies , Ethnicity , Diet
7.
Am J Epidemiol ; 191(11): 1962-1969, 2022 10 20.
Article in English | MEDLINE | ID: mdl-35896793

ABSTRACT

There are important challenges to the estimation and identification of average causal effects in longitudinal data with time-varying exposures. Here, we discuss the difficulty in meeting the positivity condition. Our motivating example is the per-protocol analysis of the Effects of Aspirin in Gestation and Reproduction (EAGeR) Trial. We estimated the average causal effect comparing the incidence of pregnancy by 26 weeks that would have occurred if all women had been assigned to aspirin and complied versus the incidence if all women had been assigned to placebo and complied. Using flexible targeted minimum loss-based estimation, we estimated a risk difference of 1.27% (95% CI: -9.83, 12.38). Using a less flexible inverse probability weighting approach, the risk difference was 5.77% (95% CI: -1.13, 13.05). However, the cumulative probability of compliance conditional on covariates approached 0 as follow-up accrued, indicating a practical violation of the positivity assumption, which limited our ability to make causal interpretations. The effects of nonpositivity were more apparent when using a more flexible estimator, as indicated by the greater imprecision. When faced with nonpositivity, one can use a flexible approach and be transparent about the uncertainty, use a parametric approach and smooth over gaps in the data, or target a different estimand that will be less vulnerable to positivity violations.


Subject(s)
Aspirin , Models, Statistical , Pregnancy , Female , Humans , Causality , Probability , Incidence
8.
Am J Epidemiol ; 191(2): 341-348, 2022 01 24.
Article in English | MEDLINE | ID: mdl-34643230

ABSTRACT

The average causal effect compares counterfactual outcomes if everyone had been exposed versus if everyone had been unexposed, which can be an unrealistic contrast. Alternatively, we can target effects that compare counterfactual outcomes against the factual outcomes observed in the sample (i.e., we can compare against the natural course). Here, we demonstrate how the natural course can be estimated and used in causal analyses for model validation and effect estimation. Our example is an analysis assessing the impact of taking aspirin on pregnancy, 26 weeks after randomization, in the Effects of Aspirin in Gestation and Reproduction trial (United States, 2006-2012). To validate our models, we estimated the natural course using g-computation and then compared that against the observed incidence of pregnancy. We observed good agreement between the observed and model-based natural courses. We then estimated an effect that compared the natural course against the scenario in which participants assigned to aspirin always complied. If participants had always complied, there would have been 5.0 (95% confidence interval: 2.2, 7.8) more pregnancies per 100 women than was observed. It is good practice to estimate the natural course for model validation when using parametric models, but whether one should estimate a natural course contrast depends on the underlying research questions.


Subject(s)
Causality , Models, Theoretical , Pregnancy Complications/epidemiology , Adult , Aspirin/therapeutic use , Female , Humans , Pregnancy , Pregnancy Complications/prevention & control , Randomized Controlled Trials as Topic , Reproducibility of Results
9.
Am J Epidemiol ; 191(1): 126-136, 2022 01 01.
Article in English | MEDLINE | ID: mdl-34343230

ABSTRACT

Severe maternal morbidity (SMM) affects 50,000 women annually in the United States, but its consequences are not well understood. We aimed to estimate the association between SMM and risk of adverse cardiovascular events during the 2 years postpartum. We analyzed 137,140 deliveries covered by the Pennsylvania Medicaid program (2016-2018), weighted with inverse probability of censoring weights to account for nonrandom loss to follow-up. SMM was defined as any diagnosis on the Centers for Disease Control and Prevention list of SMM diagnoses and procedures and/or intensive care unit admission occurring at any point from conception through 42 days postdelivery. Outcomes included heart failure, ischemic heart disease, and stroke/transient ischemic attack up to 2 years postpartum. We used marginal standardization to estimate average treatment effects. We found that SMM was associated with increased risk of each adverse cardiovascular event across the follow-up period. Per 1,000 deliveries, relative to no SMM, SMM was associated with 12.1 (95% confidence interval (CI): 6.2, 18.0) excess cases of heart failure, 6.4 (95% CI: 1.7, 11.2) excess cases of ischemic heart disease, and 8.2 (95% CI: 3.2, 13.1) excess cases of stroke/transient ischemic attack at 26 months of follow-up. These results suggest that SMM identifies a group of women who are at high risk of adverse cardiovascular events after delivery. Women who survive SMM may benefit from more comprehensive postpartum care linked to well-woman care.


Subject(s)
Cardiovascular Diseases/epidemiology , Maternal Health/statistics & numerical data , Medicaid/statistics & numerical data , Pregnancy Complications/epidemiology , Adult , Female , Humans , Pennsylvania , Pregnancy , Retrospective Studies , Risk Factors , United States/epidemiology , Young Adult
10.
Am J Epidemiol ; 191(1): 198-207, 2022 01 01.
Article in English | MEDLINE | ID: mdl-34409985

ABSTRACT

Effect measure modification is often evaluated using parametric models. These models, although efficient when correctly specified, make strong parametric assumptions. While nonparametric models avoid important functional form assumptions, they often require larger samples to achieve a given accuracy. We conducted a simulation study to evaluate performance tradeoffs between correctly specified parametric and nonparametric models to detect effect modification of a binary exposure by both binary and continuous modifiers. We evaluated generalized linear models and doubly robust (DR) estimators, with and without sample splitting. Continuous modifiers were modeled with cubic splines, fractional polynomials, and nonparametric DR-learner. For binary modifiers, generalized linear models showed the greatest power to detect effect modification, ranging from 0.42 to 1.00 in the worst and best scenario, respectively. Augmented inverse probability weighting had the lowest power, with an increase of 23% when using sample splitting. For continuous modifiers, the DR-learner was comparable to flexible parametric models in capturing quadratic and nonlinear monotonic functions. However, for nonlinear, nonmonotonic functions, the DR-learner had lower integrated bias than splines and fractional polynomials, with values of 141.3, 251.7, and 209.0, respectively. Our findings suggest comparable performance between nonparametric and correctly specified parametric models in evaluating effect modification.


Subject(s)
Epidemiologic Methods , Models, Statistical , Computer Simulation , Data Interpretation, Statistical , Humans
11.
Am J Epidemiol ; 191(8): 1396-1406, 2022 07 23.
Article in English | MEDLINE | ID: mdl-35355047

ABSTRACT

The Dietary Guidelines for Americans rely on summaries of the effect of dietary pattern on disease risk, independent of other population characteristics. We explored the modifying effect of prepregnancy body mass index (BMI; weight (kg)/height (m)2) on the relationship between fruit and vegetable density (cup-equivalents/1,000 kcal) and preeclampsia using data from a pregnancy cohort study conducted at 8 US medical centers (n = 9,412; 2010-2013). Usual daily periconceptional intake of total fruits and total vegetables was estimated from a food frequency questionnaire. We quantified the effects of diets with a high density of fruits (≥1.2 cups/1,000 kcal/day vs. <1.2 cups/1,000 kcal/day) and vegetables (≥1.3 cups/1,000 kcal/day vs. <1.3 cups/1,000 kcal/day) on preeclampsia risk, conditional on BMI, using a doubly robust estimator implemented in 2 stages. We found that the protective association of higher fruit density declined approximately linearly from a BMI of 20 to a BMI of 32, by 0.25 cases per 100 women per each BMI unit, and then flattened. The protective association of higher vegetable density strengthened in a linear fashion, by 0.3 cases per 100 women for every unit increase in BMI, up to a BMI of 30, where it plateaued. Dietary patterns with a high periconceptional density of fruits and vegetables appear more protective against preeclampsia for women with higher BMI than for leaner women.


Subject(s)
Fruit , Pre-Eclampsia , Body Mass Index , Cohort Studies , Diet , Female , Humans , Machine Learning , Pre-Eclampsia/epidemiology , Pregnancy , Vegetables
12.
Epidemiology ; 33(1): 95-104, 2022 01 01.
Article in English | MEDLINE | ID: mdl-34711736

ABSTRACT

BACKGROUND: Severe maternal morbidity (SMM) is an important maternal health indicator, but existing tools to identify SMM have substantial limitations. Our objective was to retrospectively identify true SMM status using ensemble machine learning in a hospital database and to compare machine learning algorithm performance with existing tools for SMM identification. METHODS: We screened all deliveries occurring at Magee-Womens Hospital, Pittsburgh, PA (2010-2011 and 2013-2017) using the Centers for Disease Control and Prevention list of diagnoses and procedures for SMM, intensive care unit admission, and/or prolonged postpartum length of stay. We performed a detailed medical record review to confirm case status. We trained ensemble machine learning (SuperLearner) algorithms, which "stack" predictions from multiple algorithms to obtain optimal predictions, on 171 SMM cases and 506 non-cases from 2010 to 2011, then evaluated the performance of these algorithms on 160 SMM cases and 337 non-cases from 2013 to 2017. RESULTS: Some SuperLearner algorithms performed better than existing screening criteria in terms of positive predictive value (0.77 vs. 0.64, respectively) and balanced accuracy (0.99 vs. 0.86, respectively). However, they did not perform as well as the screening criteria in terms of true-positive detection rate (0.008 vs. 0.32, respectively) and performed similarly in terms of negative predictive value. The most important predictor variables were intensive care unit admission and prolonged postpartum length of stay. CONCLUSIONS: Ensemble machine learning did not globally improve the ascertainment of true SMM cases. Our results suggest that accurate identification of SMM likely will remain a challenge in the absence of a universal definition of SMM or national obstetric surveillance systems.


Subject(s)
Maternal Health , Postpartum Period , Female , Humans , Machine Learning , Morbidity , Pregnancy , Retrospective Studies , Risk Factors
13.
J Nutr ; 152(8): 1886-1894, 2022 08 09.
Article in English | MEDLINE | ID: mdl-35641231

ABSTRACT

BACKGROUND: Adherence to the Dietary Guidelines for Americans is often assessed using the Healthy Eating Index (HEI). The HEI total score reflects overall diet quality, with all aspects equally important. Using the traditional weighting scheme for the HEI, all components are generally weighted equally in the total score. However, there is limited empirical basis for applying the traditional weighting for pregnancy specifically. OBJECTIVES: We aimed to assess associations between the 12 HEI-2010 component scores and select pregnancy outcomes. METHODS: The Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-Be was a prospective pregnancy cohort (US multicenter, 2010-2013). Participants enrolled in the study between 6 and 13 weeks of gestation. An FFQ assessed usual dietary intake 3 months prior to pregnancy (n = 7880). Scores for the HEI-2010 components were assigned using prespecified standards based on densities (standard units per 1000 kcal) of relevant food groups for most components, a ratio (PUFAs and MUFAs to SFAs) for fatty acids, and the contribution to total energy for empty calories. Using binomial regression, we estimated risk differences between each component score and cases of small-for-gestational age (SGA) birth, preterm birth, preeclampsia, and gestational diabetes, controlling for total energy and scores for the other HEI-2010 components. RESULTS: Higher scores for greens and beans and total vegetables were associated with fewer cases of SGA birth, preterm birth, and preeclampsia. For instance, every 1-unit increase in the greens and beans score was associated with 1.2 fewer SGA infants (95% CI, 0.7-1.7), 0.7 fewer preterm births (95% CI, 0.3-1.1), and 0.7 fewer preeclampsia cases (95% CI, 0.2-1.1) per 100 deliveries. For gestational diabetes, the associations were null. CONCLUSIONS: Vegetable-rich diets were associated with fewer cases of SGA birth, preterm birth, and preeclampsia, controlling for overall diet quality. Examination of the equal weighting of the HEI components (and underlying guidance) is needed for pregnancy.


Subject(s)
Diabetes, Gestational , Pre-Eclampsia , Premature Birth , Diabetes, Gestational/epidemiology , Diet , Diet, Healthy , Female , Humans , Infant, Newborn , Pre-Eclampsia/epidemiology , Pre-Eclampsia/prevention & control , Pregnancy , Premature Birth/epidemiology , Prospective Studies , United States , Vegetables
14.
Ann Intern Med ; 174(5): 595-601, 2021 05.
Article in English | MEDLINE | ID: mdl-33493011

ABSTRACT

BACKGROUND: A previous large randomized trial indicated that preconception-initiated low-dose aspirin (LDA) therapy did not have a positive effect on pregnancy outcomes. However, this trial was subject to nonadherence, which was not taken into account by the intention-to-treat approach. OBJECTIVE: To estimate per protocol effects of preconception-initiated LDA on pregnancy loss and live birth. DESIGN: The EAGeR (Effects of Aspirin on Gestation and Reproduction) trial was used to construct a prospective cohort for a post hoc analysis. (ClinicalTrials.gov: NCT00467363). SETTING: 4 university medical centers in the United States. PARTICIPANTS: 1227 women between the ages of 18 and 40 years who had 1 or 2 previous pregnancy losses and were attempting pregnancy. MEASUREMENTS: Adherence to LDA or placebo, assessed by measuring pill bottle weights at regular intervals during follow-up. Primary outcomes were human chorionic gonadotropin (hCG)-detected pregnancies, pregnancy losses, and live births, determined by pregnancy tests and medical records. RESULTS: Relative to placebo, adhering to LDA for 5 of 7 days per week led to 8 more hCG-detected pregnancies (95% CI, 4.64 to 10.96 pregnancies), 15 more live births (CI, 7.65 to 21.15 births), and 6 fewer pregnancy losses (CI, -12.00 to -0.20 losses) for every 100 women in the trial. In addition, compared with placebo, postconception initiation of LDA therapy led to a reduction in the estimated effects. Furthermore, effects were obtained in a minimum of 4 of 7 days per week. LIMITATION: The EAGeR trial data for this study were analyzed as observational data, thus are subject to the limitations of prospective observational studies. CONCLUSION: Per protocol results suggest that preconception use of LDA at least 4 days per week may improve reproductive outcomes for women who have had 1 or 2 pregnancy losses. Increasing adherence to daily LDA seems to be key to improving effectiveness. PRIMARY FUNDING SOURCE: National Institutes of Health.


Subject(s)
Anti-Inflammatory Agents, Non-Steroidal/administration & dosage , Aspirin/administration & dosage , Chorionic Gonadotropin/blood , Preconception Care/methods , Pregnancy Outcome , Abortion, Spontaneous , Adult , Female , Humans , Live Birth , Medication Adherence , Pregnancy , United States
15.
Am J Epidemiol ; 2021 Jul 15.
Article in English | MEDLINE | ID: mdl-34268558

ABSTRACT

Unlike parametric regression, machine learning (ML) methods do not generally require precise knowledge of the true data generating mechanisms. As such, numerous authors have advocated for ML methods to estimate causal effects. Unfortunately, ML algorithmscan perform worse than parametric regression. We demonstrate the performance of ML-based single- and double-robust estimators. We use 100 Monte Carlo samples with sample sizes of 200, 1200, and 5000 to investigate bias and confidence interval coverage under several scenarios. In a simple confounding scenario, confounders were related to the treatment and the outcome via parametric models. In a complex confounding scenario, the simple confounders were transformed to induce complicated nonlinear relationships. In the simple scenario, when ML algorithms were used, double-robust estimators were superior to single-robust estimators. In the complex scenario, single-robust estimators with ML algorithms were at least as biased as estimators using misspecified parametric models. Double-robust estimators were less biased, but coverage was well below nominal. The use of sample splitting, inclusion of confounder interactions, reliance on a richly specified ML algorithm, and use of doubly robust estimators was the only explored approach that yielded negligible bias and nominal coverage. Our results suggest that ML based singly robust methods should be avoided.

16.
Am J Epidemiol ; 190(12): 2690-2699, 2021 12 01.
Article in English | MEDLINE | ID: mdl-34268567

ABSTRACT

An increasing number of recent studies have suggested that doubly robust estimators with cross-fitting should be used when estimating causal effects with machine learning methods. However, not all existing programs that implement doubly robust estimators support machine learning methods and cross-fitting, or provide estimates on multiplicative scales. To address these needs, we developed AIPW, a software package implementing augmented inverse probability weighting (AIPW) estimation of average causal effects in R (R Foundation for Statistical Computing, Vienna, Austria). Key features of the AIPW package include cross-fitting and flexible covariate adjustment for observational studies and randomized controlled trials (RCTs). In this paper, we use a simulated RCT to illustrate implementation of the AIPW estimator. We also perform a simulation study to evaluate the performance of the AIPW package compared with other doubly robust implementations, including CausalGAM, npcausal, tmle, and tmle3. Our simulation showed that the AIPW package yields performance comparable to that of other programs. Furthermore, we also found that cross-fitting substantively decreases the bias and improves the confidence interval coverage for doubly robust estimators fitted with machine learning algorithms. Our findings suggest that the AIPW package can be a useful tool for estimating average causal effects with machine learning methods in RCTs and observational studies.


Subject(s)
Causality , Data Interpretation, Statistical , Machine Learning , Software Design , Bias , Computer Simulation , Humans , Observational Studies as Topic , Randomized Controlled Trials as Topic
17.
Am J Epidemiol ; 190(5): 900-907, 2021 05 04.
Article in English | MEDLINE | ID: mdl-33083814

ABSTRACT

In aspiring to be discerning epidemiologists, we must learn to think critically about the fundamental concepts in our field and be able to understand and apply many of the novel methods being developed today. We must also find effective ways to teach both basic and advanced topics in epidemiology to graduate students, in a manner that goes beyond simple provision of knowledge. Here, we argue that simulation is one critical tool that can be used to help meet these goals, by providing examples of how simulation can be used to address 2 common misconceptions in epidemiology. First, we show how simulation can be used to explore nondifferential exposure misclassification. Second, we show how an instructor could use simulation to provide greater clarity on the correct definition of the P value. Through these 2 examples, we highlight how simulation can be used to both clearly and concretely demonstrate theoretical concepts, as well as to test and experiment with ideas, theories, and methods in a controlled environment. Simulation is therefore useful not only in the classroom but also as a skill for independent self-learning.


Subject(s)
Epidemiology/education , Simulation Training , Bias , Confounding Factors, Epidemiologic , Humans , Monte Carlo Method
18.
Biostatistics ; 21(2): 339-344, 2020 04 01.
Article in English | MEDLINE | ID: mdl-31742353

ABSTRACT

In this commentary, we put forth the following argument: Anyone conducting machine learning in a health-related domain should educate themselves about structural racism. We argue that structural racism is a critical body of knowledge needed for generalizability in almost all domains of health research.


Subject(s)
Biomedical Research , Biostatistics , Health Services Research , Health Status Disparities , Healthcare Disparities , Machine Learning , Racism , Humans
19.
Epidemiology ; 32(2): 202-208, 2021 03 01.
Article in English | MEDLINE | ID: mdl-33470712

ABSTRACT

When causal inference is of primary interest, a range of target parameters can be chosen to define the causal effect, such as average treatment effects (ATEs). However, ATEs may not always align with the research question at hand. Furthermore, the assumptions needed to interpret estimates as ATEs, such as exchangeability, consistency, and positivity, are often not met. Here, we present the incremental propensity score (PS) approach to quantify the effect of shifting each person's exposure propensity by some predetermined amount. Compared with the ATE, incremental PS may better reflect the impact of certain policy interventions and do not require that positivity hold. Using the Nulliparous Pregnancy Outcomes Study: monitoring mothers-to-be (nuMoM2b), we quantified the relationship between total vegetable intake and the risk of preeclampsia and compared it to average treatment effect estimates. The ATE estimates suggested a reduction of between two and three preeclampsia cases per 100 pregnancies for consuming at least half a cup of vegetables per 1,000 kcal. However, positivity violations obfuscate the interpretation of these results. In contrast, shifting each woman's exposure propensity by odds ratios ranging from 0.20 to 5.0 yielded no difference in the risk of preeclampsia. Our analyses show the utility of the incremental PS effects in addressing public health questions with fewer assumptions.


Subject(s)
Pregnancy Outcome , Causality , Female , Humans , Odds Ratio , Pregnancy , Propensity Score
20.
Am J Epidemiol ; 189(11): 1408-1411, 2020 11 02.
Article in English | MEDLINE | ID: mdl-32412079

ABSTRACT

The Kaplan-Meier (KM) estimator of the survival function imputes event times for right-censored and left-truncated observations, but these imputations are hidden and therefore sometimes unrecognized by applied health scientists. Using a simple example data set and the redistribution algorithm, we illustrate how imputations are made by the KM estimator. We also discuss the assumptions necessary for valid analyses of survival data. Illustrating imputations hidden by the KM estimator helps to clarify these assumptions and therefore may reduce inappropriate inferences.


Subject(s)
Data Interpretation, Statistical , Kaplan-Meier Estimate , Survival Analysis , Algorithms , Humans
SELECTION OF CITATIONS
SEARCH DETAIL