RESUMO
Targeted maximum likelihood estimation (TMLE) is increasingly used for doubly robust causal inference, but how missing data should be handled when using TMLE with data-adaptive approaches is unclear. Based on data (1992-1998) from the Victorian Adolescent Health Cohort Study, we conducted a simulation study to evaluate 8 missing-data methods in this context: complete-case analysis, extended TMLE incorporating an outcome-missingness model, the missing covariate missing indicator method, and 5 multiple imputation (MI) approaches using parametric or machine-learning models. We considered 6 scenarios that varied in terms of exposure/outcome generation models (presence of confounder-confounder interactions) and missingness mechanisms (whether outcome influenced missingness in other variables and presence of interaction/nonlinear terms in missingness models). Complete-case analysis and extended TMLE had small biases when outcome did not influence missingness in other variables. Parametric MI without interactions had large bias when exposure/outcome generation models included interactions. Parametric MI including interactions performed best in bias and variance reduction across all settings, except when missingness models included a nonlinear term. When choosing a method for handling missing data in the context of TMLE, researchers must consider the missingness mechanism and, for MI, compatibility with the analysis method. In many settings, a parametric MI approach that incorporates interactions and nonlinearities is expected to perform well.
Assuntos
Causalidade , Humanos , Funções Verossimilhança , Adolescente , Interpretação Estatística de Dados , Viés , Modelos Estatísticos , Simulação por ComputadorRESUMO
Longitudinal cohort studies, which follow a group of individuals over time, provide the opportunity to examine causal effects of complex exposures on long-term health outcomes. Utilizing data from multiple cohorts has the potential to add further benefit by improving precision of estimates through data pooling and by allowing examination of effect heterogeneity through replication of analyses across cohorts. However, the interpretation of findings can be complicated by biases that may be compounded when pooling data, or, contribute to discrepant findings when analyses are replicated. The "target trial" is a powerful tool for guiding causal inference in single-cohort studies. Here we extend this conceptual framework to address the specific challenges that can arise in the multi-cohort setting. By representing a clear definition of the target estimand, the target trial provides a central point of reference against which biases arising in each cohort and from data pooling can be systematically assessed. Consequently, analyses can be designed to reduce these biases and the resulting findings appropriately interpreted in light of potential remaining biases. We use a case study to demonstrate the framework and its potential to strengthen causal inference in multi-cohort studies through improved analysis design and clarity in the interpretation of findings. Special Collection: N/A.
RESUMO
PURPOSE: "Positive epidemiology" emphasizes strengths and assets that protect the health of populations. Positive mental health refers to a range of social and emotional capabilities that may support adaptation to challenging circumstances. We examine the role of positive mental health in promoting adolescent health during the crisis phase of the COVID-19 pandemic. METHODS: We used four long-running Australian and UK longitudinal cohorts: Childhood to Adolescence Transition Study (CATS; analyzed N=809; Australia); Longitudinal Study of Australian Children (LSAC) - Baby (analyzed N=1,534) and Kindergarten (analyzed N=1,300) cohorts; Millennium Cohort Study (MCS; analyzed N=2,490; UK). Measures included: (Pre-pandemic exposure): Positive mental health (parent-reported, 13-15 years) including regulating emotions, interacting well with peers, and caring for others; and pandemic outcomes: psychological distress, life satisfaction, and sleep and alcohol use outside of recommendations (16-21 years; 2020). We used two-stage meta-analysis to estimate associations between positive mental health and outcomes across cohorts, accounting for potential confounders. RESULTS: Estimates suggest meaningful effects of positive mental health on psychosocial outcomes during the pandemic, including lower risk of psychological distress (Risk Ratio [RR]=0.83 95%CI=0.71, 0.97) and higher life satisfaction (RR=1.1, 95%CI=1.0, 1.2). The estimated effects for health behaviors were smaller in magnitude (sleep: RR=0.95, 95%CI=0.86, 1.1; alcohol use: RR=0.97, 95%CI=0.85, 1.1). CONCLUSIONS: Our results are consistent with the hypothesis that adolescents' positive mental health supports better psychosocial outcomes during challenges such as the COVID-19 pandemic, but relevance for health behaviors is less clear. These findings reinforce the value of extending evidence to include positive health states and assets.
RESUMO
BACKGROUND: Missing data are common in observational studies and often occur in several of the variables required when estimating a causal effect, i.e. the exposure, outcome and/or variables used to control for confounding. Analyses involving multiple incomplete variables are not as straightforward as analyses with a single incomplete variable. For example, in the context of multivariable missingness, the standard missing data assumptions ("missing completely at random", "missing at random" [MAR], "missing not at random") are difficult to interpret and assess. It is not clear how the complexities that arise due to multivariable missingness are being addressed in practice. The aim of this study was to review how missing data are managed and reported in observational studies that use multiple imputation (MI) for causal effect estimation, with a particular focus on missing data summaries, missing data assumptions, primary and sensitivity analyses, and MI implementation. METHODS: We searched five top general epidemiology journals for observational studies that aimed to answer a causal research question and used MI, published between January 2019 and December 2021. Article screening and data extraction were performed systematically. RESULTS: Of the 130 studies included in this review, 108 (83%) derived an analysis sample by excluding individuals with missing data in specific variables (e.g., outcome) and 114 (88%) had multivariable missingness within the analysis sample. Forty-four (34%) studies provided a statement about missing data assumptions, 35 of which stated the MAR assumption, but only 11/44 (25%) studies provided a justification for these assumptions. The number of imputations, MI method and MI software were generally well-reported (71%, 75% and 88% of studies, respectively), while aspects of the imputation model specification were not clear for more than half of the studies. A secondary analysis that used a different approach to handle the missing data was conducted in 69/130 (53%) studies. Of these 69 studies, 68 (99%) lacked a clear justification for the secondary analysis. CONCLUSION: Effort is needed to clarify the rationale for and improve the reporting of MI for estimation of causal effects from observational data. We encourage greater transparency in making and reporting analytical decisions related to missing data.
Assuntos
Estudos Observacionais como Assunto , Projetos de Pesquisa , Causalidade , Interpretação Estatística de Dados , Projetos de Pesquisa/normasRESUMO
Observational studies have a critical role in disability research, providing the opportunity to address a range of research questions. Over the past decades, there have been substantial shifts and developments in statistical methods for observational studies, most notably for causal inference. In this review, we provide an overview of modern design and analysis concepts critical for observational studies, drawing examples from the field of disability research and highlighting the challenges in this field, to inform the readership on important statistical considerations for their studies. WHAT THIS PAPER ADDS: Descriptive research questions have specific analytical complexities, so careful statistical design before analysis is critical. Prediction research aims to produce a model with good predictive ability and requires thorough statistical design prior to analysis. Causal research requires careful statistical analysis planning, facilitated by modern causal inference concepts and analytical methods. Adopting these approaches will strengthen the quality of observational studies addressing a range of research questions in the disability space.
Assuntos
Estudos Observacionais como Assunto , Humanos , Estudos Observacionais como Assunto/métodos , Interpretação Estatística de Dados , Pessoas com Deficiência , Projetos de Pesquisa , Pesquisa BiomédicaRESUMO
In the context of missing data, the identifiability or "recoverability" of the average causal effect (ACE) depends not only on the usual causal assumptions but also on missingness assumptions that can be depicted by adding variable-specific missingness indicators to causal diagrams, creating missingness directed acyclic graphs (m-DAGs). Previous research described canonical m-DAGs, representing typical multivariable missingness mechanisms in epidemiological studies, and examined mathematically the recoverability of the ACE in each case. However, this work assumed no effect modification and did not investigate methods for estimation across such scenarios. Here, we extend this research by determining the recoverability of the ACE in settings with effect modification and conducting a simulation study to evaluate the performance of widely used missing data methods when estimating the ACE using correctly specified g-computation. Methods assessed were complete case analysis (CCA) and various implementations of multiple imputation (MI) with varying degrees of compatibility with the outcome model used in g-computation. Simulations were based on an example from the Victorian Adolescent Health Cohort Study (VAHCS), where interest was in estimating the ACE of adolescent cannabis use on mental health in young adulthood. We found that the ACE is recoverable when no incomplete variable (exposure, outcome, or confounder) causes its own missingness, and nonrecoverable otherwise, in simplified versions of 10 canonical m-DAGs that excluded unmeasured common causes of missingness indicators. Despite this nonrecoverability, simulations showed that MI approaches that are compatible with the outcome model in g-computation may enable approximately unbiased estimation across all canonical m-DAGs considered, except when the outcome causes its own missingness or causes the missingness of a variable that causes its own missingness. In the latter settings, researchers may need to consider sensitivity analysis methods incorporating external information (e.g., delta-adjustment methods). The VAHCS case study illustrates the practical implications of these findings.
Assuntos
Estudos de Coortes , Humanos , Adulto Jovem , Adulto , Adolescente , Interpretação Estatística de Dados , Causalidade , Simulação por ComputadorRESUMO
Multiple imputation (MI) is a popular method for handling missing data. Auxiliary variables can be added to the imputation model(s) to improve MI estimates. However, the choice of which auxiliary variables to include is not always straightforward. Several data-driven auxiliary variable selection strategies have been proposed, but there has been limited evaluation of their performance. Using a simulation study we evaluated the performance of eight auxiliary variable selection strategies: (1, 2) two versions of selection based on correlations in the observed data; (3) selection using hypothesis tests of the "missing completely at random" assumption; (4) replacing auxiliary variables with their principal components; (5, 6) forward and forward stepwise selection; (7) forward selection based on the estimated fraction of missing information; and (8) selection via the least absolute shrinkage and selection operator (LASSO). A complete case analysis and an MI analysis using all auxiliary variables (the "full model") were included for comparison. We also applied all strategies to a motivating case study. The full model outperformed all auxiliary variable selection strategies in the simulation study, with the LASSO strategy the best performing auxiliary variable selection strategy overall. All MI analysis strategies that we were able to apply to the case study led to similar estimates, although computational time was substantially reduced when variable selection was employed. This study provides further support for adopting an inclusive auxiliary variable strategy where possible. Auxiliary variable selection using the LASSO may be a promising alternative when the full model fails or is too burdensome.
Assuntos
Simulação por ComputadorRESUMO
BACKGROUND: With continuous outcomes, the average causal effect is typically defined using a contrast of expected potential outcomes. However, in the presence of skewed outcome data, the expectation (population mean) may no longer be meaningful. In practice the typical approach is to continue defining the estimand this way or transform the outcome to obtain a more symmetric distribution, although neither approach may be entirely satisfactory. Alternatively the causal effect can be redefined as a contrast of median potential outcomes, yet discussion of confounding-adjustment methods to estimate the causal difference in medians is limited. In this study we described and compared confounding-adjustment methods to address this gap. METHODS: The methods considered were multivariable quantile regression, an inverse probability weighted (IPW) estimator, weighted quantile regression (another form of IPW) and two little-known implementations of g-computation for this problem. Methods were evaluated within a simulation study under varying degrees of skewness in the outcome and applied to an empirical study using data from the Longitudinal Study of Australian Children. RESULTS: Simulation results indicated the IPW estimator, weighted quantile regression and g-computation implementations minimised bias across all settings when the relevant models were correctly specified, with g-computation additionally minimising the variance. Multivariable quantile regression, which relies on a constant-effect assumption, consistently yielded biased results. Application to the empirical study illustrated the practical value of these methods. CONCLUSION: The presented methods provide appealing avenues for estimating the causal difference in medians.
Assuntos
Modelos Estatísticos , Criança , Humanos , Estudos Longitudinais , Austrália , Simulação por Computador , Probabilidade , Causalidade , ViésRESUMO
BACKGROUND: Case-cohort studies are conducted within cohort studies, with the defining feature that collection of exposure data is limited to a subset of the cohort, leading to a large proportion of missing data by design. Standard analysis uses inverse probability weighting (IPW) to address this intended missing data, but little research has been conducted into how best to perform analysis when there is also unintended missingness. Multiple imputation (MI) has become a default standard for handling unintended missingness and is typically used in combination with IPW to handle the intended missingness due to the case-control sampling. Alternatively, MI could be used to handle both the intended and unintended missingness. While the performance of an MI-only approach has been investigated in the context of a case-cohort study with a time-to-event outcome, it is unclear how this approach performs with a binary outcome. METHODS: We conducted a simulation study to assess and compare the performance of approaches using only MI, only IPW, and a combination of MI and IPW, for handling intended and unintended missingness in the case-cohort setting. We also applied the approaches to a case study. RESULTS: Our results show that the combined approach is approximately unbiased for estimation of the exposure effect when the sample size is large, and was the least biased with small sample sizes, while MI-only and IPW-only exhibited larger biases in both sample size settings. CONCLUSIONS: These findings suggest that a combined MI/IPW approach should be preferred to handle intended and unintended missing data in case-cohort studies with binary outcomes.
Assuntos
Estudos de Coortes , Humanos , Interpretação Estatística de Dados , Probabilidade , Viés , Simulação por ComputadorRESUMO
BACKGROUND: Despite recent advances in causal inference methods, outcome regression remains the most widely used approach for estimating causal effects in epidemiological studies with a single-point exposure and outcome. Missing data are common in these studies, and complete-case analysis (CCA) and multiple imputation (MI) are two frequently used methods for handling them. In randomised controlled trials (RCTs), it has been shown that MI should be conducted separately by treatment group. In observational studies, causal inference is now understood as the task of emulating an RCT, which raises the question of whether MI should be conducted by exposure group in such studies. METHODS: We addressed this question by evaluating the performance of seven methods for handling missing data when estimating causal effects with outcome regression. We conducted an extensive simulation study based on an illustrative case study from the Victorian Adolescent Health Cohort Study, assessing a range of scenarios, including seven outcome generation models with exposure-confounder interactions of differing strength. RESULTS: The simulation results showed that MI by exposure group led to the least bias when the size of the smallest exposure group was relatively large, followed by MI approaches that included the exposure-confounder interactions. CONCLUSIONS: The findings from our simulation study, which was designed based on a real case study, suggest that current practice for the conduct of MI in causal inference may need to shift to stratifying by exposure group where feasible, or otherwise including exposure-confounder interactions in the imputation model.
Assuntos
Simulação por Computador , Humanos , Adolescente , ViésRESUMO
BACKGROUND: Academic difficulties are common in adolescents with mental health problems. Although earlier childhood emotional problems, characterised by heightened anxiety and depressive symptoms are common forerunners to adolescent mental health problems, the degree to which mental health problems in childhood may contribute independently to academic difficulties has been little explored. METHODS: Data were drawn from a prospective cohort study of students in Melbourne, Australia (N = 1239). Data were linked with a standardised national assessment of academic performance at baseline (9 years) and wave three (11 years). Depressive and anxiety symptoms were assessed at baseline and wave two (10 years). Regression analyses estimated the association between emotional problems (9 and/or 10 years) and academic performance at 11 years, adjusting for baseline academic performance, sex, age and socioeconomic status, and hyperactivity/inattention symptoms. RESULTS: Students with depressive symptoms at 9 years of age had lost nearly 4 months of numeracy learning two years later after controlling for baseline academic performance and confounders. Results were similar for anxiety symptoms. Regardless of when depressive symptoms occurred there were consistent associations with poorer numeracy performance at 11 years. The association of depressive symptoms with reading performance was weaker than for numeracy if they were present at wave two. Persistent anxiety symptoms across two waves led to nearly a 4 month loss of numeracy learning at 11 years, but the difference was not meaningful for reading. Findings were similar when including hyperactivity/inattention symptoms. CONCLUSIONS: Childhood anxiety and depression are not only forerunners of later mental health problems but predict academic achievement. Partnerships between education and health systems have the potential to not only improve childhood emotional problems but also improve learning.
Assuntos
Ansiedade , Emoções , Adolescente , Humanos , Criança , Lactente , Estudos Prospectivos , Ansiedade/psicologia , Estudantes/psicologia , Instituições AcadêmicasRESUMO
BACKGROUND: The potential for early interventions to reduce the later prevalence of common mental disorders (CMD) first experienced in adolescence is unclear. AIMS: To examine the course of CMD and evaluate the extent to which the prevalence of CMD could be reduced by preventing adolescent CMD, or by intervening to change four young adult processes, between the ages of 20 and 29 years, that could be mediating the link between adolescent and adult disorder. METHOD: This was a prospective cohort study of 1923 Australian participants assessed repeatedly from adolescence (wave 1, mean age 14 years) to adulthood (wave 10, mean age 35 years). Causal mediation analysis was undertaken to evaluate the extent to which the prevalence of CMD at age 35 years in those with adolescent CMD could be reduced by either preventing adolescent CMD, or by intervening on four young adult mediating processes: the occurrence of young adult CMD, frequent cannabis use, parenting a child by age 24 years, and engagement in higher education and employment. RESULTS: At age 35, 19.2% of participants reported CMD; a quarter of these participants experienced CMD during both adolescence and young adulthood. In total, 49% of those with CMD during both adolescence and young adulthood went on to report CMD at age 35 years. Preventing adolescent CMD reduced the population prevalence at age 35 years by 3.9%. Intervening on all four young adult processes among those with adolescent CMD, reduced this prevalence by 1.6%. CONCLUSIONS: In this Australian cohort, a large proportion of adolescent CMD resolved by adulthood, and by age 35 years, the largest proportion of CMD emerged among individuals without prior CMD. Time-limited, early intervention in those with earlier adolescent disorder is unlikely to substantially reduce the prevalence of CMD in midlife.
Assuntos
Transtornos Mentais , Adolescente , Adulto , Austrália/epidemiologia , Criança , Estudos de Coortes , Humanos , Transtornos Mentais/epidemiologia , Prevalência , Estudos Prospectivos , Adulto JovemRESUMO
Three-level data arising from repeated measures on individuals clustered within higher-level units are common in medical research. A complexity arises when individuals change clusters over time, resulting in a cross-classified data structure. Missing values in these studies are commonly handled via multiple imputation (MI). If the three-level, cross-classified structure is modeled in the analysis, it also needs to be accommodated in the imputation model to ensure valid results. While incomplete three-level data can be handled using various approaches within MI, the performance of these in the cross-classified data setting remains unclear. We conducted simulations under a range of scenarios to compare these approaches in the context of an acute-effects cross-classified random effects substantive model, which models the time-varying cluster membership via simple additive random effects. The simulation study was based on a case study in a longitudinal cohort of students clustered within schools. We evaluated methods that ignore the time-varying cluster memberships by taking the first or most common cluster for each individual; pragmatic extensions of single- and two-level MI approaches within the joint modeling (JM) and the fully conditional specification (FCS) frameworks, using dummy indicators (DI) and/or imputing repeated measures in wide format to account for the cross-classified structure; and a three-level FCS MI approach developed specifically for cross-classified data. Results indicated that the FCS implementations performed well in terms of bias and precision while JM approaches performed poorly. Under both frameworks approaches using the DI extension should be used with caution in the presence of sparse data.
Assuntos
Modelos Estatísticos , Projetos de Pesquisa , Viés , Simulação por Computador , Interpretação Estatística de Dados , HumanosRESUMO
BACKGROUND: In case-cohort studies a random subcohort is selected from the inception cohort and acts as the sample of controls for several outcome investigations. Analysis is conducted using only the cases and the subcohort, with inverse probability weighting (IPW) used to account for the unequal sampling probabilities resulting from the study design. Like all epidemiological studies, case-cohort studies are susceptible to missing data. Multiple imputation (MI) has become increasingly popular for addressing missing data in epidemiological studies. It is currently unclear how best to incorporate the weights from a case-cohort analysis in MI procedures used to address missing covariate data. METHOD: A simulation study was conducted with missingness in two covariates, motivated by a case study within the Barwon Infant Study. MI methods considered were: using the outcome, a proxy for weights in the simple case-cohort design considered, as a predictor in the imputation model, with and without exposure and covariate interactions; imputing separately within each weight category; and using a weighted imputation model. These methods were compared to a complete case analysis (CCA) within the context of a standard IPW analysis model estimating either the risk or odds ratio. The strength of associations, missing data mechanism, proportion of observations with incomplete covariate data, and subcohort selection probability varied across the simulation scenarios. Methods were also applied to the case study. RESULTS: There was similar performance in terms of relative bias and precision with all MI methods across the scenarios considered, with expected improvements compared with the CCA. Slight underestimation of the standard error was seen throughout but the nominal level of coverage (95%) was generally achieved. All MI methods showed a similar increase in precision as the subcohort selection probability increased, irrespective of the scenario. A similar pattern of results was seen in the case study. CONCLUSIONS: How weights were incorporated into the imputation model had minimal effect on the performance of MI; this may be due to case-cohort studies only having two weight categories. In this context, inclusion of the outcome in the imputation model was sufficient to account for the unequal sampling probabilities in the analysis model.
Assuntos
Projetos de Pesquisa , Viés , Estudos de Coortes , Interpretação Estatística de Dados , Humanos , ProbabilidadeRESUMO
Importance: Randomized clinical trials showed that earlier peanut introduction can prevent peanut allergy in select high-risk populations. This led to changes in infant feeding guidelines in 2016 to recommend early peanut introduction for all infants to reduce the risk of peanut allergy. Objective: To measure the change in population prevalence of peanut allergy in infants after the introduction of these new guidelines and evaluate the association between early peanut introduction and peanut allergy. Design: Two population-based cross-sectional samples of infants aged 12 months were recruited 10 years apart using the same sampling frame and methods to allow comparison of changes over time. Infants were recruited from immunization centers around Melbourne, Australia. Infants attending their 12-month immunization visit were eligible to participate (eligible age range, 11-15 months), regardless of history of peanut exposure or allergy history. Exposures: Questionnaires collected data on demographics, food allergy risk factors, peanut introduction, and reactions. Main Outcome and Measures: All infants underwent skin prick tests to peanut and those with positive results underwent oral food challenges. Prevalence estimates were standardized to account for changes in population demographics over time. Results: This study included 7209 infants (1933 in 2018-2019 and 5276 in 2007-2011). Of the participants in the older vs more recent cohort, 51.8% vs 50.8% were male; median (IQR) ages were 12.5 (12.2-13.0) months vs 12.4 (12.2-12.9) months. There was an increase in infants of East Asian ancestry over time (16.5% in 2018-2019 vs 10.5% in 2007-2011), which is a food allergy risk factor. After standardizing for infant ancestry and other demographics changes, peanut allergy prevalence was 2.6% (95% CI, 1.8%-3.4%) in 2018-2019, compared with 3.1% in 2007-2011 (difference, -0.5% [95% CI, -1.4% to 0.4%]; P = .26). Earlier age of peanut introduction was significantly associated with a lower risk of peanut allergy among infants of Australian ancestry in 2018-2019 (age 12 months compared with age 6 months or younger: adjusted odds ratio, 0.08 [05% CI, 0.02-0.36]; age 12 months compared with 7 to less than 10 months: adjusted odds ratio, 0.09 [95% CI, 0.02-0.53]), but not significant among infants of East Asian ancestry (P for interaction = .002). Conclusions and Relevance: In cross-sectional analyses, introduction of a guideline recommending early peanut introduction in Australia was not associated with a statistically significant lower or higher prevalence of peanut allergy across the population.
Assuntos
Arachis , Comportamento Alimentar , Hipersensibilidade a Amendoim , Arachis/efeitos adversos , Austrália/epidemiologia , Estudos Transversais , Feminino , Humanos , Lactente , Masculino , Hipersensibilidade a Amendoim/epidemiologia , Hipersensibilidade a Amendoim/etiologia , Hipersensibilidade a Amendoim/prevenção & controle , Prevalência , Fatores de RiscoRESUMO
Three-level data structures arising from repeated measures on individuals clustered within larger units are common in health research studies. Missing data are prominent in such studies and are often handled via multiple imputation (MI). Although several MI approaches can be used to account for the three-level structure, including adaptations to single- and two-level approaches, when the substantive analysis model includes interactions or quadratic effects, these too need to be accommodated in the imputation model. In such analyses, substantive model compatible (SMC) MI has shown great promise in the context of single-level data. Although there have been recent developments in multilevel SMC MI, to date only one approach that explicitly handles incomplete three-level data is available. Alternatively, researchers can use pragmatic adaptations to single- and two-level MI approaches, or two-level SMC-MI approaches. We describe the available approaches and evaluate them via simulations in the context of three three-level random effects analysis models involving an interaction between the incomplete time-varying exposure and time, an interaction between the time-varying exposure and an incomplete time-fixed confounder, or a quadratic effect of the exposure. Results showed that all approaches considered performed well in terms of bias and precision when the target analysis involved an interaction with time, but the three-level SMC MI approach performed best when the target analysis involved an interaction between the time-varying exposure and an incomplete time-fixed confounder, or a quadratic effect of the exposure. We illustrate the methods using data from the Childhood to Adolescence Transition Study.
Assuntos
Projetos de Pesquisa , Adolescente , Humanos , Criança , Viés , Simulação por ComputadorRESUMO
Randomized trials involving independent and paired observations occur in many areas of health research, for example in paediatrics, where studies can include infants from both single and twin births. Multiple imputation (MI) is often used to address missing outcome data in randomized trials, yet its performance in trials with independent and paired observations, where design effects can be less than or greater than one, remains to be explored. Using simulated data and through application to a trial dataset, we investigated the performance of different methods of MI for a continuous or binary outcome when followed by analysis using generalized estimating equations to account for clustering due to the pairs. We found that imputing data separately for independent and paired data, with paired data imputed in wide format, was the best performing MI method, producing unbiased point and standard error estimates for the treatment effect throughout. Ignoring clustering in the imputation model performed well in settings where the design effect due to the inclusion of paired data was close to one, but otherwise led to moderately biased variance estimates. Including a random cluster effect in the imputation model led to slightly biased point estimates for binary outcome data and variance estimates that were too small in some settings. Based on these results, we recommend researchers impute independent and paired data separately where feasible to do so. The exception is if the design effect due to the inclusion of paired data is close to one, where ignoring clustering may be appropriate.
Assuntos
Interpretação Estatística de Dados , Ensaios Clínicos Controlados Aleatórios como Assunto , Análise por Conglomerados , Simulação por Computador , HumanosRESUMO
Semi-continuous variables are characterized by a point mass at one value and a continuous range of values for remaining observations. An example is alcohol consumption quantity, with a spike of zeros representing non-drinkers and positive values for drinkers. If multiple imputation is used to handle missing values for semi-continuous variables, it is unclear how this should be implemented within the standard approaches of fully conditional specification (FCS) and multivariate normal imputation (MVNI). This question is brought into focus by the use of categorized versions of semi-continuous exposure variables in analyses (eg, no drinking, drinking below binge level, binge drinking, heavy binge drinking), raising the question of how best to achieve congeniality between imputation and analysis models. We performed a simulation study comparing nine approaches for imputing semi-continuous exposures requiring categorization for analysis. Three methods imputed the categories directly: ordinal logistic regression, and imputation of binary indicator variables representing the categories using MVNI (with two variants). Six methods (predictive mean matching, zero-inflated binomial imputation, and two-part imputation methods with variants in FCS and MVNI) imputed the semi-continuous variable, with categories derived after imputation. The ordinal and zero-inflated binomial methods had good performance across most scenarios, while MVNI methods requiring rounding after imputation did not perform well. There were mixed results for predictive mean matching and the two-part methods, depending on whether the estimands were proportions or regression coefficients. The results highlight the need to consider the parameter of interest when selecting an imputation procedure.
Assuntos
Coleta de Dados , Projetos de Pesquisa , Simulação por Computador , Coleta de Dados/métodos , Humanos , Modelos LogísticosRESUMO
BACKGROUND: Use of social networking in later childhood and adolescence has risen quickly. The consequences of these changes for mental health are debated but require further empirical evaluation. METHODS: Using data from the Childhood to Adolescence Transition Study (n = 1,156), duration of social networking use was measured annually at four time points from 11.9 to 14.8 years of age (≥1 h/day indicating high use). Cross-sectional and prospective relationships between social networking use and depressive and anxiety symptoms were examined. RESULTS: In adjusted (age, socioeconomic status, prior mental health history) cross-sectional analyses, females with high social networking use had greater odds of depressive (odds ratio [OR]: 2.15; 95% confidence interval [CI]: 1.58-2.91) and anxiety symptoms (OR: 1.99; 95% CI: 1.32-3.00) than those that used a few minutes at most, while males with high social networking use had 1.60 greater odds of reporting depressive symptoms (95% CI: 1.09-2.35). For females, an increased odds of depressive symptoms at age 14.8 was observed for high social networking use at one previous wave and at two or three previous waves, even after adjustment (OR: 1.76; 95% CI: 1.11-2.78; OR: 2.06, 95% CI: 1.27-3.37, respectively) compared to no wave of high use. CONCLUSIONS: Our results suggest weak to moderate increased odds of depression and anxiety in girls and boys with high social networking use versus low/normal use. These findings indicate that prevention programs for early mental health problems might benefit from targeting social networking use in early adolescence.
Assuntos
Ansiedade , Depressão , Adolescente , Ansiedade/epidemiologia , Criança , Estudos Transversais , Depressão/epidemiologia , Feminino , Humanos , Masculino , Estudos Prospectivos , Rede SocialRESUMO
Early maternal-infant bonding problems are often forerunners of later emotional and behavioural difficulties. Interventions typically target the perinatal period but many risks may be established well before pregnancy. Here we examine the extent to which adolescent and young adult depression and anxiety symptoms predict perinatal maternal-infant bonding difficulties. The Victorian Intergenerational Health Cohort Study (VIHCS, est. 2006) is following offspring born to the Victorian Adolescent Health Cohort Study (VAHCS; est. 1992). VAHCS participants were assessed for depression and anxiety symptoms nine times during adolescence and young adulthood (age 14-29 years), and then contacted bi-annually (from age 29-35 years) to identify pregnancies. The Postpartum Bonding Questionnaire (PBQ) was administered to mothers at 2 and 12 months postpartum. A total of 395 women (606 infants) completed the 2-month and/or 12-month postpartum interviews. For most infants (64%), mothers had experienced depression and/or anxiety before pregnancy. Preconception depression and anxiety symptoms that persisted from adolescence into young adulthood predicted maternal-infant bonding problems at 2 months (ß = 0.30, 95% CI 0.04, 0.55) and 12 months postpartum (ß = 0.40, 95% CI 0.16, 0.63). Depression and anxiety symptoms occurring in young adulthood only, also predicted bonding problems at 12 months postpartum (ß = 0.37, 95% CI 0.02, 0.71). Associations between preconception depression and anxiety symptoms and anxiety-related maternal-infant bonding problems at 12 months postpartum remained after adjustment for antenatal and concurrent postpartum depressive symptoms. This study puts forward a case for extending preconception health care beyond contraception and nutrition to a broader engagement in supporting the mental health of young women from adolescence.