Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 107
Filtrar
1.
Proc Natl Acad Sci U S A ; 121(23): e2322376121, 2024 Jun 04.
Artigo em Inglês | MEDLINE | ID: mdl-38809705

RESUMO

In this article, we develop CausalEGM, a deep learning framework for nonlinear dimension reduction and generative modeling of the dependency among covariate features affecting treatment and response. CausalEGM can be used for estimating causal effects in both binary and continuous treatment settings. By learning a bidirectional transformation between the high-dimensional covariate space and a low-dimensional latent space and then modeling the dependencies of different subsets of the latent variables on the treatment and response, CausalEGM can extract the latent covariate features that affect both treatment and response. By conditioning on these features, one can mitigate the confounding effect of the high dimensional covariate on the estimation of the causal relation between treatment and response. In a series of experiments, the proposed method is shown to achieve superior performance over existing methods in both binary and continuous treatment settings. The improvement is substantial when the sample size is large and the covariate is of high dimension. Finally, we established excess risk bounds and consistency results for our method, and discuss how our approach is related to and improves upon other dimension reduction approaches in causal inference.

2.
Biostatistics ; 24(2): 309-326, 2023 04 14.
Artigo em Inglês | MEDLINE | ID: mdl-34382066

RESUMO

Scientists frequently generalize population level causal quantities such as average treatment effect from a source population to a target population. When the causal effects are heterogeneous, differences in subject characteristics between the source and target populations may make such a generalization difficult and unreliable. Reweighting or regression can be used to adjust for such differences when generalizing. However, these methods typically suffer from large variance if there is limited covariate distribution overlap between the two populations. We propose a generalizability score to address this issue. The score can be used as a yardstick to select target subpopulations for generalization. A simplified version of the score avoids using any outcome information and thus can prevent deliberate biases associated with inadvertent access to such information. Both simulation studies and real data analysis demonstrate convincing results for such selection.


Assuntos
Projetos de Pesquisa , Humanos , Pontuação de Propensão , Simulação por Computador , Causalidade , Viés
3.
Biostatistics ; 24(2): 518-537, 2023 04 14.
Artigo em Inglês | MEDLINE | ID: mdl-34676400

RESUMO

Instrumental variable (IV) methods allow us the opportunity to address unmeasured confounding in causal inference. However, most IV methods are only applicable to discrete or continuous outcomes with very few IV methods for censored survival outcomes. In this article, we propose nonparametric estimators for the local average treatment effect on survival probabilities under both covariate-dependent and outcome-dependent censoring. We provide an efficient influence function-based estimator and a simple estimation procedure when the IV is either binary or continuous. The proposed estimators possess double-robustness properties and can easily incorporate nonparametric estimation using machine learning tools. In simulation studies, we demonstrate the flexibility and double robustness of our proposed estimators under various plausible scenarios. We apply our method to the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial for estimating the causal effect of screening on survival probabilities and investigate the causal contrasts between the two interventions under different censoring assumptions.


Assuntos
Simulação por Computador , Humanos , Causalidade , Probabilidade
4.
Biostatistics ; 24(4): 985-999, 2023 10 18.
Artigo em Inglês | MEDLINE | ID: mdl-35791753

RESUMO

When evaluating the effectiveness of a treatment, policy, or intervention, the desired measure of efficacy may be expensive to collect, not routinely available, or may take a long time to occur. In these cases, it is sometimes possible to identify a surrogate outcome that can more easily, quickly, or cheaply capture the effect of interest. Theory and methods for evaluating the strength of surrogate markers have been well studied in the context of a single surrogate marker measured in the course of a randomized clinical study. However, methods are lacking for quantifying the utility of surrogate markers when the dimension of the surrogate grows. We propose a robust and efficient method for evaluating a set of surrogate markers that may be high-dimensional. Our method does not require treatment to be randomized and may be used in observational studies. Our approach draws on a connection between quantifying the utility of a surrogate marker and the most fundamental tools of causal inference-namely, methods for robust estimation of the average treatment effect. This connection facilitates the use of modern methods for estimating treatment effects, using machine learning to estimate nuisance functions and relaxing the dependence on model specification. We demonstrate that our proposed approach performs well, demonstrate connections between our approach and certain mediation effects, and illustrate it by evaluating whether gene expression can be used as a surrogate for immune activation in an Ebola study.


Assuntos
Modelos Estatísticos , Humanos , Biomarcadores , Causalidade , Simulação por Computador
5.
Biometrics ; 80(2)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38640436

RESUMO

Several epidemiological studies have provided evidence that long-term exposure to fine particulate matter (pm2.5) increases mortality rate. Furthermore, some population characteristics (e.g., age, race, and socioeconomic status) might play a crucial role in understanding vulnerability to air pollution. To inform policy, it is necessary to identify groups of the population that are more or less vulnerable to air pollution. In causal inference literature, the group average treatment effect (GATE) is a distinctive facet of the conditional average treatment effect. This widely employed metric serves to characterize the heterogeneity of a treatment effect based on some population characteristics. In this paper, we introduce a novel Confounder-Dependent Bayesian Mixture Model (CDBMM) to characterize causal effect heterogeneity. More specifically, our method leverages the flexibility of the dependent Dirichlet process to model the distribution of the potential outcomes conditionally to the covariates and the treatment levels, thus enabling us to: (i) identify heterogeneous and mutually exclusive population groups defined by similar GATEs in a data-driven way, and (ii) estimate and characterize the causal effects within each of the identified groups. Through simulations, we demonstrate the effectiveness of our method in uncovering key insights about treatment effects heterogeneity. We apply our method to claims data from Medicare enrollees in Texas. We found six mutually exclusive groups where the causal effects of pm2.5 on mortality rate are heterogeneous.


Assuntos
Poluentes Atmosféricos , Poluição do Ar , Estados Unidos/epidemiologia , Poluentes Atmosféricos/efeitos adversos , Poluentes Atmosféricos/análise , Teorema de Bayes , Medicare , Poluição do Ar/efeitos adversos , Poluição do Ar/análise , Material Particulado/efeitos adversos , Material Particulado/análise , Exposição Ambiental/efeitos adversos
6.
Biometrics ; 80(3)2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-39011739

RESUMO

Electronic health records and other sources of observational data are increasingly used for drawing causal inferences. The estimation of a causal effect using these data not meant for research purposes is subject to confounding and irregularly-spaced covariate-driven observation times affecting the inference. A doubly-weighted estimator accounting for these features has previously been proposed that relies on the correct specification of two nuisance models used for the weights. In this work, we propose a novel consistent multiply robust estimator and demonstrate analytically and in comprehensive simulation studies that it is more flexible and more efficient than the only alternative estimator proposed for the same setting. It is further applied to data from the Add Health study in the United States to estimate the causal effect of therapy counseling on alcohol consumption in American adolescents.


Assuntos
Simulação por Computador , Modelos Estatísticos , Estudos Observacionais como Assunto , Humanos , Estudos Observacionais como Assunto/estatística & dados numéricos , Adolescente , Causalidade , Estados Unidos , Interpretação Estatística de Dados , Registros Eletrônicos de Saúde/estatística & dados numéricos , Biometria/métodos , Consumo de Bebidas Alcoólicas
7.
Stat Med ; 43(8): 1640-1659, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38351516

RESUMO

The regression discontinuity (RD) design is a widely utilized approach for assessing treatment effects. It involves assigning treatment based on the value of an observed covariate in relation to a fixed threshold. Although the RD design has been widely employed across various problems, its application to specific data types has received limited attention. For instance, there has been little research on utilizing the RD design when the outcome variable exhibits zero-inflation. This study introduces a novel RD estimator using local likelihood, which overcomes the limitations of the local linear regression model, a popular approach for estimating treatment effects in RD design, by considering the data type of the outcome variable. To determine the optimal bandwidth, we propose a modified Ludwig-Miller cross validation method. A set of simulations is carried out, involving binary, count, and zero-inflated outcome variables, to showcase the superior performance of the suggested method over local linear regression models. Subsequently, the proposed local likelihood model is employed on HIV care data, where antiretroviral therapy eligibility is determined by a CD4 count threshold. A comparison is made between the results obtained using the local likelihood model and those obtained using local linear regression.


Assuntos
Fármacos Anti-HIV , Infecções por HIV , Humanos , África do Sul , Fármacos Anti-HIV/uso terapêutico , Infecções por HIV/tratamento farmacológico , Modelos Lineares , Projetos de Pesquisa
8.
BMC Med Res Methodol ; 24(1): 122, 2024 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-38831393

RESUMO

BACKGROUND: Two propensity score (PS) based balancing covariate methods, the overlap weighting method (OW) and the fine stratification method (FS), produce superb covariate balance. OW has been compared with various weighting methods while FS has been compared with the traditional stratification method and various matching methods. However, no study has yet compared OW and FS. In addition, OW has not yet been evaluated in large claims data with low prevalence exposure and with low frequency outcomes, a context in which optimal use of balancing methods is critical. In the study, we aimed to compare OW and FS using real-world data and simulations with low prevalence exposure and with low frequency outcomes. METHODS: We used the Texas State Medicaid claims data on adult beneficiaries with diabetes in 2012 as an empirical example (N = 42,628). Based on its real-world research question, we estimated an average treatment effect of health center vs. non-health center attendance in the total population. We also performed simulations to evaluate their relative performance. To preserve associations between covariates, we used the plasmode approach to simulate outcomes and/or exposures with N = 4,000. We simulated both homogeneous and heterogeneous treatment effects with various outcome risks (1-30% or observed: 27.75%) and/or exposure prevalence (2.5-30% or observed:10.55%). We used a weighted generalized linear model to estimate the exposure effect and the cluster-robust standard error (SE) method to estimate its SE. RESULTS: In the empirical example, we found that OW had smaller standardized mean differences in all covariates (range: OW: 0.0-0.02 vs. FS: 0.22-3.26) and Mahalanobis balance distance (MB) (< 0.001 vs. > 0.049) than FS. In simulations, OW also achieved smaller MB (homogeneity: <0.04 vs. > 0.04; heterogeneity: 0.0-0.11 vs. 0.07-0.29), relative bias (homogeneity: 4.04-56.20 vs. 20-61.63; heterogeneity: 7.85-57.6 vs. 15.0-60.4), square root of mean squared error (homogeneity: 0.332-1.308 vs. 0.385-1.365; heterogeneity: 0.263-0.526 vs 0.313-0.620), and coverage probability (homogeneity: 0.0-80.4% vs. 0.0-69.8%; heterogeneity: 0.0-97.6% vs. 0.0-92.8%), than FS, in most cases. CONCLUSIONS: These findings suggest that OW can yield nearly perfect covariate balance and therefore enhance the accuracy of average treatment effect estimation in the total population.


Assuntos
Pontuação de Propensão , Humanos , Masculino , Feminino , Estados Unidos , Adulto , Pessoa de Meia-Idade , Texas/epidemiologia , Diabetes Mellitus/epidemiologia , Medicaid/estatística & dados numéricos , Simulação por Computador , Revisão da Utilização de Seguros/estatística & dados numéricos
9.
Biometrics ; 79(4): 3179-3190, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-36645231

RESUMO

In this paper, we focus on estimating the average treatment effect (ATE) of a target population when individual-level data from a source population and summary-level data (e.g., first or second moments of certain covariates) from the target population are available. In the presence of the heterogeneous treatment effect, the ATE of the target population can be different from that of the source population when distributions of treatment effect modifiers are dissimilar in these two populations, a phenomenon also known as covariate shift. Many methods have been developed to adjust for covariate shift, but most require individual covariates from a representative target sample. We develop a weighting approach based on the summary-level information from the target sample to adjust for possible covariate shift in effect modifiers. In particular, weights of the treated and control groups within a source sample are calibrated by the summary-level information of the target sample. Our approach also seeks additional covariate balance between the treated and control groups in the source sample. We study the asymptotic behavior of the corresponding weighted estimator for the target population ATE under a wide range of conditions. The theoretical implications are confirmed in simulation studies and a real-data application.


Assuntos
Entropia , Simulação por Computador , Causalidade , Pontuação de Propensão
10.
Stat Med ; 42(1): 33-51, 2023 01 15.
Artigo em Inglês | MEDLINE | ID: mdl-36336460

RESUMO

In observational studies, causal inference relies on several key identifying assumptions. One identifiability condition is the positivity assumption, which requires the probability of treatment be bounded away from 0 and 1. That is, for every covariate combination, it should be possible to observe both treated and control subjects the covariate distributions should overlap between treatment arms. If the positivity assumption is violated, population-level causal inference necessarily involves some extrapolation. Ideally, a greater amount of uncertainty about the causal effect estimate should be reflected in such situations. With that goal in mind, we construct a Gaussian process model for estimating treatment effects in the presence of practical violations of positivity. Advantages of our method include minimal distributional assumptions, a cohesive model for estimating treatment effects, and more uncertainty associated with areas in the covariate space where there is less overlap. We assess the performance of our approach with respect to bias and efficiency using simulation studies. The method is then applied to a study of critically ill female patients to examine the effect of undergoing right heart catheterization.


Assuntos
Modelos Estatísticos , Humanos , Feminino , Probabilidade , Simulação por Computador , Viés
11.
Stat Med ; 42(10): 1542-1564, 2023 05 10.
Artigo em Inglês | MEDLINE | ID: mdl-36815690

RESUMO

Linkage between drug claims data and clinical outcome allows a data-driven experimental approach to drug repurposing. We develop an estimation procedure based on generalized random forests for estimation of time-point specific average treatment effects in a time-to-event setting with competing risks. To handle right-censoring, we propose a two-step procedure for estimation, applying inverse probability weighting to construct time-point specific weighted outcomes as input for the generalized random forest. The generalized random forests adaptively handle covariate effects on the treatment assignment by applying a splitting rule that targets a causal parameter. Using simulated data we demonstrate that the method is effective for a causal search through a list of treatments to be ranked according to the magnitude of their effect on clinical outcome. We illustrate the method using the Danish national health registries where it is of interest to discover drugs with an unexpected protective effect against relapse of severe depression.


Assuntos
Algoritmo Florestas Aleatórias , Humanos , Probabilidade
12.
Stat Med ; 42(11): 1760-1778, 2023 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-36863006

RESUMO

Matching is a popular design for inferring causal effect with observational data. Unlike model-based approaches, it is a nonparametric method to group treated and control subjects with similar characteristics together, hence to re-create a randomization-like scenario. The application of matched design for real world data may be limited by: (1) the causal estimand of interest; (2) the sample size of different treatment arms. We propose a flexible design of matching, based on the idea of template matching, to overcome these challenges. It first identifies the template group which is representative of the target population, then match subjects from the original data to this template group and make inference. We provide theoretical justification on how it unbiasedly estimates the average treatment effect using matched pairs and the average treatment effect on the treated when the treatment group has a bigger sample size. We also propose using the triplet matching algorithm to improve matching quality and devise a practical strategy to select the template size. One major advantage of matched design is that it allows both randomization-based or model-based inference, with the former being more robust. For the commonly used binary outcome in medical research, we adopt a randomization inference framework of attributable effects in matched data, which allows heterogeneous effects and can incorporate sensitivity analysis for unmeasured confounding. We apply our design and analytical strategy to a trauma care evaluation study.


Assuntos
Pesquisa Biomédica , Estudos Observacionais como Assunto , Humanos , Algoritmos , Causalidade , Projetos de Pesquisa , Tamanho da Amostra
13.
Stat Med ; 42(21): 3764-3785, 2023 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-37339777

RESUMO

Cluster randomized trials (CRTs) are studies where treatment is randomized at the cluster level but outcomes are typically collected at the individual level. When CRTs are employed in pragmatic settings, baseline population characteristics may moderate treatment effects, leading to what is known as heterogeneous treatment effects (HTEs). Pre-specified, hypothesis-driven HTE analyses in CRTs can enable an understanding of how interventions may impact subpopulation outcomes. While closed-form sample size formulas have recently been proposed, assuming known intracluster correlation coefficients (ICCs) for both the covariate and outcome, guidance on optimal cluster randomized designs to ensure maximum power with pre-specified HTE analyses has not yet been developed. We derive new design formulas to determine the cluster size and number of clusters to achieve the locally optimal design (LOD) that minimizes variance for estimating the HTE parameter given a budget constraint. Given the LODs are based on covariate and outcome-ICC values that are usually unknown, we further develop the maximin design for assessing HTE, identifying the combination of design resources that maximize the relative efficiency of the HTE analysis in the worst case scenario. In addition, given the analysis of the average treatment effect is often of primary interest, we also establish optimal designs to accommodate multiple objectives by combining considerations for studying both the average and heterogeneous treatment effects. We illustrate our methods using the context of the Kerala Diabetes Prevention Program CRT, and provide an R Shiny app to facilitate calculation of optimal designs under a wide range of design parameters.


Assuntos
Projetos de Pesquisa , Humanos , Análise por Conglomerados , Tamanho da Amostra , Ensaios Clínicos Controlados Aleatórios como Assunto
14.
Stat Med ; 42(15): 2637-2660, 2023 07 10.
Artigo em Inglês | MEDLINE | ID: mdl-37012676

RESUMO

Most propensity score (PS) analysis methods rely on a correctly specified parametric PS model, which may result in biased estimation of the average treatment effect (ATE) when the model is misspecified. More flexible nonparametric models for treatment assignment alleviate this issue, but they do not always guarantee covariate balance. Methods that force balance in the means of covariates and their transformations between the treatment groups, termed global balance in this article, do not always lead to unbiased estimation of ATE. Their estimated propensity scores only ensure global balance but not the balancing property, which is defined as the conditional independence between treatment assignment and covariates given the propensity score. The balancing property implies not only global balance but also local balance-the mean balance of covariates in propensity score stratified sub-populations. Local balance implies global balance, but the reverse is false. We propose the propensity score with local balance (PSLB) methodology, which incorporates nonparametric propensity score models and optimizes local balance. Extensive numerical studies showed that the proposed method can substantially outperform existing methods that estimate the propensity score by optimizing global balance, when the model is misspecified. The proposed method is implemented in the R package PSLB.


Assuntos
Modelos Estatísticos , Humanos , Pontuação de Propensão , Simulação por Computador
15.
BMC Med Res Methodol ; 23(1): 297, 2023 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-38102563

RESUMO

BACKGROUND: Across studies of average treatment effects, some population subgroups consistently have lower representation than others which can lead to discrepancies in how well results generalize. METHODS: We develop a framework for quantifying inequity due to systemic disparities in sample representation and a method for mitigation during data analysis. Assuming subgroup treatment effects are exchangeable, an unbiased sample average treatment effect estimator will have lower mean-squared error, on average across studies, for subgroups with less representation when treatment effects vary. We present a method for estimating average treatment effects in representation-adjusted samples which enables subgroups to optimally leverage information from the full sample rather than only their own subgroup's data. Two approaches for specifying representation adjustment are offered-one minimizes average mean-squared error for each subgroup separately and the other balances minimization of mean-squared error and equal representation. We conduct simulation studies to compare the performance of the proposed estimators to several subgroup-specific estimators. RESULTS: We find that the proposed estimators generally provide lower mean squared error, particularly for smaller subgroups, relative to the other estimators. As a case study, we apply this method to a subgroup analysis from a published study. CONCLUSIONS: We recommend the use of the proposed estimators to mitigate the impact of disparities in representation, though structural change is ultimately needed.


Assuntos
Modelos Estatísticos , Humanos , Simulação por Computador
16.
BMC Med Res Methodol ; 23(1): 231, 2023 10 11.
Artigo em Inglês | MEDLINE | ID: mdl-37821829

RESUMO

BACKGROUND: In observational studies, double robust or multiply robust (MR) approaches provide more protection from model misspecification than the inverse probability weighting and g-computation for estimating the average treatment effect (ATE). However, the approaches are based on parametric models, leading to biased estimates when all models are incorrectly specified. Nonparametric methods, such as machine learning or nonparametric double robust approaches, are robust to model misspecification, but the efficiency of nonparametric methods is low. METHOD: In the study, we proposed an improved MR method combining parametric and nonparametric models based on the previous MR method (Han, JASA 109(507):1159-73, 2014) to improve the robustness to model misspecification and the efficiency. We performed comprehensive simulations to evaluate the performance of the proposed method. RESULTS: Our simulation study showed that the MR estimators with only outcome regression (OR) models, where one of the models was a nonparametric model, were the most recommended because of the robustness to model misspecification and the lowest root mean square error (RMSE) when including a correct parametric OR model. And the performance of the recommended estimators was comparative, even if all parametric models were misspecified. As an application, the proposed method was used to estimate the effect of social activity on depression levels in the China Health and Retirement Longitudinal Study dataset. CONCLUSIONS: The proposed estimator with nonparametric and parametric models is more robust to model misspecification.


Assuntos
Aprendizado de Máquina , Modelos Estatísticos , Humanos , Estudos Longitudinais , Simulação por Computador , Probabilidade
17.
Eur J Epidemiol ; 38(2): 123-133, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-36626100

RESUMO

Most work on extending (generalizing or transporting) inferences from a randomized trial to a target population has focused on estimating average treatment effects (i.e., averaged over the target population's covariate distribution). Yet, in the presence of strong effect modification by baseline covariates, the average treatment effect in the target population may be less relevant for guiding treatment decisions. Instead, the conditional average treatment effect (CATE) as a function of key effect modifiers may be a more useful estimand. Recent work on estimating target population CATEs using baseline covariate, treatment, and outcome data from the trial and covariate data from the target population only allows for the examination of heterogeneity over distinct subgroups. We describe flexible pseudo-outcome regression modeling methods for estimating target population CATEs conditional on discrete or continuous baseline covariates when the trial is embedded in a sample from the target population (i.e., in nested trial designs). We construct pointwise confidence intervals for the CATE at a specific value of the effect modifiers and uniform confidence bands for the CATE function. Last, we illustrate the methods using data from the Coronary Artery Surgery Study (CASS) to estimate CATEs given history of myocardial infarction and baseline ejection fraction value in the target population of all trial-eligible patients with stable ischemic heart disease.


Assuntos
Infarto do Miocárdio , Humanos , Análise de Regressão , Projetos de Pesquisa
18.
J Biomed Inform ; 137: 104256, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36455806

RESUMO

Big data and (deep) machine learning have been ambitious tools in digital medicine, but these tools focus mainly on association. Intervention in medicine is about the causal effects. The average treatment effect has long been studied as a measure of causal effect, assuming that all populations have the same effect size. However, no "one-size-fits-all" treatment seems to work in some complex diseases. Treatment effects may vary by patient. Estimating heterogeneous treatment effects (HTE) may have a high impact on developing personalized treatment. Lots of advanced machine learning models for estimating HTE have emerged in recent years, but there has been limited translational research into the real-world healthcare domain. To fill the gap, we reviewed and compared eleven recent HTE estimation methodologies, including meta-learner, representation learning models, and tree-based models. We performed a comprehensive benchmark experiment based on nationwide healthcare claim data with application to Alzheimer's disease drug repurposing. We provided some challenges and opportunities in HTE estimation analysis in the healthcare domain to close the gap between innovative HTE models and deployment to real-world healthcare problems.


Assuntos
Benchmarking , Aprendizado de Máquina , Humanos , Ensaios Clínicos Controlados Aleatórios como Assunto , Causalidade
19.
Clin Trials ; 20(6): 661-669, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37439089

RESUMO

BACKGROUND: Recent work has shown that cluster-randomised trials can estimate two distinct estimands: the participant-average and cluster-average treatment effects. These can differ when participant outcomes or the treatment effect depends on the cluster size (termed informative cluster size). In this case, estimators that target one estimand (such as the analysis of unweighted cluster-level summaries, which targets the cluster-average effect) may be biased for the other. Furthermore, commonly used estimators such as mixed-effects models or generalised estimating equations with an exchangeable correlation structure can be biased for both estimands. However, there has been little empirical research into whether informative cluster size is likely to occur in practice. METHOD: We re-analysed a cluster-randomised trial comparing two different thresholds for red blood cell transfusion in patients with acute upper gastrointestinal bleeding to explore whether estimates for the participant- and cluster-average effects differed, to provide empirical evidence for whether informative cluster size may be present. For each outcome, we first estimated a participant-average effect using independence estimating equations, which are unbiased under informative cluster size. We then compared this to two further methods: (1) a cluster-average effect estimated using either weighted independence estimating equations or unweighted cluster-level summaries, and (2) estimates from a mixed-effects model or generalised estimating equations with an exchangeable correlation structure. We then performed a small simulation study to evaluate whether observed differences between cluster- and participant-average estimates were likely to occur even if no informative cluster size was present. RESULTS: For most outcomes, treatment effect estimates from different methods were similar. However, differences of >10% occurred between participant- and cluster-average estimates for 5 of 17 outcomes (29%). We also observed several notable differences between estimates from mixed-effects models or generalised estimating equations with an exchangeable correlation structure and those based on independence estimating equations. For example, for the EQ-5D VAS score, the independence estimating equation estimate of the participant-average difference was 4.15 (95% confidence interval: -3.37 to 11.66), compared with 2.84 (95% confidence interval: -7.37 to 13.04) for the cluster-average independence estimating equation estimate, and 3.23 (95% confidence interval: -6.70 to 13.16) from a mixed-effects model. Similarly, for thromboembolic/ischaemic events, the independence estimating equation estimate for the participant-average odds ratio was 0.43 (95% confidence interval: 0.07 to 2.48), compared with 0.33 (95% confidence interval: 0.06 to 1.77) from the cluster-average estimator. CONCLUSION: In this re-analysis, we found that estimates from the various approaches could differ, which may be due to the presence of informative cluster size. Careful consideration of the estimand and the plausibility of assumptions underpinning each estimator can help ensure an appropriate analysis methods are used. Independence estimating equations and the analysis of cluster-level summaries (with appropriate weighting for each to correspond to either the participant-average or cluster-average treatment effect) are a desirable choice when informative cluster size is deemed possible, due to their unbiasedness in this setting.


Assuntos
Projetos de Pesquisa , Humanos , Análise por Conglomerados , Simulação por Computador , Tamanho da Amostra , Razão de Chances
20.
Multivariate Behav Res ; 58(3): 467-483, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-35617441

RESUMO

We adopt a causal inference perspective to shed light into which ANOVA type of sums of squares (SS) should be used for testing main effects and whether main effects should be considered at all in the presence of interactions. We consider balanced, proportional and nonorthogonal designs, and models with and without interactions. When the design is balanced, we show that the average treatment effect is estimated by the main effects obtained by type I, II, and III sums of squares. In proportional designs, we show that the average treatment effect is estimated by the the type I and type II main effects, whereas type III SS yield biased estimates of the average treatment effect if there are interactions. When the design is nonorthogonal, ANOVA type I is always highly biased and ANOVA type II and III main effects are biased if there are interactions. We include a simulation study to illustrate the magnitude of the bias in estimating the average treatment effect across a variety of conditions, and provide recommendations for applied researchers.


Assuntos
Viés , Simulação por Computador , Causalidade , Análise de Variância
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa