External control arm analysis: an evaluation of propensity score approaches, G-computation, and doubly debiased machine learning.

Loiseau, Nicolas; Trichelair, Paul; He, Maxime; Andreux, Mathieu; Zaslavskiy, Mikhail; Wainrib, Gilles; Blum, Michael G B

Loiseau, Nicolas; Trichelair, Paul; He, Maxime; Andreux, Mathieu; Zaslavskiy, Mikhail; Wainrib, Gilles; Blum, Michael G B.

Afiliação

Loiseau N; Owkin France, Paris, France. nicolas.loiseau@owkin.com.
Trichelair P; Owkin France, Paris, France.
He M; Owkin France, Paris, France.
Andreux M; Owkin France, Paris, France.
Zaslavskiy M; Owkin France, Paris, France.
Wainrib G; Owkin France, Paris, France.
Blum MGB; Owkin France, Paris, France.

BMC Med Res Methodol ; 22(1): 335, 2022 12 28.

Article em En | MEDLINE | ID: mdl-36577946

ABSTRACT

ABSTRACT

BACKGROUND:

An external control arm is a cohort of control patients that are collected from data external to a single-arm trial. To provide an unbiased estimation of efficacy, the clinical profiles of patients from single and external arms should be aligned, typically using propensity score approaches. There are alternative approaches to infer efficacy based on comparisons between outcomes of single-arm patients and machine-learning predictions of control patient outcomes. These methods include G-computation and Doubly Debiased Machine Learning (DDML) and their evaluation for External Control Arms (ECA) analysis is insufficient.

METHODS:

We consider both numerical simulations and a trial replication procedure to evaluate the different statistical approaches propensity score matching, Inverse Probability of Treatment Weighting (IPTW), G-computation, and DDML. The replication study relies on five type 2 diabetes randomized clinical trials granted by the Yale University Open Data Access (YODA) project. From the pool of five trials, observational experiments are artificially built by replacing a control arm from one trial by an arm originating from another trial and containing similarly-treated patients.

RESULTS:

Among the different statistical approaches, numerical simulations show that DDML has the smallest bias followed by G-computation. In terms of mean squared error, G-computation usually minimizes mean squared error. Compared to other methods, DDML has varying Mean Squared Error performances that improves with increasing sample sizes. For hypothesis testing, all methods control type I error and DDML is the most conservative. G-computation is the best method in terms of statistical power, and DDML has comparable power at [Formula see text] but inferior ones for smaller sample sizes. The replication procedure also indicates that G-computation minimizes mean squared error whereas DDML has intermediate performances in between G-computation and propensity score approaches. The confidence intervals of G-computation are the narrowest whereas confidence intervals obtained with DDML are the widest for small sample sizes, which confirms its conservative nature.

CONCLUSIONS:

For external control arm analyses, methods based on outcome prediction models can reduce estimation error and increase statistical power compared to propensity score approaches.

Assuntos

Diabetes Mellitus Tipo 2; Humanos; Viés; Simulação por Computador; Diabetes Mellitus Tipo 2/terapia; Aprendizado de Máquina; Pontuação de Propensão; Projetos de Pesquisa; Ensaios Clínicos Controlados Aleatórios como Assunto

Palavras-chave

Average treatment effect; Confounding variables; Counterfactual; Doubly robust; Observational study; Propensity score; Replication study

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Eixos temáticos: Pesquisa_clinica Base de dados: MEDLINE Assunto principal: Diabetes Mellitus Tipo 2 Tipo de estudo: Clinical_trials / Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google