The Counterfactual &#967;-GAN: Finding comparable cohorts in observational health data.

Averitt, Amelia J; Vanitchanant, Natnicha; Ranganath, Rajesh; Perotte, Adler J

The Counterfactual χ-GAN: Finding comparable cohorts in observational health data.

Averitt, Amelia J; Vanitchanant, Natnicha; Ranganath, Rajesh; Perotte, Adler J.

Afiliación

Averitt AJ; Biomedical Informatics, Columbia University, New York, NY, United States. Electronic address: amelia.averitt@gmail.com.
Vanitchanant N; Biomedical Informatics, Columbia University, New York, NY, United States.
Ranganath R; Courant Institute, Center for Data Science, New York University, New York, NY, United States.
Perotte AJ; Biomedical Informatics, Columbia University, New York, NY, United States.

J Biomed Inform ; 109: 103515, 2020 09.

Article en En | MEDLINE | ID: mdl-32771540

RESUMEN

Causal inference often relies on the counterfactual framework, which requires that treatment assignment is independent of the outcome, known as strong ignorability. Approaches to enforcing strong ignorability in causal analyses of observational data include weighting and matching methods. Effect estimates, such as the average treatment effect (ATE), are then estimated as expectations under the re-weighted or matched distribution, P. The choice of P is important and can impact the interpretation of the effect estimate and the variance of effect estimates. In this work, instead of specifying P, we learn a distribution that simultaneously maximizes coverage and minimizes variance of ATE estimates. In order to learn this distribution, this research proposes a generative adversarial network (GAN)-based model called the Counterfactual χ-GAN (cGAN), which also learns feature-balancing weights and supports unbiased causal estimation in the absence of unobserved confounding. Our model minimizes the Pearson χ2-divergence, which we show simultaneously maximizes coverage and minimizes the variance of importance sampling estimates. To our knowledge, this is the first such application of the Pearson χ2-divergence. We demonstrate the effectiveness of cGAN in achieving feature balance relative to established weighting methods in simulation and with real-world medical data.

Asunto(s)

Causalidad; Simulación por Computador; Humanos

Palabras clave

Causal inference; Deep learning; GANs; Health; Machine learning; Observational studies

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Causalidad Tipo de estudio: Diagnostic_studies / Prognostic_studies Límite: Humans Idioma: En Revista: J Biomed Inform Asunto de la revista: INFORMATICA MEDICA Año: 2020 Tipo del documento: Article

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google