Methods for Dealing With Missing Covariate Data in Epigenome-Wide Association Studies.

Mills, Harriet L; Heron, Jon; Relton, Caroline; Suderman, Matt; Tilling, Kate

Mills, Harriet L; Heron, Jon; Relton, Caroline; Suderman, Matt; Tilling, Kate.

Am J Epidemiol ; 188(11): 2021-2030, 2019 11 01.

Article em En | MEDLINE | ID: mdl-31504104

ABSTRACT

ABSTRACT

Multiple imputation (MI) is a well-established method for dealing with missing data. MI is computationally intensive when imputing missing covariates with high-dimensional outcome data (e.g., DNA methylation data in epigenome-wide association studies (EWAS)), because every outcome variable must be included in the imputation model to avoid biasing associations towards the null. Instead, EWAS analyses are reduced to only complete cases, limiting statistical power and potentially causing bias. We used simulations to compare 5 MI methods for high-dimensional data under 2 missingness mechanisms. All imputation methods had increased power over complete-case (C-C) analyses. Imputing missing values separately for each variable was computationally inefficient, but dividing sites at random into evenly sized bins improved efficiency and gave low bias. Methods imputing solely using subsets of sites identified by the C-C analysis suffered from bias towards the null. However, if these subsets were added into random bins of sites, this bias was reduced. The optimal methods were applied to an EWAS with missingness in covariates. All methods identified additional sites over the C-C analysis, and many of these sites had been replicated in other studies. These methods are also applicable to other high-dimensional data sets, including the rapidly expanding area of "-omics" studies.

Assuntos

Estudos Epidemiológicos; Epigenoma; Estudo de Associação Genômica Ampla; Humanos

Palavras-chave

Accessible Resource for Integrated Epigenomics Studies; Avon Longitudinal Study of Parents and Children; epigenetic data; imputation; missing data

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Estudos Epidemiológicos / Estudo de Associação Genômica Ampla / Epigenoma Idioma: En Ano de publicação: 2019 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google