Your browser doesn't support javascript.
loading
Multiple imputation of semi-continuous exposure variables that are categorized for analysis.
Nguyen, Cattram D; Moreno-Betancur, Margarita; Rodwell, Laura; Romaniuk, Helena; Carlin, John B; Lee, Katherine J.
Afiliación
  • Nguyen CD; Clinical Epidemiology and Biostatistics Unit, Murdoch Children's Research Institute, Parkville, Victoria, Australia.
  • Moreno-Betancur M; Department of Paediatrics, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Melbourne, Victoria, Australia.
  • Rodwell L; Clinical Epidemiology and Biostatistics Unit, Murdoch Children's Research Institute, Parkville, Victoria, Australia.
  • Romaniuk H; Department of Paediatrics, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Melbourne, Victoria, Australia.
  • Carlin JB; Clinical Epidemiology and Biostatistics Unit, Murdoch Children's Research Institute, Parkville, Victoria, Australia.
  • Lee KJ; Department of Paediatrics, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Melbourne, Victoria, Australia.
Stat Med ; 40(27): 6093-6106, 2021 11 30.
Article en En | MEDLINE | ID: mdl-34423450
ABSTRACT
Semi-continuous variables are characterized by a point mass at one value and a continuous range of values for remaining observations. An example is alcohol consumption quantity, with a spike of zeros representing non-drinkers and positive values for drinkers. If multiple imputation is used to handle missing values for semi-continuous variables, it is unclear how this should be implemented within the standard approaches of fully conditional specification (FCS) and multivariate normal imputation (MVNI). This question is brought into focus by the use of categorized versions of semi-continuous exposure variables in analyses (eg, no drinking, drinking below binge level, binge drinking, heavy binge drinking), raising the question of how best to achieve congeniality between imputation and analysis models. We performed a simulation study comparing nine approaches for imputing semi-continuous exposures requiring categorization for analysis. Three methods imputed the categories directly ordinal logistic regression, and imputation of binary indicator variables representing the categories using MVNI (with two variants). Six methods (predictive mean matching, zero-inflated binomial imputation, and two-part imputation methods with variants in FCS and MVNI) imputed the semi-continuous variable, with categories derived after imputation. The ordinal and zero-inflated binomial methods had good performance across most scenarios, while MVNI methods requiring rounding after imputation did not perform well. There were mixed results for predictive mean matching and the two-part methods, depending on whether the estimands were proportions or regression coefficients. The results highlight the need to consider the parameter of interest when selecting an imputation procedure.
Asunto(s)
Palabras clave

Texto completo: 1 Bases de datos: MEDLINE Asunto principal: Proyectos de Investigación / Recolección de Datos Tipo de estudio: Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: Stat Med Año: 2021 Tipo del documento: Article País de afiliación: Australia

Texto completo: 1 Bases de datos: MEDLINE Asunto principal: Proyectos de Investigación / Recolección de Datos Tipo de estudio: Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: Stat Med Año: 2021 Tipo del documento: Article País de afiliación: Australia