Ultra high-dimensional semiparametric longitudinal data analysis.
Biometrics
; 77(3): 903-913, 2021 09.
Article
em En
| MEDLINE
| ID: mdl-32750150
As ultra high-dimensional longitudinal data are becoming ever more apparent in fields such as public health and bioinformatics, developing flexible methods with a sparse model is of high interest. In this setting, the dimension of the covariates can potentially grow exponentially as exp(n1/2) with respect to the number of clusters n. We consider a flexible semiparametric approach, namely, partially linear single-index models, for ultra high-dimensional longitudinal data. Most importantly, we allow not only the partially linear covariates but also the single-index covariates within the unknown flexible function estimated nonparametrically to be ultra high dimensional. Using penalized generalized estimating equations, this approach can capture correlation within subjects, can perform simultaneous variable selection and estimation with a smoothly clipped absolute deviation penalty, and can capture nonlinearity and potentially some interactions among predictors. We establish asymptotic theory for the estimators including the oracle property in ultra high dimension for both the partially linear and nonparametric components, and we present an efficient algorithm to handle the computational challenges. We show the effectiveness of our method and algorithm via a simulation study and a yeast cell cycle gene expression data.
Palavras-chave
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Assunto principal:
Algoritmos
/
Análise de Dados
Tipo de estudo:
Prognostic_studies
/
Risk_factors_studies
Limite:
Humans
Idioma:
En
Revista:
Biometrics
Ano de publicação:
2021
Tipo de documento:
Article