Your browser doesn't support javascript.
loading
A PARTIALLY LINEAR FRAMEWORK FOR MASSIVE HETEROGENEOUS DATA.
Zhao, Tianqi; Cheng, Guang; Liu, Han.
Afiliación
  • Zhao T; Department of operations research, and financial engineering, Princeton University, Princeton, New Jersey 08544, USA.
  • Cheng G; Department of Statistics, Purdue University, West Lafayette, IN 47906, USA.
  • Liu H; Department of operations research, and financial engineering, Princeton University, Princeton, New Jersey 08544, USA.
Ann Stat ; 44(4): 1400-1437, 2016 Aug.
Article en En | MEDLINE | ID: mdl-28428647
ABSTRACT
We consider a partially linear framework for modelling massive heterogeneous data. The major goal is to extract common features across all sub-populations while exploring heterogeneity of each sub-population. In particular, we propose an aggregation type estimator for the commonality parameter that possesses the (non-asymptotic) minimax optimal bound and asymptotic distribution as if there were no heterogeneity. This oracular result holds when the number of sub-populations does not grow too fast. A plug-in estimator for the heterogeneity parameter is further constructed, and shown to possess the asymptotic distribution as if the commonality information were available. We also test the heterogeneity among a large number of sub-populations. All the above results require to regularize each sub-estimation as though it had the entire sample size. Our general theory applies to the divide-and-conquer approach that is often used to deal with massive homogeneous data. A technical by-product of this paper is the statistical inferences for the general kernel ridge regression. Thorough numerical results are also provided to back up our theory.
Palabras clave

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Idioma: En Revista: Ann Stat Año: 2016 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Idioma: En Revista: Ann Stat Año: 2016 Tipo del documento: Article País de afiliación: Estados Unidos