Your browser doesn't support javascript.
loading
Scalable Bayesian Nonparametric Clustering and Classification.
Ni, Yang; Müller, Peter; Diesendruck, Maurice; Williamson, Sinead; Zhu, Yitan; Ji, Yuan.
Afiliação
  • Ni Y; Department of Statistics, Texas A&M University.
  • Müller P; Department of Statistics and Data Sciences, The University of Texas at Austin.
  • Diesendruck M; Department of Mathematics, The University of Texas at Austin.
  • Williamson S; Department of Statistics and Data Sciences, The University of Texas at Austin.
  • Zhu Y; Department of Information, Risk, and Operations Management, The University of Texas at Austin.
  • Ji Y; Program for Computational Genomics and Medicine, NorthShore University HealthSystem.
J Comput Graph Stat ; 29(1): 53-65, 2020.
Article em En | MEDLINE | ID: mdl-32982129
We develop a scalable multi-step Monte Carlo algorithm for inference under a large class of nonparametric Bayesian models for clustering and classification. Each step is "embarrassingly parallel" and can be implemented using the same Markov chain Monte Carlo sampler. The simplicity and generality of our approach makes inference for a wide range of Bayesian nonparametric mixture models applicable to large datasets. Specifically, we apply the approach to inference under a product partition model with regression on covariates. We show results for inference with two motivating data sets: a large set of electronic health records (EHR) and a bank telemarketing dataset. We find interesting clusters and competitive classification performance relative to other widely used competing classifiers. Supplementary materials for this article are available online.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: J Comput Graph Stat Ano de publicação: 2020 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: J Comput Graph Stat Ano de publicação: 2020 Tipo de documento: Article