Your browser doesn't support javascript.
loading
Interactive model building for Q-learning.
Laber, Eric B; Linn, Kristin A; Stefanski, Leonard A.
Afiliação
  • Laber EB; Department of Statistics, North Carolina State University, 2311 Stinson Drive, 5216 SAS Hall, Raleigh, North Carolina, 27695-8203, USA.
  • Linn KA; Department of Statistics, North Carolina State University, 2311 Stinson Drive, 5216 SAS Hall, Raleigh, North Carolina, 27695-8203, USA.
  • Stefanski LA; Department of Statistics, North Carolina State University, 2311 Stinson Drive, 5216 SAS Hall, Raleigh, North Carolina, 27695-8203, USA.
Biometrika ; 101(4): 831-847, 2014 Oct 20.
Article em En | MEDLINE | ID: mdl-25541562
ABSTRACT
Evidence-based rules for optimal treatment allocation are key components in the quest for efficient, effective health care delivery. Q-learning, an approximate dynamic programming algorithm, is a popular method for estimating optimal sequential decision rules from data. Q-learning requires the modeling of nonsmooth, nonmonotone transformations of the data, complicating the search for adequately expressive, yet parsimonious, statistical models. The default Q-learning working model is multiple linear regression, which is not only provably misspecified under most data-generating models, but also results in nonregular regression estimators, complicating inference. We propose an alternative strategy for estimating optimal sequential decision rules for which the requisite statistical modeling does not depend on nonsmooth, nonmonotone transformed data, does not result in nonregular regression estimators, is consistent under a broader array of data-generation models than Q-learning, results in estimated sequential decision rules that have better sampling properties, and is amenable to established statistical approaches for exploratory data analysis, model building, and validation. We derive the new method, IQ-learning, via an interchange in the order of certain steps in Q-learning. In simulated experiments IQ-learning improves on Q-learning in terms of integrated mean squared error and power. The method is illustrated using data from a study of major depressive disorder.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Prognostic_studies / Risk_factors_studies Idioma: En Revista: Biometrika Ano de publicação: 2014 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Prognostic_studies / Risk_factors_studies Idioma: En Revista: Biometrika Ano de publicação: 2014 Tipo de documento: Article