Your browser doesn't support javascript.
loading
Supplementing claims data analysis using self-reported data to develop a probabilistic phenotype model for current smoking status.
Reps, Jenna M; Rijnbeek, Peter R; Ryan, Patrick B.
Afiliação
  • Reps JM; Janssen Research and Development, Titusville, NJ, USA. Electronic address: jreps@its.jnj.com.
  • Rijnbeek PR; Erasmus MC, Rotterdam, the Netherlands.
  • Ryan PB; Janssen Research and Development, Titusville, NJ, USA.
J Biomed Inform ; 97: 103264, 2019 09.
Article em En | MEDLINE | ID: mdl-31386904
ABSTRACT

OBJECTIVES:

Smoking status is poorly record in US claims data. IBM MarketScan Commercial is a claims database that can be linked to an additional health risk assessment with self-reported smoking status for a subset of 1,966,174 patients. We investigate whether this subset could be used to learn a smoking status phenotype model generalizable to all US claims data that calculates the probability of being a current smoker.

METHODS:

251,643 (12.8%) had self-reported their smoking status as 'current smoker'. A regularized logistic regression model, the Current Risk of Smoking Status (CROSS), was trained using the subset of patients with self-reported smoking status. CROSS considered 53,027 candidate covariates including demographics and conditions/drugs/measurements/procedures/observations recorded in the prior 365 days, The CROSS phenotype model was validated across multiple other claims data.

RESULTS:

The internal validation showed the CROSS model achieved an area under the receiver operating characteristic curve (AUC) of 0.76 and the calibration plots indicated it was well calibrated. The external validation across three US claims databases obtained AUCs ranging between 0.82 and 0.87 showing the model appears to be transportable across Claims data.

CONCLUSION:

CROSS predicts current smoking status based on the claims records in the prior year. CROSS can be readily implemented to any US insurance claims mapped to the OMOP common data model and will be a useful way to impute smoking status when conducting epidemiology studies where smoking is a known confounder but smoking status is not recorded. CROSS is available from https//github.com/OHDSI/StudyProtocolSandbox/tree/master/SmokingModel.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Revisão da Utilização de Seguros / Modelos Estatísticos / Fumar Cigarros Tipo de estudo: Etiology_studies / Prognostic_studies / Risk_factors_studies Limite: Adult / Female / Humans / Male / Middle aged País/Região como assunto: America do norte Idioma: En Ano de publicação: 2019 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Revisão da Utilização de Seguros / Modelos Estatísticos / Fumar Cigarros Tipo de estudo: Etiology_studies / Prognostic_studies / Risk_factors_studies Limite: Adult / Female / Humans / Male / Middle aged País/Região como assunto: America do norte Idioma: En Ano de publicação: 2019 Tipo de documento: Article