Your browser doesn't support javascript.
loading
A robust approach for electronic health record-based case-control studies with contaminated case pools.
Dai, Guorong; Ma, Yanyuan; Hasler, Jill; Chen, Jinbo; Carroll, Raymond J.
Afiliação
  • Dai G; Department of Statistics and Data Science, School of Management, Fudan University, Shanghai, China.
  • Ma Y; Department of Statistics, Pennsylvania State University, University Park, Pennsylvania.
  • Hasler J; Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania.
  • Chen J; Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania.
  • Carroll RJ; Department of Statistics, Texas A&M University, College Station, Texas.
Biometrics ; 79(3): 2023-2035, 2023 09.
Article em En | MEDLINE | ID: mdl-35841231
ABSTRACT
We consider analyses of case-control studies assembled from electronic health records (EHRs) where the pool of cases is contaminated by patients who are ineligible for the study. These ineligible patients, referred to as "false cases," should be excluded from the analyses if known. However, the true outcome status of a patient in the case pool is unknown except in a subset whose size may be arbitrarily small compared to the entire pool. To effectively remove the influence of the false cases on estimating odds ratio parameters defined by a working association model of the logistic form, we propose a general strategy to adaptively impute the unknown case status without requiring a correct phenotyping model to help discern the true and false case statuses. Our method estimates the target parameters as the solution to a set of unbiased estimating equations constructed using all available data. It outperforms existing methods by achieving robustness to mismodeling the relationship between the outcome status and covariates of interest, as well as improved estimation efficiency. We further show that our estimator is root-n-consistent and asymptotically normal. Through extensive simulation studies and analysis of real EHR data, we demonstrate that our method has desirable robustness to possible misspecification of both the association and phenotyping models, along with statistical efficiency superior to the competitors.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Modelos Estatísticos / Registros Eletrônicos de Saúde Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Modelos Estatísticos / Registros Eletrônicos de Saúde Idioma: En Ano de publicação: 2023 Tipo de documento: Article