Your browser doesn't support javascript.
loading
Simulating complex patient populations with hierarchical learning effects to support methods development for post-market surveillance.
Davis, Sharon E; Ssemaganda, Henry; Koola, Jejo D; Mao, Jialin; Westerman, Dax; Speroff, Theodore; Govindarajulu, Usha S; Ramsay, Craig R; Sedrakyan, Art; Ohno-Machado, Lucila; Resnic, Frederic S; Matheny, Michael E.
Afiliación
  • Davis SE; Department of Biomedical Informatics, Vanderbilt University Medical Center, 2525 West End Ave, Suite 1475, Nashville, TN, 37203, USA. sharon.e.davis.1@vumc.org.
  • Ssemaganda H; Comparative Effectiveness Research Institute, Lahey Hospital and Medical Center, 41 Mall Road, Burlington, MA, 01803, USA.
  • Koola JD; UC Health Department of Biomedical Informatics, University of California San Diego, 9500 Gilman Dr. MC 0728, La Jolla, San Diego, CA, 92093-0728, USA.
  • Mao J; Department of Population Health Sciences, Weill Cornell Medicine, 1300 York Avenue, New York, NY, 10065, USA.
  • Westerman D; Department of Biomedical Informatics, Vanderbilt University Medical Center, 2525 West End Ave, Suite 1475, Nashville, TN, 37203, USA.
  • Speroff T; Departments of Medicine and Biostatistics, Vanderbilt University Medical Center, 1313 21St Avenue South, Oxford House, Room 209, Nashville, TN, 37232, USA.
  • Govindarajulu US; Center for Biostatistics, Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1077, New York, NY, 10029, USA.
  • Ramsay CR; Health Services Research Unit, University of Aberdeen, Health Sciences Building, Foresterhill, 3rd Floor, Aberdeen, AB25 2ZD, UK.
  • Sedrakyan A; Department of Population Health Sciences, Weill Cornell Medicine, 1300 York Avenue, New York, NY, 10065, USA.
  • Ohno-Machado L; Biomedical Informatics and Data Science, Yale School of Medicine, 100 College Street, New Haven, CT, 06510, USA.
  • Resnic FS; Division of Cardiovascular Medicine and Comparative Effectiveness Research Institute, Lahey Hospital and Medical Center, Tufts University School of Medicine, 41 Burlington Mall Road, Burlington, MA, 01805, USA.
  • Matheny ME; Departments of Biomedical Informatics, Biostatistics, and Medicine, Vanderbilt University Medical Center, 2525 West End Ave, Suite 1475, Nashville, TN, 37203, USA.
BMC Med Res Methodol ; 23(1): 89, 2023 04 11.
Article en En | MEDLINE | ID: mdl-37041457
ABSTRACT

BACKGROUND:

Validating new algorithms, such as methods to disentangle intrinsic treatment risk from risk associated with experiential learning of novel treatments, often requires knowing the ground truth for data characteristics under investigation. Since the ground truth is inaccessible in real world data, simulation studies using synthetic datasets that mimic complex clinical environments are essential. We describe and evaluate a generalizable framework for injecting hierarchical learning effects within a robust data generation process that incorporates the magnitude of intrinsic risk and accounts for known critical elements in clinical data relationships.

METHODS:

We present a multi-step data generating process with customizable options and flexible modules to support a variety of simulation requirements. Synthetic patients with nonlinear and correlated features are assigned to provider and institution case series. The probability of treatment and outcome assignment are associated with patient features based on user definitions. Risk due to experiential learning by providers and/or institutions when novel treatments are introduced is injected at various speeds and magnitudes. To further reflect real-world complexity, users can request missing values and omitted variables. We illustrate an implementation of our method in a case study using MIMIC-III data for reference patient feature distributions.

RESULTS:

Realized data characteristics in the simulated data reflected specified values. Apparent deviations in treatment effects and feature distributions, though not statistically significant, were most common in small datasets (n < 3000) and attributable to random noise and variability in estimating realized values in small samples. When learning effects were specified, synthetic datasets exhibited changes in the probability of an adverse outcomes as cases accrued for the treatment group impacted by learning and stable probabilities as cases accrued for the treatment group not affected by learning.

CONCLUSIONS:

Our framework extends clinical data simulation techniques beyond generation of patient features to incorporate hierarchical learning effects. This enables the complex simulation studies required to develop and rigorously test algorithms developed to disentangle treatment safety signals from the effects of experiential learning. By supporting such efforts, this work can help identify training opportunities, avoid unwarranted restriction of access to medical advances, and hasten treatment improvements.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Aprendizaje Profundo Tipo de estudio: Prognostic_studies / Screening_studies Límite: Humans Idioma: En Revista: BMC Med Res Methodol Asunto de la revista: MEDICINA Año: 2023 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Aprendizaje Profundo Tipo de estudio: Prognostic_studies / Screening_studies Límite: Humans Idioma: En Revista: BMC Med Res Methodol Asunto de la revista: MEDICINA Año: 2023 Tipo del documento: Article País de afiliación: Estados Unidos