RESUMO
We develop a Bayesian semiparametric model for the impact of dynamic treatment rules on survival among patients diagnosed with pediatric acute myeloid leukemia (AML). The data consist of a subset of patients enrolled in a phase III clinical trial in which patients move through a sequence of four treatment courses. At each course, they undergo treatment that may or may not include anthracyclines (ACT). While ACT is known to be effective at treating AML, it is also cardiotoxic and can lead to early death for some patients. Our task is to estimate the potential survival probability under hypothetical dynamic ACT treatment strategies, but there are several impediments. First, since ACT is not randomized, its effect on survival is confounded over time. Second, subjects initiate the next course depending on when they recover from the previous course, making timing potentially informative of subsequent treatment and survival. Third, patients may die or drop out before ever completing the full treatment sequence. We develop a generative Bayesian semiparametric model based on Gamma Process priors to address these complexities. At each treatment course, the model captures subjects' transition to subsequent treatment or death in continuous time. G-computation is used to compute a posterior over potential survival probability that is adjusted for time-varying confounding. Using our approach, we estimate the efficacy of hypothetical treatment rules that dynamically modify ACT based on evolving cardiac function.
Assuntos
Teorema de Bayes , Leucemia Mieloide Aguda , Modelos Estatísticos , Humanos , Leucemia Mieloide Aguda/tratamento farmacológico , Antraciclinas/uso terapêutico , Criança , Ensaios Clínicos Fase III como AssuntoRESUMO
BACKGROUND: Cytomegalovirus (CMV) commonly reactivates after allogeneic hematopoietic cell transplant (HCT), potentially leading to CMV disease and significant morbidity and mortality. To reduce morbidity and mortality, many centers conduct weekly CMV blood polymerase chain reaction (PCR) surveillance testing with subsequent initiation of antiviral therapy upon CMV DNAemia detection. However, the impact of CMV DNAemia on subsequent hospitalization risk has not been assessed using models accounting for the time-varying nature of the exposure, outcome, and confounders. METHODS: All allogeneic HCTs at the Children's Hospital of Philadelphia from January 2004-April 2017 were considered for inclusion. Patients were monitored with CMV surveillance via PCR testing for up to 105 days after HCT receipt. We estimated the association between CMV DNAemia and rate of hospitalization using marginal structural models (MSM). RESULTS: There were 343 allogeneic HCT episodes in 330 with CMV surveillance; median age was 9.0 (range: 0.1-26.2) and 46.5% were female. And 24.1% of HCT patients had at least one positive CMV blood PCR during the follow-up period. Median time to CMV DNAemia detection was 19 days (range: 4-97). The MSM estimated the incidence rate ratios for an association of CMV DNAemia with hospitalization to be 1.24, (95% confidence interval: 1.04-1.47). CONCLUSIONS: CMV DNAemia was associated with an increased hospitalization in the post-HCT period. The MSM accounted for time-varying nature of the outcome, exposure and confounders. The findings support prevention of CMV DNAemia in this population. We recommend further investigation into the effectiveness and safety of prophylaxis versus pre-emptive CMV prevention approaches.
Assuntos
Infecções por Citomegalovirus , Transplante de Células-Tronco Hematopoéticas , Criança , Humanos , Feminino , Masculino , Transplante de Células-Tronco Hematopoéticas/efeitos adversos , Transplante Homólogo/efeitos adversos , DNA Viral , Infecções por Citomegalovirus/diagnóstico , Citomegalovirus , Antivirais/uso terapêutico , Estudos RetrospectivosRESUMO
A major focus of causal inference is the estimation of heterogeneous average treatment effects (HTE) - average treatment effects within strata of another variable of interest such as levels of a biomarker, education, or age strata. Inference involves estimating a stratum-specific regression and integrating it over the distribution of confounders in that stratum - which itself must be estimated. Standard practice involves estimating these stratum-specific confounder distributions independently (e.g. via the empirical distribution or Rubin's Bayesian bootstrap), which becomes problematic for sparsely populated strata with few observed confounder vectors. In this paper, we develop a nonparametric hierarchical Bayesian bootstrap (HBB) prior over the stratum-specific confounder distributions for HTE estimation. The HBB partially pools the stratum-specific distributions, thereby allowing principled borrowing of confounder information across strata when sparsity is a concern. We show that posterior inference under the HBB can yield efficiency gains over standard marginalization approaches while avoiding strong parametric assumptions about the confounder distribution. We use our approach to estimate the adverse event risk of proton versus photon chemoradiotherapy across various cancer types.
RESUMO
BACKGROUND: With rising cost pressures on health care systems, machine-learning (ML)-based algorithms are increasingly used to predict health care costs. Despite their potential advantages, the successful implementation of these methods could be undermined by biases introduced in the design, conduct, or analysis of studies seeking to develop and/or validate ML models. The utility of such models may also be negatively affected by poor reporting of these studies. In this systematic review, we aim to evaluate the reporting quality, methodological characteristics, and risk of bias of ML-based prediction models for individual-level health care spending. METHODS: We will systematically search PubMed and Embase to identify studies developing, updating, or validating ML-based models to predict an individual's health care spending for any medical condition, over any time period, and in any setting. We will exclude prediction models of aggregate-level health care spending, models used to infer causality, models using radiomics or speech parameters, models of non-clinically validated predictors (e.g., genomics), and cost-effectiveness analyses without predicting individual-level health care spending. We will extract data based on the Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modeling Studies (CHARMS), previously published research, and relevant recommendations. We will assess the adherence of ML-based studies to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement and examine the inclusion of transparency and reproducibility indicators (e.g. statements on data sharing). To assess the risk of bias, we will apply the Prediction model Risk Of Bias Assessment Tool (PROBAST). Findings will be stratified by study design, ML methods used, population characteristics, and medical field. DISCUSSION: Our systematic review will appraise the quality, reporting, and risk of bias of ML-based models for individualized health care cost prediction. This review will provide an overview of the available models and give insights into the strengths and limitations of using ML methods for the prediction of health spending.
RESUMO
BACKGROUND: Ventilator-associated lower respiratory tract infection (VA-LRTI) is common among critically ill patients and has been associated with increased morbidity and mortality. In acute critical illness, respiratory microbiome disruption indices (MDIs) have been shown to predict risk for VA-LRTI, but their utility beyond the first days of critical illness is unknown. We sought to characterize how MDIs previously shown to predict VA-LRTI at initiation of mechanical ventilation change with prolonged mechanical ventilation, and if they remain associated with VA-LRTI risk. METHODS: We developed a cohort of 83 subjects admitted to a long-term acute care hospital due to their prolonged dependence on mechanical ventilation; performed dense, longitudinal sampling of the lower respiratory tract, collecting 1066 specimens; and characterized the lower respiratory microbiome by 16S rRNA sequencing as well as total bacterial abundance by 16S rRNA quantitative polymerase chain reaction. RESULTS: Cross-sectional MDIs, including low Shannon diversity and high total bacterial abundance, were associated with risk for VA-LRTI, but associations had wide posterior credible intervals. Persistent lower respiratory microbiome disruption showed a more robust association with VA-LRTI risk, with each day of (base e) Shannon diversityâ <2.0 associated with a VA-LRTI odds ratio of 1.36 (95% credible interval, 1.10-1.72). The observed association was consistent across multiple clinical definitions of VA-LRTI. CONCLUSIONS: Cross-sectional MDIs have limited ability to discriminate VA-LRTI risk during prolonged mechanical ventilation, but persistent lower respiratory tract microbiome disruption, best characterized by consecutive days with low Shannon diversity, may identify a population at high risk for infection and may help target infection-prevention interventions.
Assuntos
Microbiota , Pneumonia Associada à Ventilação Mecânica , Infecções Respiratórias , Estado Terminal , Estudos Transversais , Humanos , Microbiota/genética , Pneumonia Associada à Ventilação Mecânica/microbiologia , RNA Ribossômico 16S/genética , Sistema Respiratório , Infecções Respiratórias/microbiologia , Ventiladores MecânicosRESUMO
Researchers are often interested in predicting outcomes, detecting distinct subgroups of their data, or estimating causal treatment effects. Pathological data distributions that exhibit skewness and zero-inflation complicate these tasks-requiring highly flexible, data-adaptive modeling. In this paper, we present a multipurpose Bayesian nonparametric model for continuous, zero-inflated outcomes that simultaneously predicts structural zeros, captures skewness, and clusters patients with similar joint data distributions. The flexibility of our approach yields predictions that capture the joint data distribution better than commonly used zero-inflated methods. Moreover, we demonstrate that our model can be coherently incorporated into a standardization procedure for computing causal effect estimates that are robust to such data pathologies. Uncertainty at all levels of this model flow through to the causal effect estimates of interest-allowing easy point estimation, interval estimation, and posterior predictive checks verifying positivity, a required causal identification assumption. Our simulation results show point estimates to have low bias and interval estimates to have close to nominal coverage under complicated data settings. Under simpler settings, these results hold while incurring lower efficiency loss than comparator methods. We use our proposed method to analyze zero-inflated inpatient medical costs among endometrial cancer patients receiving either chemotherapy or radiation therapy in the SEER-Medicare database.
Assuntos
Medicare , Modelos Estatísticos , Idoso , Teorema de Bayes , Causalidade , Análise por Conglomerados , Humanos , Estados UnidosRESUMO
Substantial advances in Bayesian methods for causal inference have been made in recent years. We provide an introduction to Bayesian inference for causal effects for practicing statisticians who have some familiarity with Bayesian models and would like an overview of what it can add to causal estimation in practical settings. In the paper, we demonstrate how priors can induce shrinkage and sparsity in parametric models and be used to perform probabilistic sensitivity analyses around causal assumptions. We provide an overview of nonparametric Bayesian estimation and survey their applications in the causal inference literature. Inference in the point-treatment and time-varying treatment settings are considered. For the latter, we explore both static and dynamic treatment regimes. Throughout, we illustrate implementation using off-the-shelf open source software. We hope to leave the reader with implementation-level knowledge of Bayesian causal inference using both parametric and nonparametric models. All synthetic examples and code used in the paper are publicly available on a companion GitHub repository.
Assuntos
Software , Teorema de Bayes , Causalidade , HumanosRESUMO
OBJECTIVES: To evaluate the impact of the Community-Based Care Management (CBCM) program on total costs of care and utilization among adult high-need, high-cost patients enrolled in a Medicaid managed care organization (MCO). CBCM was a Medicaid insurer-led care coordination and disease management program staffed by nurse care managers paired with community health workers. STUDY DESIGN: Retrospective cohort analysis. METHODS: We obtained deidentified health plan claims data, enrollment information, and the MCO's monthly registry of the top 10% of costliest patients. The analysis included 896 patients enrolled in CBCM over the course of 2 years (January 2016 to December 2017) and a propensity score-matched cohort of high-cost patients (n = 2152) who received primary care at sites that did not participate in CBCM during the same time period. The primary outcomes were total costs of care and utilization in the 12-month period after enrollment. Secondary outcomes included utilization by care setting: outpatient, inpatient, emergency department, pharmacy, postacute care, and all other remaining sites. We used zero-inflated gamma and Poisson regression models to estimate average differences in postperiod costs and utilization between CBCM enrollees versus non-CBCM enrollees. RESULTS: We did not observe meaningful differences in total costs or visit frequency among CBCM enrollees relative to non-CBCM enrollees. CONCLUSIONS: Although our study found no association between the CBCM program and subsequent cost or utilization outcomes, understanding why these outcomes were not achieved will inform how future Medicaid programs are designed to achieve better patient outcomes and lower costs.
Assuntos
Seguradoras , Programas de Assistência Gerenciada/organização & administração , Medicaid/organização & administração , Aceitação pelo Paciente de Cuidados de Saúde/estatística & dados numéricos , Administração dos Cuidados ao Paciente/organização & administração , Adulto , Fatores Etários , Agentes Comunitários de Saúde/organização & administração , Feminino , Humanos , Masculino , Programas de Assistência Gerenciada/economia , Medicaid/economia , Pessoa de Meia-Idade , Administração dos Cuidados ao Paciente/economia , Equipe de Assistência ao Paciente/organização & administração , Estudos Retrospectivos , Fatores Sexuais , Fatores Socioeconômicos , Estados UnidosRESUMO
Importance: The effect of the Patient Protection and Affordable Care Act's Medicaid expansion on cancer care delivery and outcomes is unknown. Patients with cancer are a high-risk group for whom treatment delays are particularly detrimental. Objective: To examine the association between Medicaid expansion and changes in insurance status, stage at diagnosis, and timely treatment among patients with incident breast, colon, and non-small cell lung cancer. Design, Setting, and Participants: This quasi-experimental, difference-in-differences (DID) cross-sectional study included nonelderly adults (aged 40-64 years) with a new diagnosis of invasive breast, colon, or non-small cell lung cancer from January 1, 2011, to December 31, 2016, in the National Cancer Database, a hospital-based registry capturing more than 70% of incident cancer diagnoses in the United States. Data were analyzed from March 8 to August 15, 2019. Exposures: Residence in a state that expanded Medicaid on January 1, 2014. Main Outcomes and Measures: The primary outcomes were insurance status, cancer stage, and timely treatment within 30 and 90 days of diagnosis. Results: A total of 925â¯543 patients (78.6% women; mean [SD] age, 55.0 [6.5] years; 14.2% black; and 5.7% Hispanic) had a new diagnosis of invasive breast (58.9%), colon (14.6%), or non-small cell lung (26.5%) cancer; 48.3% resided in Medicaid expansion states and 51.7% resided in nonexpansion states. Compared with nonexpansion states, the percentage of uninsured patients decreased more in expansion states (adjusted DID, -0.7 [95% CI, -1.2 to -0.3] percentage points), and the percentage of early-stage cancer diagnoses rose more in expansion states (adjusted DID, 0.8 [95% CI, 0.3 to 1.2] percentage points). Among the 848â¯329 patients who underwent cancer-directed therapy within 365 days of diagnosis, the percentage treated within 30 days declined from 52.7% before to 48.0% after expansion in expansion states (difference, -4.7 [95% CI, -5.1 to -4.5] percentage points). In nonexpansion states, this percentage declined from 56.9% to 51.5% (difference, -5.4 [95% CI, -5.6 to -5.1] percentage points), yielding no statistically significant DID in timely treatment associated with Medicaid expansion (adjusted DID, 0.6 [95% CI, -0.2 to 1.4] percentage points). Conclusions and Relevance: This study found that, among patients with incident breast, colon, and lung cancer, Medicaid expansion was associated with a decreased rate of uninsured patients and increased rate of early-stage cancer diagnosis; no evidence of improvement or decrement in the rate of timely treatment was found. Further research is warranted to understand Medicaid expansion's effect on the treatment patterns and health outcomes of patients with cancer.
Assuntos
Cobertura do Seguro/estatística & dados numéricos , Medicaid , Neoplasias/epidemiologia , Patient Protection and Affordable Care Act , Tempo para o Tratamento/estatística & dados numéricos , Idoso , Estudos Transversais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Neoplasias/economia , Neoplasias/terapia , Estados UnidosRESUMO
Objective: Describe the development of a claims-based classifier utilizing machine learning to identify patients with probable Lennox-Gastaut syndrome (LGS) from six state Medicaid programs. Methods: Patients were included if they had ≥2 medical claims ≥30 days apart for specified or unspecified epilepsy, excluding those with ≥1 claim for petit mal status. The LGS classifier utilized a random forest algorithm, a compilation of thousands of binary decision trees in which machine-generated predictor variables split the data set into branches that predict the presence or absence of LGS. To construct the splitting rules, the importance of each candidate variable was determined by calculating the mean decrease in Gini impurity. Training and testing were performed on two data sets (30% and 70%) using a "true" LGS and non-LGS patient population. Performance was compared with logistic regression and single tree methodology. Results: Using a 60% probability threshold, which yielded the highest sensitivity (97.3%) and specificity (95.6%), the classifier identified approximately 4% of patients with epilepsy as probable LGS. The most important input variables included number of distinct antiepileptic drugs received, epilepsy-related outpatient/inpatient visits, electroencephalogram procedures and claims for delayed development. The random forest methodology outperformed logistic regression and single tree methodology. Most of the important LGS predictor characteristics identified by the classifier were statistically significantly associated with LGS status (p < .05). Conclusions: The claims-based LGS classifier showed high sensitivity and specificity, outperformed single tree and logistic regression methodologies and identified a prevalence of probable LGS that was similar to previously published estimates.
Assuntos
Síndrome de Lennox-Gastaut/diagnóstico , Medicaid , Modelos Estatísticos , Demandas Administrativas em Assistência à Saúde , Bases de Dados Factuais , Árvores de Decisões , Humanos , Estado Epiléptico , Estados UnidosRESUMO
Previous studies indicate racial/ethnic differences in health care utilization for pediatric atopic dermatitis (AD), but do not account for disease severity impact. We sought to examine the relationship between race/ethnicity and health care utilization, both overall and by specific visit type, while accounting for AD control. A longitudinal cohort study of children with AD in the United States was performed to evaluate the association between race/ethnicity and health care utilization for AD. AD control and health care utilization were assessed biannually. Our study included 7,522 children (34.2% white, 54.2% black, and 11.5% Hispanic) who were followed for a median of 4 years (interquartile range 0.9-8.4 years). After adjusting for sociodemographic and other factors, black and Hispanic children were up to nearly threefold more likely than white children to receive medical care for AD across almost all levels of AD control. Black and Hispanic children had higher odds of primary care and emergency visits compared to whites. Black children with poorly controlled AD were significantly less likely to see a dermatologist than white children with similarly poorly controlled AD (odds ratio = 0.74, 95% confidence interval = 0.64-0.85 for limited control; odds ratio = 0.59, 95% confidence interval = 0.47-0.76 for uncontrolled AD). Together, these findings suggest the presence of racial/ethnic disparities in health care utilization for AD.
Assuntos
Assistência Ambulatorial/estatística & dados numéricos , Dermatite Atópica/terapia , Serviços Médicos de Emergência/estatística & dados numéricos , Disparidades em Assistência à Saúde/etnologia , Aceitação pelo Paciente de Cuidados de Saúde/estatística & dados numéricos , Negro ou Afro-Americano/estatística & dados numéricos , Criança , Pré-Escolar , Dermatite Atópica/etnologia , Feminino , Disparidades em Assistência à Saúde/estatística & dados numéricos , Hispânico ou Latino/estatística & dados numéricos , Humanos , Estudos Longitudinais , Masculino , Estados Unidos , População Branca/estatística & dados numéricosAssuntos
Indústria Farmacêutica/economia , Medicare Part D/economia , Padrões de Prática Médica/economia , Inibidores do Fator de Necrose Tumoral/economia , Indústria Farmacêutica/tendências , Feminino , Humanos , Masculino , Medicare Part D/tendências , Padrões de Prática Médica/tendências , Inibidores do Fator de Necrose Tumoral/uso terapêutico , Estados UnidosRESUMO
Phenotyping, ie, identification of patients possessing a characteristic of interest, is a fundamental task for research conducted using electronic health records. However, challenges to this task include imperfect sensitivity and specificity of clinical codes and inconsistent availability of more detailed data such as laboratory test results. Despite these challenges, most existing electronic health records-derived phenotypes are rule-based, consisting of a series of Boolean arguments informed by expert knowledge of the disease of interest and its coding. The objective of this paper is to introduce a Bayesian latent phenotyping approach that accounts for imperfect data elements and missing not at random missingness patterns that can be used when no gold-standard data are available. We conducted simulation studies to compare alternative phenotyping methods under different patterns of missingness and applied these approaches to a cohort of 68 265 children at elevated risk for type 2 diabetes mellitus (T2DM). In simulation studies, the latent class approach had similar sensitivity to a rule-based approach (95.9% vs 91.9%) while substantially improving specificity (99.7% vs 90.8%). In the PEDSnet cohort, we found that biomarkers and clinical codes were strongly associated with latent T2DM status. The latent T2DM class was also strongly predictive of missingness in biomarkers. Glucose was missing in 83.4% of patients (odds ratio for latent T2DM status = 0.52) while hemoglobin A1c was missing in 91.2% (odds ratio for latent T2DM status = 0.03 ), suggesting missing not at random missingness. The latent phenotype approach may substantially improve on rule-based phenotyping.