RESUMEN
The CDK4/6 inhibitor palbociclib blocks cell cycle progression in Estrogen receptor-positive, human epidermal growth factor 2 receptor-negative (ER+/HER2-) breast tumor cells. Despite the drug's success in improving patient outcomes, a small percentage of tumor cells continues to divide in the presence of palbociclib-a phenomenon we refer to as fractional resistance. It is critical to understand the cellular mechanisms underlying fractional resistance because the precise percentage of resistant cells in patient tissue is a strong predictor of clinical outcomes. Here, we hypothesize that fractional resistance arises from cell-to-cell differences in core cell cycle regulators that allow a subset of cells to escape CDK4/6 inhibitor therapy. We used multiplex, single-cell imaging to identify fractionally resistant cells in both cultured and primary breast tumor samples resected from patients. Resistant cells showed premature accumulation of multiple G1 regulators including E2F1, retinoblastoma protein, and CDK2, as well as enhanced sensitivity to pharmacological inhibition of CDK2 activity. Using trajectory inference approaches, we show how plasticity among cell cycle regulators gives rise to alternate cell cycle "paths" that allow individual tumor cells to escape palbociclib treatment. Understanding drivers of cell cycle plasticity, and how to eliminate resistant cell cycle paths, could lead to improved cancer therapies targeting fractionally resistant cells to improve patient outcomes.
Asunto(s)
Neoplasias de la Mama , Piperazinas , Piridinas , Humanos , Femenino , Ciclo Celular , División Celular , Piperazinas/farmacología , Piperazinas/uso terapéutico , Neoplasias de la Mama/tratamiento farmacológico , Quinasa 4 Dependiente de la Ciclina/metabolismo , Quinasa 6 Dependiente de la Ciclina/metabolismo , Inhibidores de Proteínas Quinasas/farmacologíaRESUMEN
We recently developed a machine-learning subgrouping algorithm, iterative causal forest (iCF), to identify subgroups with heterogeneous treatment effects (HTEs) using predefined covariates. However, such predefined covariates may miss or poorly define important features leading to inaccurate subgrouping. To address such limitations, we developed a new semi-automatic subgrouping algorithm, hdiCF, which adapts methodology from high-dimensional propensity score for feature recognition in claims data. The hdiCF algorithm has 3 steps: 1) high-dimensional feature identification by International Classification of Diseases, Current Procedural Terminology, and Anatomical Therapeutic Chemical codes (in/outpatient diagnoses, procedures, prescriptions) and creation of ordinal variables by frequency of occurrence; 2) propensity score trimming and high-dimensional feature preparation; 3) iCF implementation to identify subgroups. We applied hdiCF in a 20% random sample of fee-for-service Medicare beneficiaries who initiated sodium-glucose cotransporter-2 inhibitors (SGLT2i) or glucagon-like peptide-1 receptor agonists to identify subgroups with HTEs for incidence of hospitalized heart failure. HdiCF findings were consistent with studies suggesting SGLT2i to be more beneficial for patients with pre-existing heart failure or chronic kidney disease. HdiCF is not dependent on prior hypotheses about HTEs and identifies subgroups with markers for potential HTEs in real-world evidence studies where active-comparator, new-user study designs limit the potential for unmeasured confounding.
RESUMEN
CpG site methylation patterns have potential to improve differentiation of high-grade screening-detected cervical abnormalities. We assessed CpG differential methylation (DM) and differential variability (DV) in high-grade (CIN2+) vs. low-grade (≤CIN1) lesions. In ≤CIN1 (n=117) and CIN2+ (n=31) samples, cervical sample DNA underwent testing with Illumina HumanMethylation arrays. We assessed DM and DV of CpG methylation M values among nine cervical cancer-associated genes. We fit CpG-specific linear models and estimated empirical Bayes standard errors and false discovery rates (FDR). An exploratory epigenome-wide association study (EWAS) aimed to detect novel DM and DV CpGs (FDR<0.05) and Gene Ontology (GO) term enrichment. Compared to ≤CIN1, CIN2+ exhibited greater methylation at CCNA1 Cluster 1 (M value difference 0.24; 95% CI 0.04, 0.43) and RARB Cluster 2 (0.16; 95% CI 0.05, 0.28), and lower methylation at CDH1 Cluster 1 (-0.15; 95% CI -0.26, -0.04). CIN2+ exhibited lower variability at CDH1 Cluster 2 (variation difference -0.24; 95% CI -0.41, -0.05) and FHIT Cluster 1 (-0.30; 95% CI -0.50, -0.09). EWAS detected 3,534 DM and 270 DV CpGs. Forty-four GO terms were enriched with DM CpGs related to transcriptional, structural, developmental, and neuronal processes. Methylation patterns may help triage screening-detected cervical abnormalities and inform US screening algorithms.
RESUMEN
Precision medicine is a promising framework for generating evidence to improve health and health care. Yet, a gap persists between the ever-growing number of statistical precision medicine strategies for evidence generation and implementation in real-world clinical settings, and the strategies for closing this gap will likely be context-dependent. In this paper, we consider the specific context of partial compliance to wound management among patients with peripheral artery disease. Using a Gaussian process surrogate for the value function, we show the feasibility of using Bayesian optimization to learn optimal individualized treatment rules. Further, we expand beyond the common precision medicine task of learning an optimal individualized treatment rule to the characterization of classes of individualized treatment rules and show how those findings can be translated into clinical contexts.
Asunto(s)
Medicina de Precisión , Humanos , Teorema de BayesRESUMEN
Machine learning (ML) has seen impressive growth in health science research due to its capacity for handling complex data to perform a range of tasks, including unsupervised learning, supervised learning, and reinforcement learning. To aid health science researchers in understanding the strengths and limitations of ML and to facilitate its integration into their studies, we present here a guideline for integrating ML into an analysis through a structured framework, covering steps from framing a research question to study design and analysis techniques for specialized data types.
Asunto(s)
Aprendizaje Automático , Refuerzo en Psicología , Humanos , Proyectos de Investigación , InvestigadoresRESUMEN
Multilevel interventions (MLIs) hold promise for reducing health inequities by intervening at multiple types of social determinants of health consistent with the socioecological model of health. In spite of their potential, methodological challenges related to study design compounded by a lack of tools for sample size calculation inhibit their development. We help address this gap by proposing the Multilevel Intervention Stepped Wedge Design (MLI-SWD), a hybrid experimental design which combines cluster-level (CL) randomization using a Stepped Wedge design (SWD) with independent individual-level (IL) randomization. The MLI-SWD is suitable for MLIs where the IL intervention has a low risk of interference between individuals in the same cluster, and it enables estimation of the component IL and CL treatment effects, their interaction, and the combined intervention effect. The MLI-SWD accommodates cross-sectional and cohort designs as well as both incomplete (clusters are not observed in every study period) and complete observation patterns. We adapt recent work using generalized estimating equations for SWD sample size calculation to the multilevel setting and provide an R package for power and sample size calculation. Furthermore, motivated by our experiences with the ongoing NC Works 4 Health study, we consider how to apply the MLI-SWD when individuals join clusters over the course of the study. This situation arises when unemployment MLIs include IL interventions that are delivered while the individual is unemployed. This extension requires carefully considering whether the study interventions will satisfy additional causal assumptions but could permit randomization in new settings.
Asunto(s)
Proyectos de Investigación , Humanos , Tamaño de la Muestra , Ensayos Clínicos Controlados Aleatorios como Asunto , Estudios TransversalesRESUMEN
BACKGROUND: Studies on the efficacy of peanut sublingual immunotherapy (SLIT) are limited. The durability of desensitization after SLIT has not been well described. OBJECTIVE: We sought to evaluate the efficacy and safety of 4-mg peanut SLIT and persistence of desensitization after SLIT discontinuation. METHODS: Challenge-proven peanut-allergic 1- to 11-year-old children were treated with open-label 4-mg peanut SLIT for 48 months. Desensitization after peanut SLIT was assessed by a 5000-mg double-blind, placebo-controlled food challenge (DBPCFC). A novel randomly assigned avoidance period of 1 to 17 weeks was followed by the DBPCFC. Skin prick test results immunoglobulin levels, basophil activation test results, TH1, TH2, and IL-10 cytokines were measured longitudinally. Safety was assessed through patient-reported home diaries. RESULTS: Fifty-four participants were enrolled and 47 (87%) completed peanut SLIT and the 48-month DBPCFC per protocol. The mean successfully consumed dose (SCD) during the DBPCFC increased from 48 to 2723 mg of peanut protein after SLIT (P < .0001), with 70% achieving clinically significant desensitization (SCD > 800 mg) and 36% achieving full desensitization (SCD = 5000 mg). Modeled median time to loss of clinically significant desensitization was 22 weeks. Peanut skin prick test; peanut-specific IgE, IgG4, and IgG4/IgE ratio; and peanut-stimulated basophil activation test, IL-4, IL-5, IL-13, IFN-γ, and IL-10 changed significantly compared with baseline, with changes seen as early as 6 months. Median rate of reaction per dose was 0.5%, with transient oropharyngeal itching being the most common, and there were no dosing symptoms requiring epinephrine. CONCLUSIONS: In this open-label, prospective study, peanut SLIT was safe and induced clinically significant desensitization in most of the children, lasting more than 17 weeks after discontinuation of therapy.
Asunto(s)
Hipersensibilidad al Cacahuete , Inmunoterapia Sublingual , Humanos , Niño , Lactante , Preescolar , Inmunoterapia Sublingual/efectos adversos , Inmunoterapia Sublingual/métodos , Arachis , Desensibilización Inmunológica/efectos adversos , Desensibilización Inmunológica/métodos , Interleucina-10 , Estudios Prospectivos , Hipersensibilidad al Cacahuete/terapia , Hipersensibilidad al Cacahuete/diagnóstico , Inmunoglobulina E , Alérgenos , Inmunoglobulina G , Administración OralRESUMEN
Importance: Accurate assessment of gestational age (GA) is essential to good pregnancy care but often requires ultrasonography, which may not be available in low-resource settings. This study developed a deep learning artificial intelligence (AI) model to estimate GA from blind ultrasonography sweeps and incorporated it into the software of a low-cost, battery-powered device. Objective: To evaluate GA estimation accuracy of an AI-enabled ultrasonography tool when used by novice users with no prior training in sonography. Design, Setting, and Participants: This prospective diagnostic accuracy study enrolled 400 individuals with viable, single, nonanomalous, first-trimester pregnancies in Lusaka, Zambia, and Chapel Hill, North Carolina. Credentialed sonographers established the "ground truth" GA via transvaginal crown-rump length measurement. At random follow-up visits throughout gestation, including a primary evaluation window from 14 0/7 weeks' to 27 6/7 weeks' gestation, novice users obtained blind sweeps of the maternal abdomen using the AI-enabled device (index test) and credentialed sonographers performed fetal biometry with a high-specification machine (study standard). Main Outcomes and Measures: The primary outcome was the mean absolute error (MAE) of the index test and study standard, which was calculated by comparing each method's estimate to the previously established GA and considered equivalent if the difference fell within a prespecified margin of ±2 days. Results: In the primary evaluation window, the AI-enabled device met criteria for equivalence to the study standard, with an MAE (SE) of 3.2 (0.1) days vs 3.0 (0.1) days (difference, 0.2 days [95% CI, -0.1 to 0.5]). Additionally, the percentage of assessments within 7 days of the ground truth GA was comparable (90.7% for the index test vs 92.5% for the study standard). Performance was consistent in prespecified subgroups, including the Zambia and North Carolina cohorts and those with high body mass index. Conclusions and Relevance: Between 14 and 27 weeks' gestation, novice users with no prior training in ultrasonography estimated GA as accurately with the low-cost, point-of-care AI tool as credentialed sonographers performing standard biometry on high-specification machines. These findings have immediate implications for obstetrical care in low-resource settings, advancing the World Health Organization goal of ultrasonography estimation of GA for all pregnant people. Trial Registration: ClinicalTrials.gov Identifier: NCT05433519.
Asunto(s)
Inteligencia Artificial , Edad Gestacional , Ultrasonografía Prenatal , Adulto , Femenino , Humanos , Embarazo , Biometría/métodos , Largo Cráneo-Cadera , Sistemas de Atención de Punto/economía , Primer Trimestre del Embarazo , Estudios Prospectivos , Programas Informáticos , Ultrasonografía Prenatal/economía , Ultrasonografía Prenatal/instrumentación , Ultrasonografía Prenatal/métodos , ZambiaRESUMEN
Precisely and efficiently identifying subgroups with heterogeneous treatment effects (HTEs) in real-world evidence studies remains a challenge. Based on the causal forest (CF) method, we developed an iterative CF (iCF) algorithm to identify HTEs in subgroups defined by important variables. Our method iteratively grows different depths of the CF with important effect modifiers, performs plurality votes to obtain decision trees (subgroup decisions) for a family of CFs with different depths, then finds the cross-validated subgroup decision that best predicts the treatment effect as a final subgroup decision. We simulated 12 different scenarios and showed that the iCF outperformed other machine learning methods for interaction/subgroup identification in the majority of scenarios assessed. Using a 20% random sample of fee-for-service Medicare beneficiaries initiating sodium-glucose cotransporter-2 inhibitors (SGLT2i) or glucagon-like peptide-1 receptor agonists (GLP1RA), we implemented the iCF to identify subgroups with HTEs for hospitalized heart failure. Consistent with previous studies suggesting patients with heart failure benefit more from SGLT2i, iCF successfully identified such a subpopulation with HTEs and additive interactions. The iCF is a promising method for identifying subgroups with HTEs in real-world data where the potential for unmeasured confounding can be limited by study design.
RESUMEN
BACKGROUND: Diet, a key component of type 1 diabetes (T1D) management, modulates the intestinal microbiota and its metabolically active byproducts-including SCFA-through fermentation of dietary carbohydrates such as fiber. However, the diet-microbiome relationship remains largely unexplored in longstanding T1D. OBJECTIVES: We evaluated whether increased carbohydrate intake, including fiber, is associated with increased SCFA-producing gut microbes, SCFA, and intestinal microbial diversity among young adults with longstanding T1D and overweight or obesity. METHODS: Young adult men and women with T1D for ≥1 y, aged 19-30 y, and BMI of 27.0-39.9 kg/m2 at baseline provided stool samples at baseline and 3, 6, and 9 mo of a randomized dietary weight loss trial. Diet was assessed by 1-2 24-h recalls. The abundance of SCFA-producing microbes was measured using 16S rRNA gene sequencing. GC-MS measured fecal SCFA (acetate, butyrate, propionate, and total) concentrations. Adjusted and Bonferroni-corrected generalized estimating equations modeled associations of dietary fiber (total, soluble, and pectins) and carbohydrate (available carbohydrate, and fructose) with microbiome-related outcomes. Primary analyses were restricted to data collected before COVID-19 interruptions. RESULTS: Fiber (total and soluble) and carbohydrates (available and fructose) were positively associated with total SCFA and acetate concentrations (n = 40 participants, 52 visits). Each 10 g/d of total and soluble fiber intake was associated with an additional 8.8 µmol/g (95% CI: 4.5, 12.8 µmol/g; P = 0.006) and 24.0 µmol/g (95% CI: 12.9, 35.1 µmol/g; P = 0.003) of fecal acetate, respectively. Available carbohydrate intake was positively associated with SCFA producers Roseburia and Ruminococcus gnavus. All diet variables except pectin were inversely associated with normalized abundance of Bacteroides and Alistipes. Fructose was inversely associated with Akkermansia abundance. CONCLUSIONS: In young adults with longstanding T1D, fiber and carbohydrate intake were associated positively with fecal SCFA but had variable associations with SCFA-producing gut microbes. Controlled feeding studies should determine whether gut microbes and SCFA can be directly manipulated in T1D.
Asunto(s)
COVID-19 , Diabetes Mellitus Tipo 1 , Microbioma Gastrointestinal , Femenino , Humanos , Masculino , Adulto Joven , Acetatos , Fibras de la Dieta/análisis , Ingestión de Alimentos , Ácidos Grasos Volátiles/análisis , Heces/química , Fructosa , Obesidad , Sobrepeso , ARN Ribosómico 16S/genéticaRESUMEN
BACKGROUND: Methylation levels may be associated with and serve as markers to predict risk of progression of precancerous cervical lesions. We conducted an epigenome-wide association study (EWAS) of CpG methylation and progression to high-grade cervical intraepithelial neoplasia (CIN2 +) following an abnormal screening test. METHODS: A prospective US cohort of 289 colposcopy patients with normal or CIN1 enrollment histology was assessed. Baseline cervical sample DNA was analyzed using Illumina HumanMethylation 450K (n = 76) or EPIC 850K (n = 213) arrays. Participants returned at provider-recommended intervals and were followed up to 5 years via medical records. We assessed continuous CpG M values for 9 cervical cancer-associated genes and time-to-progression to CIN2+. We estimated CpG-specific time-to-event ratios (TTER) and hazard ratios using adjusted, interval-censored Weibull accelerated failure time models. We also conducted an exploratory EWAS to identify novel CpGs with false discovery rate (FDR) < 0.05. RESULTS: At enrollment, median age was 29.2 years; 64.0% were high-risk HPV-positive, and 54.3% were non-white. During follow-up (median 24.4 months), 15 participants progressed to CIN2+. Greater methylation levels were associated with a shorter time-to-CIN2+ for CADM1 cg03505501 (TTER = 0.28; 95%CI 0.12, 0.63; FDR = 0.03) and RARB Cluster 1 (TTER = 0.46; 95% CI 0.29, 0.71; FDR = 0.01). There was evidence of similar trends for DAPK1 cg14286732, PAX1 cg07213060, and PAX1 Cluster 1. The EWAS detected 336 novel progression-associated CpGs, including those located in CpG islands associated with genes FGF22, TOX, COL18A1, GPM6A, XAB2, TIMP2, GSPT1, NR4A2, and APBB1IP. CONCLUSIONS: Using prospective time-to-event data, we detected associations between CADM1-, DAPK1-, PAX1-, and RARB-related CpGs and cervical disease progression, and we identified novel progression-associated CpGs. IMPACT: Methylation levels at novel CpG sites may help identify individuals with ≤CIN1 histology at higher risk of progression to CIN2+ and inform risk-based cervical cancer screening guidelines.
Asunto(s)
Infecciones por Papillomavirus , Displasia del Cuello del Útero , Neoplasias del Cuello Uterino , Femenino , Humanos , Estados Unidos , Adulto , Neoplasias del Cuello Uterino/patología , Estudios Prospectivos , Epigenoma , Detección Precoz del Cáncer , Metilación de ADN , Displasia del Cuello del Útero/diagnóstico , Infecciones por Papillomavirus/complicaciones , Papillomaviridae/genética , Molécula 1 de Adhesión Celular/genéticaRESUMEN
Personalized intervention strategies, in particular those that modify treatment based on a participant's own response, are a core component of precision medicine approaches. Sequential multiple assignment randomized trials (SMARTs) are growing in popularity and are specifically designed to facilitate the evaluation of sequential adaptive strategies, in particular those embedded within the SMART. Advances in efficient estimation approaches that are able to incorporate machine learning while retaining valid inference can allow for more precise estimates of the effectiveness of these embedded regimes. However, to the best of our knowledge, such approaches have not yet been applied as the primary analysis in SMART trials. In this paper, we present a robust and efficient approach using targeted maximum likelihood estimation (TMLE) for estimating and contrasting expected outcomes under the dynamic regimes embedded in a SMART, together with generating simultaneous confidence intervals for the resulting estimates. We contrast this method with two alternatives (G-computation and inverse probability weighting estimators). The precision gains and robust inference achievable through the use of TMLE to evaluate the effects of embedded regimes are illustrated using both outcome-blind simulations and a real-data analysis from the Adaptive Strategies for Preventing and Treating Lapses of Retention in Human Immunodeficiency Virus (HIV) Care (ADAPT-R) trial (NCT02338739), a SMART with a primary aim of identifying strategies to improve retention in HIV care among people living with HIV in sub-Saharan Africa.
Asunto(s)
Infecciones por VIH , Humanos , Ensayos Clínicos Controlados Aleatorios como Asunto , Probabilidad , Infecciones por VIH/tratamiento farmacológicoRESUMEN
AIMS: Co-management of weight and glycaemia is critical yet challenging in type 1 diabetes (T1D). We evaluated the effect of a hypocaloric low carbohydrate, hypocaloric moderate low fat, and Mediterranean diet without calorie restriction on weight and glycaemia in young adults with T1D and overweight or obesity. MATERIALS AND METHODS: We implemented a 9-month Sequential, Multiple Assignment, Randomized Trial pilot among adults aged 19-30 years with T1D for ≥1 year and body mass index 27-39.9 kg/m2 . Re-randomization occurred at 3 and 6 months if the assigned diet was not acceptable or not effective. We report results from the initial 3-month diet period and re-randomization statistics before shutdowns due to COVID-19 for primary [weight, haemoglobin A1c (HbA1c), percentage of time below range <70 mg/dl] and secondary outcomes [body fat percentage, percentage of time in range (70-180 mg/dl), and percentage of time below range <54 mg/dl]. Models adjusted for design, demographic and clinical covariates tested changes in outcomes and diet differences. RESULTS: Adjusted weight and HbA1c (n = 38) changed by -2.7 kg (95% CI -3.8, -1.5, P < .0001) and -0.91 percentage points (95% CI -1.5, -0.30, P = .005), respectively, while adjusted body fat percentage remained stable, on average (P = .21). Hypoglycaemia indices remained unchanged following adjustment (n = 28, P > .05). Variability in all outcomes, including weight change, was considerable (57.9% were re-randomized primarily due to loss of <2% body weight). No outcomes varied by diet. CONCLUSIONS: Three months of a diet, irrespective of macronutrient distribution or caloric restriction, resulted in weight loss while improving or maintaining HbA1c levels without increasing hypoglycaemia in adults with T1D.
Asunto(s)
Diabetes Mellitus Tipo 1 , Hipoglucemia , Obesidad , Sobrepeso , Pérdida de Peso , Humanos , Adulto Joven , Diabetes Mellitus Tipo 1/terapia , Diabetes Mellitus Tipo 1/complicaciones , Hemoglobina Glucada , Hipoglucemia/complicaciones , Obesidad/complicaciones , Obesidad/terapia , Sobrepeso/complicaciones , Sobrepeso/terapiaRESUMEN
BACKGROUND AND AIMS: Disordered eating (DE) in type 1 diabetes (T1D) includes insulin restriction for weight loss with serious complications. Gut microbiota-derived short chain fatty acids (SCFA) may benefit host metabolism but are reduced in T1D. We evaluated the hypothesis that DE and insulin restriction were associated with reduced SCFA-producing gut microbes, SCFA, and intestinal microbial diversity in adults with T1D. METHODS AND RESULTS: We collected stool samples at four timepoints in a hypothesis-generating gut microbiome pilot study ancillary to a weight management pilot in young adults with T1D. 16S ribosomal RNA gene sequencing measured the normalized abundance of SCFA-producing intestinal microbes. Gas-chromatography mass-spectrometry measured SCFA (total, acetate, butyrate, and propionate). The Diabetes Eating Problem Survey-Revised (DEPS-R) assessed DE and insulin restriction. Covariate-adjusted and Bonferroni-corrected generalized estimating equations modeled the associations. COVID-19 interrupted data collection, so models were repeated restricted to pre-COVID-19 data. Data were available for 45 participants at 109 visits, which included 42 participants at 65 visits pre-COVID-19. Participants reported restricting insulin "At least sometimes" at 53.3% of visits. Pre-COVID-19, each 5-point DEPS-R increase was associated with a -0.34 (95% CI -0.56, -0.13, p = 0.07) lower normalized abundance of genus Anaerostipes; and the normalized abundance of Lachnospira genus was -0.94 (95% CI -1.5, -0.42), p = 0.02 lower when insulin restriction was reported "At least sometimes" compared to "Rarely or Never". CONCLUSION: DE and insulin restriction were associated with a reduced abundance of SCFA-producing gut microbes pre-COVID-19. Additional studies are needed to confirm these associations to inform microbiota-based therapies in T1D.
Asunto(s)
COVID-19 , Diabetes Mellitus Tipo 1 , Trastornos de Alimentación y de la Ingestión de Alimentos , Microbioma Gastrointestinal , Humanos , Adulto Joven , Diabetes Mellitus Tipo 1/diagnóstico , Proyectos Piloto , Ácidos Grasos Volátiles/metabolismo , Insulina , HecesRESUMEN
In recent years, the field of precision medicine has seen many advancements. Significant focus has been placed on creating algorithms to estimate individualized treatment rules (ITRs), which map from patient covariates to the space of available treatments with the goal of maximizing patient outcome. Direct learning (D-Learning) is a recent one-step method which estimates the ITR by directly modeling the treatment-covariate interaction. However, when the variance of the outcome is heterogeneous with respect to treatment and covariates, D-Learning does not leverage this structure. Stabilized direct learning (SD-Learning), proposed in this paper, utilizes potential heteroscedasticity in the error term through a residual reweighting which models the residual variance via flexible machine learning algorithms such as XGBoost and random forests. We also develop an internal cross-validation scheme which determines the best residual model among competing models. SD-Learning improves the efficiency of D-Learning estimates in binary and multi-arm treatment scenarios. The method is simple to implement and an easy way to improve existing algorithms within the D-Learning family, including original D-Learning, Angle-based D-Learning (AD-Learning), and Robust D-learning (RD-Learning). We provide theoretical properties and justification of the optimality of SD-Learning. Head-to-head performance comparisons with D-Learning methods are provided through simulations, which demonstrate improvement in terms of average prediction error (APE), misclassification rate, and empirical value, along with a data analysis of an acquired immunodeficiency syndrome (AIDS) randomized clinical trial.
RESUMEN
BACKGROUND: Precision medicine is an emerging field that involves the selection of treatments based on patients' individual prognostic data. It is formalized through the identification of individualized treatment rules (ITRs) that maximize a clinical outcome. When the type of outcome is time-to-event, the correct handling of censoring is crucial for estimating reliable optimal ITRs. METHODS: We propose a jackknife estimator of the value function to allow for right-censored data for a binary treatment. The jackknife estimator or leave-one-out-cross-validation approach can be used to estimate the value function and select optimal ITRs using existing machine learning methods. We address the issue of censoring in survival data by introducing an inverse probability of censoring weighted (IPCW) adjustment in the expression of the jackknife estimator of the value function. In this paper, we estimate the optimal ITR by using random survival forest (RSF) and Cox proportional hazards model (COX). We use a Z-test to compare the optimal ITRs learned by RSF and COX with the zero-order model (or one-size-fits-all). Through simulation studies, we investigate the asymptotic properties and the performance of our proposed estimator under different censoring rates. We illustrate our proposed method on a phase III clinical trial of non-small cell lung cancer data. RESULTS: Our simulations show that COX outperforms RSF for small sample sizes. As sample sizes increase, the performance of RSF improves, in particular when the expected log failure time is not linear in the covariates. The estimator is fairly normally distributed across different combinations of simulation scenarios and censoring rates. When applied to a non-small-cell lung cancer data set, our method determines the zero-order model (ZOM) as the best performing model. This finding highlights the possibility that tailoring may not be needed for this cancer data set. CONCLUSION: The jackknife approach for estimating the value function in the presence of right-censored data shows satisfactory performance when there is small to moderate censoring. Winsorizing the upper and lower percentiles of the estimated survival weights for computing the IPCWs stabilizes the estimator.
Asunto(s)
Carcinoma de Pulmón de Células no Pequeñas , Neoplasias Pulmonares , Humanos , Carcinoma de Pulmón de Células no Pequeñas/terapia , Neoplasias Pulmonares/terapia , Modelos de Riesgos Proporcionales , Probabilidad , Pronóstico , Simulación por Computador , Análisis de SupervivenciaRESUMEN
Severe asthma accounts for almost half the cost associated with asthma. Severe asthma is driven by heterogeneous molecular mechanisms. Conventional clinical trial design often lacks the power and efficiency to target subgroups with specific pathobiological mechanisms. Furthermore, the validation and approval of new asthma therapies is a lengthy process. A large proportion of that time is taken by clinical trials to validate asthma interventions. The National Institutes of Health Precision Medicine in Severe and/or Exacerbation Prone Asthma (PrecISE) program was established with the goal of designing and executing a trial that uses adaptive design techniques to rapidly evaluate novel interventions in biomarker-defined subgroups of severe asthma, while seeking to refine these biomarker subgroups, and to identify early markers of response to therapy. The novel trial design is an adaptive platform trial conducted under a single master protocol that incorporates precision medicine components. Furthermore, it includes innovative applications of futility analysis, cross-over design with use of shared placebo groups, and early futility analysis to permit more rapid identification of effective interventions. The development and rationale behind the study design are described. The interventions chosen for the initial investigation and the criteria used to identify these interventions are enumerated. The biomarker-based adaptive design and analytic scheme are detailed as well as special considerations involved in the final trial design.
Asunto(s)
Asma , Biomarcadores , Medicina de Precisión , Ensayos Clínicos Controlados Aleatorios como Asunto , Humanos , Proyectos de InvestigaciónRESUMEN
Many problems that appear in biomedical decision-making, such as diagnosing disease and predicting response to treatment, can be expressed as binary classification problems. The support vector machine (SVM) is a popular classification technique that is robust to model misspecification and effectively handles high-dimensional data. The relative costs of false positives and false negatives can vary across application domains. The receiving operating characteristic (ROC) curve provides a visual representation of the trade-off between these two types of errors. Because the SVM does not produce a predicted probability, an ROC curve cannot be constructed in the traditional way of thresholding a predicted probability. However, a sequence of weighted SVMs can be used to construct an ROC curve. Although ROC curves constructed using weighted SVMs have great potential for allowing ROC curves analyses that cannot be done by thresholding predicted probabilities, their theoretical properties have heretofore been underdeveloped. We propose a method for constructing confidence bands for the SVM ROC curve and provide the theoretical justification for the SVM ROC curve by showing that the risk function of the estimated decision rule is uniformly consistent across the weight parameter. We demonstrate the proposed confidence band method using simulation studies. We present a predictive model for treatment response in breast cancer as an illustrative example.
Asunto(s)
Neoplasias de la Mama , Máquina de Vectores de Soporte , Neoplasias de la Mama/diagnóstico , Simulación por Computador , Femenino , Humanos , Probabilidad , Curva ROCRESUMEN
We propose a multithreshold change plane regression model which naturally partitions the observed subjects into subgroups with different covariate effects. The underlying grouping variable is a linear function of observed covariates and thus multiple thresholds produce change planes in the covariate space. We contribute a novel two-stage estimation approach to determine the number of subgroups, the location of thresholds, and all other regression parameters. In the first stage we adopt a group selection principle to consistently identify the number of subgroups, while in the second stage change point locations and model parameter estimates are refined by a penalized induced smoothing technique. Our procedure allows sparse solutions for relatively moderate- or high-dimensional covariates. We further establish the asymptotic properties of our proposed estimators under appropriate technical conditions. We evaluate the performance of the proposed methods by simulation studies and provide illustrations using two medical data examples. Our proposal for subgroup identification may lead to an immediate application in personalized medicine.
Asunto(s)
Medicina de Precisión , Simulación por ComputadorRESUMEN
BACKGROUND: Common and complex traits are the consequence of the interaction and regulation of multiple genes simultaneously, therefore characterizing the interconnectivity of genes is essential to unravel the underlying biological networks. However, the focus of many studies is on the differential expression of individual genes or on co-expression analysis. METHODS: Going beyond analysis of one gene at a time, we systematically integrated transcriptomics, genotypes and Hi-C data to identify interconnectivities among individual genes as a causal network. We utilized different machine learning techniques to extract information from the network and identify differential regulatory pattern between cases and controls. We used data from the Allen Brain Atlas for replication. RESULTS: Employing the integrative systems approach on the data from CommonMind Consortium showed that gene transcription is controlled by genetic variants proximal to the gene (cis-regulatory factors), and transcribed distal genes (trans-regulatory factors). We identified differential gene regulatory patterns in SCZ-cases versus controls and novel SCZ-associated genes that may play roles in the disorder since some of them are primary expressed in human brain. In addition, we observed genes known associated with SCZ are not likely (OR = 0.59) to have high impacts (degree > 3) on the network. CONCLUSIONS: Causal networks could reveal underlying patterns and the role of genes individually and as a group. Establishing principles that govern relationships between genes provides a mechanistic understanding of the dysregulated gene transcription patterns in SCZ and creates more efficient experimental designs for further studies. This information cannot be obtained by studying a single gene at the time.