RESUMEN
Opportunities to decrease the toxicity and cost of approved treatment regimens with lower dose, less frequent, or shorter duration alternative regimens have been limited by the perception that alternatives must be non-inferior to approved regimens. Non-inferiority trials are large and expensive to do, because they must show statistically that the alternative and approved therapies differ in a single outcome, by a margin far smaller than that required to demonstrate superiority. Non-inferiority's flaws are manifest: it ignores variability expected to occur with repeated evaluation of the approved therapy, fails to recognise that a trial of similar design will be labelled as superiority or non-inferiority depending on whether it is done prior to or after initial registration of the approved treatment, and relegates endpoints such as toxicity and cost. For example, while a less toxic and less costly regimen of 3 months duration would typically be required to demonstrate efficacy that is non-inferior to that of a standard regimen of 6 months to displace it, the longer duration therapy has no such obligation to prove its superiority. This situation is the tyranny of the non-inferiority trial: its statistics perpetuate less cost-effective regimens, which are not patient-centred, even when less intensive therapies confer survival benefits nearly identical to those of the standard, by placing a disproportionately large burden of proof on the alternative. This approach is illogical. We propose that the designation of trials as superiority or non-inferiority be abandoned, and that randomised, controlled trials should henceforth be described simply as "comparative".
Asunto(s)
Estudios de Equivalencia como Asunto , Humanos , Proyectos de Investigación , Análisis Costo-Beneficio , Neoplasias/tratamiento farmacológico , Ensayos Clínicos como AsuntoRESUMEN
BACKGROUND: Evidence suggests that ctDNA may be a reliable biomarker to monitor metastatic colorectal cancer (CRC) evolution. Nevertheless, evidence on the potential of liquid biopsy in this setting is still low quality, mostly consisting of retrospective studies. METHODS: COPERNIC is an international, multicenter clinical trial. The pilot study aims to confirm the predictive potential of early on-treatment ctDNA dynamics, and inform the design of a larger ctDNA-driven trial. Advanced CRC patients who are candidates for ≥3rd lines of systemic therapy undergo longitudinal blood sample collection during treatment (day 1, 15 and 29 for 2- or 4-weekly treatment regimens; day 1, 22 and 43 for 3-weekly treatment regimens) and at each imaging assessment. ctDNA analyses are carried out with the FoundationOne Liquid CDx and FoundationOneMonitor assays, and ctDNA changes during treatment are correlated with radiologic response (as assessed every 8-12 weeks by RECIST v1.1). The primary objective is to select the optimal timepoint and cut-off value for early ctDNA changes (at day 15/22) to predict progressive disease as best radiological response with a high positive predictive value. The cut-off value for ctDNA will be defined based on nonparametric ROC-curves with bootstrapping. Based on the expected rate of progressive disease and statistical assumptions, 109 patients are needed to be screened to have 87 assessable patients. COPERNIC is sponsored by the Institut Jules Bordet, and supported by Roche and Foundation Medicine. Recruitment is open in 13 centres across Belgium and France. The study is registered with clinicaltrials.gov (NCT05487248).
RESUMEN
We observed lack of clarity and consistency in end point definitions of large randomized clinical trials in diffuse large B-cell lymphoma. These inconsistencies are such that trials might, in fact, address different clinical questions. They complicate interpretation of results, including comparisons across studies. Problems arise from different ways to account for events occurring after randomization including absence of improvement in disease status, treatment discontinuation or the initiation of new therapy. We call for more dialogue between stakeholders to define with clarity the questions of interest and corresponding end points. We illustrate that assessing different end point rules across a range of plausible patient journeys can be a powerful tool to facilitate such a discussion and contribute to better understanding of patient-relevant end points.
What is this article about? This article talks about the lack of clarity and consistency in the definitions of outcomes used in clinical trials that investigate new treatments for diffuse large B-cell lymphoma. This is mainly due to how these different outcome definitions handle events such as absence of improvement in disease status, treatment discontinuation or initiation of new treatment. The authors discuss how these inconsistencies make it hard to interpret the results of individual clinical trials and to compare results across clinical trials.Why is it important? Defining the above events and consequently defining outcomes affects what we can learn from the trials and can lead to different results. Some approaches may not reflect good and bad outcomes for patients appropriately. This makes it challenging for patients, physicians, health authorities and payors to understand the true benefit of treatments under investigation and which one is better.What are the key take-aways? This article serves as a call-to-action for more dialogue among all stakeholders involved in drug development and the decision-making process related to drug evaluations. There is an urgent need for clinical trials to be designed with more clarity and consistency on what is being measured so that relevant questions for patients and prescribing physicians are addressed. Understanding patient journeys will be key to successfully understand what truly matters to patients and how to measure the benefit of new treatments. Such discussions will contribute toward more clarity and consistency in the evaluation of new treatments.
Asunto(s)
Linfoma de Células B Grandes Difuso , Linfoma de Células B Grandes Difuso/terapia , Linfoma de Células B Grandes Difuso/tratamiento farmacológico , Linfoma de Células B Grandes Difuso/mortalidad , Humanos , Ensayos Clínicos Controlados Aleatorios como Asunto , Determinación de Punto Final , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapéutico , Ensayos Clínicos como Asunto , Resultado del Tratamiento , Proyectos de InvestigaciónRESUMEN
Unlocking the full potential of clinical trials through comprehensive CSR and IPD sharing can revolutionize cancer care, enhance safety evaluations, and reduce bias in systematic reviews. It is time for all stakeholders to embrace transparency and advance patient-centered outcomes.
RESUMEN
Defining meaningful endpoints for research of early-stage high-risk prostate cancer is challenging, with established measures such as overall survival and metastasis-free survival facing limitations related to feasibility and adequate reflection of patient relevance. Developing endpoints must cater to diverse perspectives across scientific, clinical, regulatory, and patient viewpoints. Endpoints such as pathological complete response, no evidence of disease, and prevention of prostate-specific antigen relapse may reflect patient benefit by accounting for diagnostic and treatment burdens.
Asunto(s)
Neoplasias de la Próstata , Humanos , Masculino , Neoplasias de la Próstata/terapia , Neoplasias de la Próstata/patología , Determinación de Punto Final , Antígeno Prostático Específico/sangreRESUMEN
BACKGROUND AND OBJECTIVE: Radiotherapy (RT) and long-term androgen deprivation therapy (ltADT; 18-36 mo) is a standard of care in the treatment of high-risk localized/locoregional prostate cancer (HRLPC). We evaluated the outcomes in patients treated with RT + ltADT to identify which patients have poorer prognosis with standard therapy. METHODS: Individual patient data from patients with HRLPC (as defined by any of the following three risk factors [RFs] in the context of cN0 disease-Gleason score ≥8, cT3-4, and prostate-specific antigen [PSA] >20 ng/ml, or cN1 disease) treated with RT and ltADT in randomized controlled trials collated by the Intermediate Clinical Endpoints in Cancer of the Prostate group. The outcome measures of interest were metastasis-free survival (MFS), overall survival (OS), time to metastasis, and prostate cancer-specific mortality. Multivariable Cox and Fine-Gray regression estimated hazard ratios (HRs) for the three RFs and cN1 disease. KEY FINDINGS AND LIMITATIONS: A total of 3604 patients from ten trials were evaluated, with a median PSA value of 24 ng/ml. Gleason score ≥8 (MFS HR = 1.45; OS HR = 1.42), cN1 disease (MFS HR = 1.86; OS HR = 1.77), cT3-4 disease (MFS HR = 1.28; OS HR = 1.22), and PSA >20 ng/ml (MFS HR = 1.30; OS HR = 1.21) were associated with poorer outcomes. Adjusted 5-yr MFS rates were 83% and 78%, and 10-yr MFS rates were 63% and 53% for patients with one and two to three RFs, respectively; corresponding 10-yr adjusted OS rates were 67% and 60%, respectively. In cN1 patients, adjusted 5- and 10-yr MFS rates were 67% and 36%, respectively, and 10-yr OS was 47%. CONCLUSIONS AND CLINICAL IMPLICATIONS: HRLPC patients with two to three RFs (and cN0) or cN1 disease had the poorest outcomes on RT and ltADT. This will help in counseling patients treated in routine practice and in guiding adjuvant trials in HRLPC. PATIENT SUMMARY: Radiotherapy and long-term hormone therapy are standard treatments for high-risk and locoregional prostate cancer. In this report, we defined prognostic groups within high-risk/locoregional prostate cancer and showed that outcomes to standard therapy are poorest in those with two or more "high-risk" factors or evidence of lymph node involvement. Such patients may therefore be the best candidates for intensification of treatment.
RESUMEN
OBJECTIVES: The restricted Net Treatment Benefit (rNTB) is a clinically meaningful and tractable estimand of the overall treatment effect assessed in randomized trials when at least one survival endpoint with time restriction is used. Its interpretation does not rely on parametric assumptions such as proportional hazards, can be estimated without bias even in the presence of independent right-censoring, and can include a prespecified threshold of minimal clinically relevant difference. To demonstrate that the rNTB, corresponding to the NTB during a predefined time interval, is a meaningful and adaptable measure of treatment effect in clinical trials. METHODS: In this simulation study, we tested the impact on the rNTB value, estimation, and power of several factors including the presence of a delayed treatment effect, minimal clinically relevant difference threshold value, restriction time value, and the inclusion of both efficacy and toxicity in the rNTB definition. The impact of right censoring on rNTB was assessed in terms of bias. rNTB-derived statistical tests and log rank (LR) tests were compared in terms of power. RESULTS: RNTB estimates are unbiased even in case of right-censoring. rNTB may be used to estimate the benefit/risk ratio of a new treatment, for example, taking into account both survival and toxicity and include several prioritized outcomes. The estimated rNTB is much easier to interpret in this context compared to NTB in the presence of censoring since the latter is intrinsically dependent on the follow-up duration. Including toxicity increases the test power when the experimental treatment is less toxic. rNTB-derived test power increases when the experimental treatment is associated with longer survival and lower toxicity and might increase in the presence of a cure rate or a delayed treatment effect. Case applications on the PRODIGE, Checkmate-066, and Checkmate-067 trials are provided. CONCLUSIONS: RNTB is an interesting alternative to describe and test the treatment's effect in a clear and understandable way in case of restriction, particularly in scenarios with nonproportional hazards or when trying to balance benefit and safety. It can be tuned to take into consideration short- or long-term survival differences and one or more prioritized outcomes.
Asunto(s)
Neoplasias , Ensayos Clínicos Controlados Aleatorios como Asunto , Humanos , Neoplasias/terapia , Neoplasias/mortalidad , Simulación por Computador , Resultado del Tratamiento , Oncología Médica/métodos , Análisis de Supervivencia , Diferencia Mínima Clínicamente Importante , SesgoRESUMEN
BACKGROUND: Central monitoring aims at improving the quality of clinical research by pro-actively identifying risks and remediating emerging issues in the conduct of a clinical trial that may have an adverse impact on patient safety and/or the reliability of trial results. This paper, focusing on statistical data monitoring (SDM), is the second of a series that attempts to quantify the impact of central monitoring in clinical trials. MATERIAL AND METHODS: Quality improvement was assessed in studies using SDM from a single large central monitoring platform. The analysis focused on a total of 1111 sites that were identified as at-risk by the SDM tests and for which the study teams conducted a follow-up investigation. These sites were taken from 159 studies conducted by 23 different clinical development organizations (including both sponsor companies and contract research organizations). Two quality improvement metrics were assessed for each selected site, one based on a site data inconsistency score (DIS, overall -log10 P-value of the site compared with all other sites) and the other based on the observed metric value associated with each risk signal. RESULTS: The SDM quality metrics showed improvement in 83% (95% CI, 80-85%) of the sites across therapeutic areas and study phases (primarily phases 2 and 3). In contrast, only 56% (95% CI, 41-70%) of sites showed improvement in 2 historical studies that did not use SDM during study conduct. CONCLUSION: The results of this analysis provide clear quantitative evidence supporting the hypothesis that the use of SDM in central monitoring is leading to improved quality in clinical trial conduct and associated data across participating sites.
Asunto(s)
Ensayos Clínicos como Asunto , Exactitud de los Datos , Mejoramiento de la Calidad , Humanos , Comités de Monitoreo de Datos de Ensayos Clínicos , Reproducibilidad de los Resultados , Seguridad del PacienteRESUMEN
PURPOSE: Despite major increases in the longevity of men with metastatic hormone-sensitive prostate cancer (mHSPC), most men still die of prostate cancer. Phase III trials assessing new therapies in mHSPC with overall survival (OS) as the primary end point will take approximately a decade to complete. We investigated whether radiographic progression-free survival (rPFS) and clinical PFS (cPFS) are valid surrogates for OS in men with mHSPC and could potentially be used to expedite future phase III clinical trials. METHODS: We obtained individual patient data (IPD) from 9 eligible randomized trials comparing treatment regimens (different androgen deprivation therapy [ADT] strategies or ADT plus docetaxel in the control or research arms) in mHSPC. rPFS was defined as the time from random assignment to radiographic progression or death from any cause whichever occurred first; cPFS was defined as the time from random assignment to the date of radiographic progression, symptoms, initiation of new treatment, or death, whichever occurred first. We implemented a two-stage meta-analytic validation model where conditions of patient-level and trial-level surrogacy had to be met. We then computed the surrogate threshold effect (STE). RESULTS: IPD from 6,390 patients randomly assigned from 1994 to 2012 from 13 units were pooled for a stratified analysis. The median OS, rPFS, and cPFS were 4.3 (95% CI, 4.2 to 4.5), 2.4 (95% CI, 2.3 to 2.5), and 2.3 years (95% CI, 2.2 to 2.4), respectively. The STEs were 0.80 and 0.81 for rPFS and cPFS end points, respectively. CONCLUSION: Both rPFS and cPFS appear to be promising surrogate end points for OS. The STE of 0.80 or higher makes it viable for either rPFS or cPFS to be used as the primary end point that is surrogate for OS in phase III mHSPC trials with testosterone suppression alone as the backbone therapy and would expedite trial conduct.
Asunto(s)
Neoplasias de la Próstata , Masculino , Humanos , Neoplasias de la Próstata/diagnóstico por imagen , Neoplasias de la Próstata/tratamiento farmacológico , Supervivencia sin Progresión , Antagonistas de Andrógenos , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapéutico , Hormonas/uso terapéutico , Supervivencia sin EnfermedadRESUMEN
BACKGROUND/AIMS: Showing "similar efficacy" of a less intensive treatment typically requires a non-inferiority trial. Yet such trials may be challenging to design and conduct. In acute promyelocytic leukemia, great progress has been achieved with the introduction of targeted therapies, but toxicity remains a major clinical issue. There is a pressing need to show the favorable benefit/risk of less intensive treatment regimens. METHODS: We designed a clinical trial that uses generalized pairwise comparisons of five prioritized outcomes (alive and event-free at 2 years, grade 3/4 documented infections, differentiation syndrome, hepatotoxicity, and neuropathy) to confirm a favorable benefit/risk of a less intensive treatment regimen. We conducted simulations based on historical data and assumptions about the differences expected between the standard of care and the less intensive treatment regimen to calculate the sample size required to have high power to show a positive Net Treatment Benefit in favor of the less intensive treatment regimen. RESULTS: Across 10,000 simulations, average sample sizes of 260 to 300 patients are required for a trial using generalized pairwise comparisons to detect typical Net Treatment Benefits of 0.19 (interquartile range 0.14-0.23 for a sample size of 280). The Net Treatment Benefit is interpreted as a difference between the probability of doing better on the less intensive treatment regimen than on the standard of care, minus the probability of the opposite situation. A Net Treatment Benefit of 0.19 translates to a number needed to treat of about 5.3 patients (1/0.19 ≃ 5.3). CONCLUSION: Generalized pairwise comparisons allow for simultaneous assessment of efficacy and safety, with priority given to the former. The sample size required would be of the order of 300 patients, as compared with more than 700 patients for a non-inferiority trial using a margin of 4% against the less intensive treatment regimen for the absolute difference in event-free survival at 2 years, as considered here.
Asunto(s)
Probabilidad , HumanosRESUMEN
Trial-level surrogacy is critical before early response endpoints are used to approve new therapies.
Asunto(s)
Terapia Neoadyuvante , Neoplasias del Recto , Humanos , Resultado del Tratamiento , Neoplasias del Recto/terapiaRESUMEN
PURPOSE: To evaluate the addition of ofranergene obadenovec (ofra-vec, VB-111), a novel gene-based anticancer targeted therapy, to once a week paclitaxel in patients with recurrent platinum-resistant ovarian cancer (PROC). METHODS: This placebo-controlled, double-blind, phase III trial (ClinicalTrials.gov identifier: NCT03398655) randomly assigned patients with PROC 1:1 to receive intravenous ofra-vec every 8 weeks with once a week IV paclitaxel or placebo with paclitaxel until disease progression. The dual primary end points were overall survival (OS) and progression-free survival (PFS) as assessed by Blinded Independent Central Review. RESULTS: Between December 2017 and March 2022, 409 patients were randomly assigned. The median PFS was 5.29 months in the ofra-vec arm and 5.36 months in the control arm, hazard ratio (HR) 1.03 (CI, 0.83 to 1.29; P = .7823). The median OS with ofra-vec was 13.37 months versus 13.14 months, HR 0.97 (CI, 0.75 to 1.27; P = .8440). Objective response rates (ORRs) per RECIST 1.1 were similar in both arms: 28.9% with ofra-vec versus 29.6% with control. In both treatment arms, response to CA-125 was a substantial prognostic factor for both PFS and OS. In the ofra-vec arm, the HR in CA-125 responders compared with that in nonresponders for PFS was 0.2428 (CI, 0.1642 to 0.3588), and for OS, the HR was 0.3343 (CI, 0.2134 to 0.5238). Safety profile was characterized by common transient flu-like symptoms such as fever and chills. CONCLUSION: The addition of ofra-vec to paclitaxel did not improve PFS or OS. The PFS and ORR in the control arm exceeded the results that were anticipated on the basis of the AURELIA chemotherapy control arm. CA-125 response was a substantial prognostic biomarker for PFS and OS in patients with PROC treated with paclitaxel.
Asunto(s)
Neoplasias Ováricas , Paclitaxel , Humanos , Femenino , Neoplasias Ováricas/tratamiento farmacológico , Neoplasias Ováricas/genética , Recurrencia Local de Neoplasia/tratamiento farmacológico , Carcinoma Epitelial de Ovario/tratamiento farmacológico , Supervivencia sin Progresión , Inhibidores de la Angiogénesis/uso terapéutico , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapéuticoRESUMEN
BACKGROUND: Generalized pairwise comparisons (GPC) can be used to assess the net benefit of new treatments for rare diseases. We show the potential of GPC through simulations based on data from a natural history study in mucopolysaccharidosis type IIIA (MPS IIIA). METHODS: Using data from a historical series of untreated children with MPS IIIA aged 2 to 9 years at the time of enrolment and followed for 2 years, we performed simulations to assess the operating characteristics of GPC to detect potential (simulated) treatment effects on a multi-domain symptom assessment. Two approaches were used for GPC: one in which the various domains were prioritized, the other with all domains weighted equally. The net benefit was used as a measure of treatment effect. We used increasing thresholds of clinical relevance to reflect the magnitude of the desired treatment effects, relative to the standard deviation of the measurements in each domain. RESULTS: GPC were shown to have adequate statistical power (80% or more), even with small sample sizes, to detect treatment effects considered to be clinically worthwhile on a symptom assessment covering five domains (expressive language, daily living skills, and gross-motor, sleep and pain). The prioritized approach generally led to higher power as compared with the non-prioritized approach. CONCLUSIONS: GPC of prioritized outcomes is a statistically powerful as well as a patient-centric approach for the analysis of multi-domain scores in MPS IIIA and could be applied to other heterogeneous rare diseases.
Asunto(s)
Mucopolisacaridosis III , Niño , Humanos , Enfermedades Raras , Recolección de Datos , Atención Dirigida al PacienteRESUMEN
In randomized trials, comparability of the treatment groups is ensured through allocation of treatments using a mechanism that involves some random element, thus controlling for confounding of the treatment effect. Completely random allocation ensures comparability between the treatment groups for all known and unknown prognostic factors. For a specific trial, however, imbalances in prognostic factors among the treatment groups may occur. Although accidental bias can be avoided in the presence of such imbalances by stratifying the analysis, most trialists, regulatory agencies, and other stakeholders prefer a balanced distribution of prognostic factors across the treatment groups. Some procedures attempt to achieve balance in baseline covariates, by stratifying the allocation for these covariates, or by dynamically adapting the allocation using covariate information during the trial (covariate-adaptive procedures). In this Tutorial, the performance of minimization, a popular covariate-adaptive procedure, is compared with two other commonly used procedures, completely random allocation and stratified blocked designs. Using individual patient data of 2 clinical trials (in advanced ovarian cancer and age-related macular degeneration), the procedures are compared in terms of operating characteristics (using asymptotic and randomization tests), predictability of treatment allocation, and achieved balance. Fifty actual trials of various sizes that applied minimization for treatment allocation are used to investigate the achieved balance. Implementation issues of minimization are described. Minimization procedures are useful in all trials but especially when (1) many major prognostic factors are known, (2) many centers of different sizes accrue patients, or (3) the trial sample size is moderate.
Asunto(s)
Proyectos de Investigación , Humanos , Sesgo , Ensayos Clínicos Controlados Aleatorios como Asunto , Tamaño de la MuestraRESUMEN
A time-to-first-event composite endpoint analysis has well-known shortcomings in evaluating a treatment effect in cardiovascular clinical trials. It does not fully describe the clinical benefit of therapy because the severity of the events, events repeated over time, and clinically relevant nonsurvival outcomes cannot be considered. The generalized pairwise comparisons (GPC) method adds flexibility in defining the primary endpoint by including any number and type of outcomes that best capture the clinical benefit of a therapy as compared with standard of care. Clinically important outcomes, including bleeding severity, number of interventions, and quality of life, can easily be integrated in a single analysis. The treatment effect in GPC can be expressed by the net treatment benefit, the success odds, or the win ratio. This review provides guidance on the use of GPC and the choice of treatment effect measures for the analysis and reporting of cardiovascular trials.
Asunto(s)
Enfermedades Cardiovasculares , Evaluación de Procesos y Resultados en Atención de Salud , Humanos , Calidad de Vida , Determinación de Punto Final , Enfermedades Cardiovasculares/terapiaRESUMEN
Immunotherapy with checkpoint inhibitors (CPIs) and cell-based products has revolutionized the treatment of various solid tumors and hematologic malignancies. These agents have shown unprecedented response rates and long-term benefits in various settings. These clinical advances have also pointed to the need for new or adapted approaches to trial design and assessment of efficacy and safety, both in the early and late phases of drug development. Some of the conventional statistical methods and endpoints used in other areas of oncology appear to be less appropriate in immuno-oncology. Conversely, other methods and endpoints have emerged as alternatives. In this article, we discuss issues related to trial design in the early and late phases of drug development in immuno-oncology, with a focus on CPIs. For early trials, we review the most salient issues related to dose escalation, use and limitations of tumor response and progression criteria for immunotherapy, the role of duration of response as an endpoint in and of itself, and the need to conduct randomized trials as early as possible in the development of new therapies. For late phases, we discuss the choice of primary endpoints for randomized trials, review the current status of surrogate endpoints, and discuss specific statistical issues related to immunotherapy, including non-proportional hazards in the assessment of time-to-event endpoints, alternatives to the Cox model in these settings, and the method of generalized pairwise comparisons, which can provide a patient-centric assessment of clinical benefit and be used to design randomized trials.
RESUMEN
BACKGROUND: Overall survival is the optimal marker of treatment efficacy in randomized clinical trials (RCTs) but can take considerable time to mature. Progression-free survival (PFS) has served as an early surrogate of overall survival but is imperfect. Time to deterioration in quality of life (QOL) measures could be a surrogate for overall survival. METHODS: Phase 3 RCTs in solid malignancies that reported overall survival, PFS, and time to deterioration in QOL or physical function published between January 1, 2010, and June 30, 2022, were evaluated. Weighted regression analysis was used to assess the relationship between PFS, time to deterioration in QOL, and time to deterioration in physical function with overall survival. The coefficient of determination (R2) was used to quantify surrogacy. RESULTS: In total, 138 phase 3 RCTs were included. Of these, 47 trials evaluated immune checkpoint inhibitors and 91 investigated non-immune checkpoint inhibitor agents. Time to deterioration in QOL (137 RCTs) and time to deterioration in physical function (75 RCTs) performed similarly to PFS as surrogates for overall survival (R2 = 0.18 vs R2 = 0.19 and R2 = 0.10 vs R2 = 0.09, respectively). For immune checkpoint inhibitor studies, time to deterioration in physical function had a higher association with overall survival than with PFS (R2 = 0.38 vs R2 = 0.19), and PFS and time to deterioration in physical function did not correlate with each other (R2 = 0). When time to deterioration in physical function and PFS are used together, the coefficient of determination increased (R2 = 0.57). CONCLUSIONS: Time to deterioration in physical function appears to be an overall survival surrogate measure of particular importance for immune checkpoint inhibitor treatment efficacy. The combination of time to deterioration in physical function with PFS may enable better prediction of overall survival treatment benefit in RCTs of immune checkpoint inhibitors than either PFS or time to deterioration in physical function alone.