RESUMO
We propose a multi-metric flexible Bayesian framework to support efficient interim decision-making in multi-arm multi-stage phase II clinical trials. Multi-arm multi-stage phase II studies increase the efficiency of drug development, but early decisions regarding the futility or desirability of a given arm carry considerable risk since sample sizes are often low and follow-up periods may be short. Further, since intermediate outcomes based on biomarkers of treatment response are rarely perfect surrogates for the primary outcome and different trial stakeholders may have different levels of risk tolerance, a single hypothesis test is insufficient for comprehensively summarizing the state of the collected evidence. We present a Bayesian framework comprised of multiple metrics based on point estimates, uncertainty, and evidence towards desired thresholds (a Target Product Profile) for (1) ranking of arms and (2) comparison of each arm against an internal control. Using a large public-private partnership targeting novel TB arms as a motivating example, we find via simulation study that our multi-metric framework provides sufficient confidence for decision-making with sample sizes as low as 30 patients per arm, even when intermediate outcomes have only moderate correlation with the primary outcome. Our reframing of trial design and the decision-making procedure has been well-received by research partners and is a practical approach to more efficient assessment of novel therapeutics.
Assuntos
Projetos de Pesquisa , Humanos , Teorema de Bayes , Tamanho da Amostra , Incerteza , Simulação por ComputadorRESUMO
Adaptive randomized clinical trials are of major interest when dealing with a time-to-event outcome in a prolonged observation window. No consensus exists either to define stopping boundaries or to combine p $$ p $$ values or test statistics in the terminal analysis in the case of a frequentist design and sample size adaptation. In a one-sided setting, we compared three frequentist approaches using stopping boundaries relying on α $$ \alpha $$ -spending functions and a Bayesian monitoring setting with boundaries based on the posterior distribution of the log-hazard ratio. All designs comprised a single interim analysis with an efficacy stopping rule and the possibility of sample size adaptation at this interim step. Three frequentist approaches were defined based on the terminal analysis: combination of stagewise statistics (Wassmer) or of p $$ p $$ values (Desseaux), or on patientwise splitting (Jörgens), and we compared the results with those of the Bayesian monitoring approach (Freedman). These different approaches were evaluated in a simulation study and then illustrated on a real dataset from a randomized clinical trial conducted in elderly patients with chronic lymphocytic leukemia. All approaches controlled for the type I error rate, except for the Bayesian monitoring approach, and yielded satisfactory power. It appears that the frequentist approaches are the best in underpowered trials. The power of all the approaches was affected by the violation of the proportional hazards (PH) assumption. For adaptive designs with a survival endpoint and a one-sided alternative hypothesis, the Wassmer and Jörgens approaches after sample size adaptation should be preferred, unless violation of PH is suspected.
Assuntos
Teorema de Bayes , Simulação por Computador , Ensaios Clínicos Controlados Aleatórios como Assunto , Humanos , Ensaios Clínicos Controlados Aleatórios como Assunto/estatística & dados numéricos , Tamanho da Amostra , Projetos de Pesquisa , Determinação de Ponto Final , Leucemia Linfocítica Crônica de Células B/tratamento farmacológico , Modelos EstatísticosRESUMO
This study presents a hybrid (Bayesian-frequentist) approach to sample size re-estimation (SSRE) for cluster randomised trials with continuous outcome data, allowing for uncertainty in the intra-cluster correlation (ICC). In the hybrid framework, pre-trial knowledge about the ICC is captured by placing a Truncated Normal prior on it, which is then updated at an interim analysis using the study data, and used in expected power control. On average, both the hybrid and frequentist approaches mitigate against the implications of misspecifying the ICC at the trial's design stage. In addition, both frameworks lead to SSRE designs with approximate control of the type I error-rate at the desired level. It is clearly demonstrated how the hybrid approach is able to reduce the high variability in the re-estimated sample size observed within the frequentist framework, based on the informativeness of the prior. However, misspecification of a highly informative prior can cause significant power loss. In conclusion, a hybrid approach could offer advantages to cluster randomised trials using SSRE. Specifically, when there is available data or expert opinion to help guide the choice of prior for the ICC, the hybrid approach can reduce the variance of the re-estimated required sample size compared to a frequentist approach. As SSRE is unlikely to be employed when there is substantial amounts of such data available (ie, when a constructed prior is highly informative), the greatest utility of a hybrid approach to SSRE likely lies when there is low-quality evidence available to guide the choice of prior.
Assuntos
Teorema de Bayes , Ensaios Clínicos Controlados Aleatórios como Assunto , Tamanho da Amostra , Ensaios Clínicos Controlados Aleatórios como Assunto/métodos , Humanos , Análise por Conglomerados , Modelos Estatísticos , Simulação por ComputadorRESUMO
In phase 2 clinical trials, we expect to make a right Go or No-Go decision during the interim analysis (IA) and make this decision at the right time. The optimal time for IA is usually determined based on a utility function. In most previous research, utility functions aim to minimize the expected sample size or total cost in confirmatory trials. However, the selected time can vary depending on different alternative hypotheses. This paper proposes a new utility function for Bayesian phase 2 exploratory clinical trials. It evaluates the predictability and robustness of the Go and No-Go decision made during the IA. We can make a robust time selection for the IA based on the function regardless of the treatment effect assumptions.
Assuntos
Projetos de Pesquisa , Humanos , Teorema de Bayes , Tamanho da AmostraRESUMO
Random coefficient (RC) models are commonly used in clinical trials to estimate the rate of change over time in longitudinal data. Trials utilizing a surrogate endpoint for accelerated approval with a confirmatory longitudinal endpoint to show clinical benefit is a strategy implemented across various therapeutic areas, including immunoglobulin A nephropathy. Understanding conditional power (CP) and information fraction calculations of RC models may help in the design of clinical trials as well as provide support for the confirmatory endpoint at the time of accelerated approval. This paper provides calculation methods, with practical examples, for determining CP at an interim analysis for a RC model with longitudinal data, such as estimated glomerular filtration rate (eGFR) assessments to measure rate of change in eGFR slope.
Assuntos
Biomarcadores , HumanosRESUMO
In covariate-adaptive or response-adaptive randomization, the treatment assignment and outcome can be correlated. Under this situation, the re-randomization test is a straightforward and attractive method to provide valid statistical inferences. In this paper, we investigate the number of repetitions in tests. This is motivated by a group sequential design in clinical trials, where the nominal significance bound can be very small at an interim analysis. Accordingly, re-randomization tests lead to a very large number of required repetitions, which may be computationally intractable. To reduce the number of repetitions, we propose an adaptive procedure and compare it with multiple approaches under predefined criteria. Monte Carlo simulations are conducted to show the performance of different approaches in a limited sample size. We also suggest strategies to reduce total computation time and provide practical guidance in preparing, executing, and reporting before and after data are unblinded at an interim analysis, so one can complete the computation within a reasonable time frame.
RESUMO
In randomized clinical trials that use a long-term efficacy endpoint, the follow-up time necessary to observe the endpoint may be substantial. In such trials, an attractive option is to consider an interim analysis based solely on an early outcome that could be used to expedite the evaluation of treatment's efficacy. Garcia Barrado et al. (Pharm Stat. 2022; 21: 209-219) developed a methodology that allows introducing such an early interim analysis for the case when both the early outcome and the long-term endpoint are normally-distributed, continuous variables. We extend the methodology to any combination of the early-outcome and long-term-endpoint types. As an example, we consider the case of a binary outcome and a time-to-event endpoint. We further evaluate the potential gain in operating characteristics (power, expected trial duration, and expected sample size) of a trial with such an interim analysis in function of the properties of the early outcome as a surrogate for the long-term endpoint.
RESUMO
In clinical trials with time-to-event data, the evaluation of treatment efficacy can be a long and complex process, especially when considering long-term primary endpoints. Using surrogate endpoints to correlate the primary endpoint has become a common practice to accelerate decision-making. Moreover, the ethical need to minimize sample size and the practical need to optimize available resources have encouraged the scientific community to develop methodologies that leverage historical data. Relying on the general theory of group sequential design and using a Bayesian framework, the methodology described in this paper exploits a documented historical relationship between a clinical "final" endpoint and a surrogate endpoint to build an informative prior for the primary endpoint, using surrogate data from an early interim analysis of the clinical trial. The predictive probability of success of the trial is then used to define a futility-stopping rule. The methodology demonstrates substantial enhancements in trial operating characteristics when there is a good agreement between current and historical data. Furthermore, incorporating a robust approach that combines the surrogate prior with a vague component mitigates the impact of the minor prior-data conflicts while maintaining acceptable performance even in the presence of significant prior-data conflicts. The proposed methodology was applied to design a Phase III clinical trial in metastatic colorectal cancer, with overall survival as the primary endpoint and progression-free survival as the surrogate endpoint.
RESUMO
Real world healthcare data are commonly used in post-market safety monitoring studies to address potential safety issues related to newly approved medical products. Such studies typically involve repeated evaluations of accumulating safety data with respect to pre-defined hypotheses, for which the group sequential design provides a rigorous and flexible statistical framework. A major challenge in designing a group sequential safety monitoring study is the uncertainty associated with product uptake, which makes it difficult to specify the final sample size or maximum duration of the study. To deal with this challenge, we propose an information-based group sequential design which specifies a target amount of information that would produce adequate power for detecting a clinically significant effect size. At each interim analysis, the variance estimate for the treatment effect of interest is used to compute the current information time, and a pre-specified alpha spending function is used to determine the stopping boundary. The proposed design can be applied to regression models that adjust for potential confounders and/or heterogeneous treatment exposure. Simulation results demonstrate that the proposed design performs reasonably well in realistic settings.
RESUMO
Correctly characterising the dose-response relationship and taking the correct dose forward for further study is a critical part of the drug development process. We use optimal design theory to compare different designs and show that using longitudinal data from all available timepoints in a continuous-time dose-response model can substantially increase the efficiency of estimation of the dose-response compared to a single timepoint model. We give theoretical results to calculate the efficiency gains for a large class of these models. For example, a linearly growing Emax dose-response in a population with a between/within-patient variance ratio ranging from 0.1 to 1 measured at six visits can be estimated with between 1.43 and 2.22 times relative efficiency gain, or equivalently, with 30% to a 55% reduced sample size, compared to a single model of the final timepoint. Fractional polynomials are a flexible way to incorporate data from repeated measurements, increasing precision without imposing strong constraints. Longitudinal dose-response models using two fractional polynomial terms are robust to mis-specification of the true longitudinal process while maintaining, often large, efficiency gains. These models have applications for characterising the dose-response at interim or final analyses.
RESUMO
OBJECTIVES: An interim analysis of post-marketing surveillance data to assess the safety and effectiveness of sarilumab in Japanese patients with rheumatoid arthritis refractory to previous treatment. METHODS: The interim analysis included patients who initiated sarilumab therapy between June 2018 and January 2021. The primary objective of this surveillance was safety. RESULTS: In total, 1036 patients were enrolled and registered by 12 January 2021 (interim cut-off date). Of these, 678 were included in the safety analysis [75.4% female; mean age (± standard deviation) 65.8 ± 13.0 years]. Adverse drug reactions, defined as adverse events classified as possibly or probably related to sarilumab, were reported in 170 patients (incidence: 25.1%), with white blood cell count decreased (4.4%) and neutrophil count decreased (1.6%) most frequently reported. Serious haematologic disorders (3.4%) and serious infections (including tuberculosis) (2.5%) were the most frequently reported priority surveillance items. No malignant tumour was reported. An absolute neutrophil count (ANC) below the minimum standard value did not increase the incidence of serious infections. CONCLUSIONS: Sarilumab was well tolerated, and no new safety signals were noted in this analysis. There was no difference in the frequency of serious infections between patients with an ANC below or above normal.
Assuntos
Anticorpos Monoclonais Humanizados , Antirreumáticos , Artrite Reumatoide , Humanos , Feminino , Pessoa de Meia-Idade , Idoso , Masculino , Antirreumáticos/efeitos adversos , Japão , Resultado do Tratamento , Artrite Reumatoide/tratamento farmacológico , Artrite Reumatoide/patologia , Vigilância de Produtos ComercializadosRESUMO
For randomized clinical trials where a single, primary, binary endpoint would require unfeasibly large sample sizes, composite endpoints (CEs) are widely chosen as the primary endpoint. Despite being commonly used, CEs entail challenges in designing and interpreting results. Given that the components may be of different relevance and have different effect sizes, the choice of components must be made carefully. Especially, sample size calculations for composite binary endpoints depend not only on the anticipated effect sizes and event probabilities of the composite components but also on the correlation between them. However, information on the correlation between endpoints is usually not reported in the literature which can be an obstacle for designing future sound trials. We consider two-arm randomized controlled trials with a primary composite binary endpoint and an endpoint that consists only of the clinically more important component of the CE. We propose a trial design that allows an adaptive modification of the primary endpoint based on blinded information obtained at an interim analysis. Especially, we consider a decision rule to select between a CE and its most relevant component as primary endpoint. The decision rule chooses the endpoint with the lower estimated required sample size. Additionally, the sample size is reassessed using the estimated event probabilities and correlation, and the expected effect sizes of the composite components. We investigate the statistical power and significance level under the proposed design through simulations. We show that the adaptive design is equally or more powerful than designs without adaptive modification on the primary endpoint. Besides, the targeted power is achieved even if the correlation is misspecified at the planning stage while maintaining the type 1 error. All the computations are implemented in R and illustrated by means of a peritoneal dialysis trial.
RESUMO
The challenges and potential benefits of incorporating biomarkers into clinical trial designs have been increasingly discussed, in particular to develop new agents for immune-oncology or targeted cancer therapies. To more accurately identify a sensitive subpopulation of patients, in many cases, a larger sample size-and consequently higher development costs and a longer study period-might be required. This article discusses a biomarker-based Bayesian (BM-Bay) randomized clinical trial design that incorporates a predictive biomarker measured on a continuous scale with pre-determined cutoff points or a graded scale to define multiple patient subpopulations. We consider designing interim analyses with suitable decision criteria to achieve correct and efficient identification of a target patient population for developing a new treatment. The proposed decision criteria allow not only the take-in of sensitive subpopulations but also the ruling-out of insensitive ones on the basis of the efficacy evaluation of a time-to-event outcome. Extensive simulation studies are conducted to evaluate the operating characteristics of the proposed method, including the probability of correct identification of the desired subpopulation and the expected number of patients, under a wide range of clinical scenarios. For illustration purposes, we apply the proposed method to design a randomized phase II immune-oncology clinical trial.
Assuntos
Projetos de Pesquisa , Humanos , Teorema de Bayes , Ensaios Clínicos como Assunto , Biomarcadores , Tamanho da Amostra , Simulação por ComputadorRESUMO
BACKGROUND: Trial design plays a key role in clinical trials. Traditional group sequential design has been used in cardiovascular clinical trials over decades as the trials can potentially be stopped early, therefore, it can reduce pre-planned sample size and trial resources. In contrast, trials with adoptive designs provide greater flexibility and are more efficient due to the ability to modify trial design according to the interim analysis results. In this systematic review, we aim to explore characteristics of adaptive and traditional group sequential trials in practice and to gain an understanding how these trial designs are currently being reported in cardiology. METHODS: PubMed, Embase and Cochrane Central Register of Controlled Trials database were searched from January 1980 to June 2022. Randomised controlled phase 2/3 trials with either adaptive or traditional group sequential design in patients with cardiovascular disease were included. Descriptive statistics were used to present the collected data. RESULTS: Of 456 articles found in the initial search, 56 were identified including 43 (76.8%) trials with traditional group sequential design and 13 (23.2%) with adaptive. Most trials were large, multicentre, led by the USA (50%) and Europe (28.6%), and were funded by companies (78.6%). For trials with group sequential design, frequency of interim analyses was determined mainly by the number of events (47%). 67% of the trials stopped early, in which 14 (32.6%) were due to efficacy, and 5 (11.6%) for futility. The commonly used stopping rule to terminate trials was O'Brien- Fleming-type alpha spending function (10 (23.3%)). For trials with adaptive designs, 54% of the trials stopped early, in which 4 (30.8%) were due to futility, and 2 (15.4%) for efficacy. Sample size re-estimation was commonly used (8 (61.5%)). In 69% of the trials, simulation including Bayesian approach was used to define the statistical stopping rules. The adaptive designs have been increasingly used (from 0 to 1999 to 38.6% after 2015 amongst adaptive trials). 25% of the trials reported "adaptive" in abstract or title of the studies. CONCLUSIONS: The application of adaptive trials is increasingly popular in cardiovascular clinical trials. The reporting of adaptive design needs improving.
Assuntos
Doenças Cardiovasculares , Humanos , Teorema de Bayes , Doenças Cardiovasculares/terapia , Simulação por Computador , Coleta de Dados , Morte , Ensaios Clínicos Fase II como Assunto , Ensaios Clínicos Controlados Aleatórios como AssuntoRESUMO
BACKGROUND: Adaptive clinical trials are growing in popularity as they are more flexible, efficient and ethical than traditional fixed designs. However, notwithstanding their increased use in assessing treatments for COVID-19, their use in critical care trials remains limited. A better understanding of the relative benefits of various adaptive designs may increase their use and interpretation. METHODS: Using two large critical care trials (ADRENAL. CLINICALTRIALS: gov number, NCT01448109. Updated 12-12-2017; NICE-SUGAR. CLINICALTRIALS: gov number, NCT00220987. Updated 01-29-2009), we assessed the performance of three frequentist and two bayesian adaptive approaches. We retrospectively re-analysed the trials with one, two, four, and nine equally spaced interims. Using the original hypotheses, we conducted 10,000 simulations to derive error rates, probabilities of making an early correct and incorrect decision, expected sample size and treatment effect estimates under the null scenario (no treatment effect) and alternative scenario (a positive treatment effect). We used a logistic regression model with 90-day mortality as the outcome and the treatment arm as the covariate. The null hypothesis was tested using a two-sided significance level (α) at 0.05. RESULTS: Across all approaches, increasing the number of interims led to a decreased expected sample size. Under the null scenario, group sequential approaches provided good control of the type-I error rate; however, the type I error rate inflation was an issue for the Bayesian approaches. The Bayesian Predictive Probability and O'Brien-Fleming approaches showed the highest probability of correctly stopping the trials (around 95%). Under the alternative scenario, the Bayesian approaches showed the highest overall probability of correctly stopping the ADRENAL trial for efficacy (around 91%), whereas the Haybittle-Peto approach achieved the greatest power for the NICE-SUGAR trial. Treatment effect estimates became increasingly underestimated as the number of interims increased. CONCLUSIONS: This study confirms the right adaptive design can reach the same conclusion as a fixed design with a much-reduced sample size. The efficiency gain associated with an increased number of interims is highly relevant to late-phase critical care trials with large sample sizes and short follow-up times. Systematically exploring adaptive methods at the trial design stage will aid the choice of the most appropriate method.
Assuntos
COVID-19 , Humanos , Teorema de Bayes , Cuidados Críticos/métodos , Projetos de Pesquisa , Estudos Retrospectivos , Tamanho da Amostra , Ensaios Clínicos como AssuntoRESUMO
This manuscript consists of two topics. Firstly, we explore the utility of internal pilot study (IPS) approach for reestimating sample size at an interim stage when a reliable estimate of the nuisance shape parameter of the Weibull distribution for modeling survival data is unavailable during the planning phase of a study. Although IPS approach can help rescue the study power, it is noted that the adjusted sample size can be as much as twice the initially planned sample size, which may put substantial practical constraints to continue the study. Secondly, we discuss Bayesian predictive probability for conducting interim analyses to obtain preliminary evidence of efficacy or futility of an experimental treatment warranting early termination of a clinical trial. In the context of single-arm clinical trials with time-to-event endpoints following Weibull distribution, we present the calculation of the Bayesian predictive probability when the shape parameter of the Weibull distribution is unknown. Based on the data accumulated at the interim, we propose two approaches which rely on the posterior mode or the entire posterior distribution of the shape parameter. To account for uncertainty in the shape parameter, it is recommended to incorporate its entire posterior distribution in our calculation.
RESUMO
As part of the drug development process, interim analysis is frequently used to design efficient phase II clinical trials. A stochastic curtailment framework is often deployed wherein a decision to continue or curtail the trial is taken at each interim look based on the likelihood of observing a positive or negative treatment effect if the trial were to continue to its anticipated end. Thus, curtailment can take place due to evidence of early efficacy or futility. Traditionally, in the case of time-to-event endpoints, interim monitoring is conducted in a two-arm clinical trial using the log-rank test, often with the assumption of proportional hazards. However, when this is violated, the log-rank test may not be appropriate, resulting in loss of power and subsequently inaccurate sample sizes. In this paper, we propose stochastic curtailment methods for two-arm phase II trial with the flexibility to allow non-proportional hazards. The proposed methods are built utilizing the concept of relative time assuming that the survival times in the two treatment arms follow two different Weibull distributions. Three methods - conditional power, predictive power and Bayesian predictive probability - are discussed along with corresponding sample size calculations. The monitoring strategy is discussed with a real-life example.
RESUMO
A robust Bayesian design is presented for a single-arm phase II trial with an early stopping rule to monitor a time to event endpoint. The assumed model is a piecewise exponential distribution with non-informative gamma priors on the hazard parameters in subintervals of a fixed follow up interval. As an additional comparator, we also define and evaluate a version of the design based on an assumed Weibull distribution. Except for the assumed models, the piecewise exponential and Weibull model based designs are identical to an established design that assumes an exponential event time distribution with an inverse gamma prior on the mean event time. The three designs are compared by simulation under several log-logistic and Weibull distributions having different shape parameters, and for different monitoring schedules. The simulations show that, compared to the exponential inverse gamma model based design, the piecewise exponential design has substantially better performance, with much higher probabilities of correctly stopping the trial early, and shorter and less variable trial duration, when the assumed median event time is unacceptably low. Compared to the Weibull model based design, the piecewise exponential design does a much better job of maintaining small incorrect stopping probabilities in cases where the true median survival time is desirably large.
Assuntos
Projetos de Pesquisa , Humanos , Teorema de Bayes , Simulação por Computador , ProbabilidadeRESUMO
In the precision medicine era, (prespecified) subgroup analyses are an integral part of clinical trials. Incorporating multiple populations and hypotheses in the design and analysis plan, adaptive designs promise flexibility and efficiency in such trials. Adaptations include (unblinded) interim analyses (IAs) or blinded sample size reviews. An IA offers the possibility to select promising subgroups and reallocate sample size in further stages. Trials with these features are known as adaptive enrichment designs. Such complex designs comprise many nuisance parameters, such as prevalences of the subgroups and variances of the outcomes in the subgroups. Additionally, a number of design options including the timepoint of the sample size review and timepoint of the IA have to be selected. Here, for normally distributed endpoints, we propose a strategy combining blinded sample size recalculation and adaptive enrichment at an IA, that is, at an early timepoint nuisance parameters are reestimated and the sample size is adjusted while subgroup selection and enrichment is performed later. We discuss implications of different scenarios concerning the variances as well as the timepoints of blinded review and IA and investigate the design characteristics in simulations. The proposed method maintains the desired power if planning assumptions were inaccurate and reduces the sample size and variability of the final sample size when an enrichment is performed. Having two separate timepoints for blinded sample size review and IA improves the timing of the latter and increases the probability to correctly enrich a subgroup.
Assuntos
Medicina de Precisão , Projetos de Pesquisa , Tamanho da Amostra , ProbabilidadeRESUMO
BACKGROUND: Stepped-wedge cluster randomized trial (SW-CRT) designs are often used when there is a desire to provide an intervention to all enrolled clusters, because of a belief that it will be effective. However, given there should be equipoise at trial commencement, there has been discussion around whether a pre-trial decision to provide the intervention to all clusters is appropriate. In pharmaceutical drug development, a solution to a similar desire to provide more patients with an effective treatment is to use a response adaptive (RA) design. METHODS: We introduce a way in which RA design could be incorporated in an SW-CRT, permitting modification of the intervention allocation during the trial. The proposed framework explicitly permits a balance to be sought between power and patient benefit considerations. A simulation study evaluates the methodology. RESULTS: In one scenario, for one particular RA design, the proportion of cluster-periods spent in the intervention condition was observed to increase from 32.2% to 67.9% as the intervention effect was increased. A cost of this was a 6.2% power drop compared to a design that maximized power by fixing the proportion of time in the intervention condition at 45.0%, regardless of the intervention effect. CONCLUSIONS: An RA approach may be most applicable to settings for which the intervention has substantial individual or societal benefit considerations, potentially in combination with notable safety concerns. In such a setting, the proposed methodology may routinely provide the desired adaptability of the roll-out speed, with only a small cost to the study's power.