RESUMEN
As one of the most commonly used data types, methods in testing or designing a trial for binary endpoints from two independent populations are still being developed until recently. However, the power and the minimum required sample size comparisons between different tests may not be valid if their type I errors are not controlled at the same level. In this article, we unify all related testing procedures into a decision framework, including both frequentist and Bayesian methods. Sufficient conditions of the type I error attained at the boundary of hypotheses are derived, which help reduce the magnitude of the exact calculations and lay out a foundation for developing computational algorithms to correctly specify the actual type I error. The efficient algorithms are thus proposed to calculate the cutoff value in a deterministic decision rule and the probability value in a randomized decision rule, such that the actual type I error is under but closest to, or equal to, the intended level, respectively. The algorithm may also be used to calculate the sample size to achieve the prespecified type I error and power. The usefulness of the proposed methodology is further demonstrated in the power calculation for designing superiority and noninferiority trials.
Asunto(s)
Algoritmos , Proyectos de Investigación , Humanos , Teorema de Bayes , Tamaño de la Muestra , ProbabilidadRESUMEN
We describe the command artbin, which offers various new facilities for the calculation of sample size for binary outcome variables that are not otherwise available in Stata. While artbin has been available since 2004, it has not been previously described in the Stata Journal. artbin has been recently updated to include new options for different statistical tests, methods and study designs, improved syntax, and better handling of noninferiority trials. In this article, we describe the updated version of artbin and detail the various formulas used within artbin in different settings.
RESUMEN
We describe a new command, artcat, that calculates sample size or power for a randomized controlled trial or similar experiment with an ordered categorical outcome, where analysis is by the proportional-odds model. artcat implements the method of Whitehead (1993, Statistics in Medicine 12: 2257-2271). We also propose and implement a new method that 1) allows the user to specify a treatment effect that does not obey the proportional-odds assumption, 2) offers greater accuracy for large treatment effects, and 3) allows for noninferiority trials. We illustrate the command and explore the value of an ordered categorical outcome over a binary outcome in various settings. We show by simulation that the methods perform well and that the new method is more accurate than Whitehead's method.
RESUMEN
BACKGROUND: Adaptive designs offer added flexibility in the execution of clinical trials, including the possibilities of allocating more patients to the treatments that turned out more successful, and early stopping due to either declared success or futility. Commonly applied adaptive designs, such as group sequential methods, are based on the frequentist paradigm and on ideas from statistical significance testing. Interim checks during the trial will have the effect of inflating the Type 1 error rate, or, if this rate is controlled and kept fixed, lowering the power. RESULTS: The purpose of the paper is to demonstrate the usefulness of the Bayesian approach in the design and in the actual running of randomized clinical trials during phase II and III. This approach is based on comparing the performance of the different treatment arms in terms of the respective joint posterior probabilities evaluated sequentially from the accruing outcome data, and then taking a control action if such posterior probabilities fall below a pre-specified critical threshold value. Two types of actions are considered: treatment allocation, putting on hold at least temporarily further accrual of patients to a treatment arm, and treatment selection, removing an arm from the trial permanently. The main development in the paper is in terms of binary outcomes, but extensions for handling time-to-event data, including data from vaccine trials, are also discussed. The performance of the proposed methodology is tested in extensive simulation experiments, with numerical results and graphical illustrations documented in a Supplement to the main text. As a companion to this paper, an implementation of the methods is provided in the form of a freely available R package 'barts'. CONCLUSION: The proposed methods for trial design provide an attractive alternative to their frequentist counterparts.
Asunto(s)
Ensayos Clínicos como Asunto , Proyectos de Investigación , Teorema de Bayes , Simulación por Computador , Humanos , Inutilidad Médica , ProbabilidadRESUMEN
BACKGROUND/AIMS: Noninferiority clinical trials are susceptible to false confirmation of noninferiority when the intention-to-treat principle is applied in the setting of incomplete trial protocol adherence. The risk increases as protocol adherence rates decrease. The objective of this study was to compare protocol adherence and hypothesis confirmation between superiority and noninferiority randomized clinical trials published in three high impact medical journals. We hypothesized that noninferiority trials have lower protocol adherence and greater hypothesis confirmation. METHODS: We conducted an observational study using published clinical trial data. We searched PubMed for active control, two-arm parallel group randomized clinical trials published in JAMA: The Journal of the American Medical Association, The New England Journal of Medicine, and The Lancet between 2007 and 2017. The primary exposure was trial type, superiority versus noninferiority, as determined by the hypothesis testing framework of the primary trial outcome. The primary outcome was trial protocol adherence rate, defined as the number of randomized subjects receiving the allocated intervention as described by the trial protocol and followed to primary outcome ascertainment (numerator), over the total number of subjects randomized (denominator). Hypothesis confirmation was defined as affirmation of noninferiority or the alternative hypothesis for noninferiority and superiority trials, respectively. RESULTS: Among 120 superiority and 120 noninferiority trials, median and interquartile protocol adherence rates were 91.5 [81.4-96.7] and 89.8 [83.6-95.2], respectively; P = 0.47. Hypothesis confirmation was observed in 107/120 (89.2%) of noninferiority and 64/120 (53.3%) of superiority trials, risk difference (95% confidence interval): 35.8 (25.3-46.3), P < 0.001. CONCLUSION: Protocol adherence rates are similar between superiority and noninferiority trials published in three high impact medical journals. Despite this, we observed greater hypothesis confirmation among noninferiority trials. We speculate that publication bias, lenient noninferiority margins and other sources of bias may contribute to this finding. Further study is needed to identify the reasons for this observed difference.
Asunto(s)
Estudios de Equivalencia como Asunto , Adhesión a Directriz/estadística & datos numéricos , Publicaciones Periódicas como Asunto , Ensayos Clínicos Controlados Aleatorios como Asunto/métodos , Humanos , Análisis de Intención de Tratar , Factor de Impacto de la Revista , Ensayos Clínicos Controlados Aleatorios como Asunto/estadística & datos numéricos , Proyectos de Investigación , Tamaño de la MuestraRESUMEN
Many studies to date have conducted a meta-analysis on a mix of effectiveness and superiority studies. This methodological flaw will lead to difficulties in interpreting the results. We addressed this issue in this article, illustrated our point with a simulated experiment, and re-analyzed a recent meta-analysis study based on the effectiveness-superiority dichotomy to provide a real-world correlate of our point of view.
Asunto(s)
Metaanálisis como Asunto , Rehabilitación , Proyectos de Investigación , HumanosRESUMEN
BACKGROUND: There is currently no guidance for selecting a specific difference to be detected in a superiority trial. We explored 3 factors that in our opinion should influence the difference to be detected (type of outcome, patient age group, and presence of treatment side-effects), and 3 that should not (baseline level of risk, logistical difficulties, and cost of treatment). METHODS: We conducted an experimental survey using a factorial design among 380 corresponding authors of randomized controlled trials indexed in Medline. Two hypothetical vignettes were submitted to participants: one described a trial of a new analgesic in mild trauma injuries, the other described a trial of a new chemotherapy among cancer patients. The first vignette tested the baseline level of risk, patient age-group, patient recruitment difficulties, and treatment side-effects. The second tested the baseline level of risk, patient age-group, type of outcome, and cost of treatment. The respondents were asked to select the smallest gain of effectiveness that should be detected by the trial. RESULTS: In vignette 1, respondents selected a median difference to be detected corresponding to an improvement of 7.0 % in pain control with the new treatment. In vignette 2, they selected a median difference to be detected corresponding to a reduction of 5.0 % in mortality or cancer recurrence with the new chemotherapy. In both vignettes, the difference to be detected decreased significantly with the baseline risk. The other factor influencing difference to be detected was the age group, but the impact of this factor was smaller. Cost, side-effects, outcome severity, or mention of logistical difficulties did not significantly impact the difference to be detected selected by participants. CONCLUSIONS: Three of the anticipated effects conformed to our expectations (the effect of patient age, and absence of effect of the cost of treatment and of patient recruitment difficulties) and the other three did not. These findings can guide future research in determining differences to be detected in trials that can translate to meaningful clinical decision-making.
Asunto(s)
Resultado del Tratamiento , Distribución por Edad , Toma de Decisiones Clínicas , Costos de los Medicamentos , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos/economía , Humanos , Ensayos Clínicos Controlados Aleatorios como Asunto/normas , InvestigadoresRESUMEN
Clinical trials are essential for establishing the benefits and harms of various treatments. Among the various trial designs, superiority trials aim to establish the superiority of one treatment over another, while noninferiority trials demonstrate that a new treatment is not inferior to an established one while minimizing harms or patient burdens. In recent years, noninferiority trials have gained prominence. This mini-review explores noninferiority trials, focusing on challenges in their interpretation. Ultimately, we argue that the focus should be on the results from trials rather than their design, as clinicians and other stakeholders primarily seek evidence that helps patients and clinicians in trade-offs of the benefits and harms and burdens of treatment options. PATIENT SUMMARY: Our mini-review shows that looking at the overall treatment benefits and harms in noninferiority trials is better than focusing on the trial design. This approach would help patients and clinicians to better understand trial results and their implications.
Asunto(s)
Estudios de Equivalencia como Asunto , Proyectos de Investigación , HumanosRESUMEN
BACKGROUND: Medial knee osteoarthritis (OA) is a common health problem resulting in knee pain and limiting patients' physical activity. After failed conservative treatment, unicompartmental knee arthroplasty (UKA) and high tibial osteotomy (HTO) are possible surgical treatment options for this condition. There is a paucity of high-quality evidence in the literature comparing objective and subjective outcomes of these procedures. Also, there is no common agreement on whether these procedures provide comparable results in late-stage medial knee OA patients. METHODS: We will perform a prospective randomized controlled trial comparing HTO and UKA in patients with late-stage medial knee OA. 100 patients with isolated medial knee OA (KL III-IV) are assigned to either UKA (n = 50) or HTO (n = 50) procedure in patients 45-65 years of age. Our primary outcome will be KOOS5 at one year postoperatively. Secondary outcomes include OARSI physical assessment, length of stay, wearable activity watch, radiographs (OA progression according to Kellgren-Lawrence classification), patient-reported outcomes (KOOS subscales, pain visual analog scale [VAS], Lysholm, and Oxford knee scores), and adverse events (conversion to total knee arthroplasty, surgery-related complications, need for revision surgery) outcomes. Our hypothesis is that neither of the interventions is superior as measured with KOOS5 at 12 months. ETHICS AND DISSEMINATION: The institutional review board of the Helsinki and Uusimaa Hospital District has approved the protocol. We will disseminate the findings through peer-reviewed publications. TRIAL REGISTRATION: ClinicalTrials.gov/TooloH NCT05442242. Registered on 7/1/2022.
Asunto(s)
Artroplastia de Reemplazo de Rodilla , Osteoartritis de la Rodilla , Humanos , Artroplastia de Reemplazo de Rodilla/efectos adversos , Artroplastia de Reemplazo de Rodilla/métodos , Osteoartritis de la Rodilla/diagnóstico por imagen , Osteoartritis de la Rodilla/cirugía , Estudios Prospectivos , Resultado del Tratamiento , Articulación de la Rodilla/diagnóstico por imagen , Articulación de la Rodilla/cirugía , Osteotomía/efectos adversos , Osteotomía/métodos , Dolor/etiología , Estudios Retrospectivos , Ensayos Clínicos Controlados Aleatorios como AsuntoRESUMEN
OBJECTIVE: To determine and compare the effects of an unsupervised behavioral and pelvic floor muscle training (B-PFMT) program delivered in two formats on nocturia, urinary urgency, and urinary frequency in postmenopausal women. STUDY DESIGN: A secondary analysis used data collected from women enrolled in the TULIP study. Women aged 55 years or more with no urinary incontinence were provided the B-PFMT program. Each woman was randomly assigned to a face-to-face class that took about 2 h (2-hrClass) or to a DVD showing essentially the same information as a 20-minute video (20-minVideo). All women were instructed to independently continue the program following their education session. Three urinary outcomes were assessed at baseline, 3, 12, and 24 months. MAIN OUTCOME MEASURES: Nocturia and urinary urgency were examined with one item each from the questionnaire-based voiding diary, and urinary frequency was assessed with patients' self-documenting 3-day bladder diary. RESULTS: Women in the 2-hrClass group experienced significantly fewer nocturia episodes and longer average inter-void interval at each follow-up and fewer urinary urgency episodes at 12 months. Women in the 20-minVideo group experienced significantly fewer episodes of nocturia and urinary urgency and longer average inter-void interval at each follow-up time point. No significant between-group differences were found for any outcome, except for nocturia at 24 months, when effectiveness favored women in the 20-minVideo group. CONCLUSIONS: Unsupervised B-PFMT programs are effective for improving postmenopausal women's urinary outcomes regardless of the format. The optimal format to deliver B-PFMT programs in terms of effectiveness should be explored in future studies.
Asunto(s)
Terapia por Ejercicio , Nocturia/rehabilitación , Diafragma Pélvico , Incontinencia Urinaria/rehabilitación , Anciano , Femenino , Humanos , Persona de Mediana Edad , Educación del Paciente como Asunto , Posmenopausia , Resultado del TratamientoRESUMEN
The appropriate sample size estimation is very important in the design of clinical trials. However, insufficient or inappropriate sample size estimation is still a prominent problem in the currently published acupuncture and moxibustion clinical trials. At present, the superiority test, non-inferiority test and equivalence test have been widely used in acupuncture and moxibustion clinical trials. This article focuses on the application, calculation methods and PASS11 software using of these three hypothesis test types. In view of the problems in the estimation of sample size in acupuncture and moxibustion clinical trials, the particularity of sample size estimation in acupuncture and moxibustion is summarized from the aspects of parameter setting, ratio of intervention group and control group, and multi-group comparison, in order to guide acupuncture clinical researchers to correctly estimate sample size when conducting clinical trials.
Asunto(s)
Terapia por Acupuntura , Acupuntura , Moxibustión , Ensayos Clínicos como Asunto , Tamaño de la MuestraRESUMEN
Recent success of established treatment has driven concerns about the ethics of using placebo-controlled trials in psychiatry. Active-controlled (superiority or non-inferiority) trials do not include a placebo-arm and thus avoid the associated ethical concerns but show disadvantages in other respects. The aim of this paper is to review the available literature and critically discuss the evidence regarding the use of placebo-controlled- versus active-controlled trials. A MEDLINE/PubMed and Google Scholar search was performed. Studies included focused on the deliberation on placebo-controlled- versus active-controlled trials. Twenty-six studies were included. The most cited benefits of placebo-controlled trials were greater scientific reliability of the results and no average impact on patients' health. Disadvantages were mainly related to withholding effective treatment and limited generalizability. The most frequent argument in favor of active-controlled trials is the lower chance of receiving ineffective medication during the trial. Downsides include larger sample sizes, higher costs and lower scientific reliability of results. Most authors agree that all trial designs are relevant to psychiatric research depending on study goals. Whatsoever, data does not support forgoing placebo-controlled trials. Expert consensus is warranted to permit drawing conclusions on the debate on the relevance of placebo-controlled trials.