Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 354
Filtrar
Mais filtros

Eixos temáticos
Tipo de documento
Intervalo de ano de publicação
1.
Genet Epidemiol ; 47(2): 152-166, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36571162

RESUMO

Two-step tests for gene-environment ( G × E $G\times E$ ) interactions exploit marginal single-nucleotide polymorphism (SNP) effects to improve the power of a genome-wide interaction scan. They combine a screening step based on marginal effects used to "bin" SNPs for weighted hypothesis testing in the second step to deliver greater power over single-step tests while preserving the genome-wide Type I error. However, the presence of many SNPs with detectable marginal effects on the trait of interest can reduce power by "displacing" true interactions with weaker marginal effects and by adding to the number of tests that need to be corrected for multiple testing. We introduce a new significance-based allocation into bins for Step-2 G × E $G\times E$ testing that overcomes the displacement issue and propose a computationally efficient approach to account for multiple testing within bins. Simulation results demonstrate that these simple improvements can provide substantially greater power than current methods under several scenarios. An application to a multistudy collaboration for understanding colorectal cancer reveals a G × Sex interaction located near the SMAD7 gene.


Assuntos
Interação Gene-Ambiente , Estudo de Associação Genômica Ampla , Humanos , Modelos Genéticos , Fenótipo , Simulação por Computador , Polimorfismo de Nucleotídeo Único
2.
Stat Med ; 43(26): 5023-5042, 2024 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-38573319

RESUMO

The two-trials rule for drug approval requires "at least two adequate and well-controlled studies, each convincing on its own, to establish effectiveness." This is usually implemented by requiring two significant pivotal trials and is the standard regulatory requirement to provide evidence for a new drug's efficacy. However, there is need to develop suitable alternatives to this rule for a number of reasons, including the possible availability of data from more than two trials. I consider the case of up to three studies and stress the importance to control the partial Type-I error rate, where only some studies have a true null effect, while maintaining the overall Type-I error rate of the two-trials rule, where all studies have a null effect. Some less-known P $$ P $$ -value combination methods are useful to achieve this: Pearson's method, Edgington's method and the recently proposed harmonic mean χ 2 $$ {\chi}^2 $$ -test. I study their properties and discuss how they can be extended to a sequential assessment of success while still ensuring overall Type-I error control. I compare the different methods in terms of partial Type-I error rate, project power and the expected number of studies required. Edgington's method is eventually recommended as it is easy to implement and communicate, has only moderate partial Type-I error rate inflation but substantially increased project power.


Assuntos
Aprovação de Drogas , Humanos , Ensaios Clínicos como Assunto/economia , Modelos Estatísticos , Projetos de Pesquisa
3.
Stat Med ; 43(9): 1688-1707, 2024 Apr 30.
Artigo em Inglês | MEDLINE | ID: mdl-38373827

RESUMO

As one of the most commonly used data types, methods in testing or designing a trial for binary endpoints from two independent populations are still being developed until recently. However, the power and the minimum required sample size comparisons between different tests may not be valid if their type I errors are not controlled at the same level. In this article, we unify all related testing procedures into a decision framework, including both frequentist and Bayesian methods. Sufficient conditions of the type I error attained at the boundary of hypotheses are derived, which help reduce the magnitude of the exact calculations and lay out a foundation for developing computational algorithms to correctly specify the actual type I error. The efficient algorithms are thus proposed to calculate the cutoff value in a deterministic decision rule and the probability value in a randomized decision rule, such that the actual type I error is under but closest to, or equal to, the intended level, respectively. The algorithm may also be used to calculate the sample size to achieve the prespecified type I error and power. The usefulness of the proposed methodology is further demonstrated in the power calculation for designing superiority and noninferiority trials.


Assuntos
Algoritmos , Projetos de Pesquisa , Humanos , Teorema de Bayes , Tamanho da Amostra , Probabilidade
4.
Stat Med ; 43(24): 4752-4767, 2024 Oct 30.
Artigo em Inglês | MEDLINE | ID: mdl-39193779

RESUMO

BACKGROUND: Outcome measures that are count variables with excessive zeros are common in health behaviors research. Examples include the number of standard drinks consumed or alcohol-related problems experienced over time. There is a lack of empirical data about the relative performance of prevailing statistical models for assessing the efficacy of interventions when outcomes are zero-inflated, particularly compared with recently developed marginalized count regression approaches for such data. METHODS: The current simulation study examined five commonly used approaches for analyzing count outcomes, including two linear models (with outcomes on raw and log-transformed scales, respectively) and three prevailing count distribution-based models (ie, Poisson, negative binomial, and zero-inflated Poisson (ZIP) models). We also considered the marginalized zero-inflated Poisson (MZIP) model, a novel alternative that estimates the overall effects on the population mean while adjusting for zero-inflation. Motivated by alcohol misuse prevention trials, extensive simulations were conducted to evaluate and compare the statistical power and Type I error rate of the statistical models and approaches across data conditions that varied in sample size ( N = 100 $$ N=100 $$ to 500), zero rate (0.2 to 0.8), and intervention effect sizes. RESULTS: Under zero-inflation, the Poisson model failed to control the Type I error rate, resulting in higher than expected false positive results. When the intervention effects on the zero (vs. non-zero) and count parts were in the same direction, the MZIP model had the highest statistical power, followed by the linear model with outcomes on the raw scale, negative binomial model, and ZIP model. The performance of the linear model with a log-transformed outcome variable was unsatisfactory. CONCLUSIONS: The MZIP model demonstrated better statistical properties in detecting true intervention effects and controlling false positive results for zero-inflated count outcomes. This MZIP model may serve as an appealing analytical approach to evaluating overall intervention effects in studies with count outcomes marked by excessive zeros.


Assuntos
Simulação por Computador , Modelos Estatísticos , Humanos , Distribuição de Poisson , Modelos Lineares , Tamanho da Amostra , Avaliação de Resultados em Cuidados de Saúde/estatística & dados numéricos , Interpretação Estatística de Dados , Alcoolismo , Consumo de Bebidas Alcoólicas/epidemiologia , Distribuição Binomial
5.
BMC Med Res Methodol ; 24(1): 197, 2024 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-39251907

RESUMO

PURPOSE: In the context of clinical research, there is an increasing need for new study designs that help to incorporate already available data. With the help of historical controls, the existing information can be utilized to support the new study design, but of course, inclusion also carries the risk of bias in the study results. METHODS: To combine historical and randomized controls we investigate the Fill-it-up-design, which in the first step checks the comparability of the historical and randomized controls performing an equivalence pre-test. If equivalence is confirmed, the historical control data will be included in the new RCT. If equivalence cannot be confirmed, the historical controls will not be considered at all and the randomization of the original study will be extended. We are investigating the performance of this study design in terms of type I error rate and power. RESULTS: We demonstrate how many patients need to be recruited in each of the two steps in the Fill-it-up-design and show that the family wise error rate of the design is kept at 5 % . The maximum sample size of the Fill-it-up-design is larger than that of the single-stage design without historical controls and increases as the heterogeneity between the historical controls and the concurrent controls increases. CONCLUSION: The two-stage Fill-it-up-design represents a frequentist method for including historical control data for various study designs. As the maximum sample size of the design is larger, a robust prior belief is essential for its use. The design should therefore be seen as a way out in exceptional situations where a hybrid design is considered necessary.


Assuntos
Ensaios Clínicos Controlados Aleatórios como Assunto , Projetos de Pesquisa , Humanos , Ensaios Clínicos Controlados Aleatórios como Assunto/métodos , Ensaios Clínicos Controlados Aleatórios como Assunto/estatística & dados numéricos , Tamanho da Amostra , Estudo Historicamente Controlado , Grupos Controle
6.
BMC Med Res Methodol ; 24(1): 124, 2024 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-38831421

RESUMO

BACKGROUND: Multi-arm multi-stage (MAMS) randomised trial designs have been proposed to evaluate multiple research questions in the confirmatory setting. In designs with several interventions, such as the 8-arm 3-stage ROSSINI-2 trial for preventing surgical wound infection, there are likely to be strict limits on the number of individuals that can be recruited or the funds available to support the protocol. These limitations may mean that not all research treatments can continue to accrue the required sample size for the definitive analysis of the primary outcome measure at the final stage. In these cases, an additional treatment selection rule can be applied at the early stages of the trial to restrict the maximum number of research arms that can progress to the subsequent stage(s). This article provides guidelines on how to implement treatment selection within the MAMS framework. It explores the impact of treatment selection rules, interim lack-of-benefit stopping boundaries and the timing of treatment selection on the operating characteristics of the MAMS selection design. METHODS: We outline the steps to design a MAMS selection trial. Extensive simulation studies are used to explore the maximum/expected sample sizes, familywise type I error rate (FWER), and overall power of the design under both binding and non-binding interim stopping boundaries for lack-of-benefit. RESULTS: Pre-specification of a treatment selection rule reduces the maximum sample size by approximately 25% in our simulations. The familywise type I error rate of a MAMS selection design is smaller than that of the standard MAMS design with similar design specifications without the additional treatment selection rule. In designs with strict selection rules - for example, when only one research arm is selected from 7 arms - the final stage significance levels can be relaxed for the primary analyses to ensure that the overall type I error for the trial is not underspent. When conducting treatment selection from several treatment arms, it is important to select a large enough subset of research arms (that is, more than one research arm) at early stages to maintain the overall power at the pre-specified level. CONCLUSIONS: Multi-arm multi-stage selection designs gain efficiency over the standard MAMS design by reducing the overall sample size. Diligent pre-specification of the treatment selection rule, final stage significance level and interim stopping boundaries for lack-of-benefit are key to controlling the operating characteristics of a MAMS selection design. We provide guidance on these design features to ensure control of the operating characteristics.


Assuntos
Ensaios Clínicos Controlados Aleatórios como Assunto , Projetos de Pesquisa , Humanos , Ensaios Clínicos Controlados Aleatórios como Assunto/métodos , Tamanho da Amostra , Seleção de Pacientes
7.
BMC Med Res Methodol ; 24(1): 223, 2024 Sep 30.
Artigo em Inglês | MEDLINE | ID: mdl-39350102

RESUMO

BACKGROUND: Considering multiple endpoints in clinical trials provide a more comprehensive understanding of treatment effects and may lead to increased power or reduced sample size, which may be beneficial in rare diseases. Besides the small sample sizes, allocation bias is an issue that affects the validity of these trials. We investigate the impact of allocation bias on testing decisions in clinical trials with multiple endpoints and offer a tool for selecting an appropriate randomization procedure (RP). METHODS: We derive a model for quantifying the effect of allocation bias depending on the RP in the case of two-arm parallel group trials with continuous multiple endpoints. We focus on two approaches to analyze multiple endpoints, either the Sidák procedure to show efficacy in at least one endpoint and the all-or-none procedure to show efficacy in all endpoints. RESULTS: To evaluate the impact of allocation bias on the test decision we propose a biasing policy for multiple endpoints. The impact of allocation on the test decision is measured by the family-wise error rate of the Sidák procedure and the type I error rate of the all-or-none procedure. Using the biasing policy we derive formulas to calculate these error rates. In simulations we show that, for the Sidák procedure as well as for the all-or-none procedure, allocation bias leads to inflation of the mean family-wise error and mean type I error, respectively. The strength of this inflation is affected by the choice of the RP. CONCLUSION: Allocation bias should be considered during the design phase of a trial to increase validity. The developed methodology is useful for selecting an appropriate RP for a clinical trial with multiple endpoints to minimize allocation bias effects.


Assuntos
Viés , Humanos , Determinação de Ponto Final/métodos , Determinação de Ponto Final/estatística & dados numéricos , Ensaios Clínicos como Assunto/métodos , Ensaios Clínicos como Assunto/estatística & dados numéricos , Projetos de Pesquisa , Tamanho da Amostra , Ensaios Clínicos Controlados Aleatórios como Assunto/métodos , Ensaios Clínicos Controlados Aleatórios como Assunto/estatística & dados numéricos , Modelos Estatísticos , Simulação por Computador , Algoritmos
8.
Oecologia ; 205(2): 257-269, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38806949

RESUMO

Community weighted means (CWMs) are widely used to study the relationship between community-level functional traits and environment. For certain null hypotheses, CWM-environment relationships assessed by linear regression or ANOVA and tested by standard parametric tests are prone to inflated Type I error rates. Previous research has found that this problem can be solved by permutation tests (i.e., the max test). A recent extension of the CWM approach allows the inclusion of intraspecific trait variation (ITV) by the separate calculation of fixed, site-specific, and intraspecific CWMs. The question is whether the same Type I error rate inflation exists for the relationship between environment and site-specific or intraspecific CWM. Using simulated and real-world community datasets, we show that site-specific CWM-environment relationships have also inflated Type I error rate, and this rate is negatively related to the relative ITV magnitude. In contrast, for intraspecific CWM-environment relationships, standard parametric tests have the correct Type I error rate, although somewhat reduced statistical power. We introduce an ITV-extended version of the max test, which can solve the inflation problem for site-specific CWM-environment relationships and, without considering ITV, becomes equivalent to the "original" max test used for the CWM approach. We show that this new ITV-extended max test works well across the full possible magnitude of ITV on both simulated and real-world data. Most real datasets probably do not have intraspecific trait variation large enough to alleviate the problem of inflated Type I error rate, and published studies possibly report overly optimistic significance results.


Assuntos
Ecossistema
9.
Clin Trials ; 21(2): 171-179, 2024 04.
Artigo em Inglês | MEDLINE | ID: mdl-38311901

RESUMO

BACKGROUND: Pivotal evidence of efficacy of a new drug is typically generated by (at least) two clinical trials which independently provide statistically significant and mutually corroborating evidence of efficacy based on a primary endpoint. In this situation, showing drug effects on clinically important secondary objectives can be demanding in terms of sample size requirements. Statistically efficient methods to power for such endpoints while controlling the Type I error are needed. METHODS: We review existing strategies for establishing claims on important but sample size-intense secondary endpoints. We present new strategies based on combined data from two independent, identically designed and concurrent trials, controlling the Type I error at the submission level. We explain the methodology and provide three case studies. RESULTS: Different strategies have been used for establishing secondary claims. One new strategy, involving a protocol planned analysis of combined data across trials, and controlling the Type I error at the submission level, is particularly efficient. It has already been successfully used in support of label claims. Regulatory views on this strategy differ. CONCLUSIONS: Inference on combined data across trials is a useful approach for generating pivotal evidence of efficacy for important but sample size-intense secondary endpoints. It requires careful preparation and regulatory discussion.


Assuntos
Projetos de Pesquisa , Humanos , Tamanho da Amostra
10.
J Biopharm Stat ; : 1-14, 2024 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-38515269

RESUMO

In recent years, clinical trials utilizing a two-stage seamless adaptive trial design have become very popular in drug development. A typical example is a phase 2/3 adaptive trial design, which consists of two stages. As an example, stage 1 is for a phase 2 dose-finding study and stage 2 is for a phase 3 efficacy confirmation study. Depending upon whether or not the target patient population, study objectives, and study endpoints are the same at different stages, Chow (2020) classified two-stage seamless adaptive design into eight categories. In practice, standard statistical methods for group sequential design with one planned interim analysis are often wrongly directly applied for data analysis. In this article, following similar ideas proposed by Chow and Lin (2015) and Chow (2020), a statistical method for the analysis of a two-stage seamless adaptive trial design with different study endpoints and shifted target patient population is discussed under the fundamental assumption that study endpoints have a known relationship. The proposed analysis method should be useful in both clinical trials with protocol amendments and clinical trials with the existence of disease progression utilizing a two-stage seamless adaptive trial design.

11.
BMC Public Health ; 24(1): 901, 2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38539086

RESUMO

BACKGROUND: Count time series (e.g., daily deaths) are a very common type of data in environmental health research. The series is generally autocorrelated, while the widely used generalized linear model is based on the assumption of independent outcomes. None of the existing methods for modelling parameter-driven count time series can obtain consistent and reliable standard error of parameter estimates, causing potential inflation of type I error rate. METHODS: We proposed a new maximum significant ρ correction (MSRC) method that utilizes information of significant autocorrelation coefficient ρ estimate within 5 orders by moment estimation. A Monte Carlo simulation was conducted to evaluate and compare the finite sample performance of the MSRC and classical unbiased correction (UB-corrected) method. We demonstrated a real-data analysis for assessing the effect of drunk driving regulations on the incidence of road traffic injuries (RTIs) using MSRC in Shenzhen, China. Moreover, there is no previous paper assessing the time-varying intervention effect and considering autocorrelation based on daily data of RTIs. RESULTS: Both methods had a small bias in the regression coefficients. The autocorrelation coefficient estimated by UB-corrected is slightly underestimated at high autocorrelation (≥ 0.6), leading to the inflation of the type I error rate. The new method well controlled the type I error rate when the sample size reached 340. Moreover, the power of MSRC increased with increasing sample size and effect size and decreasing nuisance parameters, and it approached UB-corrected when ρ was small (≤ 0.4), but became more reliable as autocorrelation increased further. The daily data of RTIs exhibited significant autocorrelation after controlling for potential confounding, and therefore the MSRC was preferable to the UB-corrected. The intervention contributed to a decrease in the incidence of RTIs by 8.34% (95% CI, -5.69-20.51%), 45.07% (95% CI, 25.86-59.30%) and 42.94% (95% CI, 9.56-64.00%) at 1, 3 and 5 years after the implementation of the intervention, respectively. CONCLUSIONS: The proposed MSRC method provides a reliable and consistent approach for modelling parameter-driven time series with autocorrelated count data. It offers improved estimation compared to existing methods. The strict drunk driving regulations can reduce the risk of RTIs.


Assuntos
Fatores de Tempo , Humanos , Modelos Lineares , Simulação por Computador , Viés , China
12.
Pharm Stat ; 23(1): 4-19, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-37632266

RESUMO

Borrowing information from historical or external data to inform inference in a current trial is an expanding field in the era of precision medicine, where trials are often performed in small patient cohorts for practical or ethical reasons. Even though methods proposed for borrowing from external data are mainly based on Bayesian approaches that incorporate external information into the prior for the current analysis, frequentist operating characteristics of the analysis strategy are often of interest. In particular, type I error rate and power at a prespecified point alternative are the focus. We propose a procedure to investigate and report the frequentist operating characteristics in this context. The approach evaluates type I error rate of the test with borrowing from external data and calibrates the test without borrowing to this type I error rate. On this basis, a fair comparison of power between the test with and without borrowing is achieved. We show that no power gains are possible in one-sided one-arm and two-arm hybrid control trials with normal endpoint, a finding proven in general before. We prove that in one-arm fixed-borrowing situations, unconditional power (i.e., when external data is random) is reduced. The Empirical Bayes power prior approach that dynamically borrows information according to the similarity of current and external data avoids the exorbitant type I error inflation occurring with fixed borrowing. In the hybrid control two-arm trial we observe power reductions as compared to the test calibrated to borrowing that increase when considering unconditional power.


Assuntos
Modelos Estatísticos , Projetos de Pesquisa , Humanos , Teorema de Bayes , Simulação por Computador , Ensaios Clínicos como Assunto
13.
Biom J ; 66(1): e2200102, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-36642800

RESUMO

When comparing the performance of two or more competing tests, simulation studies commonly focus on statistical power. However, if the size of the tests being compared are either different from one another or from the nominal size, comparing tests based on power alone may be misleading. By analogy with diagnostic accuracy studies, we introduce relative positive and negative likelihood ratios to factor in both power and size in the comparison of multiple tests. We derive sample size formulas for a comparative simulation study. As an example, we compared the performance of six statistical tests for small-study effects in meta-analyses of randomized controlled trials: Begg's rank correlation, Egger's regression, Schwarzer's method for sparse data, the trim-and-fill method, the arcsine-Thompson test, and Lin and Chu's combined test. We illustrate that comparing power alone, or power adjusted or penalized for size, can be misleading, and how the proposed likelihood ratio approach enables accurate comparison of the trade-off between power and size between competing tests.


Assuntos
Viés de Publicação , Simulação por Computador , Tamanho da Amostra
14.
Biom J ; 66(1): e2200322, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38063813

RESUMO

Bayesian clinical trials can benefit from available historical information through the specification of informative prior distributions. Concerns are however often raised about the potential for prior-data conflict and the impact of Bayes test decisions on frequentist operating characteristics, with particular attention being assigned to inflation of type I error (TIE) rates. This motivates the development of principled borrowing mechanisms, that strike a balance between frequentist and Bayesian decisions. Ideally, the trust assigned to historical information defines the degree of robustness to prior-data conflict one is willing to sacrifice. However, such relationship is often not directly available when explicitly considering inflation of TIE rates. We build on available literature relating frequentist and Bayesian test decisions, and investigate a rationale for inflation of TIE rate which explicitly and linearly relates the amount of borrowing and the amount of TIE rate inflation in one-arm studies. A novel dynamic borrowing mechanism tailored to hypothesis testing is additionally proposed. We show that, while dynamic borrowing prevents the possibility to obtain a simple closed-form TIE rate computation, an explicit upper bound can still be enforced. Connections with the robust mixture prior approach, particularly in relation to the choice of the mixture weight and robust component, are made. Simulations are performed to show the properties of the approach for normal and binomial outcomes, and an exemplary application is demonstrated in a case study.


Assuntos
Modelos Estatísticos , Projetos de Pesquisa , Teorema de Bayes , Simulação por Computador
15.
Biom J ; 66(1): e2200312, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38285403

RESUMO

To accelerate a randomized controlled trial, historical control data may be used after ensuring little heterogeneity between the historical and current trials. The test-then-pool approach is a simple frequentist borrowing method that assesses the similarity between historical and current control data using a two-sided test. A limitation of the conventional test-then-pool method is the inability to control the type I error rate and power for the primary hypothesis separately and flexibly for heterogeneity between trials. This is because the two-sided test focuses on the absolute value of the mean difference between the historical and current controls. In this paper, we propose a new test-then-pool method that splits the two-sided hypothesis of the conventional method into two one-sided hypotheses. Testing each one-sided hypothesis with different significance levels allows for the separate control of the type I error rate and power for heterogeneity between trials. We also propose a significance-level selection approach based on the maximum type I error rate and the minimum power. The proposed method prevented a decrease in power even when there was heterogeneity between trials while controlling type I error at a maximum tolerable type I error rate larger than the targeted type I error rate. The application of depression trial data and hypothetical trial data further supported the usefulness of the proposed method.

16.
Biom J ; 66(8): e202300232, 2024 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-39473139

RESUMO

Replication studies are increasingly conducted to assess the credibility of scientific findings. Most of these replication attempts target studies with a superiority design, but there is a lack of methodology regarding the analysis of replication studies with alternative types of designs, such as equivalence. In order to fill this gap, we propose two approaches, the two-trials rule and the sceptical two one-sided tests (TOST) procedure, adapted from methods used in superiority settings. Both methods have the same overall Type-I error rate, but the sceptical TOST procedure allows replication success even for nonsignificant original or replication studies. This leads to a larger project power and other differences in relevant operating characteristics. Both methods can be used for sample size calculation of the replication study, based on the results from the original one. The two methods are applied to data from the Reproducibility Project: Cancer Biology.


Assuntos
Biometria , Biometria/métodos , Humanos , Reprodutibilidade dos Testes , Tamanho da Amostra , Estudos de Equivalência como Asunto
17.
Behav Res Methods ; 56(7): 7391-7409, 2024 10.
Artigo em Inglês | MEDLINE | ID: mdl-38886305

RESUMO

Recently, Asparouhov and Muthén Structural Equation Modeling: A Multidisciplinary Journal, 28, 1-14, (2021a, 2021b) proposed a variant of the Wald test that uses Markov chain Monte Carlo machinery to generate a chi-square test statistic for frequentist inference. Because the test's composition does not rely on analytic expressions for sampling variation and covariation, it potentially provides a way to get honest significance tests in cases where the likelihood-based test statistic's assumptions break down (e.g., in small samples). The goal of this study is to use simulation to compare the new MCM Wald test to its maximum likelihood counterparts, with respect to both their type I error rate and power. Our simulation examined the test statistics across different levels of sample size, effect size, and degrees of freedom (test complexity). An additional goal was to assess the robustness of the MCMC Wald test with nonnormal data. The simulation results uniformly demonstrated that the MCMC Wald test was superior to the maximum likelihood test statistic, especially with small samples (e.g., sample sizes less than 150) and complex models (e.g., models with five or more predictors). This conclusion held for nonnormal data as well. Lastly, we provide a brief application to a real data example.


Assuntos
Cadeias de Markov , Método de Monte Carlo , Humanos , Funções Verossimilhança , Modelos Lineares , Simulação por Computador , Modelos Estatísticos , Interpretação Estatística de Dados , Tamanho da Amostra
18.
Biostatistics ; 23(1): 328-344, 2022 01 13.
Artigo em Inglês | MEDLINE | ID: mdl-32735010

RESUMO

Bayesian clinical trials allow taking advantage of relevant external information through the elicitation of prior distributions, which influence Bayesian posterior parameter estimates and test decisions. However, incorporation of historical information can have harmful consequences on the trial's frequentist (conditional) operating characteristics in case of inconsistency between prior information and the newly collected data. A compromise between meaningful incorporation of historical information and strict control of frequentist error rates is therefore often sought. Our aim is thus to review and investigate the rationale and consequences of different approaches to relaxing strict frequentist control of error rates from a Bayesian decision-theoretic viewpoint. In particular, we define an integrated risk which incorporates losses arising from testing, estimation, and sampling. A weighted combination of the integrated risk addends arising from testing and estimation allows moving smoothly between these two targets. Furthermore, we explore different possible elicitations of the test error costs, leading to test decisions based either on posterior probabilities, or solely on Bayes factors. Sensitivity analyses are performed following the convention which makes a distinction between the prior of the data-generating process, and the analysis prior adopted to fit the data. Simulation in the case of normal and binomial outcomes and an application to a one-arm proof-of-concept trial, exemplify how such analysis can be conducted to explore sensitivity of the integrated risk, the operating characteristics, and the optimal sample size, to prior-data conflict. Robust analysis prior specifications, which gradually discount potentially conflicting prior information, are also included for comparison. Guidance with respect to cost elicitation, particularly in the context of a Phase II proof-of-concept trial, is provided.


Assuntos
Modelos Estatísticos , Projetos de Pesquisa , Teorema de Bayes , Ensaios Clínicos como Assunto , Humanos , Tamanho da Amostra
19.
Stat Sci ; 38(2): 185-208, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-37324576

RESUMO

Response-Adaptive Randomization (RAR) is part of a wider class of data-dependent sampling algorithms, for which clinical trials are typically used as a motivating application. In that context, patient allocation to treatments is determined by randomization probabilities that change based on the accrued response data in order to achieve experimental goals. RAR has received abundant theoretical attention from the biostatistical literature since the 1930's and has been the subject of numerous debates. In the last decade, it has received renewed consideration from the applied and methodological communities, driven by well-known practical examples and its widespread use in machine learning. Papers on the subject present different views on its usefulness, and these are not easy to reconcile. This work aims to address this gap by providing a unified, broad and fresh review of methodological and practical issues to consider when debating the use of RAR in clinical trials.

20.
Stat Sci ; 38(4): 557-575, 2023 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-38223302

RESUMO

Modern data analysis frequently involves large-scale hypothesis testing, which naturally gives rise to the problem of maintaining control of a suitable type I error rate, such as the false discovery rate (FDR). In many biomedical and technological applications, an additional complexity is that hypotheses are tested in an online manner, one-by-one over time. However, traditional procedures that control the FDR, such as the Benjamini-Hochberg procedure, assume that all p-values are available to be tested at a single time point. To address these challenges, a new field of methodology has developed over the past 15 years showing how to control error rates for online multiple hypothesis testing. In this framework, hypotheses arrive in a stream, and at each time point the analyst decides whether to reject the current hypothesis based both on the evidence against it, and on the previous rejection decisions. In this paper, we present a comprehensive exposition of the literature on online error rate control, with a review of key theory as well as a focus on applied examples. We also provide simulation results comparing different online testing algorithms and an up-to-date overview of the many methodological extensions that have been proposed.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa