RESUMEN
Quantifying an individual's risk for common diseases is an important goal of precision health. The polygenic risk score (PRS), which aggregates multiple risk alleles of candidate diseases, has emerged as a standard approach for identifying high-risk individuals. Although several studies have been performed to benchmark the PRS calculation tools and assess their potential to guide future clinical applications, some issues remain to be further investigated, such as lacking (i) various simulated data with different genetic effects; (ii) evaluation of machine learning models and (iii) evaluation on multiple ancestries studies. In this study, we systematically validated and compared 13 statistical methods, 5 machine learning models and 2 ensemble models using simulated data with additive and genetic interaction models, 22 common diseases with internal training sets, 4 common diseases with external summary statistics and 3 common diseases for trans-ancestry studies in UK Biobank. The statistical methods were better in simulated data from additive models and machine learning models have edges for data that include genetic interactions. Ensemble models are generally the best choice by integrating various statistical methods. LDpred2 outperformed the other standalone tools, whereas PRS-CS, lassosum and DBSLMM showed comparable performance. We also identified that disease heritability strongly affected the predictive performance of all methods. Both the number and effect sizes of risk SNPs are important; and sample size strongly influences the performance of all methods. For the trans-ancestry studies, we found that the performance of most methods became worse when training and testing sets were from different populations.
Asunto(s)
Aprendizaje Automático , Herencia Multifactorial , Humanos , Factores de Riesgo , Genómica , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo/métodosRESUMEN
Rigorous evidence generation with randomized controlled trials has lagged for aneurysmal subarachnoid hemorrhage (SAH) compared with other forms of acute stroke. Besides its lower incidence compared with other stroke subtypes, the presentation and outcome of patients with SAH also differ. This must be considered and adjusted for in designing pivotal randomized controlled trials of patients with SAH. Here, we show the effect of the unique expected distribution of the SAH severity at presentation (World Federation of Neurological Surgeons grade) on the outcome most used in pivotal stroke randomized controlled trials (modified Rankin Scale) and, consequently, on the sample size. Furthermore, we discuss the advantages and disadvantages of different options to analyze the outcome and control the expected distribution of the World Federation of Neurological Surgeons grades in addition to showing their effects on the sample size. Finally, we offer methods that investigators can adapt to more precisely understand the effect of common modified Rankin Scale analysis methods and trial eligibility pertaining to the World Federation of Neurological Surgeons grade in designing their large-scale SAH randomized controlled trials.
Asunto(s)
Accidente Cerebrovascular , Hemorragia Subaracnoidea , Humanos , Hemorragia Subaracnoidea/terapia , Hemorragia Subaracnoidea/cirugía , Resultado del Tratamiento , Procedimientos Neuroquirúrgicos , Neurocirujanos , Accidente Cerebrovascular/cirugíaRESUMEN
BACKGROUND: A recent review of randomization methods used in large multicenter clinical trials within the National Institutes of Health Stroke Trials Network identified preservation of treatment allocation randomness, achievement of the desired group size balance between treatment groups, achievement of baseline covariate balance, and ease of implementation in practice as critical properties required for optimal randomization designs. Common-scale minimal sufficient balance (CS-MSB) adaptive randomization effectively controls for covariate imbalance between treatment groups while preserving allocation randomness but does not balance group sizes. This study extends the CS-MSB adaptive randomization method to achieve both group size and covariate balance while preserving allocation randomness in hyperacute stroke trials. METHODS: A full factorial in silico simulation study evaluated the performance of the proposed new CSSize-MSB adaptive randomization method in achieving group size balance, covariate balance, and allocation randomness compared with the original CS-MSB method. Data from 4 existing hyperacute stroke trials were used to investigate the performance of CSSize-MSB for a range of sample sizes and covariate numbers and types. A discrete-event simulation model created with AnyLogic was used to dynamically visualize the decision logic of the CSSize-MSB randomization process for communication with clinicians. RESULTS: The proposed new CSSize-MSB algorithm uniformly outperformed the CS-MSB algorithm in controlling for group size imbalance while maintaining comparable levels of covariate balance and allocation randomness in hyperacute stroke trials. This improvement was consistent across a distribution of simulated trials with varying levels of imbalance but was increasingly pronounced for trials with extreme cases of imbalance. The results were consistent across a range of trial data sets of different sizes and covariate numbers and types. CONCLUSIONS: The proposed adaptive CSSize-MSB algorithm successfully controls for group size imbalance in hyperacute stroke trials under various settings, and its logic can be readily explained to clinicians using dynamic visualization.
Asunto(s)
Accidente Cerebrovascular , Humanos , Tamaño de la Muestra , Ensayos Clínicos Controlados Aleatorios como Asunto/métodos , Simulación por Computador , Distribución Aleatoria , Proyectos de InvestigaciónRESUMEN
While the majority of stroke researchers use frequentist statistics to analyze and present their data, Bayesian statistics are becoming more and more prevalent in stroke research. As opposed to frequentist approaches, which are based on the probability that data equal specific values given underlying unknown parameters, Bayesian approaches are based on the probability that parameters equal specific values given observed data and prior beliefs. The Bayesian paradigm allows researchers to update their beliefs with observed data to provide probabilistic interpretations of key parameters, for example, the probability that a treatment is effective. In this review, we outline the basic concepts of Bayesian statistics as they apply to stroke trials, compare them to the frequentist approach using exemplary data from a randomized trial, and explain how a Bayesian analysis is conducted and interpreted.
Asunto(s)
Teorema de Bayes , Proyectos de Investigación , Accidente Cerebrovascular , Humanos , Accidente Cerebrovascular/terapia , Accidente Cerebrovascular/epidemiología , Ensayos Clínicos Controlados Aleatorios como Asunto/métodos , Interpretación Estadística de DatosRESUMEN
Despite its widespread use, resting-state functional magnetic resonance imaging (rsfMRI) has been criticized for low test-retest reliability. To improve reliability, researchers have recommended using extended scanning durations, increased sample size, and advanced brain connectivity techniques. However, longer scanning runs and larger sample sizes may come with practical challenges and burdens, especially in rare populations. Here we tested if an advanced brain connectivity technique, dynamic causal modeling (DCM), can improve reliability of fMRI effective connectivity (EC) metrics to acceptable levels without extremely long run durations or extremely large samples. Specifically, we employed DCM for EC analysis on rsfMRI data from the Human Connectome Project. To avoid bias, we assessed four distinct DCMs and gradually increased sample sizes in a randomized manner across ten permutations. We employed pseudo true positive and pseudo false positive rates to assess the efficacy of shorter run durations (3.6, 7.2, 10.8, 14.4 min) in replicating the outcomes of the longest scanning duration (28.8 min) when the sample size was fixed at the largest (n = 160 subjects). Similarly, we assessed the efficacy of smaller sample sizes (n = 10, 20, , 150 subjects) in replicating the outcomes of the largest sample (n = 160 subjects) when the scanning duration was fixed at the longest (28.8 min). Our results revealed that the pseudo false positive rate was below 0.05 for all the analyses. After the scanning duration reached 10.8 min, which yielded a pseudo true positive rate of 92%, further extensions in run time showed no improvements in pseudo true positive rate. Expanding the sample size led to enhanced pseudo true positive rate outcomes, with a plateau at n = 70 subjects for the targeted top one-half of the largest ECs in the reference sample, regardless of whether the longest run duration (28.8 min) or the viable run duration (10.8 min) was employed. Encouragingly, smaller sample sizes exhibited pseudo true positive rates of approximately 80% for n = 20, and 90% for n = 40 subjects. These data suggest that advanced DCM analysis may be a viable option to attain reliable metrics of EC when larger sample sizes or run times are not feasible.
Asunto(s)
Encéfalo , Conectoma , Imagen por Resonancia Magnética , Humanos , Imagen por Resonancia Magnética/métodos , Imagen por Resonancia Magnética/normas , Tamaño de la Muestra , Conectoma/métodos , Conectoma/normas , Reproducibilidad de los Resultados , Encéfalo/diagnóstico por imagen , Encéfalo/fisiología , Adulto , Femenino , Masculino , Descanso/fisiología , Factores de TiempoRESUMEN
The quality of the inferences we make from pathogen sequence data is determined by the number and composition of pathogen sequences that make up the sample used to drive that inference. However, there remains limited guidance on how to best structure and power studies when the end goal is phylogenetic inference. One question that we can attempt to answer with molecular data is whether some people are more likely to transmit a pathogen than others. Here we present an estimator to quantify differential transmission, as measured by the ratio of reproductive numbers between people with different characteristics, using transmission pairs linked by molecular data, along with a sample size calculation for this estimator. We also provide extensions to our method to correct for imperfect identification of transmission linked pairs, overdispersion in the transmission process, and group imbalance. We validate this method via simulation and provide tools to implement it in an R package, phylosamp.
RESUMEN
There is a dearth of safety data on maternal outcomes after perinatal medication exposure. Data-mining for unexpected adverse event occurrence in existing datasets is a potentially useful approach. One method, the Poisson tree-based scan statistic (TBSS), assumes that the expected outcome counts, based on incidence of outcomes in the control group, are estimated without error. This assumption may be difficult to satisfy with a small control group. Our simulation study evaluated the effect of imprecise incidence proportions from the control group on TBSS' ability to identify maternal outcomes in pregnancy research. We simulated base case analyses with "true" expected incidence proportions and compared these to imprecise incidence proportions derived from sparse control samples. We varied parameters impacting Type I error and statistical power (exposure group size, outcome's incidence proportion, and effect size). We found that imprecise incidence proportions generated by a small control group resulted in inaccurate alerting, inflation of Type I error, and removal of very rare outcomes for TBSS analysis due to "zero" background counts. Ideally, the control size should be at least several times larger than the exposure size to limit the number of false positive alerts and retain statistical power for true alerts.
RESUMEN
BACKGROUND: Clinical studies are often limited by resources available, which results in constraints on sample size. We use simulated data to illustrate study implications when the sample size is too small. METHODS AND RESULTS: Using 2 theoretical populations each with Nâ =â 1000, we randomly sample 10 from each population and conduct a statistical comparison, to help make a conclusion about whether the 2 populations are different. This exercise is repeated for a total of 4 studies: 2 concluded that the 2 populations are statistically significantly different, while 2 showed no statistically significant difference. CONCLUSIONS: Our simulated examples demonstrate that sample sizes play important roles in clinical research. The results and conclusions, in terms of estimates of means, medians, Pearson correlations, chi-square test, and P values, are unreliable with small samples.
Asunto(s)
Proyectos de Investigación , Tamaño de la Muestra , Humanos , Proyectos de Investigación/normasRESUMEN
Basket trials are increasingly used for the simultaneous evaluation of a new treatment in various patient subgroups under one overarching protocol. We propose a Bayesian approach to sample size determination in basket trials that permit borrowing of information between commensurate subsets. Specifically, we consider a randomized basket trial design where patients are randomly assigned to the new treatment or control within each trial subset ("subtrial" for short). Closed-form sample size formulae are derived to ensure that each subtrial has a specified chance of correctly deciding whether the new treatment is superior to or not better than the control by some clinically relevant difference. Given prespecified levels of pairwise (in)commensurability, the subtrial sample sizes are solved simultaneously. The proposed Bayesian approach resembles the frequentist formulation of the problem in yielding comparable sample sizes for circumstances of no borrowing. When borrowing is enabled between commensurate subtrials, a considerably smaller trial sample size is required compared to the widely implemented approach of no borrowing. We illustrate the use of our sample size formulae with two examples based on real basket trials. A comprehensive simulation study further shows that the proposed methodology can maintain the true positive and false positive rates at desired levels.
Asunto(s)
Proyectos de Investigación , Humanos , Tamaño de la Muestra , Teorema de Bayes , Simulación por ComputadorRESUMEN
Early in the SARS-CoV2 pandemic, in this journal, Hou et al. (BMC Med 18:216, 2020) interpreted public genotype data, run through functional prediction tools, as suggesting that members of particular human populations carry potentially COVID-risk-increasing variants in genes ACE2 and TMPRSS2 far more often than do members of other populations. Beyond resting on predictions rather than clinical outcomes, and focusing on variants too rare to typify population members even jointly, their claim mistook a well known artifact (that large samples reveal more of a population's variants than do small samples) as if showing real and congruent population differences for the two genes, rather than lopsided population sampling in their shared source data. We explain that artifact, and contrast it with empirical findings, now ample, that other loci shape personal COVID risks far more significantly than do ACE2 and TMPRSS2-and that variation in ACE2 and TMPRSS2 per se unlikely exacerbates any net population disparity in the effects of such more risk-informative loci.
Asunto(s)
Enzima Convertidora de Angiotensina 2 , COVID-19 , SARS-CoV-2 , Serina Endopeptidasas , Humanos , Enzima Convertidora de Angiotensina 2/genética , Enzima Convertidora de Angiotensina 2/metabolismo , COVID-19/genética , COVID-19/epidemiología , Predisposición Genética a la Enfermedad , SARS-CoV-2/genética , Serina Endopeptidasas/genéticaRESUMEN
Missing values are common in high-throughput mass spectrometry data. Two strategies are available to address missing values: (i) eliminate or impute the missing values and apply statistical methods that require complete data and (ii) use statistical methods that specifically account for missing values without imputation (imputation-free methods). This study reviews the effect of sample size and percentage of missing values on statistical inference for multiple methods under these two strategies. With increasing missingness, the ability of imputation and imputation-free methods to identify differentially and non-differentially regulated compounds in a two-group comparison study declined. Random forest and k-nearest neighbor imputation combined with a Wilcoxon test performed well in statistical testing for up to 50% missingness with little bias in estimating the effect size. Quantile regression imputation accompanied with a Wilcoxon test also had good statistical testing outcomes but substantially distorted the difference in means between groups. None of the imputation-free methods performed consistently better for statistical testing than imputation methods.
Asunto(s)
Proyectos de Investigación , Sesgo , Análisis por Conglomerados , Espectrometría de Masas/métodosRESUMEN
Quantitative trait locus (QTL) analyses of multiomic molecular traits, such as gene transcription (eQTL), DNA methylation (mQTL) and histone modification (haQTL), have been widely used to infer the functional effects of genome variants. However, the QTL discovery is largely restricted by the limited study sample size, which demands higher threshold of minor allele frequency and then causes heavy missing molecular trait-variant associations. This happens prominently in single-cell level molecular QTL studies because of sample availability and cost. It is urgent to propose a method to solve this problem in order to enhance discoveries of current molecular QTL studies with small sample size. In this study, we presented an efficient computational framework called xQTLImp to impute missing molecular QTL associations. In the local-region imputation, xQTLImp uses multivariate Gaussian model to impute the missing associations by leveraging known association statistics of variants and the linkage disequilibrium (LD) around. In the genome-wide imputation, novel procedures are implemented to improve efficiency, including dynamically constructing a reused LD buffer, adopting multiple heuristic strategies and parallel computing. Experiments on various multiomic bulk and single-cell sequencing-based QTL datasets have demonstrated high imputation accuracy and novel QTL discovery ability of xQTLImp. Finally, a C++ software package is freely available at https://github.com/stormlovetao/QTLIMP.
Asunto(s)
Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo , Estudio de Asociación del Genoma Completo/métodos , Genotipo , Desequilibrio de Ligamiento , Fenotipo , Polimorfismo de Nucleótido Simple , Tamaño de la MuestraRESUMEN
BACKGROUND: Clinical trial scenarios can be modeled using data from observational studies, providing critical information for design of real-world trials. The Huntington's Disease Integrated Staging System (HD-ISS) characterizes disease progression over an individual's lifespan and allows for flexibility in the design of trials with the goal of delaying progression. Enrichment methods can be applied to the HD-ISS to identify subgroups requiring smaller estimated sample sizes. OBJECTIVE: Investigate time to the event of functional decline (HD-ISS Stage 3) as an endpoint for trials in HD and present sample size estimates after enrichment. METHODS: We classified individuals from observational studies according to the HD-ISS. We assessed the ability of the prognostic index normed (PIN) and its components to predict time to HD-ISS Stage 3. For enrichment, we formed groups from deciles of the baseline PIN distribution for HD-ISS Stage 2 participants. We selected enrichment subgroups closer to Stage 3 transition and estimated sample sizes, using delay in the transition time as the effect size. RESULTS: In predicting time to HD-ISS Stage 3, PIN outperforms its components. Survival curves for each PIN decile show that groups with PIN from 1.48 to 2.74 have median time to Stage 3 of approximately 2 years and these are combined to create enrichment subgroups. Sample size estimates are presented by enrichment subgroup. CONCLUSIONS: PIN is predictive of functional decline. A delay of 9 months or more in the transition to Stage 3 for an enriched sample yields feasible sample size estimates, demonstrating that this approach can aid in planning future trials. © 2024 The Author(s). Movement Disorders published by Wiley Periodicals LLC on behalf of International Parkinson and Movement Disorder Society.
Asunto(s)
Progresión de la Enfermedad , Enfermedad de Huntington , Enfermedad de Huntington/fisiopatología , Humanos , Tamaño de la Muestra , Femenino , Masculino , Persona de Mediana Edad , Ensayos Clínicos como Asunto/métodos , Adulto , Pronóstico , Factores de TiempoRESUMEN
Divergence time estimation is crucial to provide temporal signals for dating biologically important events from species divergence to viral transmissions in space and time. With the advent of high-throughput sequencing, recent Bayesian phylogenetic studies have analyzed hundreds to thousands of sequences. Such large-scale analyses challenge divergence time reconstruction by requiring inference on highly correlated internal node heights that often become computationally infeasible. To overcome this limitation, we explore a ratio transformation that maps the original $N-1$ internal node heights into a space of one height parameter and $N-2$ ratio parameters. To make the analyses scalable, we develop a collection of linear-time algorithms to compute the gradient and Jacobian-associated terms of the log-likelihood with respect to these ratios. We then apply Hamiltonian Monte Carlo sampling with the ratio transform in a Bayesian framework to learn the divergence times in 4 pathogenic viruses (West Nile virus, rabies virus, Lassa virus, and Ebola virus) and the coralline red algae. Our method both resolves a mixing issue in the West Nile virus example and improves inference efficiency by at least 5-fold for the Lassa and rabies virus examples as well as for the algae example. Our method now also makes it computationally feasible to incorporate mixed-effects molecular clock models for the Ebola virus example, confirms the findings from the original study, and reveals clearer multimodal distributions of the divergence times of some clades of interest.
Asunto(s)
Algoritmos , Filogenia , Teorema de Bayes , Factores de Tiempo , Método de MontecarloRESUMEN
In clinical studies of chronic diseases, the effectiveness of an intervention is often assessed using "high cost" outcomes that require long-term patient follow-up and/or are invasive to obtain. While much progress has been made in the development of statistical methods to identify surrogate markers, that is, measurements that could replace such costly outcomes, they are generally not applicable to studies with a small sample size. These methods either rely on nonparametric smoothing which requires a relatively large sample size or rely on strict model assumptions that are unlikely to hold in practice and empirically difficult to verify with a small sample size. In this paper, we develop a novel rank-based nonparametric approach to evaluate a surrogate marker in a small sample size setting. The method developed in this paper is motivated by a small study of children with nonalcoholic fatty liver disease (NAFLD), a diagnosis for a range of liver conditions in individuals without significant history of alcohol intake. Specifically, we examine whether change in alanine aminotransferase (ALT; measured in blood) is a surrogate marker for change in NAFLD activity score (obtained by biopsy) in a trial, which compared Vitamin E ($n=50$) versus placebo ($n=46$) among children with NAFLD.
Asunto(s)
Enfermedad del Hígado Graso no Alcohólico , Niño , Humanos , Enfermedad del Hígado Graso no Alcohólico/diagnóstico , Biomarcadores , Biopsia , Tamaño de la MuestraRESUMEN
The expected value of the standard power function of a test, computed with respect to a design prior distribution, is often used to evaluate the probability of success of an experiment. However, looking only at the expected value might be reductive. Instead, the whole probability distribution of the power function induced by the design prior can be exploited. In this article we consider one-sided testing for the scale parameter of exponential families and we derive general unifying expressions for cumulative distribution and density functions of the random power. Sample size determination criteria based on alternative summaries of these functions are discussed. The study sheds light on the relevance of the choice of the design prior in order to construct a successful experiment.
Asunto(s)
Teorema de Bayes , Humanos , Probabilidad , Tamaño de la MuestraRESUMEN
In clinical settings with no commonly accepted standard-of-care, multiple treatment regimens are potentially useful, but some treatments may not be appropriate for some patients. A personalized randomized controlled trial (PRACTical) design has been proposed for this setting. For a network of treatments, each patient is randomized only among treatments which are appropriate for them. The aim is to produce treatment rankings that can inform clinical decisions about treatment choices for individual patients. Here we propose methods for determining sample size in a PRACTical design, since standard power-based methods are not applicable. We derive a sample size by evaluating information gained from trials of varying sizes. For a binary outcome, we quantify how many adverse outcomes would be prevented by choosing the top-ranked treatment for each patient based on trial results rather than choosing a random treatment from the appropriate personalized randomization list. In simulations, we evaluate three performance measures: mean reduction in adverse outcomes using sample information, proportion of simulated patients for whom the top-ranked treatment performed as well or almost as well as the best appropriate treatment, and proportion of simulated trials in which the top-ranked treatment performed better than a randomly chosen treatment. We apply the methods to a trial evaluating eight different combination antibiotic regimens for neonatal sepsis (NeoSep1), in which a PRACTical design addresses varying patterns of antibiotic choice based on disease characteristics and resistance. Our proposed approach produces results that are more relevant to complex decision making by clinicians and policy makers.
Asunto(s)
Medicina de Precisión , Ensayos Clínicos Controlados Aleatorios como Asunto , Humanos , Ensayos Clínicos Controlados Aleatorios como Asunto/métodos , Tamaño de la Muestra , Medicina de Precisión/métodos , Simulación por Computador , Recién Nacido , Sepsis/tratamiento farmacológico , Modelos EstadísticosRESUMEN
Sample size formulas have been proposed for comparing two sensitivities (specificities) in the presence of verification bias under a paired design. However, the existing sample size formulas involve lengthy calculations of derivatives and are too complicated to implement. In this paper, we propose alternative sample size formulas for each of three existing tests, two Wald tests and one weighted McNemar's test. The proposed sample size formulas are more intuitive and simpler to implement than their existing counterparts. Furthermore, by comparing the sample sizes calculated based on the three tests, we can show that the three tests have similar sample sizes even though the weighted McNemar's test only use the data from discordant pairs whereas the two Wald tests also use the additional data from accordant pairs.
Asunto(s)
Sensibilidad y Especificidad , Tamaño de la Muestra , Humanos , Modelos Estadísticos , Sesgo , Simulación por ComputadorRESUMEN
This article is concerned with sample size determination methodology for prediction models. We propose to combine the individual calculations via learning-type curves. We suggest two distinct ways of doing so, a deterministic skeleton of a learning curve and a Gaussian process centered upon its deterministic counterpart. We employ several learning algorithms for modeling the primary endpoint and distinct measures for trial efficacy. We find that the performance may vary with the sample size, but borrowing information across sample size universally improves the performance of such calculations. The Gaussian process-based learning curve appears more robust and statistically efficient, while computational efficiency is comparable. We suggest that anchoring against historical evidence when extrapolating sample sizes should be adopted when such data are available. The methods are illustrated on binary and survival endpoints.
Asunto(s)
Algoritmos , Modelos Estadísticos , Humanos , Tamaño de la Muestra , Curva de Aprendizaje , Distribución Normal , Simulación por Computador , Análisis de SupervivenciaRESUMEN
The US FDA's Project Optimus initiative that emphasizes dose optimization prior to marketing approval represents a pivotal shift in oncology drug development. It has a ripple effect for rethinking what changes may be made to conventional pivotal trial designs to incorporate a dose optimization component. Aligned with this initiative, we propose a novel seamless phase II/III design with dose optimization (SDDO framework). The proposed design starts with dose optimization in a randomized setting, leading to an interim analysis focused on optimal dose selection, trial continuation decisions, and sample size re-estimation (SSR). Based on the decision at interim analysis, patient enrollment continues for both the selected dose arm and control arm, and the significance of treatment effects will be determined at final analysis. The SDDO framework offers increased flexibility and cost-efficiency through sample size adjustment, while stringently controlling the Type I error. This proposed design also facilitates both accelerated approval (AA) and regular approval in a "one-trial" approach. Extensive simulation studies confirm that our design reliably identifies the optimal dosage and makes preferable decisions with a reduced sample size while retaining statistical power.