Search | VHL Regional Portal

1.

Upstrapping to determine futility: predicting future outcomes nonparametrically from past data.

Wild, Jessica L; Ginde, Adit A; Lindsell, Christopher J; Kaizer, Alexander M.

Trials ; 25(1): 312, 2024 May 09.

Article in English | MEDLINE | ID: mdl-38725072

ABSTRACT

BACKGROUND: Clinical trials often involve some form of interim monitoring to determine futility before planned trial completion. While many options for interim monitoring exist (e.g., alpha-spending, conditional power), nonparametric based interim monitoring methods are also needed to account for more complex trial designs and analyses. The upstrap is one recently proposed nonparametric method that may be applied for interim monitoring. METHODS: Upstrapping is motivated by the case resampling bootstrap and involves repeatedly sampling with replacement from the interim data to simulate thousands of fully enrolled trials. The p-value is calculated for each upstrapped trial and the proportion of upstrapped trials for which the p-value criteria are met is compared with a pre-specified decision threshold. To evaluate the potential utility for upstrapping as a form of interim futility monitoring, we conducted a simulation study considering different sample sizes with several different proposed calibration strategies for the upstrap. We first compared trial rejection rates across a selection of threshold combinations to validate the upstrapping method. Then, we applied upstrapping methods to simulated clinical trial data, directly comparing their performance with more traditional alpha-spending and conditional power interim monitoring methods for futility. RESULTS: The method validation demonstrated that upstrapping is much more likely to find evidence of futility in the null scenario than the alternative across a variety of simulations settings. Our three proposed approaches for calibration of the upstrap had different strengths depending on the stopping rules used. Compared to O'Brien-Fleming group sequential methods, upstrapped approaches had type I error rates that differed by at most 1.7% and expected sample size was 2-22% lower in the null scenario, while in the alternative scenario power fluctuated between 15.7% lower and 0.2% higher and expected sample size was 0-15% lower. CONCLUSIONS: In this proof-of-concept simulation study, we evaluated the potential for upstrapping as a resampling-based method for futility monitoring in clinical trials. The trade-offs in expected sample size, power, and type I error rate control indicate that the upstrap can be calibrated to implement futility monitoring with varying degrees of aggressiveness and that performance similarities can be identified relative to considered alpha-spending and conditional power futility monitoring methods.

Subject(s)

Clinical Trials as Topic , Computer Simulation , Medical Futility , Research Design , Humans , Clinical Trials as Topic/methods , Sample Size , Data Interpretation, Statistical , Models, Statistical , Treatment Outcome

2.

Using Bayesian statistics in confirmatory clinical trials in the regulatory setting: a tutorial review.

Lee, Se Yoon.

BMC Med Res Methodol ; 24(1): 110, 2024 May 07.

Article in English | MEDLINE | ID: mdl-38714936

ABSTRACT

Bayesian statistics plays a pivotal role in advancing medical science by enabling healthcare companies, regulators, and stakeholders to assess the safety and efficacy of new treatments, interventions, and medical procedures. The Bayesian framework offers a unique advantage over the classical framework, especially when incorporating prior information into a new trial with quality external data, such as historical data or another source of co-data. In recent years, there has been a significant increase in regulatory submissions using Bayesian statistics due to its flexibility and ability to provide valuable insights for decision-making, addressing the modern complexity of clinical trials where frequentist trials are inadequate. For regulatory submissions, companies often need to consider the frequentist operating characteristics of the Bayesian analysis strategy, regardless of the design complexity. In particular, the focus is on the frequentist type I error rate and power for all realistic alternatives. This tutorial review aims to provide a comprehensive overview of the use of Bayesian statistics in sample size determination, control of type I error rate, multiplicity adjustments, external data borrowing, etc., in the regulatory environment of clinical trials. Fundamental concepts of Bayesian sample size determination and illustrative examples are provided to serve as a valuable resource for researchers, clinicians, and statisticians seeking to develop more complex and innovative designs.

Subject(s)

Bayes Theorem , Clinical Trials as Topic , Humans , Clinical Trials as Topic/methods , Clinical Trials as Topic/statistics & numerical data , Research Design/standards , Sample Size , Data Interpretation, Statistical , Models, Statistical

3.

Replication of null results: Absence of evidence or evidence of absence?

Pawel, Samuel; Heyard, Rachel; Micheloud, Charlotte; Held, Leonhard.

Elife ; 122024 May 13.

Article in English | MEDLINE | ID: mdl-38739437

ABSTRACT

In several large-scale replication projects, statistically non-significant results in both the original and the replication study have been interpreted as a 'replication success.' Here, we discuss the logical problems with this approach: Non-significance in both studies does not ensure that the studies provide evidence for the absence of an effect and 'replication success' can virtually always be achieved if the sample sizes are small enough. In addition, the relevant error rates are not controlled. We show how methods, such as equivalence testing and Bayes factors, can be used to adequately quantify the evidence for the absence of an effect and how they can be applied in the replication setting. Using data from the Reproducibility Project: Cancer Biology, the Experimental Philosophy Replicability Project, and the Reproducibility Project: Psychology we illustrate that many original and replication studies with 'null results' are in fact inconclusive. We conclude that it is important to also replicate studies with statistically non-significant results, but that they should be designed, analyzed, and interpreted appropriately.

Subject(s)

Bayes Theorem , Reproducibility of Results , Humans , Research Design , Sample Size , Data Interpretation, Statistical

4.

Innovative statistical approaches: the use of neural networks reduces the sample size in the splenectomy-MCAO mouse model.

Romic, Dominik; Berecki, Monika; Srakocic, Sanja; Josic Dominovic, Paula; Justic, Helena; Hamer, Dominik; Petrinec, Daniela; Radmilovic, Marina; Hackenberger, Branimir; Gajovic, Srecko; Glasnovic, Anton.

Croat Med J ; 65(2): 122-137, 2024 Apr 30.

Article in English | MEDLINE | ID: mdl-38706238

ABSTRACT

AIM: To compare the effectiveness of artificial neural network (ANN) and traditional statistical analysis on identical data sets within the splenectomy-middle carotid artery occlusion (MCAO) mouse model. METHODS: Mice were divided into the splenectomized (SPLX) and sham-operated (SPLX-sham) group. A splenectomy was conducted 14 days before middle carotid artery occlusion (MCAO). Magnetic resonance imaging (MRI), bioluminescent imaging, neurological scoring (NS), and histological analysis, were conducted at two, four, seven, and 28 days after MCAO. Frequentist statistical analyses and ANN analysis employing a multi-layer perceptron architecture were performed to assess the probability of discriminating between SPLX and SPLX-sham mice. RESULTS: Repeated measures ANOVA showed no significant differences in body weight (F (5, 45)=0.696, P=0.629), NS (F (2.024, 18.218)=1.032, P=0.377) and brain infarct size on MRI between the SPLX and SPLX-sham groups post-MCAO (F (2, 24)=0.267, P=0.768). ANN analysis was employed to predict SPLX and SPL-sham classes. The highest accuracy in predicting SPLX class was observed when the model was trained on a data set containing all variables (0.7736±0.0234). For SPL-sham class, the highest accuracy was achieved when it was trained on a data set excluding the variable combination MR contralateral/animal mass/NS (0.9284±0.0366). CONCLUSION: This study validated the neuroprotective impact of splenectomy in an MCAO model using ANN for data analysis with a reduced animal sample size, demonstrating the potential for leveraging advanced statistical methods to minimize sample sizes in experimental biomedical research.

Subject(s)

Disease Models, Animal , Infarction, Middle Cerebral Artery , Magnetic Resonance Imaging , Neural Networks, Computer , Splenectomy , Animals , Mice , Splenectomy/methods , Infarction, Middle Cerebral Artery/surgery , Infarction, Middle Cerebral Artery/diagnostic imaging , Sample Size , Male

5.

Operational complexities in international clinical trials: a systematic review of challenges and proposed solutions.

Gumber, Leher; Agbeleye, Opeyemi; Inskip, Alex; Fairbairn, Ross; Still, Madeleine; Ouma, Luke; Lozano-Kuehne, Jingky; Bardgett, Michelle; Isaacs, John D; Wason, James Ms; Craig, Dawn; Pratt, Arthur G.

BMJ Open ; 14(4): e077132, 2024 Apr 15.

Article in English | MEDLINE | ID: mdl-38626966

ABSTRACT

OBJECTIVE: International trials can be challenging to operationalise due to incompatibilities between country-specific policies and infrastructures. The aim of this systematic review was to identify the operational complexities of conducting international trials and identify potential solutions for overcoming them. DESIGN: Systematic review. DATA SOURCES: Medline, Embase and Health Management Information Consortium were searched from 2006 to 30 January 2023. ELIGIBILITY CRITERIA: All studies reporting operational challenges (eg, site selection, trial management, intervention management, data management) of conducting international trials were included. DATA EXTRACTION AND SYNTHESIS: Search results were independently screened by at least two reviewers and data were extracted into a proforma. RESULTS: 38 studies (35 RCTs, 2 reports and 1 qualitative study) fulfilled the inclusion criteria. The median sample size was 1202 (IQR 332-4056) and median number of sites was 40 (IQR 13-78). 88.6% of studies had an academic sponsor and 80% were funded through government sources. Operational complexities were particularly reported during trial set-up due to lack of harmonisation in regulatory approvals and in relation to sponsorship structure, with associated budgetary impacts. Additional challenges included site selection, staff training, lengthy contract negotiations, site monitoring, communication, trial oversight, recruitment, data management, drug procurement and distribution, pharmacy involvement and biospecimen processing and transport. CONCLUSIONS: International collaborative trials are valuable in cases where recruitment may be difficult, diversifying participation and applicability. However, multiple operational and regulatory challenges are encountered when implementing a trial in multiple countries. Careful planning and communication between trials units and investigators, with an emphasis on establishing adequately resourced cross-border sponsorship structures and regulatory approvals, may help to overcome these barriers and realise the benefits of the approach. OPEN SCIENCE FRAMEWORK REGISTRATION NUMBER: osf-registrations-yvtjb-v1.

Subject(s)

Pharmacy , Humans , Sample Size , Budgets

6.

Beta spending function based on conditional power in group sequential design.

Ni, Senmiao; Zhong, Zihang; Jiang, Zhiwei; Zhao, Yang; Wu, Jingwei; Yu, Hao; Bai, Jianling.

Biom J ; 66(3): e2300094, 2024 Apr.

Article in English | MEDLINE | ID: mdl-38581099

ABSTRACT

Conditional power (CP) serves as a widely utilized approach for futility monitoring in group sequential designs. However, adopting the CP methods may lead to inadequate control of the type II error rate at the desired level. In this study, we introduce a flexible beta spending function tailored to regulate the type II error rate while employing CP based on a predetermined standardized effect size for futility monitoring (a so-called CP-beta spending function). This function delineates the expenditure of type II error rate across the entirety of the trial. Unlike other existing beta spending functions, the CP-beta spending function seamlessly incorporates beta spending concept into the CP framework, facilitating precise stagewise control of the type II error rate during futility monitoring. In addition, the stopping boundaries derived from the CP-beta spending function can be calculated via integration akin to other traditional beta spending function methods. Furthermore, the proposed CP-beta spending function accommodates various thresholds on the CP-scale at different stages of the trial, ensuring its adaptability across different information time scenarios. These attributes render the CP-beta spending function competitive among other forms of beta spending functions, making it applicable to any trials in group sequential designs with straightforward implementation. Both simulation study and example from an acute ischemic stroke trial demonstrate that the proposed method accurately captures expected power, even when the initially determined sample size does not consider futility stopping, and exhibits a good performance in maintaining overall type I error rates for evident futility.

Subject(s)

Ischemic Stroke , Research Design , Humans , Sample Size , Computer Simulation , Medical Futility

7.

Factors Associated With Premature Termination of Hyperacute Stroke Trials: A Review.

Aziz, Yasmin N; Sucharew, Heidi; Reeves, Mathew J; Broderick, Joseph P.

J Am Heart Assoc ; 13(8): e034115, 2024 Apr 16.

Article in English | MEDLINE | ID: mdl-38606770

ABSTRACT

BACKGROUND: We performed a review of acute stroke trials to determine features associated with premature termination of trial enrollment, defined by the authors as not meeting preplanned sample size. METHODS AND RESULTS: MEDLINE was searched for randomized clinical stroke trials published in 9 major clinical journals between 2013 and 2022. We included randomized clinical trials that were phase 2 or 3 with a preplanned sample size ≥100 and a time-to-treatment within 24 hours of onset for transient ischemic attack, ischemic stroke, or intracerebral hemorrhage. Data were abstracted on trial features including trial design, inclusion criteria, imaging, location and number of sites, masking, treatment complexity, control group (standard therapy, placebo), industry involvement, and preplanned stopping rules (futility and efficacy). Least absolute shrinkage and selection operator regression was used to select the most important factors associated with premature termination; then, a multivariable logistic regression was fit including only the least absolute shrinkage and selection operator selected variables. Of 1475 studies assessed, 98 trials met eligibility criteria. Forty-five (46%) trials were prematurely terminated, of which 27% were stopped for benefit/efficacy, 20% for lack of money/slow enrollment, 18% for futility, 16% for newly available evidence, 17% for other reasons, and 4% due to harm. Complex trials (adjusted odds ratio [aOR], 2.76 [95% CI, 1.13-7.49]), presence of a futility rule (aOR, 4.43 [95% CI, 1.62-17.91]), and exclusion of prestroke dependency (none/slight disability only; aOR, 2.19 [95% CI, 0.84-6.72] versus dependency allowed) were identified as the strongest predictors. CONCLUSIONS: Nearly half of acute stroke trials were terminated prematurely. Broadening inclusion criteria and simplifying trial design may decrease the likelihood of unplanned termination, whereas planned futility analyses may appropriately terminate trials early, saving money and resources.

Subject(s)

Ischemic Attack, Transient , Ischemic Stroke , Stroke , Humans , Stroke/therapy , Stroke/drug therapy , Cerebral Hemorrhage , Sample Size

8.

A response to "Realism and robustness require increased sample size when studying both sexes".

Phillips, Benjamin; Haschler, Timo N; Karp, Natasha A.

PLoS Biol ; 22(4): e3002578, 2024 Apr.

Article in English | MEDLINE | ID: mdl-38607970

Subject(s)

Sample Size , Female , Male , Humans

9.

Assessing the performance of physician's prescribing preference as an instrumental variable in comparative effectiveness research with moderate and small sample sizes: a simulation study.

Zhang, Lisong; Lewsey, Jim; McAllister, David A.

J Comp Eff Res ; 13(5): e230044, 2024 05.

Article in English | MEDLINE | ID: mdl-38567966

ABSTRACT

Aim: This simulation study is to assess the utility of physician's prescribing preference (PPP) as an instrumental variable for moderate and smaller sample sizes. Materials & methods: We designed a simulation study to imitate a comparative effectiveness research under different sample sizes. We compare the performance of instrumental variable (IV) and non-IV approaches using two-stage least squares (2SLS) and ordinary least squares (OLS) methods, respectively. Further, we test the performance of different forms of proxies for PPP as an IV. Results: The percent bias of 2SLS is around approximately 20%, while the percent bias of OLS is close to 60%. The sample size is not associated with the level of bias for the PPP IV approach. Conclusion: Irrespective of sample size, the PPP IV approach leads to less biased estimates of treatment effectiveness than OLS adjusting for known confounding only. Particularly for smaller sample sizes, we recommend constructing PPP from long prescribing histories to improve statistical power.

Subject(s)

Comparative Effectiveness Research , Computer Simulation , Practice Patterns, Physicians' , Humans , Comparative Effectiveness Research/methods , Sample Size , Practice Patterns, Physicians'/statistics & numerical data , Least-Squares Analysis , Bias

10.

Root surface roughness evaluation following application of different periodontal instruments and Er:YAG laser: A profilometry and SEM study.

Ince Kuka, Gizem; Gürsoy, Hare.

Lasers Med Sci ; 39(1): 98, 2024 Apr 07.

Article in English | MEDLINE | ID: mdl-38583109

ABSTRACT

AIM: The aim of the present study was to evaluate the efficacy of 30°-angled Er:YAG laser tip and different periodontal instruments on root surface roughness and morphology in vitro. METHODS: Eighteen bovine teeth root without carious lesion were decoronated from the cementoenamel junction and seperated longitidunally. A total of 36 obtained blocks were mounted in resin blocks and polished with silicon carbide papers under water irrigation. These blocks were randomly assigned into 3 treatment groups. In Group 1, 30°-angled Er:YAG laser (2.94 µm) tip was applied onto the blocks with a 20 Hz, 120 mJ energy output under water irrigation for 20 s. In Groups 2 and 3, the same treatment was applied to the blocks with new generation ultrasonic tip and conventional curette apico-coronally for 20 s with a sweeping motion. Surface roughness and morphology were evaluated before and after instrumentation with a profilometer and SEM, respectively. RESULTS: After instrumentation, profilometric analysis revealed significantly higher roughness values compared to baseline in all treatment groups(p < 0.05). Laser group revealed the roughest surface morphology followed by conventional curette and new generation ultrasonic tip treatment groups (p < 0.05). In SEM analysis, irregular surfaces and crater defects were seen more frequently in the laser group. CONCLUSION: Results of the study showed that the use of new generation ultrasonic tip was associated with smoother surface morphology compared to 30°-angled Er-YAG laser tip and conventional curette. Further in vitro and in vivo studies with an increased sample size are necessary to support the present study findings.

Subject(s)

Lasers, Solid-State , Animals , Cattle , Lasers, Solid-State/therapeutic use , Research Design , Sample Size , Tooth Cervix , Water

11.

Effect of scanning duration and sample size on reliability in resting state fMRI dynamic causal modeling analysis.

Ma, Liangsuo; Braun, Sarah E; Steinberg, Joel L; Bjork, James M; Martin, Caitlin E; Keen Ii, Larry D; Moeller, F Gerard.

Neuroimage ; 292: 120604, 2024 Apr 15.

Article in English | MEDLINE | ID: mdl-38604537

ABSTRACT

Despite its widespread use, resting-state functional magnetic resonance imaging (rsfMRI) has been criticized for low test-retest reliability. To improve reliability, researchers have recommended using extended scanning durations, increased sample size, and advanced brain connectivity techniques. However, longer scanning runs and larger sample sizes may come with practical challenges and burdens, especially in rare populations. Here we tested if an advanced brain connectivity technique, dynamic causal modeling (DCM), can improve reliability of fMRI effective connectivity (EC) metrics to acceptable levels without extremely long run durations or extremely large samples. Specifically, we employed DCM for EC analysis on rsfMRI data from the Human Connectome Project. To avoid bias, we assessed four distinct DCMs and gradually increased sample sizes in a randomized manner across ten permutations. We employed pseudo true positive and pseudo false positive rates to assess the efficacy of shorter run durations (3.6, 7.2, 10.8, 14.4 min) in replicating the outcomes of the longest scanning duration (28.8 min) when the sample size was fixed at the largest (n = 160 subjects). Similarly, we assessed the efficacy of smaller sample sizes (n = 10, 20, , 150 subjects) in replicating the outcomes of the largest sample (n = 160 subjects) when the scanning duration was fixed at the longest (28.8 min). Our results revealed that the pseudo false positive rate was below 0.05 for all the analyses. After the scanning duration reached 10.8 min, which yielded a pseudo true positive rate of 92%, further extensions in run time showed no improvements in pseudo true positive rate. Expanding the sample size led to enhanced pseudo true positive rate outcomes, with a plateau at n = 70 subjects for the targeted top one-half of the largest ECs in the reference sample, regardless of whether the longest run duration (28.8 min) or the viable run duration (10.8 min) was employed. Encouragingly, smaller sample sizes exhibited pseudo true positive rates of approximately 80% for n = 20, and 90% for n = 40 subjects. These data suggest that advanced DCM analysis may be a viable option to attain reliable metrics of EC when larger sample sizes or run times are not feasible.

Subject(s)

Brain , Connectome , Magnetic Resonance Imaging , Humans , Magnetic Resonance Imaging/methods , Magnetic Resonance Imaging/standards , Sample Size , Connectome/methods , Connectome/standards , Reproducibility of Results , Brain/diagnostic imaging , Brain/physiology , Adult , Female , Male , Rest/physiology , Time Factors

12.

Bioequivalence trials for the approval of generic drugs in Saudi Arabia: a descriptive analysis of design aspects.

Althunian, Turki A; Alzenaidy, Bader R; Alroba, Raseel A; Almadani, Ohoud A; Alqahtani, Fahad A; Binajlan, Albatool A; Almousa, Amal I; Alamr, Deema K; Al-Mofada, Malak S; Alsaqer, Nora Y; Alarfaj, Hessa A; Bahlewa, Abdulmohsen A; Alharbi, Mohammed A; Alhomaidan, Ali M; Alsuwyeh, Abdulaziz A; Alsaleh, Abdulmohsen A.

BMC Med Res Methodol ; 24(1): 82, 2024 Apr 05.

Article in English | MEDLINE | ID: mdl-38580928

ABSTRACT

BACKGROUND: This retrospective analysis aimed to comprehensively review the design and regulatory aspects of bioequivalence trials submitted to the Saudi Food and Drug Authority (SFDA) since 2017. METHODS: This was a retrospective, comprehensive analysis study. The Data extracted from the SFDA bioequivalence assessment reports were analyzed for reviewing the overall design and regulatory aspects of the successful bioequivalence trials, exploring the impact of the coefficient of variation of within-subject variability (CVw) on some design aspects, and providing an in-depth assessment of bioequivalence trial submissions that were deemed insufficient in demonstrating bioequivalence. RESULTS: A total of 590 bioequivalence trials were included of which 521 demonstrated bioequivalence (440 single active pharmaceutical ingredients [APIs] and 81 fixed combinations). Most of the successful trials were for cardiovascular drugs (84 out of 521 [16.1%]), and the 2 × 2 crossover design was used in 455 (87.3%) trials. The sample size tended to increase with the increase in the CVw in trials of single APIs. Biopharmaceutics Classification System Class II and IV drugs accounted for the majority of highly variable drugs (58 out of 82 [70.7%]) in the study. Most of the 51 rejected trials were rejected due to concerns related to the study center (n = 21 [41.2%]). CONCLUSION: This comprehensive analysis provides valuable insights into the regulatory and design aspects of bioequivalence trials and can inform future research and assist in identifying opportunities for improvement in conducting bioequivalence trials in Saudi Arabia.

Subject(s)

Drugs, Generic , Humans , Therapeutic Equivalency , Drugs, Generic/therapeutic use , Saudi Arabia , Retrospective Studies , Sample Size

13.

Realism and robustness require increased sample size when studying both sexes.

Drobniak, Szymon M; Lagisz, Malgorzata; Yang, Yefeng; Nakagawa, Shinichi.

PLoS Biol ; 22(4): e3002456, 2024 Apr.

Article in English | MEDLINE | ID: mdl-38603525

ABSTRACT

A recent article claimed that researchers need not increase the overall sample size for a study that includes both sexes. This Formal Comment points out that that study assumed two sexes to have the same variance, and explains why this is a unrealistic assumption.

Subject(s)

Research Design , Male , Female , Humans , Sample Size

14.

phylaGAN: data augmentation through conditional GANs and autoencoders for improving disease prediction accuracy using microbiome data.

Sharma, Divya; Lou, Wendy; Xu, Wei.

Bioinformatics ; 40(4)2024 Mar 29.

Article in English | MEDLINE | ID: mdl-38569898

ABSTRACT

MOTIVATION: Research is improving our understanding of how the microbiome interacts with the human body and its impact on human health. Existing machine learning methods have shown great potential in discriminating healthy from diseased microbiome states. However, Machine Learning based prediction using microbiome data has challenges such as, small sample size, imbalance between cases and controls and high cost of collecting large number of samples. To address these challenges, we propose a deep learning framework phylaGAN to augment the existing datasets with generated microbiome data using a combination of conditional generative adversarial network (C-GAN) and autoencoder. Conditional generative adversarial networks train two models against each other to compute larger simulated datasets that are representative of the original dataset. Autoencoder maps the original and the generated samples onto a common subspace to make the prediction more accurate. RESULTS: Extensive evaluation and predictive analysis was conducted on two datasets, T2D study and Cirrhosis study showing an improvement in mean AUC using data augmentation by 11% and 5% respectively. External validation on a cohort classifying between obese and lean subjects, with a smaller sample size provided an improvement in mean AUC close to 32% when augmented through phylaGAN as compared to using the original cohort. Our findings not only indicate that the generative adversarial networks can create samples that mimic the original data across various diversity metrics, but also highlight the potential of enhancing disease prediction through machine learning models trained on synthetic data. AVAILABILITY AND IMPLEMENTATION: https://github.com/divya031090/phylaGAN.

Subject(s)

Benchmarking , Microbiota , Humans , Machine Learning , Sample Size

15.

Sample size planning for rank-based multiple contrast tests.

Pöhlmann, Anna; Brunner, Edgar; Konietschke, Frank.

Biom J ; 66(3): e2300240, 2024 Apr.

Article in English | MEDLINE | ID: mdl-38637304

ABSTRACT

Rank methods are well-established tools for comparing two or multiple (independent) groups. Statistical planning methods for the computing the required sample size(s) to detect a specific alternative with predefined power are lacking. In the present paper, we develop numerical algorithms for sample size planning of pseudo-rank-based multiple contrast tests. We discuss the treatment effects and different ways to approximate variance parameters within the estimation scheme. We further compare pairwise with global rank methods in detail. Extensive simulation studies show that the sample size estimators are accurate. A real data example illustrates the application of the methods.

Subject(s)

Algorithms , Models, Statistical , Sample Size , Computer Simulation

16.

On repeated diagnostic testing in screening for a medical condition: How often should the diagnostic test be repeated?

Sangnawakij, Patarawan; Böhning, Dankmar.

Biom J ; 66(3): e2300175, 2024 Apr.

Article in English | MEDLINE | ID: mdl-38637326

ABSTRACT

In screening large populations a diagnostic test is frequently used repeatedly. An example is screening for bowel cancer using the fecal occult blood test (FOBT) on several occasions such as at 3 or 6 days. The question that is addressed here is how often should we repeat a diagnostic test when screening for a specific medical condition. Sensitivity is often used as a performance measure of a diagnostic test and is considered here for the individual application of the diagnostic test as well as for the overall screening procedure. The latter can involve an increasingly large number of repeated applications, but how many are sufficient? We demonstrate the issues involved in answering this question using real data on bowel cancer at St Vincents Hospital in Sydney. As data are only available for those testing positive at least once, an appropriate modeling technique is developed on the basis of the zero-truncated binomial distribution which allows for population heterogeneity. The latter is modeled using discrete nonparametric maximum likelihood. If we wish to achieve an overall sensitivity of 90%, the FOBT should be repeated for 2 weeks instead of the 1 week that was used at the time of the survey. A simulation study also shows consistency in the sense that bias and standard deviation for the estimated sensitivity decrease with an increasing number of repeated occasions as well as with increasing sample size.

Subject(s)

Colorectal Neoplasms , Humans , Colorectal Neoplasms/diagnosis , Occult Blood , Sample Size , Diagnostic Tests, Routine , Mass Screening/methods

17.

Penalized weighted smoothed quantile regression for high-dimensional longitudinal data.

Song, Yanan; Han, Haohui; Fu, Liya; Wang, Ting.

Stat Med ; 43(10): 2007-2042, 2024 May 10.

Article in English | MEDLINE | ID: mdl-38634309

ABSTRACT

Quantile regression, known as a robust alternative to linear regression, has been widely used in statistical modeling and inference. In this paper, we propose a penalized weighted convolution-type smoothed method for variable selection and robust parameter estimation of the quantile regression with high dimensional longitudinal data. The proposed method utilizes a twice-differentiable and smoothed loss function instead of the check function in quantile regression without penalty, and can select the important covariates consistently using the efficient gradient-based iterative algorithms when the dimension of covariates is larger than the sample size. Moreover, the proposed method can circumvent the influence of outliers in the response variable and/or the covariates. To incorporate the correlation within each subject and enhance the accuracy of the parameter estimation, a two-step weighted estimation method is also established. Furthermore, we prove the oracle properties of the proposed method under some regularity conditions. Finally, the performance of the proposed method is demonstrated by simulation studies and two real examples.

Subject(s)

Algorithms , Models, Statistical , Humans , Computer Simulation , Linear Models , Sample Size

18.

On the distribution of the power function for the scale parameter of exponential families.

De Santis, Fulvio; Gubbiotti, Stefania.

Stat Med ; 43(10): 1973-1992, 2024 May 10.

Article in English | MEDLINE | ID: mdl-38634314

ABSTRACT

The expected value of the standard power function of a test, computed with respect to a design prior distribution, is often used to evaluate the probability of success of an experiment. However, looking only at the expected value might be reductive. Instead, the whole probability distribution of the power function induced by the design prior can be exploited. In this article we consider one-sided testing for the scale parameter of exponential families and we derive general unifying expressions for cumulative distribution and density functions of the random power. Sample size determination criteria based on alternative summaries of these functions are discussed. The study sheds light on the relevance of the choice of the design prior in order to construct a successful experiment.

Subject(s)

Bayes Theorem , Humans , Probability , Sample Size

19.

GRADE guidance 37: rating imprecision in a body of evidence on test accuracy.

Mustafa, Reem A; El Mikati, Ibrahim K; Murad, M Hassan; Hultcrantz, Monica; Steingart, Karen R; Yang, Bada; Leeflang, Mariska M G; Akl, Elie A; Dahm, Philipp; Schünemann, Holger J.

J Clin Epidemiol ; 165: 111189, 2024 Jan.

Article in English | MEDLINE | ID: mdl-38613246

ABSTRACT

OBJECTIVES: To provide guidance on rating imprecision in a body of evidence assessing the accuracy of a single test. This guide will clarify when Grading of Recommendations Assessment, Development and Evaluation (GRADE) users should consider rating down the certainty of evidence by one or more levels for imprecision in test accuracy. STUDY DESIGN AND SETTING: A project group within the GRADE working group conducted iterative discussions and presentations at GRADE working group meetings to produce this guidance. RESULTS: Before rating the certainty of evidence, GRADE users should define the target of their certainty rating. GRADE recommends setting judgment thresholds defining what they consider a very accurate, accurate, inaccurate, and very inaccurate test. These thresholds should be set after considering consequences of testing and effects on people-important outcomes. GRADE's primary criterion for judging imprecision in test accuracy evidence is considering confidence intervals (i.e., CI approach) of absolute test accuracy results (true and false, positive, and negative results in a cohort of people). Based on the CI approach, when a CI appreciably crosses the predefined judgment threshold(s), one should consider rating down certainty of evidence by one or more levels, depending on the number of thresholds crossed. When the CI does not cross judgment threshold(s), GRADE suggests considering the sample size for an adequately powered test accuracy review (optimal or review information size [optimal information size (OIS)/review information size (RIS)]) in rating imprecision. If the combined sample size of the included studies in the review is smaller than the required OIS/RIS, one should consider rating down by one or more levels for imprecision. CONCLUSION: This paper extends previous GRADE guidance for rating imprecision in single test accuracy systematic reviews and guidelines, with a focus on the circumstances in which one should consider rating down one or more levels for imprecision.

Subject(s)

GRADE Approach , Group Processes , Humans , Judgment , Sample Size

20.

Craniocervical instability in patients with EhlersDanlos syndromes: outcomes analysis following occipitocervical fusion of published cases.

Mughal, Zaib Un Nisa; Malik, Abdul.

Neurosurg Rev ; 47(1): 158, 2024 Apr 16.

Article in English | MEDLINE | ID: mdl-38625445

ABSTRACT

This critique provides a critical analysis of the outcomes following occipito-cervical fusion in patients with Ehlers-Danlos syndromes (EDS) and craniocervical instability. The study examines the efficacy of the surgical intervention and evaluates its impact on patient outcomes. While the article offers valuable insights into the management of EDS-related craniocervical instability, several limitations and areas for improvement are identified, including sample size constraints, the absence of a control group, and the need for long-term follow-up data. Future research efforts should focus on addressing these concerns to optimize treatment outcomes for individuals with EDS.

Subject(s)

Publications , Spinal Fusion , Humans , Sample Size

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL