Pesquisa | Portal de Pesquisa da BVS Enfermagem

1.

Identification of hidden associations among eukaryotic genes through statistical analysis of coevolutionary transitions.

Dembech, Elena; Malatesta, Marco; De Rito, Carlo; Mori, Giulia; Cavazzini, Davide; Secchi, Andrea; Morandin, Francesco; Percudani, Riccardo.

Proc Natl Acad Sci U S A ; 120(16): e2218329120, 2023 04 18.

Artigo em Inglês | MEDLINE | ID: mdl-37043529

RESUMO

Coevolution at the gene level, as reflected by correlated events of gene loss or gain, can be revealed by phylogenetic profile analysis. The optimal method and metric for comparing phylogenetic profiles, especially in eukaryotic genomes, are not yet established. Here, we describe a procedure suitable for large-scale analysis, which can reveal coevolution based on the assessment of the statistical significance of correlated presence/absence transitions between gene pairs. This metric can identify coevolution in profiles with low overall similarities and is not affected by similarities lacking coevolutionary information. We applied the procedure to a large collection of 60,912 orthologous gene groups (orthogroups) in 1,264 eukaryotic genomes extracted from OrthoDB. We found significant cotransition scores for 7,825 orthogroups associated in 2,401 coevolving modules linking known and unknown genes in protein complexes and biological pathways. To demonstrate the ability of the method to predict hidden gene associations, we validated through experiments the involvement of vertebrate malate synthase-like genes in the conversion of (S)-ureidoglycolate into glyoxylate and urea, the last step of purine catabolism. This identification explains the presence of glyoxylate cycle genes in metazoa and suggests an anaplerotic role of purine degradation in early eukaryotes.

Assuntos

Eucariotos , Evolução Molecular , Eucariotos/genética , Filogenia , Células Eucarióticas

2.

Identifying local associations in biological time series: algorithms, statistical significance, and applications.

Ai, Dongmei; Chen, Lulu; Xie, Jiemin; Cheng, Longwei; Zhang, Fang; Luan, Yihui; Li, Yang; Hou, Shengwei; Sun, Fengzhu; Xia, Li Charlie.

Brief Bioinform ; 24(6)2023 09 22.

Artigo em Inglês | MEDLINE | ID: mdl-37930023

RESUMO

Local associations refer to spatial-temporal correlations that emerge from the biological realm, such as time-dependent gene co-expression or seasonal interactions between microbes. One can reveal the intricate dynamics and inherent interactions of biological systems by examining the biological time series data for these associations. To accomplish this goal, local similarity analysis algorithms and statistical methods that facilitate the local alignment of time series and assess the significance of the resulting alignments have been developed. Although these algorithms were initially devised for gene expression analysis from microarrays, they have been adapted and accelerated for multi-omics next generation sequencing datasets, achieving high scientific impact. In this review, we present an overview of the historical developments and recent advances for local similarity analysis algorithms, their statistical properties, and real applications in analyzing biological time series data. The benchmark data and analysis scripts used in this review are freely available at http://github.com/labxscut/lsareview.

Assuntos

Algoritmos , Perfilação da Expressão Gênica , Fatores de Tempo , Perfilação da Expressão Gênica/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Benchmarking

3.

Fragility indices for only sufficiently likely modifications.

Baer, Benjamin R; Gaudino, Mario; Charlson, Mary; Fremes, Stephen E; Wells, Martin T.

Proc Natl Acad Sci U S A ; 118(49)2021 12 07.

Artigo em Inglês | MEDLINE | ID: mdl-34848537

RESUMO

The fragility index is a clinically meaningful metric based on modifying patient outcomes that is increasingly used to interpret the robustness of clinical trial results. The fragility index relies on a concept that explores alternative realizations of the same clinical trial by modifying patient measurements. In this article, we propose to generalize the fragility index to a family of fragility indices called the incidence fragility indices that permit only outcome modifications that are sufficiently likely and provide an exact algorithm to calculate the incidence fragility indices. Additionally, we introduce a far-reaching generalization of the fragility index to any data type and explain how to permit only sufficiently likely modifications for nondichotomous outcomes. All of the proposed methodologies follow the fragility index concept.

Assuntos

Interpretação Estatística de Dados , Algoritmos , Humanos , Projetos de Pesquisa , Tamanho da Amostra

4.

Investigating the Performance of Frequentist and Bayesian Techniques in Genomic Evaluation.

Sahebalam, Hamid; Gholizadeh, Mohsen; Hafezian, Hasan.

Biochem Genet ; 2024 Jul 01.

Artigo em Inglês | MEDLINE | ID: mdl-38951354

RESUMO

The genomic evaluation process relies on the assumption of linkage disequilibrium between dense single-nucleotide polymorphism (SNP) markers at the genome level and quantitative trait loci (QTL). The present study was conducted with the aim of evaluating four frequentist methods including Ridge Regression, Least Absolute Shrinkage and Selection Operator (LASSO), Elastic Net, and Genomic Best Linear Unbiased Prediction (GBLUP) and five Bayesian methods including Bayes Ridge Regression (BRR), Bayes A, Bayesian LASSO, Bayes C, and Bayes B, in genomic selection using simulation data. The difference between prediction accuracy was assessed in pairs based on statistical significance (p-value) (i.e., t test and Mann-Whitney U test) and practical significance (Cohen's d effect size) For this purpose, the data were simulated based on two scenarios in different marker densities (4000 and 8000, in the whole genome). The simulated data included a genome with four chromosomes, 1 Morgan each, on which 100 randomly distributed QTL and two different densities of evenly distributed SNPs (1000 and 2000), at the heritability level of 0.4, was considered. For the frequentist methods except for GBLUP, the regularization parameter λ was calculated using a five-fold cross-validation approach. For both scenarios, among the frequentist methods, the highest prediction accuracy was observed by Ridge Regression and GBLUP. The lowest and the highest bias were shown by Ridge Regression and GBLUP, respectively. Also, among the Bayesian methods, Bayes B and BRR showed the highest and lowest prediction accuracy, respectively. The lowest bias in both scenarios was registered by Bayesian LASSO and the highest bias in the first and the second scenario were shown by BRR and Bayes B, respectively. Across all the studied methods in both scenarios, the highest and the lowest accuracy were shown by Bayes B and LASSO and Elastic Net, respectively. As expected, the greatest similarity in performance was observed between GBLUP and BRR ( d = 0.007 , in the first scenario and d = 0.003 , in the second scenario). The results obtained from parametric t and non-parametric Mann-Whitney U tests were similar. In the first and second scenario, out of 36 t test between the performance of the studied methods in each scenario, 14 ( P < . 001 ) and 2 ( P < . 05 ) comparisons were significant, respectively, which indicates that with the increase in the number of predictors, the difference in the performance of different methods decreases. This was proven based on the Cohen's d effect size, so that with the increase in the complexity of the model, the effect size was not seen as very large. The regularization parameters in frequentist methods should be optimized by cross-validation approach before using these methods in genomic evaluation.

5.

Interpreting the Current Literature on Outcomes of Robotic-Assisted Versus Conventional Total Knee Arthroplasty Using Fragility Analysis: A Systematic Review and Cross-Sectional Study of Randomized Controlled Trials.

Zabat, Michelle A; Giakas, Alec M; Hohmann, Alexandra L; Lonner, Jess H.

J Arthroplasty ; 39(7): 1882-1887, 2024 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-38309638

RESUMO

BACKGROUND: Fragility analysis is a method of further characterizing outcomes in terms of the stability of statistical findings. This study assesses the statistical fragility of recent randomized controlled trials (RCTs) evaluating robotic-assisted versus conventional total knee arthroplasty (RA-TKA versus C-TKA). METHODS: We queried PubMed for RCTs comparing alignment, function, and outcomes between RA-TKA and C-TKA. Fragility index (FI) and reverse fragility index (RFI) (collectively, "FI") were calculated for dichotomous outcomes as the number of outcome reversals needed to change statistical significance. Fragility quotient (FQ) was calculated by dividing the FI by the sample size for that outcome event. Median FI and FQ were calculated for all outcomes collectively as well as for each individual outcome. Subanalyses were performed to assess FI and FQ based on outcome event type and statistical significance, as well as study loss to follow-up and year of publication. RESULTS: The overall median FI was 3.0 (interquartile range, [IQR] 1.0 to 6.3) and the median reverse fragility index was 3.0 (IQR 2.0 to 4.0). The overall median FQ was 0.027 (IQR 0.012 to 0.050). Loss to follow-up was greater than FI for 23 of the 38 outcomes assessed. CONCLUSIONS: A small number of alternative outcomes is often enough to reverse the statistical significance of findings in RCTs evaluating dichotomous outcomes in RA-TKA versus C-TKA. We recommend reporting FI and FQ alongside P values to improve the interpretability of RCT results.

Assuntos

Artroplastia do Joelho , Ensaios Clínicos Controlados Aleatórios como Assunto , Procedimentos Cirúrgicos Robóticos , Artroplastia do Joelho/métodos , Humanos , Procedimentos Cirúrgicos Robóticos/métodos , Resultado do Tratamento , Estudos Transversais , Articulação do Joelho/cirurgia

6.

Statistical significance, clinical importance and effect sizes: Enhancing understanding of a study's results.

Glaros, Alan G.

J Oral Rehabil ; 2024 Jul 02.

Artigo em Inglês | MEDLINE | ID: mdl-38956893

RESUMO

BACKGROUND: The proper interpretation of a study's results requires both excellent understanding of good methodological practices and deep knowledge of prior results, aided by the availability of effect sizes. METHODS: This review takes the form of an expository essay exploring the complex and nuanced relationships among statistical significance, clinical importance, and effect sizes. RESULTS: Careful attention to study design and methodology will increase the likelihood of obtaining statistical significance and may enhance the ability of investigators/readers to accurately interpret results. Measures of effect size show how well the variables used in a study account for/explain the variability in the data. Studies reporting strong effects may have greater practical value/utility than studies reporting weak effects. Effect sizes need to be interpreted in context. Verbal summary characterizations of effect sizes (e.g., "weak", "strong") are fundamentally flawed and can lead to inappropriate characterization of results. Common language effect size (CLES) indicators are a relatively new approach to effect sizes that may offer a more accessible interpretation of results that can benefit providers, patients, and the public at large. CONCLUSIONS: It is important to convey research findings in ways that are clear to both the research community and to the public. At a minimum, this requires inclusion of standard effect size data in research reports. Proper selection of measures and careful design of studies are foundational to the interpretation of a study's results. The ability to draw useful conclusions from a study is increased when investigators enhance the methodological quality of their work.

7.

Statistical considerations and database limitations in NMR-based metabolic profiling studies.

Ross, Imani L; Beardslee, Julie A; Steil, Maria M; Chihanga, Tafadzwa; Kennedy, Michael A.

Metabolomics ; 19(7): 64, 2023 06 28.

Artigo em Inglês | MEDLINE | ID: mdl-37378680

RESUMO

INTRODUCTION: Interpretation and analysis of NMR-based metabolic profiling studies is limited by substantially incomplete commercial and academic databases. Statistical significance tests, including p-values, VIP scores, AUC values and FC values, can be largely inconsistent. Data normalization prior to statistical analysis can cause erroneous outcomes. OBJECTIVES: The objectives were (1) to quantitatively assess consistency among p-values, VIP scores, AUC values and FC values in representative NMR-based metabolic profiling datasets, (2) to assess how data normalization can impact statistical significance outcomes, (3) to determine resonance peak assignment completion potential using commonly used databases and (4) to analyze intersection and uniqueness of metabolite space in these databases. METHODS: P-values, VIP scores, AUC values and FC values, and their dependence on data normalization, were determined in orthotopic mouse model of pancreatic cancer and two human pancreatic cancer cell lines. Completeness of resonance assignments were evaluated using Chenomx, the human metabolite database (HMDB) and the COLMAR database. The intersection and uniqueness of the databases was quantified. RESULTS: P-values and AUC values were strongly correlated compared to VIP or FC values. Distributions of statistically significant bins depended strongly on whether or not datasets were normalized. 40-45% of peaks had either no or ambiguous database matches. 9-22% of metabolites were unique to each database. CONCLUSIONS: Lack of consistency in statistical analyses of metabolomics data can lead to misleading or inconsistent interpretation. Data normalization can have large effects on statistical analysis and should be justified. About 40% of peak assignments remain ambiguous or impossible with current databases. 1D and 2D databases should be made consistent to maximize metabolite assignment confidence and validation.

Assuntos

Imageamento por Ressonância Magnética , Metabolômica , Animais , Camundongos , Humanos , Espectroscopia de Ressonância Magnética , Bases de Dados Factuais , Linhagem Celular

8.

Assessing the robustness of results from clinical trials and meta-analyses with the fragility index.

Lin, Lifeng; Xing, Aiwen; Chu, Haitao; Murad, M Hassan; Xu, Chang; Baer, Benjamin R; Wells, Martin T; Sanchez-Ramos, Luis.

Am J Obstet Gynecol ; 228(3): 276-282, 2023 03.

Artigo em Inglês | MEDLINE | ID: mdl-36084702

RESUMO

The fragility index has been increasingly used to assess the robustness of the results of clinical trials since 2014. It aims at finding the smallest number of event changes that could alter originally statistically significant results. Despite its popularity, some researchers have expressed several concerns about the validity and usefulness of the fragility index. It offers a comprehensive review of the fragility index's rationale, calculation, software, and interpretation, with emphasis on application to studies in obstetrics and gynecology. This article presents the fragility index in the settings of individual clinical trials, standard pairwise meta-analyses, and network meta-analyses. Moreover, this article provides worked examples to demonstrate how the fragility index can be appropriately calculated and interpreted. In addition, the limitations of the traditional fragility index and some solutions proposed in the literature to address these limitations were reviewed. In summary, the fragility index is recommended to be used as a supplemental measure in the reporting of clinical trials and a tool to communicate the robustness of trial results to clinicians. Other considerations that can aid in the fragility index's interpretation include the loss to follow-up and the likelihood of data modifications that achieve the loss of statistical significance.

Assuntos

Probabilidade , Humanos , Metanálise em Rede , Metanálise como Assunto , Ensaios Clínicos como Assunto

9.

The trends for the "trend toward significance" in the pediatric literature.

Rallis, Dimitrios; Baltogianni, Maria; Balomenou, Foteini; Dermitzaki, Niki; Kosmeri, Chrisoula; Giannakopoulos, Spyridon; Giapros, Vasileios.

Eur J Pediatr ; 182(2): 937-940, 2023 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-36459228

RESUMO

Purpose This study is to examine whether the term "trend toward statistical significance" is used to describe statistically nonsignificant results in biomedical literature. We examined articles published in five high-impact pediatric journals, including The Lancet Child & Adolescent Health, The Journal of Pediatrics, Early Human Development, Frontiers in Pediatrics, and BMC Pediatrics to identify manuscripts where a "trend" was used to describe a statistically nonsignificant result, from January 2020 to December 2021, and, furthermore, for The Journal of Pediatrics, Early Human Development, and BMC Pediatrics from January 2010 to December 2011. We detected that a "trend toward significance" was used to describe a statistically nonsignificant result at least once in 146 articles (2.7%) during the period between 2020 and 2021 and in 97 articles (4.0%) during the period between 2010 and 2011. We found no significant difference in the proportion of published articles with inappropriate use of "trend" across journals belonging to the first quartile of impact compared to the second quartile or across journals publishing under the subscription model or open access policy compared to journals publishing solely under the open access policy, in any period. The overall proportion of the inappropriate use of "trend" declined significantly between 2010 and 2011 to 2020 and 2021 (p = 0.002, RR 0.66 95% CI 0.51-0.86). CONCLUSION: "Trend" statements were sporadically used to describe statistically nonsignificant results across pediatric literature. The inappropriate use of "trend" to describe almost significant differences could be misleading, and "trend" should be reserved only when a specific statistical test for trend has been performed, or in relation to appropriate scientific definitions. WHAT IS KNOWN: â¢Previously, researchers have reported inappropriate use of "trend" in articles across anaesthesia or major oncology journals. â¢In many cases, hypothesized results that are close but not lower than the statistical significance threshold are emphasized as "almost" significant. WHAT IS NEW: â¢"Trend" statements were sporadically used to describe statistically nonsignificant results across pediatric literature. â¢Inappropriate use of "trend" was similar in journals with a subscription model compared to those having an open access policy and decreased within a 10-year period.

Assuntos

Publicações Periódicas como Assunto , Editoração , Adolescente , Criança , Humanos , Editoração/tendências , Estatística como Assunto

10.

Learning to deep learning: statistics and a paradigm test in selecting a UNet architecture to enhance MRI.

Sharma, Rishabh; Tsiamyrtzis, Panagiotis; Webb, Andrew G; Leiss, Ernst L; Tsekos, Nikolaos V.

MAGMA ; 2023 Nov 21.

Artigo em Inglês | MEDLINE | ID: mdl-37989921

RESUMO

OBJECTIVE: This study aims to assess the statistical significance of training parameters in 240 dense UNets (DUNets) used for enhancing low Signal-to-Noise Ratio (SNR) and undersampled MRI in various acquisition protocols. The objective is to determine the validity of differences between different DUNet configurations and their impact on image quality metrics. MATERIALS AND METHODS: To achieve this, we trained all DUNets using the same learning rate and number of epochs, with variations in 5 acquisition protocols, 24 loss function weightings, and 2 ground truths. We calculated evaluation metrics for two metric regions of interest (ROI). We employed both Analysis of Variance (ANOVA) and Mixed Effects Model (MEM) to assess the statistical significance of the independent parameters, aiming to compare their efficacy in revealing differences and interactions among fixed parameters. RESULTS: ANOVA analysis showed that, except for the acquisition protocol, fixed variables were statistically insignificant. In contrast, MEM analysis revealed that all fixed parameters and their interactions held statistical significance. This emphasizes the need for advanced statistical analysis in comparative studies, where MEM can uncover finer distinctions often overlooked by ANOVA. DISCUSSION: These findings highlight the importance of utilizing appropriate statistical analysis when comparing different deep learning models. Additionally, the surprising effectiveness of the UNet architecture in enhancing various acquisition protocols underscores the potential for developing improved methods for characterizing and training deep learning models. This study serves as a stepping stone toward enhancing the transparency and comparability of deep learning techniques for medical imaging applications.

11.

Lowering the statistical significance threshold of randomized controlled trials in three major general anesthesiology journals.

Waters, Philo; Rucker, Brayden; Love, Mitchell; Vassar, Matt.

Can J Anaesth ; 70(9): 1441-1448, 2023 09.

Artigo em Inglês | MEDLINE | ID: mdl-37561351

RESUMO

PURPOSE: The primary objective of our study was to determine how lowering a P value threshold from 0.05 to 0.005 would affect the statistical significance of previously published randomized controlled trials (RCTs) in major anesthesiology journals. METHODS: We searched the PubMed database for studies electronically published in 2020 within three major general anesthesiology journals as indexed by both Google Metrics and Scimago Journal & Country Rank. Studies included were RCTs published in 2020 in Anesthesiology, Anesthesia & Analgesia, and the British Journal of Anaesthesia; had a primary endpoint, and used a P value threshold to determine the effect of the intervention. We performed screening and data extraction in a masked duplicate fashion. RESULTS: Ninety-one RCTs met inclusion criteria. The most frequently studied type of intervention was drugs (44/91, 48%). From the 91 trials, 99 primary endpoints, and thus P values, were obtained. Fifty-eight (59%) endpoints had a P value < 0.05 and 41 (41%) had a P value ≥ 0.05. Of the 58 primary endpoints previously considered statistically significant, 21 (36%) P values would maintain statistical significance at P < 0.005, and 37 (64%) would be reclassified as "suggestive." CONCLUSIONS: Lowering a P value threshold of 0.05 to 0.005 would have altered one third of significance interpretations of RCTs in the surveyed anesthesiology literature. Thus, it is important for readers to consider post hoc probabilities when evaluating clinical trial results. Although the present study focused on the anesthesiology literature, we suggest that our results warrant further research within other fields of medicine to help avoid clinical misinterpretation of RCT findings and improve quality of care.

RéSUMé: OBJECTIF: L'objectif principal de notre étude était de déterminer comment l'abaissement d'un seuil de valeur P de 0,05 à 0,005 affecterait la signification statistique des études randomisées contrôlées (ERC) précédemment publiées dans certaines des principales revues d'anesthésiologie. MéTHODE: Nous avons réalisé des recherches dans la base de données PubMed pour trouver des études publiées électroniquement en 2020 dans trois des principales revues d'anesthésiologie générale et indexées par Google Metrics et Scimago Journal & Country Rank. Les études incluses étaient des ERC publiées en 2020 dans les revues Anesthesiology, Anesthesia & Analgesia, et le British Journal of Anaesthesia, qui avaient un critère d'évaluation principal et utilisaient un seuil de valeur P pour déterminer l'effet de l'intervention. Nous avons effectué la sélection et l'extraction des données de manière dupliquée masquée. RéSULTATS: Quatre-vingt-onze ERC remplissaient les critères d'inclusion. Le type d'intervention le plus fréquemment étudié était de nature médicamenteuse (44/91, 48 %). Sur les 91 études, 99 critères d'évaluation principaux, et donc valeurs P, ont été obtenus. Cinquante-huit (59 %) critères d'évaluation avaient une valeur P < 0,05 et 41 (41 %) avaient une valeur P ≥ 0,05. Sur les 58 critères d'évaluation principaux précédemment considérés comme statistiquement significatifs, 21 (36 %) valeurs P maintiendraient leur signification statistique à P < 0,005, et 37 (64 %) seraient reclassées comme étant « suggestives ¼. CONCLUSION: Le fait d'abaisser le seuil de valeur P de 0,05 à 0,005 aurait modifié un tiers des interprétations de signification des ERC dans la littérature anesthésiologique étudiée. Il est donc important que les lectrices et lecteurs tiennent compte des probabilités post hoc lors de l'évaluation des résultats d'études cliniques. Bien que la présente étude se soit concentrée sur la littérature en anesthésiologie, nous suggérons que nos résultats justifient des recherches supplémentaires dans d'autres domaines de la médecine afin d'éviter une mauvaise interprétation clinique des résultats des ERC et d'améliorer la qualité des soins.

Assuntos

Anestesia , Anestesiologia , Publicações Periódicas como Assunto , Humanos , Anestesiologia/métodos , Ensaios Clínicos Controlados Aleatórios como Assunto

12.

Practical notes on popular statistical tests in renal physiology.

Mamenko, Mykola; Lysikova, Daria V; Spires, Denisha R; Tarima, Sergey S; Ilatovskaya, Daria V.

Am J Physiol Renal Physiol ; 323(4): F389-F400, 2022 10 01.

Artigo em Inglês | MEDLINE | ID: mdl-35834273

RESUMO

Competent statistical analysis is essential to maintain rigor and reproducibility in physiological research. Unfortunately, the benefits offered by statistics are often negated by misuse or inadequate reporting of statistical methods. To address the need for improved quality of statistical analysis in papers, the American Physiological Society released guidelines for reporting statistics in journals published by the society. The guidelines reinforce high standards for the presentation of statistical data in physiology but focus on the conceptual challenges and, thus, may be of limited use to an unprepared reader. Experimental scientists working in the renal field may benefit from putting the existing guidelines in a practical context. This paper discusses the application of widespread hypothesis tests in a confirmatory study. We simulated pharmacological experiments assessing intracellular calcium in cultured renal cells and kidney function at the systemic level to review best practices for data analysis, graphical presentation, and reporting. Such experiments are ubiquitously used in renal physiology and could be easily translated to other practical applications to fit the reader's specific needs. We provide step-by-step guidelines for using the most common types of t tests and ANOVA and discuss typical mistakes associated with them. We also briefly consider normality tests, exclusion criteria, and identification of technical and experimental replicates. This review is supposed to help the reader analyze, illustrate, and report the findings correctly and will hopefully serve as a gauge for a level of design complexity when it might be time to consult a biostatistician.

Assuntos

Projetos de Pesquisa , Reprodutibilidade dos Testes , Estados Unidos

13.

A logical analysis of null hypothesis significance testing using popular terminology.

McNulty, Richard.

BMC Med Res Methodol ; 22(1): 244, 2022 Sep 19.

Artigo em Inglês | MEDLINE | ID: mdl-36123631

RESUMO

BACKGROUND: Null Hypothesis Significance Testing (NHST) has been well criticised over the years yet remains a pillar of statistical inference. Although NHST is well described in terms of statistical models, most textbooks for non-statisticians present the null and alternative hypotheses (H0 and HA, respectively) in terms of differences between groups such as (µ1 = µ2) and (µ1 ≠ µ2) and HA is often stated to be the research hypothesis. Here we use propositional calculus to analyse the internal logic of NHST when couched in this popular terminology. The testable H0 is determined by analysing the scope and limits of the P-value and the test statistic's probability distribution curve. RESULTS: We propose a minimum axiom set NHST in which it is taken as axiomatic that H0 is rejected if P-value< α. Using the common scenario of the comparison of the means of two sample groups as an example, the testable H0 is {(µ1 = µ2) and [([Formula: see text] 1 ≠ [Formula: see text] 2) due to chance alone]}. The H0 and HA pair should be exhaustive to avoid false dichotomies. This entails that HA is ¬{(µ1 = µ2) and [([Formula: see text] 1 ≠ [Formula: see text] 2) due to chance alone]}, rather than the research hypothesis (HT). To see the relationship between HA and HT, HA can be rewritten as the disjunction HA: ({(µ1 = µ2) â§ [([Formula: see text] 1 ≠ [Formula: see text] 2) not due to chance alone]} â¨ {(µ1 ≠ µ2) â§ [[Formula: see text] 1 ≠ [Formula: see text] 2) not due to (µ1 ≠ µ2) alone]} â¨ {(µ1 ≠ µ2) â§ [([Formula: see text] 1 ≠ [Formula: see text] 2) due to (µ1 ≠ µ2) alone]}). This reveals that HT (the last disjunct in bold) is just one possibility within HA. It is only by adding premises to NHST that HT or other conclusions can be reached. CONCLUSIONS: Using this popular terminology for NHST, analysis shows that the definitions of H0 and HA differ from those found in textbooks. In this framework, achieving a statistically significant result only justifies the broad conclusion that the results are not due to chance alone, not that the research hypothesis is true. More transparency is needed concerning the premises added to NHST to rig particular conclusions such as HT. There are also ramifications for the interpretation of Type I and II errors, as well as power, which do not specifically refer to HT as claimed by texts.

14.

Current controversies: Null hypothesis significance testing.

Sedgwick, Philip M; Hammer, Anne; Kesmodel, Ulrik Schiøler; Pedersen, Lars Henning.

Acta Obstet Gynecol Scand ; 101(6): 624-627, 2022 06.

Artigo em Inglês | MEDLINE | ID: mdl-35451497

RESUMO

Traditional null hypothesis significance testing (NHST) incorporating the critical level of significance of 0.05 has become the cornerstone of decision-making in health care, and nowhere less so than in obstetric and gynecological research. However, such practice is controversial. In particular, it was never intended for clinical significance to be inferred from statistical significance. The inference of clinical importance based on statistical significance (p < 0.05), and lack of clinical significance otherwise (p ≥ 0.05) represents misunderstanding of the original purpose of NHST. Furthermore, the limitations of NHST-sensitivity to sample size, plus type I and II errors-are frequently ignored. Therefore, decision-making based on NHST has the potential for recurrent false claims about the effectiveness of interventions or importance of exposure to risk factors, or dismissal of important ones. This commentary presents the history behind NHST along with the limitations that modern-day NHST presents, and suggests that a statistics reform regarding NHST be considered.

Assuntos

Projetos de Pesquisa , Humanos , Tamanho da Amostra

15.

Using Bayesian Methods to Augment the Interpretation of Critical Care Trials. An Overview of Theory and Example Reanalysis of the Alveolar Recruitment for Acute Respiratory Distress Syndrome Trial.

Zampieri, Fernando G; Casey, Jonathan D; Shankar-Hari, Manu; Harrell, Frank E; Harhay, Michael O.

Am J Respir Crit Care Med ; 203(5): 543-552, 2021 03 01.

Artigo em Inglês | MEDLINE | ID: mdl-33270526

RESUMO

Most randomized trials are designed and analyzed using frequentist statistical approaches such as null hypothesis testing and P values. Conceptually, P values are cumbersome to understand, as they provide evidence of data incompatibility with a null hypothesis (e.g., no clinical benefit) and not direct evidence of the alternative hypothesis (e.g., clinical benefit). This counterintuitive framework may contribute to the misinterpretation that the absence of evidence is equal to evidence of absence and may cause the discounting of potentially informative data. Bayesian methods provide an alternative, probabilistic interpretation of data. The reanalysis of completed trials using Bayesian methods is becoming increasingly common, particularly for trials with effect estimates that appear clinically significant despite P values above the traditional threshold of 0.05. Statistical inference using Bayesian methods produces a distribution of effect sizes that would be compatible with observed trial data, interpreted in the context of prior assumptions about an intervention (called "priors"). These priors are chosen by investigators to reflect existing beliefs and past empirical evidence regarding the effect of an intervention. By calculating the likelihood of clinical benefit, a Bayesian reanalysis can augment the interpretation of a trial. However, if priors are not defined a priori, there is a legitimate concern that priors could be constructed in a manner that produces biased results. Therefore, some standardization of priors for Bayesian reanalysis of clinical trials may be desirable for the critical care community. In this Critical Care Perspective, we discuss both frequentist and Bayesian approaches to clinical trial analysis, introduce a framework that researchers can use to select priors for a Bayesian reanalysis, and demonstrate how to apply our proposal by conducting a novel Bayesian trial reanalysis.

Assuntos

Teorema de Bayes , Interpretação Estatística de Dados , Ensaios Clínicos Controlados Aleatórios como Assunto , Respiração Artificial/métodos , Síndrome do Desconforto Respiratório/terapia , Humanos , Mortalidade , Respiração com Pressão Positiva/métodos , Modelos de Riscos Proporcionais

16.

One-Sided Multidimensional Statistical Significance Testing: A New Method of Calculating the Statistical Significance of Spectra Used to Demonstrate Magnetic Nanoparticle Sensitivity.

Weaver, John B; Weaver, Claire V; Ness, Dylan B; Gordon-Wylie, Scott W; Demidenko, Eugene.

J Phys D Appl Phys ; 55(32)2022 Aug 11.

Artigo em Inglês | MEDLINE | ID: mdl-35726230

RESUMO

Estimating statistical significance of the difference between two spectra or series is a fundamental statistical problem. Multivariate significance tests exist but the limitations preclude their use in many common cases; e.g., one-sided testing, unequal variance and when few repetitions are acquired all of which are required in magnetic spectroscopy of nanoparticle Brownian motion (MSB). We introduce a test, termed the T-S test, that is powerful and exact (exact type I error). It is flexible enough to be one- or two-sided and the one-sided version can specify arbitrary regions where each spectrum should be larger. The T-S test takes the-one or two-sided p-value at each frequency and combines them using Stouffer's method. We evaluated it using simulated spectra and measured MSB spectra. For the single-sided version, mean of the spectrum, A-T, was used as a reference; the T-S test is as powerful when the variance at each frequency is uniform and outperforms when the noise power is not uniform. For the two-sided version, the Hotelling T2 two-sided multivariate test was used as a reference; the two-sided T-S test is only slightly less powerful for large numbers of repetitions and outperforms rather dramatically for small numbers of repetitions. The T-S test was used to estimate the sensitivity of our current MSB spectrometer showing 1 nanogram sensitivity. Using eight repetitions the T-S test allowed 15 pM concentrations of mouse IL-6 to be identified while the mean of the spectra only identified 76 pM.

17.

How Reliable Are Ultra-Short-Term HRV Measurements during Cognitively Demanding Tasks?

Bernardes, André; Couceiro, Ricardo; Medeiros, Júlio; Henriques, Jorge; Teixeira, César; Simões, Marco; Durães, João; Barbosa, Raul; Madeira, Henrique; Carvalho, Paulo.

Sensors (Basel) ; 22(17)2022 Aug 30.

Artigo em Inglês | MEDLINE | ID: mdl-36080987

RESUMO

Ultra-short-term HRV features assess minor autonomous nervous system variations such as variations resulting from cognitive stress peaks during demanding tasks. Several studies compare ultra-short-term and short-term HRV measurements to investigate their reliability. However, existing experiments are conducted in low cognitively demanding environments. In this paper, we propose to evaluate these measurements' reliability under cognitively demanding tasks using a near real-life setting. For this purpose, we selected 31 HRV features, extracted from data collected from 21 programmers performing code comprehension, and compared them across 18 different time frames, ranging from 3 min to 10 s. Statistical significance and correlation tests were performed between the features extracted using the larger window (3 min) and the same features extracted with the other 17 time frames. We paired these analyses with Bland-Altman plots to inspect how the extraction window size affects the HRV features. The main results show 13 features that presented at least 50% correlation when using 60-second windows. The HF and mNN features achieved around 50% correlation using a 30-second window. The 30-second window was the smallest time frame considered to have reliable measurements. Furthermore, the mNN feature proved to be quite robust to the shortening of the time resolution.

Assuntos

Eletrocardiografia , Eletrocardiografia/métodos , Frequência Cardíaca/fisiologia , Reprodutibilidade dos Testes

18.

Statistical significance and its critics: practicing damaging science, or damaging scientific practice?

Mayo, Deborah G; Hand, David.

Synthese ; 200(3): 220, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35578622

RESUMO

While the common procedure of statistical significance testing and its accompanying concept of p-values have long been surrounded by controversy, renewed concern has been triggered by the replication crisis in science. Many blame statistical significance tests themselves, and some regard them as sufficiently damaging to scientific practice as to warrant being abandoned. We take a contrary position, arguing that the central criticisms arise from misunderstanding and misusing the statistical tools, and that in fact the purported remedies themselves risk damaging science. We argue that banning the use of p-value thresholds in interpreting data does not diminish but rather exacerbates data-dredging and biasing selection effects. If an account cannot specify outcomes that will not be allowed to count as evidence for a claim-if all thresholds are abandoned-then there is no test of that claim. The contributions of this paper are: To explain the rival statistical philosophies underlying the ongoing controversy; To elucidate and reinterpret statistical significance tests, and explain how this reinterpretation ameliorates common misuses and misinterpretations; To argue why recent recommendations to replace, abandon, or retire statistical significance undermine a central function of statistics in science: to test whether observed patterns in the data are genuine or due to background variability.

19.

Influence of optional measurement parameters in the Eclipse treatment planning system on the quality of the dosimetric model of the biomedical accelerator using the Acuros XB algorithm.

Gadek, Adam; Wendykier, Jacek; Grzadziel, Aleksandra; Bekman, Barbara; Smolinska, Barbara; Bekman, Adam; Niewiadomska, Beata; Prazmowska, Joanna; Wozniak, Bozena; Slosarek, Krzysztof.

Rep Pract Oncol Radiother ; 27(2): 241-249, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36299384

RESUMO

Background: To properly configure a treatment planning system, a measurement data set is needed, which consists of the values required for its configuration. The aim is to obtain a dosimetric model of the beam that is as compatible as possible with the measured values. The set of required data can be supplemented with optional values. The aim of the study was to assess the influence of optional measurement data on the compliance of the calculations with the measurements. Materials and methods: Dosimetric measurements, model configuration and dose distribution calculations were performed for the photon radiation beams generated by the VMS TrueBeam® linear accelerator. Beams were configured on an Eclipse™ v. 15.6 system using the Acuros v. 15.6 algorithm. The measured and calculated data were entered into the Alfard™ software for comparison with the calculated dose distributions. In the last stage, the absolute dose values at the designated points were also compared. The obtained data were statistically analysed with Statistica™ v. 13.3. Results: The work showed that the differences in the shape of the beam profile, depth dose and the dose value in points were not related to the use of optional data. Differences in dose distributions are within the tolerance. It cannot be determined under which conditions the use of optional data has a more favourable effect on the reflection of the actual dose values. Conclusions: The use of optional data in modelling photon radiation beams does not significantly improve the compliance of the calculated and measured dose values.

20.

On clinical trial fragility due to patients lost to follow up.

Baer, Benjamin R; Fremes, Stephen E; Gaudino, Mario; Charlson, Mary; Wells, Martin T.

BMC Med Res Methodol ; 21(1): 254, 2021 11 20.

Artigo em Inglês | MEDLINE | ID: mdl-34800976

RESUMO

BACKGROUND: Clinical trials routinely have patients lost to follow up. We propose a methodology to understand their possible effect on the results of statistical tests by altering the concept of the fragility index to treat the outcomes of observed patients as fixed but incorporate the potential outcomes of patients lost to follow up as random and subject to modification. METHODS: We reanalyse the statistical results of three clinical trials on coronary artery bypass grafting (CABG) to study the possible effect of patients lost to follow up on the treatment effect statistical significance. To do so, we introduce the LTFU-aware fragility indices as a measure of the robustness of a clinical trial's statistical results with respect to patients lost to follow up. RESULTS: The analyses illustrate that clinical trials can either be completely robust to the outcomes of patients lost to follow up, extremely sensitive to the outcomes of patients lost to follow up, or in an intermediate state. When a clinical trial is in an intermediate state, the LTFU-aware fragility indices provide an interpretable measure to quantify the degree of fragility or robustness. CONCLUSIONS: The LTFU-aware fragility indices allow researchers to rigorously explore the outcomes of patients who are lost to follow up, when their data is the appropriate kind. The LTFU-aware fragility indices are sensitivity measures in a way that the original fragility index is not.

Assuntos

Perda de Seguimento , Humanos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA