RESUMO
The growing scale and declining cost of single-cell RNA-sequencing (RNA-seq) now permit a repetition of cell sampling that increases the power to detect rare cell states, reconstruct developmental trajectories, and measure phenotype in new terms such as cellular variance. The characterization of anatomy and developmental dynamics has not had an equivalent breakthrough since groundbreaking advances in live fluorescent microscopy. The new resolution obtained by single-cell RNA-seq is a boon to genetics because the novel description of phenotype offers the opportunity to refine gene function and dissect pleiotropy. In addition, the recent pairing of high-throughput genetic perturbation with single-cell RNA-seq has made practical a scale of genetic screening not previously possible.
Assuntos
Microscopia de Fluorescência/métodos , RNA/genética , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Perfilação da Expressão Gênica , Regulação da Expressão Gênica no Desenvolvimento/genética , HumanosRESUMO
Children with attention-deficit/hyperactivity disorder show deficits in processing speed, as well as aberrant neural oscillations, including both periodic (oscillatory) and aperiodic (1/f-like) activity, reflecting the pattern of power across frequencies. Both components were suggested as underlying neural mechanisms of cognitive dysfunctions in attention-deficit/hyperactivity disorder. Here, we examined differences in processing speed and resting-state-Electroencephalogram neural oscillations and their associations between 6- and 12-year-old children with (n = 33) and without (n = 33) attention-deficit/hyperactivity disorder. Spectral analyses of the resting-state EEG signal using fast Fourier transform revealed increased power in fronto-central theta and beta oscillations for the attention-deficit/hyperactivity disorder group, but no differences in the theta/beta ratio. Using the parameterization method, we found a higher aperiodic exponent, which has been suggested to reflect lower neuronal excitation-inhibition, in the attention-deficit/hyperactivity disorder group. While fast Fourier transform-based theta power correlated with clinical symptoms for the attention-deficit/hyperactivity disorder group only, the aperiodic exponent was negatively correlated with processing speed across the entire sample. Finally, the aperiodic exponent was correlated with fast Fourier transform-based beta power. These results highlight the different and complementary contribution of periodic and aperiodic components of the neural spectrum as metrics for evaluation of processing speed in attention-deficit/hyperactivity disorder. Future studies should further clarify the roles of periodic and aperiodic components in additional cognitive functions and in relation to clinical status.
Assuntos
Transtorno do Deficit de Atenção com Hiperatividade , Encéfalo , Cognição , Eletroencefalografia , Humanos , Criança , Transtorno do Deficit de Atenção com Hiperatividade/fisiopatologia , Masculino , Feminino , Encéfalo/fisiopatologia , Cognição/fisiologia , Análise de Fourier , Ondas Encefálicas/fisiologia , Ritmo Teta/fisiologia , Ritmo beta/fisiologiaRESUMO
Single-cell RNA sequencing (scRNA-seq) offers new possibilities to address biological and medical questions. However, systematic comparisons of the performance of diverse scRNA-seq protocols are lacking. We generated data from 583 mouse embryonic stem cells to evaluate six prominent scRNA-seq methods: CEL-seq2, Drop-seq, MARS-seq, SCRB-seq, Smart-seq, and Smart-seq2. While Smart-seq2 detected the most genes per cell and across cells, CEL-seq2, Drop-seq, MARS-seq, and SCRB-seq quantified mRNA levels with less amplification noise due to the use of unique molecular identifiers (UMIs). Power simulations at different sequencing depths showed that Drop-seq is more cost-efficient for transcriptome quantification of large numbers of cells, while MARS-seq, SCRB-seq, and Smart-seq2 are more efficient when analyzing fewer cells. Our quantitative comparison offers the basis for an informed choice among six prominent scRNA-seq methods, and it provides a framework for benchmarking further improvements of scRNA-seq protocols.
Assuntos
Células-Tronco Embrionárias/química , Sequenciamento de Nucleotídeos em Larga Escala , RNA/genética , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Animais , Sequência de Bases , Linhagem Celular , Simulação por Computador , Análise Custo-Benefício , Sequenciamento de Nucleotídeos em Larga Escala/economia , Camundongos , Modelos Econômicos , RNA/isolamento & purificação , Análise de Sequência de RNA/economia , Análise de Célula Única/economiaRESUMO
A large enough sample size of patients is required to statistically show that one treatment is better than another. However, too large a sample size is expensive and can also result in findings that are statistically significant, but not clinically relevant. How sample sizes should be chosen is a well-studied problem in classical statistics and analytical expressions can be derived from the appropriate test statistic. However, these expressions require information regarding the efficacy of the treatment, which may not be available, particularly for newly developed drugs. Tumor growth inhibition (TGI) models are frequently used to quantify the efficacy of newly developed anticancer drugs. In these models, the tumor growth dynamics are commonly described by a set of ordinary differential equations containing parameters that must be estimated using experimental data. One widely used endpoint in clinical trials is the proportion of patients in different response categories determined using the Response Evaluation Criteria In Solid Tumors (RECIST) framework. From the TGI model, we derive analytical expressions for the probability of patient response to combination therapy. The probabilistic expressions are used together with classical statistics to derive a parametric model for the sample size required to achieve a certain significance level and test power when comparing two treatments. Furthermore, the probabilistic expressions are used to generalize the Tumor Static Exposure concept to be more suitable for predicting clinical response. The derivatives of the probabilistic expressions are used to derive two additional expressions characterizing the exposure and its sensitivity. Finally, our results are illustrated using parameters obtained from calibrating the model to preclinical data.
RESUMO
ACADEMIC ABSTRACT: In the wake of the replication crisis, social and personality psychologists have increased attention to power analysis and the adequacy of sample sizes. In this article, we analyze current controversies in this area, including choosing effect sizes, why and whether power analyses should be conducted on already-collected data, how to mitigate the negative effects of sample size criteria on specific kinds of research, and which power criterion to use. For novel research questions, we advocate that researchers base sample sizes on effects that are likely to be cost-effective for other people to implement (in applied settings) or to study (in basic research settings), given the limitations of interest-based minimums or field-wide effect sizes. We discuss two alternatives to power analysis, precision analysis and sequential analysis, and end with recommendations for improving the practices of researchers, reviewers, and journal editors in social-personality psychology. PUBLIC ABSTRACT: Recently, social-personality psychology has been criticized for basing some of its conclusions on studies with low numbers of participants. As a result, power analysis, a mathematical way to ensure that a study has enough participants to reliably "detect" a given size of psychological effect, has become popular. This article describes power analysis and discusses some controversies about it, including how researchers should derive assumptions about effect size, and how the requirements of power analysis can be applied without harming research on hard-to-reach and marginalized communities. For novel research questions, we advocate that researchers base sample sizes on effects that are likely to be cost-effective for other people to implement (in applied settings) or to study (in basic research settings). We discuss two alternatives to power analysis, precision analysis and sequential analysis, and end with recommendations for improving the practices of researchers, reviewers, and journal editors in social-personality psychology.
Assuntos
Projetos de Pesquisa , Humanos , Tamanho da Amostra , Psicologia SocialRESUMO
BACKGROUND: The Bayesian group sequential design has been applied widely in clinical studies, especially in Phase II and III studies. It allows early termination based on accumulating interim data. However, to date, there lacks development in its application to stepped-wedge cluster randomized trials, which are gaining popularity in pragmatic trials conducted by clinical and health care delivery researchers. METHODS: We propose a Bayesian adaptive design approach for stepped-wedge cluster randomized trials, which makes adaptive decisions based on the predictive probability of declaring the intervention effective at the end of study given interim data. The Bayesian models and the algorithms for posterior inference and trial conduct are presented. RESULTS: We present how to determine design parameters through extensive simulations to achieve desired operational characteristics. We further evaluate how various design factors, such as the number of steps, cluster size, random variability in cluster size, and correlation structures, impact trial properties, including power, type I error, and the probability of early stopping. An application example is presented. CONCLUSION: This study presents the incorporation of Bayesian adaptive strategies into stepped-wedge cluster randomized trials design. The proposed approach provides the flexibility to stop the trial early if substantial evidence of efficacy or futility is observed, improving the flexibility and efficiency of stepped-wedge cluster randomized trials.
Assuntos
Algoritmos , Teorema de Bayes , Ensaios Clínicos Controlados Aleatórios como Assunto , Projetos de Pesquisa , Humanos , Ensaios Clínicos Controlados Aleatórios como Assunto/métodos , Análise por Conglomerados , Simulação por Computador , Modelos Estatísticos , Tamanho da AmostraRESUMO
BACKGROUND: The use of corporate power to undermine public health policy processes is increasingly well understood; however, relatively little scholarship examines how advocates can leverage power to promote the successful adoption of public health policies. The objective of this paper is to explore how advocates leveraged three forms of power - structural, instrumental and discursive - to promote the passage of the Promotion of Healthy Eating Law (Ley 27,642) in Argentina, one of the most comprehensive policies to introduce mandatory front-of-package (FOP) warning labels and regulate the marketing and sales of ultra-processed foods (UPFs) adopted to date. METHODS: We conducted seventeen semi-structured interviews with advocates from different sectors, including civil society, international agencies, and government. Both data collection and analysis were guided by Milsom's conceptual framework for analyzing power in public health policymaking, and the data was analyzed using hybrid deductive and inductive thematic analysis. RESULTS: Advocates harnessed structural power through the leveraging of revolving doors, informal alliances, and formal coalitions, enabling them to convene discussion spaces with decision-makers, make strategic use of limited resources, and cultivate the diverse expertise (e.g., research, nutrition science, advocacy, law, political science, activism and communications) needed to support the law through different phases of the policy process. Advocates wielded instrumental power by amassing an armada of localized evidence to promote robust policy design, building technical literacy amongst themselves and decision-makers, and exposing conflicts of interest to harness public pressure. Advocates exercised discursive power by adopting a rights-based discourse, including of children and adolescents and of consumers to transparent information, which enabled advocates to foster a favorable perception of the law amongst both decision-makers and the public. Key contextual enablers include a political window of opportunity, the COVID-19 pandemic, and the ability to learn from the regional precedent of similar policies. CONCLUSIONS: Public health policymaking, particularly when encroaching upon corporate interests, is characterized by stark imbalances of power that hinder policy decisions. The strategies identified in the case of Argentina provide important insights as to how advocates might harness and exercise structural, instrumental, and discursive power to counter corporate influence and promote the successful adoption of comprehensive UPF regulation.
RESUMEN: ANTECEDENTES: El uso del poder corporativo para socavar los procesos de políticas de salud pública se comprende cada vez mejor; sin embargo, relativamente pocos estudios examinan cómo los defensores pueden aprovechar el poder para promover la adopción exitosa de políticas de salud pública. El objetivo de este artículo es explorar cómo sus defensores aprovecharon tres formas de poder estructural, instrumental y discursivo para promover la aprobación de la Ley de Promoción de la Alimentación Saludable (Ley 27.642) en Argentina -una de las políticas más integrales adoptadas hasta la fecha- que establece la obligatoriedad del etiquetado frontal de advertencias en alimentos y bebidas envasados y regula la comercialización y venta de productos comestibles y bebibles ultraprocesados (PUP). MéTODOS: Realizamos diecisiete entrevistas semiestructuradas con defensores de diferentes sectores, incluyendo sociedad civil, agencias internacionales y gobierno. Tanto la recolección como el análisis de datos se guiaron por el marco conceptual de Milsom para analizar el poder en la formulación de políticas de salud pública, y los datos se analizaron mediante un análisis temático híbrido deductivo e inductivo. RESULTADOS: Los defensores sacaron provecho del poder estructural mediante el uso de puertas giratorias, alianzas informales y coaliciones formales, lo que les permitió generar espacios de discusión con los tomadores de decisiones, hacer uso estratégico de recursos limitados y profundizar la experticia diversa (por ejemplo, en investigación, ciencia de la nutrición, abogacía, derecho, ciencias políticas, activismo y comunicación) necesaria para apoyar la ley a través de las diferentes fases del proceso político. Los defensores ejercieron poder instrumental al acumular abundante evidencia local para promover un diseño de políticas sólidas, construir conocimientos técnicos entre ellos y los tomadores de decisiones, y exponer conflictos de intereses para ejercer presión pública. Los defensores ejercieron poder discursivo al adoptar un discurso con enfoque de derechos -incluyendo el de niños, niñas y adolescentes, y el de los consumidores a acceder a una información transparente-, lo que les permitió fomentar una percepción favorable hacia la ley tanto entre los tomadores de decisiones como entre el público. Los facilitadores contextuales clave incluyen una ventana de oportunidad política, la pandemia de COVID-19 y la capacidad de aprender de la experiencia regional en políticas similares. CONCLUSIONES: La formulación de políticas de salud pública, particularmente cuando menoscaban los intereses corporativos, se caracteriza por marcados desequilibrios de poder que obstaculizan las decisiones políticas. Las estrategias identificadas en el caso de Argentina brindan ideas importantes sobre cómo los defensores podrían aprovechar y ejercer el poder estructural, instrumental y discursivo para contrarrestar la influencia corporativa y promover la adopción exitosa de una regulación integral de PUP.
Assuntos
Fast Foods , Argentina , Humanos , Defesa do Consumidor , Política de Saúde , Formulação de Políticas , Alimento ProcessadoRESUMO
OBJECTIVE: To investigate the accuracy, precision, and trending ability of noninvasive bioreactance-based Starling SV and the mini invasive pulse-power device LiDCOrapid as compared to thermodilution cardiac output (TDCO) as measured by pulmonary artery catheter when assessing cardiac index (CIx) in the setting of elective open abdominal aortic (AA) surgery. DESIGN: A prospective method-comparison study. SETTING: Oulu University Hospital, Finland. PARTICIPANTS: Forty patients undergoing elective open abdominal aortic surgery. INTERVENTIONS: Intraoperative CI measurements were obtained simultaneously with TDCO and the study monitors, resulting in 627 measurement pairs with Starling SV and 497 with LiDCOrapid. MEASUREMENTS AND MAIN RESULTS: The Bland-Altman method was used to investigate the agreement among the devices, and four-quadrant plots with error grids were used to assess trending ability. The agreement between TDCO and Starling SV was associated with a bias of 0.18 L/min/m2 (95% confidence interval [CI] = 0.13 to 0.23), wide limits of agreement (LOA = -1.12 to 1.47 L/min/m2), and a percentage error (PE) of 63.7 (95% CI = 52.4-71.0). The agreement between TDCO and LiDCOrapid was associated with a bias of -0.15 L/min/m2 (95% CI = -0.21 to -0.09), wide LOA (-1.56 to 1.37), and a PE of 68.7 (95% CI = 54.9-79.6). The trending ability of neither device was sufficient. CONCLUSION: The CI measurements achieved with Starling SV and LiDCOrapid were not interchangeable with TDCO, and the ability to track changes in CI was poor. These results do not support the use of either study device in monitoring CI during open AA surgery.
Assuntos
Aorta Abdominal , Débito Cardíaco , Monitorização Intraoperatória , Termodiluição , Humanos , Masculino , Feminino , Estudos Prospectivos , Débito Cardíaco/fisiologia , Idoso , Aorta Abdominal/cirurgia , Reprodutibilidade dos Testes , Monitorização Intraoperatória/métodos , Monitorização Intraoperatória/normas , Pessoa de Meia-Idade , Termodiluição/métodos , Procedimentos Cirúrgicos Vasculares/métodosRESUMO
Single case experimental designs are an important research design in behavioral and medical research. Although there are design standards prescribed by the What Works Clearinghouse for single case experimental designs, these standards do not include statistically derived power computations. Recently we derived the equations for computing power for (AB)k designs. However, these computations and the software code in R may not be accessible to applied researchers who are most likely to want to compute power for their studies. Therefore, we have developed an (AB)k power calculator Shiny App (https://abkpowercalculator.shinyapps.io/ABkpowercalculator/) that researchers can use with no software training. These power computations assume that the researcher would be interested in fitting multilevel models with autocorrelations or conduct similar analyses. The purpose of this software contribution is to briefly explain how power is derived for balanced (AB)k designs and to elaborate on how to use the Shiny App. The app works well on not just computers but mobile phones without installing the R program. We believe this can be a valuable tool for practitioners and applied researchers who want to plan their single case studies with sufficient power to detect appropriate effect sizes.
Assuntos
Aplicativos Móveis , Projetos de Pesquisa , Análise MultinívelRESUMO
Linear mixed-effects models have been increasingly used to analyze dependent data in psychological research. Despite their many advantages over ANOVA, critical issues in their analyses remain. Due to increasing random effects and model complexity, estimation computation is demanding, and convergence becomes challenging. Applied users need help choosing appropriate methods to estimate random effects. The present Monte Carlo simulation study investigated the impacts when the restricted maximum likelihood (REML) and Bayesian estimation models were misspecified in the estimation. We also compared the performance of Akaike information criterion (AIC) and deviance information criterion (DIC) in model selection. Results showed that models neglecting the existing random effects had inflated Type I errors, unacceptable coverage, and inaccurate R-squared measures of fixed and random effects variation. Furthermore, models with redundant random effects had convergence problems, lower statistical power, and inaccurate R-squared measures for Bayesian estimation. The convergence problem is more severe for REML, while reduced power and inaccurate R-squared measures were more severe for Bayesian estimation. Notably, DIC was better than AIC in identifying the true models (especially for models including person random intercept only), improving convergence rates, and providing more accurate effect size estimates, despite AIC having higher power than DIC with 10 items and the most complicated true model.
Assuntos
Teorema de Bayes , Simulação por Computador , Método de Monte Carlo , Humanos , Modelos Lineares , Funções Verossimilhança , Simulação por Computador/estatística & dados numéricos , Interpretação Estatística de Dados , Modelos EstatísticosRESUMO
Side-channel analysis is a type of cryptanalysis that utilizes the physical leakage of a cryptographic device. An adversary exploits the relationship between a physical leakage and the secret intermediate value of an encryption algorithm. In order to prevent side-channel analysis, the masking method was proposed. Several masking methods of the ISO/IEC 18033-3 standard encryption algorithm SEED have been proposed, as the Korean financial IC (integrated circuit) card standard (CFIP.ST.FINIC-01-2021) mandates using a robust implementation of SEED as an encryption algorithm against side-channel analyses. However, vulnerabilities were reported, except for with only one masking method. This study proposes the first-order vulnerability of that masking method. That is, an adversary is able to perform a side-channel analysis with the same complexity as an unprotected implementation. In order to fix this vulnerability, we revise the masking method with negligible additional overhead. Its vulnerability and security are theoretically verified and experimentally demonstrated. The round key of the existing masking method is revealed with only 210 power consumption traces, while that of the proposed masking method is not disclosed with 10,000 traces.
RESUMO
When designing a study for causal mediation analysis, it is crucial to conduct a power analysis to determine the sample size required to detect the causal mediation effects with sufficient power. However, the development of power analysis methods for causal mediation analysis has lagged far behind. To fill the knowledge gap, I proposed a simulation-based method and an easy-to-use web application ( https://xuqin.shinyapps.io/CausalMediationPowerAnalysis/ ) for power and sample size calculations for regression-based causal mediation analysis. By repeatedly drawing samples of a specific size from a population predefined with hypothesized models and parameter values, the method calculates the power to detect a causal mediation effect based on the proportion of the replications with a significant test result. The Monte Carlo confidence interval method is used for testing so that the sampling distributions of causal effect estimates are allowed to be asymmetric, and the power analysis runs faster than if the bootstrapping method is adopted. This also guarantees that the proposed power analysis tool is compatible with the widely used R package for causal mediation analysis, mediation, which is built upon the same estimation and inference method. In addition, users can determine the sample size required for achieving sufficient power based on power values calculated from a range of sample sizes. The method is applicable to a randomized or nonrandomized treatment, a mediator, and an outcome that can be either binary or continuous. I also provided sample size suggestions under various scenarios and a detailed guideline of app implementation to facilitate study designs.
Assuntos
Aplicativos Móveis , Humanos , Tamanho da Amostra , Simulação por Computador , Causalidade , NegociaçãoRESUMO
Conditional process models, including moderated mediation models and mediated moderation models, are widely used in behavioral science research. However, few studies have examined approaches to conduct statistical power analysis for such models and there is also a lack of software packages that provide such power analysis functionalities. In this paper, we introduce new simulation-based methods for power analysis of conditional process models with a focus on moderated mediation models. These simulation-based methods provide intuitive ways for sample-size planning based on regression coefficients in a moderated mediation model as well as selected variance and covariance components. We demonstrate how the methods can be applied to five commonly used moderated mediation models using a simulation study, and we also assess the performance of the methods through the five models. We implement our approaches in the WebPower R package and also in Web apps to ease their application.
Assuntos
Modelos Estatísticos , Tamanho da Amostra , Humanos , Simulação por Computador , Software , Interpretação Estatística de Dados , Análise de Mediação , Pesquisa Comportamental/métodosRESUMO
Researchers increasingly study short-term dynamic processes that evolve within single individuals using N = 1 studies. The processes of interest are typically captured by fitting a VAR(1) model to the resulting data. A crucial question is how to perform sample-size planning and thus decide on the number of measurement occasions that are needed. The most popular approach is to perform a power analysis, which focuses on detecting the effects of interest. We argue that performing sample-size planning based on out-of-sample predictive accuracy yields additional important information regarding potential overfitting of the model. Predictive accuracy quantifies how well the estimated VAR(1) model will allow predicting unseen data from the same individual. We propose a new simulation-based sample-size planning method called predictive accuracy analysis (PAA), and an associated Shiny app. This approach makes use of a novel predictive accuracy metric that accounts for the multivariate nature of the prediction problem. We showcase how the values of the different VAR(1) model parameters impact power and predictive accuracy-based sample-size recommendations using simulated data sets and real data applications. The range of recommended sample sizes is smaller for predictive accuracy analysis than for power analysis.
Assuntos
Modelos Estatísticos , Humanos , Tamanho da Amostra , Simulação por Computador , Projetos de Pesquisa , Interpretação Estatística de DadosRESUMO
Ordinal data such as Likert items, ratings or generic ordered variables are widespread in psychology. These variables are usually analysed using metric models (e.g., standard linear regression) with important drawbacks in terms of statistical inference (reduced power and increased type-1 error) and prediction. One possible reason for not using ordinal regression models could be difficulty in understanding parameters or conducting a power analysis. The tutorial aims to present ordinal regression models using a simulation-based approach. Firstly, we introduced the general model highlighting crucial components and assumptions. Then, we explained how to interpret parameters for a logit and probit model. Then we proposed two ways for simulating data as a function of predictors showing a 2 × 2 interaction with categorical predictors and the interaction between a numeric and categorical predictor. Finally, we showed an example of power analysis using simulations that can be easily extended to complex models with multiple predictors. The tutorial is supported by a collection of custom R functions developed to simulate and understand ordinal regression models. The code to reproduce the proposed simulation, the custom R functions and additional examples of ordinal regression models can be found on the online Open Science Framework repository ( https://osf.io/93h5j).
RESUMO
Our current understanding of litter variability in neurodevelopmental studies using mice may limit translation of neuroscientific findings. Higher variance of measures across litters than within, often termed intra-litter likeness, may be attributable to both pre- and postnatal environment. This study aimed to assess the litter-effect within behavioral assessments (2 timepoints) and anatomy using T1-weighted magnetic resonance images across 72 brain region volumes (4 timepoints) (36 C57bl/6J inbred mice; 7 litters: 19F/17M). Between-litter comparisons of brain and behavioral measures and their associations were evaluated using univariate and multivariate techniques. A power analysis using simulation methods was then performed on modeled neurodevelopment and to evaluate trade-offs between number-of-litters, number-of-mice-per-litter, and sample size. Our results show litter-specific developmental effects, from the adolescent period to adulthood for brain structure volumes and behaviors, and for their associations in adulthood. Our power simulation analysis suggests increasing the number-of-litters in experimental designs to achieve the smallest total sample size necessary for detecting different rates of change in specific brain regions. Our results demonstrate how litter-specific effects may influence development and that increasing the litters to the total sample size ratio should be strongly considered when designing neurodevelopmental studies.
Assuntos
Tamanho da Ninhada de Vivíparos , Gravidez , Feminino , Animais , Camundongos , Simulação por Computador , Camundongos Endogâmicos C57BLRESUMO
The power of genotype-phenotype association mapping studies increases greatly when contributions from multiple variants in a focal region are meaningfully aggregated. Currently, there are two popular categories of variant aggregation methods. Transcriptome-wide association studies (TWAS) represent a set of emerging methods that select variants based on their effect on gene expressions, providing pretrained linear combinations of variants for downstream association mapping. In contrast to this, kernel methods such as sequence kernel association test (SKAT) model genotypic and phenotypic variance use various kernel functions that capture genetic similarity between subjects, allowing nonlinear effects to be included. From the perspective of machine learning, these two methods cover two complementary aspects of feature engineering: feature selection/pruning and feature aggregation. Thus far, no thorough comparison has been made between these categories, and no methods exist which incorporate the advantages of TWAS- and kernel-based methods. In this work, we developed a novel method called kernel-based TWAS (kTWAS) that applies TWAS-like feature selection to a SKAT-like kernel association test, combining the strengths of both approaches. Through extensive simulations, we demonstrate that kTWAS has higher power than TWAS and multiple SKAT-based protocols, and we identify novel disease-associated genes in Wellcome Trust Case Control Consortium genotyping array data and MSSNG (Autism) sequence data. The source code for kTWAS and our simulations are available in our GitHub repository (https://github.com/theLongLab/kTWAS).
Assuntos
Simulação por Computador , Estudos de Associação Genética , Variação Genética , Modelos Genéticos , Software , Transcriptoma , Estudo de Associação Genômica Ampla , Genótipo , HumanosRESUMO
The restricted mean time in favor (RMT-IF) of treatment has just been added to the analytic toolbox for composite endpoints of recurrent events and death. To help practitioners design new trials based on this method, we develop tools to calculate the sample size and power. Specifically, we formulate the outcomes as a multistate Markov process with a sequence of transient states for recurrent events and an absorbing state for death. The transition intensities, in this case the instantaneous risks of another nonfatal event or death, are assumed to be time-homogeneous but nonetheless allowed to depend on the number of past events. Using the properties of Coxian distributions, we derive the RMT-IF effect size under the alternative hypothesis as a function of the treatment-to-control intensity ratios along with the baseline intensities, the latter of which can be easily estimated from historical data. We also reduce the variance of the nonparametric RMT-IF estimator to calculable terms under a standard set-up for censoring. Simulation studies show that the resulting formulas provide accurate approximation to the sample size and power in realistic settings. For illustration, a past cardiovascular trial with recurrent-hospitalization and mortality outcomes is analyzed to generate the parameters needed to design a future trial. The procedures are incorporated into the rmt package along with the original methodology on the Comprehensive R Archive Network (CRAN).
Assuntos
Hospitalização , Projetos de Pesquisa , Humanos , Tamanho da Amostra , Simulação por Computador , Fatores de TempoRESUMO
The stepped wedge cluster randomized trial (SW-CRT) is an increasingly popular design for evaluating health service delivery or policy interventions. An essential consideration of this design is the need to account for both within-period and between-period correlations in sample size calculations. Especially when embedded in health care delivery systems, many SW-CRTs may have subclusters nested in clusters, within which outcomes are collected longitudinally. However, existing sample size methods that account for between-period correlations have not allowed for multiple levels of clustering. We present computationally efficient sample size procedures that properly differentiate within-period and between-period intracluster correlation coefficients in SW-CRTs in the presence of subclusters. We introduce an extended block exchangeable correlation matrix to characterize the complex dependencies of outcomes within clusters. For Gaussian outcomes, we derive a closed-form sample size expression that depends on the correlation structure only through two eigenvalues of the extended block exchangeable correlation structure. For non-Gaussian outcomes, we present a generic sample size algorithm based on linearization and elucidate simplifications under canonical link functions. For example, we show that the approximate sample size formula under a logistic linear mixed model depends on three eigenvalues of the extended block exchangeable correlation matrix. We provide an extension to accommodate unequal cluster sizes and validate the proposed methods via simulations. Finally, we illustrate our methods in two real SW-CRTs with subclusters.
Assuntos
Algoritmos , Projetos de Pesquisa , Tamanho da Amostra , Análise por ConglomeradosRESUMO
BACKGROUND: When evaluating the impact of environmental exposures on human health, study designs often include a series of repeated measurements. The goal is to determine whether populations have different trajectories of the environmental exposure over time. Power analyses for longitudinal mixed models require multiple inputs, including clinically significant differences, standard deviations, and correlations of measurements. Further, methods for power analyses of longitudinal mixed models are complex and often challenging for the non-statistician. We discuss methods for extracting clinically relevant inputs from literature, and explain how to conduct a power analysis that appropriately accounts for longitudinal repeated measures. Finally, we provide careful recommendations for describing complex power analyses in a concise and clear manner. METHODS: For longitudinal studies of health outcomes from environmental exposures, we show how to [1] conduct a power analysis that aligns with the planned mixed model data analysis, [2] gather the inputs required for the power analysis, and [3] conduct repeated measures power analysis with a highly-cited, validated, free, point-and-click, web-based, open source software platform which was developed specifically for scientists. RESULTS: As an example, we describe the power analysis for a proposed study of repeated measures of per- and polyfluoroalkyl substances (PFAS) in human blood. We show how to align data analysis and power analysis plan to account for within-participant correlation across repeated measures. We illustrate how to perform a literature review to find inputs for the power analysis. We emphasize the need to examine the sensitivity of the power values by considering standard deviations and differences in means that are smaller and larger than the speculated, literature-based values. Finally, we provide an example power calculation and a summary checklist for describing power and sample size analysis. CONCLUSIONS: This paper provides a detailed roadmap for conducting and describing power analyses for longitudinal studies of environmental exposures. It provides a template and checklist for those seeking to write power analyses for grant applications.