Your browser doesn't support javascript.
loading
Sample size calculations for the differential expression analysis of RNA-seq data using a negative binomial regression model.
Li, Xiaohong; Wu, Dongfeng; Cooper, Nigel G F; Rai, Shesh N.
Afiliação
  • Li X; Department of Bioinformatics and Biostatistics, School of Public Health and Information Sciences, University of Louisville, Louisville, KY 40202, USA.
  • Wu D; Department of Anatomical Sciences and Neurobiology, School of Medicine, University of Louisville, Louisville, KY, USA.
  • Cooper NGF; Department of Bioinformatics and Biostatistics, School of Public Health and Information Sciences, University of Louisville, Louisville, KY 40202, USA.
  • Rai SN; Department of Anatomical Sciences and Neurobiology, School of Medicine, University of Louisville, Louisville, KY, USA.
Stat Appl Genet Mol Biol ; 18(1)2019 01 22.
Article em En | MEDLINE | ID: mdl-30667368
High throughput RNA sequencing (RNA-seq) technology is increasingly used in disease-related biomarker studies. A negative binomial distribution has become the popular choice for modeling read counts of genes in RNA-seq data due to over-dispersed read counts. In this study, we propose two explicit sample size calculation methods for RNA-seq data using a negative binomial regression model. To derive these new sample size formulas, the common dispersion parameter and the size factor as an offset via a natural logarithm link function are incorporated. A two-sided Wald test statistic derived from the coefficient parameter is used for testing a single gene at a nominal significance level 0.05 and multiple genes at a false discovery rate 0.05. The variance for the Wald test is computed from the variance-covariance matrix with the parameters estimated from the maximum likelihood estimates under the unrestricted and constrained scenarios. The performance and a side-by-side comparison of our new formulas with three existing methods with a Wald test, a likelihood ratio test or an exact test are evaluated via simulation studies. Since other methods are much computationally extensive, we recommend our M1 method for quick and direct estimation of sample sizes in an experimental design. Finally, we illustrate sample sizes estimation using an existing breast cancer RNA-seq data.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: RNA / Perfilação da Expressão Gênica / Sequenciamento de Nucleotídeos em Larga Escala / RNA-Seq Tipo de estudo: Prognostic_studies / Risk_factors_studies Limite: Humans Idioma: En Ano de publicação: 2019 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: RNA / Perfilação da Expressão Gênica / Sequenciamento de Nucleotídeos em Larga Escala / RNA-Seq Tipo de estudo: Prognostic_studies / Risk_factors_studies Limite: Humans Idioma: En Ano de publicação: 2019 Tipo de documento: Article