Your browser doesn't support javascript.
loading
Sequence count data are poorly fit by the negative binomial distribution.
Hawinkel, Stijn; Rayner, J C W; Bijnens, Luc; Thas, Olivier.
Afiliação
  • Hawinkel S; Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, Belgium.
  • Rayner JCW; Centre for Computer-Assisted Research in Mathematics and its Applications, School of Mathematical and Physical Sciences, University of Newcastle, Newcastle, Australia.
  • Bijnens L; National Institute for Applied Statistics Research Australia (NIASRA), University of Wollongong, Wollongong, Australia.
  • Thas O; Quantitative Sciences, Janssen Pharmaceutical companies of Johnson and Johnson, Ghent, Belgium.
PLoS One ; 15(4): e0224909, 2020.
Article em En | MEDLINE | ID: mdl-32352970
ABSTRACT
Sequence count data are commonly modelled using the negative binomial (NB) distribution. Several empirical studies, however, have demonstrated that methods based on the NB-assumption do not always succeed in controlling the false discovery rate (FDR) at its nominal level. In this paper, we propose a dedicated statistical goodness of fit test for the NB distribution in regression models and demonstrate that the NB-assumption is violated in many publicly available RNA-Seq and 16S rRNA microbiome datasets. The zero-inflated NB distribution was not found to give a substantially better fit. We also show that the NB-based tests perform worse on the features for which the NB-assumption was violated than on the features for which no significant deviation was detected. This gives an explanation for the poor behaviour of NB-based tests in many published evaluation studies. We conclude that nonparametric tests should be preferred over parametric methods.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Distribuição Binomial / RNA-Seq Tipo de estudo: Diagnostic_studies Idioma: En Revista: PLoS One Ano de publicação: 2020 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Distribuição Binomial / RNA-Seq Tipo de estudo: Diagnostic_studies Idioma: En Revista: PLoS One Ano de publicação: 2020 Tipo de documento: Article