A heavy-tailed model for analyzing miRNA-seq raw read counts.
Stat Appl Genet Mol Biol
; 23(1)2024 Jan 01.
Article
in En
| MEDLINE
| ID: mdl-38810893
ABSTRACT
This article addresses the limitations of existing statistical models in analyzing and interpreting highly skewed miRNA-seq raw read count data that can range from zero to millions. A heavy-tailed model using discrete stable distributions is proposed as a novel approach to better capture the heterogeneity and extreme values commonly observed in miRNA-seq data. Additionally, the parameters of the discrete stable distribution are proposed as an alternative target for differential expression analysis. An R package for computing and estimating the discrete stable distribution is provided. The proposed model is applied to miRNA-seq raw counts from the Norwegian Women and Cancer Study (NOWAC) and the Cancer Genome Atlas (TCGA) databases. The goodness-of-fit is compared with the popular Poisson and negative binomial distributions, and the discrete stable distributions are found to give a better fit for both datasets. In conclusion, the use of discrete stable distributions is shown to potentially lead to more accurate modeling of the underlying biological processes.
Key words
Full text:
1
Collection:
01-internacional
Database:
MEDLINE
Main subject:
Models, Statistical
/
MicroRNAs
Limits:
Female
/
Humans
Language:
En
Journal:
Stat Appl Genet Mol Biol
Year:
2024
Document type:
Article