Your browser doesn't support javascript.
loading
Data-based RNA-seq simulations by binomial thinning.
Gerard, David.
Afiliação
  • Gerard D; Department of Mathematics and Statistics, American University, Massachusetts Ave NW, Washington, DC, 20016, USA. dgerard@american.edu.
BMC Bioinformatics ; 21(1): 206, 2020 May 24.
Article em En | MEDLINE | ID: mdl-32448189
ABSTRACT

BACKGROUND:

With the explosion in the number of methods designed to analyze bulk and single-cell RNA-seq data, there is a growing need for approaches that assess and compare these methods. The usual technique is to compare methods on data simulated according to some theoretical model. However, as real data often exhibit violations from theoretical models, this can result in unsubstantiated claims of a method's performance.

RESULTS:

Rather than generate data from a theoretical model, in this paper we develop methods to add signal to real RNA-seq datasets. Since the resulting simulated data are not generated from an unrealistic theoretical model, they exhibit realistic (annoying) attributes of real data. This lets RNA-seq methods developers assess their procedures in non-ideal (model-violating) scenarios. Our procedures may be applied to both single-cell and bulk RNA-seq. We show that our simulation method results in more realistic datasets and can alter the conclusions of a differential expression analysis study. We also demonstrate our approach by comparing various factor analysis techniques on RNA-seq datasets.

CONCLUSIONS:

Using data simulated from a theoretical model can substantially impact the results of a study. We developed more realistic simulation techniques for RNA-seq data. Our tools are available in the seqgendiff R package on the Comprehensive R Archive Network https//cran.r-project.org/package=seqgendiff.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Simulação por Computador / Bases de Dados Genéticas / RNA-Seq Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2020 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Simulação por Computador / Bases de Dados Genéticas / RNA-Seq Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2020 Tipo de documento: Article