Your browser doesn't support javascript.
loading
Error modelled gene expression analysis (EMOGEA) provides a superior overview of time course RNA-seq measurements and low count gene expression.
Barra, Jasmine; Taverna, Federico; Bong, Fabian; Ahmed, Ibrahim; Karakach, Tobias K.
Affiliation
  • Barra J; Laboratory of Integrative Multi-Omics Research, Department of Pharmacology, Dalhousie University, 5850 College Street, Halifax, NS, B3H 4R2, Canada.
  • Taverna F; Beatrice Hunter Cancer Research Institute, 5743 University Avenue, Suite 98, Halifax, NS, B3H 0A2, Canada.
  • Bong F; Department of Microbiology & Immunology, Dalhousie University, 5850 College Street, Halifax, NS, B3H 4R2, Canada.
  • Ahmed I; Laboratory of Integrative Multi-Omics Research, Department of Pharmacology, Dalhousie University, 5850 College Street, Halifax, NS, B3H 4R2, Canada.
  • Karakach TK; Beatrice Hunter Cancer Research Institute, 5743 University Avenue, Suite 98, Halifax, NS, B3H 0A2, Canada.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in En | MEDLINE | ID: mdl-38770716
ABSTRACT
Temporal RNA-sequencing (RNA-seq) studies of bulk samples provide an opportunity for improved understanding of gene regulation during dynamic phenomena such as development, tumor progression or response to an incremental dose of a pharmacotherapeutic. Moreover, single-cell RNA-seq (scRNA-seq) data implicitly exhibit temporal characteristics because gene expression values recapitulate dynamic processes such as cellular transitions. Unfortunately, temporal RNA-seq data continue to be analyzed by methods that ignore this ordinal structure and yield results that are often difficult to interpret. Here, we present Error Modelled Gene Expression Analysis (EMOGEA), a framework for analyzing RNA-seq data that incorporates measurement uncertainty, while introducing a special formulation for those acquired to monitor dynamic phenomena. This method is specifically suited for RNA-seq studies in which low-count transcripts with small-fold changes lead to significant biological effects. Such transcripts include genes involved in signaling and non-coding RNAs that inherently exhibit low levels of expression. Using simulation studies, we show that this framework down-weights samples that exhibit extreme responses such as batch effects allowing them to be modeled with the rest of the samples and maintain the degrees of freedom originally envisioned for a study. Using temporal experimental data, we demonstrate the framework by extracting a cascade of gene expression waves from a well-designed RNA-seq study of zebrafish embryogenesis and an scRNA-seq study of mouse pre-implantation and provide unique biological insights into the regulation of genes in each wave. For non-ordinal measurements, we show that EMOGEA has a much higher rate of true positive calls and a vanishingly small rate of false negative discoveries compared to common approaches. Finally, we provide two packages in Python and R that are self-contained and easy to use, including test data.
Subject(s)
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Zebrafish / RNA-Seq Limits: Animals Language: En Journal: Brief Bioinform Year: 2024 Document type: Article

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Zebrafish / RNA-Seq Limits: Animals Language: En Journal: Brief Bioinform Year: 2024 Document type: Article