Your browser doesn't support javascript.
loading
An information-theoretic approach to single cell sequencing analysis.
Casey, Michael J; Fliege, Jörg; Sánchez-García, Rubén J; MacArthur, Ben D.
Afiliación
  • Casey MJ; Mathematical Sciences, University of Southampton, Southampton, UK.
  • Fliege J; Institute for Life Sciences, University of Southampton, Southampton, UK.
  • Sánchez-García RJ; Mathematical Sciences, University of Southampton, Southampton, UK.
  • MacArthur BD; Mathematical Sciences, University of Southampton, Southampton, UK. R.Sanchez-Garcia@soton.ac.uk.
BMC Bioinformatics ; 24(1): 311, 2023 Aug 12.
Article en En | MEDLINE | ID: mdl-37573291
ABSTRACT

BACKGROUND:

Single-cell sequencing (sc-Seq) experiments are producing increasingly large data sets. However, large data sets do not necessarily contain large amounts of information.

RESULTS:

Here, we formally quantify the information obtained from a sc-Seq experiment and show that it corresponds to an intuitive notion of gene expression heterogeneity. We demonstrate a natural relation between our notion of heterogeneity and that of cell type, decomposing heterogeneity into that component attributable to differential expression between cell types (inter-cluster heterogeneity) and that remaining (intra-cluster heterogeneity). We test our definition of heterogeneity as the objective function of a clustering algorithm, and show that it is a useful descriptor for gene expression patterns associated with different cell types.

CONCLUSIONS:

Thus, our definition of gene heterogeneity leads to a biologically meaningful notion of cell type, as groups of cells that are statistically equivalent with respect to their patterns of gene expression. Our measure of heterogeneity, and its decomposition into inter- and intra-cluster, is non-parametric, intrinsic, unbiased, and requires no additional assumptions about expression patterns. Based on this theory, we develop an efficient method for the automatic unsupervised clustering of cells from sc-Seq data, and provide an R package implementation.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Algoritmos / Perfilación de la Expresión Génica Idioma: En Revista: BMC Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2023 Tipo del documento: Article País de afiliación: Reino Unido

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Algoritmos / Perfilación de la Expresión Génica Idioma: En Revista: BMC Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2023 Tipo del documento: Article País de afiliación: Reino Unido