Universal prediction of cell-cycle position using transfer learning.
Genome Biol
; 23(1): 41, 2022 01 31.
Article
em En
| MEDLINE
| ID: mdl-35101061
BACKGROUND: The cell cycle is a highly conserved, continuous process which controls faithful replication and division of cells. Single-cell technologies have enabled increasingly precise measurements of the cell cycle both as a biological process of interest and as a possible confounding factor. Despite its importance and conservation, there is no universally applicable approach to infer position in the cell cycle with high-resolution from single-cell RNA-seq data. RESULTS: Here, we present tricycle, an R/Bioconductor package, to address this challenge by leveraging key features of the biology of the cell cycle, the mathematical properties of principal component analysis of periodic functions, and the use of transfer learning. We estimate a cell-cycle embedding using a fixed reference dataset and project new data into this reference embedding, an approach that overcomes key limitations of learning a dataset-dependent embedding. Tricycle then predicts a cell-specific position in the cell cycle based on the data projection. The accuracy of tricycle compares favorably to gold-standard experimental assays, which generally require specialized measurements in specifically constructed in vitro systems. Using internal controls which are available for any dataset, we show that tricycle predictions generalize to datasets with multiple cell types, across tissues, species, and even sequencing assays. CONCLUSIONS: Tricycle generalizes across datasets and is highly scalable and applicable to atlas-level single-cell RNA-seq data.
Palavras-chave
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Assunto principal:
Análise de Célula Única
/
Aprendizado de Máquina
Tipo de estudo:
Prognostic_studies
/
Risk_factors_studies
Idioma:
En
Ano de publicação:
2022
Tipo de documento:
Article