Minimal gene set discovery in single-cell mRNA-seq datasets with ActiveSVM.
Nat Comput Sci
; 2(6): 387-398, 2022 Jun.
Article
de En
| MEDLINE
| ID: mdl-38177588
ABSTRACT
Sequencing costs currently prohibit the application of single-cell mRNA-seq to many biological and clinical analyses. Targeted single-cell mRNA-sequencing reduces sequencing costs by profiling reduced gene sets that capture biological information with a minimal number of genes. Here we introduce an active learning method that identifies minimal but highly informative gene sets that enable the identification of cell types, physiological states and genetic perturbations in single-cell data using a small number of genes. Our active feature selection procedure generates minimal gene sets from single-cell data by employing an active support vector machine (ActiveSVM) classifier. We demonstrate that ActiveSVM feature selection identifies gene sets that enable ~90% cell-type classification accuracy across, for example, cell atlas and disease-characterization datasets. The discovery of small but highly informative gene sets should enable reductions in the number of measurements necessary for application of single-cell mRNA-seq to clinical tests, therapeutic discovery and genetic screens.
Texte intégral:
1
Collection:
01-internacional
Base de données:
MEDLINE
Langue:
En
Journal:
Nat Comput Sci
Année:
2022
Type de document:
Article
Pays d'affiliation:
États-Unis d'Amérique
Pays de publication:
États-Unis d'Amérique