ABSTRACT
We introduce poly-adenine CRISPR gRNA-based single-cell RNA-sequencing (pAC-Seq), a method that enables the direct observation of guide RNAs (gRNAs) in scRNA-seq. We use pAC-Seq to assess the phenotypic consequences of CRISPR/Cas9 based alterations of gene cis-regulatory regions. We show that pAC-Seq is able to detect cis-regulatory-induced alteration of target gene expression even when biallelic loss of target gene expression occurs in only ~5% of cells. This low rate of biallelic loss significantly increases the number of cells required to detect the consequences of changes to the regulatory genome, but can be ameliorated by transcript-targeted sequencing. Based on our experimental results we model the power to detect regulatory genome induced transcriptomic effects based on the rate of mono/biallelic loss, baseline gene expression, and the number of cells per target gRNA.
Subject(s)
CRISPR-Cas Systems/genetics , Regulatory Elements, Transcriptional/genetics , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Transcriptome/genetics , Algorithms , Animals , Clustered Regularly Interspaced Short Palindromic Repeats/genetics , Computational Biology , Databases, Factual , Humans , Mice , RNA, Guide, Kinetoplastida/geneticsABSTRACT
The study of cell-population heterogeneity in a range of biological systems, from viruses to bacterial isolates to tumor samples, has been transformed by recent advances in sequencing throughput. While the high-coverage afforded can be used, in principle, to identify very rare variants in a population, existing ad hoc approaches frequently fail to distinguish true variants from sequencing errors. We report a method (LoFreq) that models sequencing run-specific error rates to accurately call variants occurring in <0.05% of a population. Using simulated and real datasets (viral, bacterial and human), we show that LoFreq has near-perfect specificity, with significantly improved sensitivity compared with existing methods and can efficiently analyze deep Illumina sequencing datasets without resorting to approximations or heuristics. We also present experimental validation for LoFreq on two different platforms (Fluidigm and Sequenom) and its application to call rare somatic variants from exome sequencing datasets for gastric cancer. Source code and executables for LoFreq are freely available at http://sourceforge.net/projects/lofreq/.
Subject(s)
Genetic Variation , High-Throughput Nucleotide Sequencing/methods , Computer Simulation , Dengue Virus/genetics , Escherichia coli/genetics , Genomics/methods , High-Throughput Nucleotide Sequencing/standards , Humans , Mutation , Sensitivity and Specificity , Stomach Neoplasms/genetics , Viral Proteins/chemistry , Viral Proteins/geneticsABSTRACT
Existing computational methods that use single-cell RNA-sequencing (scRNA-seq) for cell fate prediction do not model how cells evolve stochastically and in physical time, nor can they predict how differentiation trajectories are altered by proposed interventions. We introduce PRESCIENT (Potential eneRgy undErlying Single Cell gradIENTs), a generative modeling framework that learns an underlying differentiation landscape from time-series scRNA-seq data. We validate PRESCIENT on an experimental lineage tracing dataset, where we show that PRESCIENT is able to predict the fate biases of progenitor cells in hematopoiesis when accounting for cell proliferation, improving upon the best-performing existing method. We demonstrate how PRESCIENT can simulate trajectories for perturbed cells, recovering the expected effects of known modulators of cell fate in hematopoiesis and pancreatic ß cell differentiation. PRESCIENT is able to accommodate complex perturbations of multiple genes, at different time points and from different starting cell populations, and is available at https://github.com/gifford-lab/prescient .
Subject(s)
Cell Differentiation/genetics , Models, Genetic , RNA-Seq , Single-Cell Analysis/methods , Animals , Cell Proliferation/genetics , Cells, Cultured , Computer Simulation , Datasets as Topic , Deep Learning , Hematopoiesis/genetics , Humans , Insulin-Secreting Cells/physiology , Mice , Software , Stem Cells/physiology , Stochastic ProcessesABSTRACT
Empirical optimization of stem cell differentiation protocols is time consuming, is laborintensive, and typically does not comprehensively interrogate all relevant signaling pathways. Here we describe barcodelet single-cell RNA sequencing (barRNA-seq), which enables systematic exploration of cellular perturbations by tagging individual cells with RNA "barcodelets" to identify them on the basis of the treatments they receive. We apply barRNA-seq to simultaneously manipulate up to seven developmental pathways and study effects on embryonic stem cell (ESC) germ layer specification and mesodermal specification, uncovering combinatorial effects of signaling pathway activation on gene expression. We further develop a data-driven framework for identifying combinatorial signaling perturbations that drive cells toward specific fates, including several annotated in an existing scRNA-seq gastrulation atlas, and use this approach to guide ESC differentiation into a notochord-like population. We expect that barRNA-seq will have broad utility for investigating and understanding how cooperative signaling pathways drive cell fate acquisition.