RESUMEN
Here we present advancements in single-cell combinatorial indexed Assay for Transposase Accessible Chromatin (sciATAC) to measure chromatin accessibility that leverage nanowell chips to achieve atlas-scale cell throughput (>105 cells) at low cost. The platform leverages the core of the sciATAC workflow where multiple indexed tagmentation reactions are performed, followed by pooling and distribution to a second set of reaction wells for polymerase chain reaction (PCR)-based indexing. In this work, we instead leverage a chip containing 5184 nanowells at the PCR stage of indexing, enabling a 52-fold improvement in scale and reduction in per-cell preparation costs. We detail three variants that balance cell throughput and depth of coverage, and apply these methods to banked mouse brain tissue, producing maps of cell types as well as neuronal subtypes that include integration with existing single-cell Assay for Transposase Accessible Chromatin (scATAC) and scRNA-seq data sets. Our optimized workflow achieves a high fraction of reads that fall within called peaks (>80%) and low cell doublet rates. The high cell coverage technique produces high unique reads per cell, while retaining high enrichment for open chromatin regions, enabling the assessment of >70,000 unique accessible loci on average for each cell profiled. When compared to current methods in the field, our technique provides similar or superior per-cell information with very low levels of cell-to-cell cross talk, and achieves this at a cost point much lower than existing assays.
Asunto(s)
Cromatina , Transposasas , Ratones , Animales , Transposasas/metabolismo , Neuronas/metabolismo , Epigenómica/métodos , Análisis de la Célula Individual/métodosRESUMEN
DNA methylation is a key epigenetic property that drives gene regulatory programs in development and disease. Current single-cell methods that produce high quality methylomes are expensive and low throughput without the aid of extensive automation. We previously described a proof-of-principle technique that enabled high cell throughput; however, it produced only low-coverage profiles and was a difficult protocol that required custom sequencing primers and recipes and frequently produced libraries with excessive adapter contamination. Here, we describe a greatly improved version that generates high-coverage profiles (~15-fold increase) using a robust protocol that does not require custom sequencing capabilities, includes multiple stopping points, and exhibits minimal adapter contamination. We demonstrate two versions of sciMETv2 on primary human cortex, a high coverage and rapid version, identifying distinct cell types using CH methylation patterns. These datasets are able to be directly integrated with one another as well as with existing snmC-seq2 datasets with little discernible bias. Finally, we demonstrate the ability to determine cell types using CG methylation alone, which is the dominant context for DNA methylation in most cell types other than neurons and the most applicable analysis outside of brain tissue.
Asunto(s)
Metilación de ADN , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Metilación de ADN/genética , Análisis de Secuencia de ADN , Epigenómica/métodos , Programas InformáticosRESUMEN
Single-cell combinatorial indexing (sci) with transposase-based library construction increases the throughput of single-cell genomics assays but produces sparse coverage in terms of usable reads per cell. We develop symmetrical strand sci ('s3'), a uracil-based adapter switching approach that improves the rate of conversion of source DNA into viable sequencing library fragments following tagmentation. We apply this chemistry to assay chromatin accessibility (s3-assay for transposase-accessible chromatin, s3-ATAC) in human cortical and mouse whole-brain tissues, with mouse datasets demonstrating a six- to 13-fold improvement in usable reads per cell compared with other available methods. Application of s3 to single-cell whole-genome sequencing (s3-WGS) and to whole-genome plus chromatin conformation (s3-GCC) yields 148- and 14.8-fold improvements, respectively, in usable reads per cell compared with sci-DNA-sequencing and sci-HiC. We show that s3-WGS and s3-GCC resolve subclonal genomic alterations in patient-derived pancreatic cancer cell lines. We expect that the s3 platform will be compatible with other transposase-based techniques, including sci-MET or CUT&Tag.
Asunto(s)
Cromatina , Transposasas , Animales , Cromatina/genética , ADN/genética , Genoma , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Ratones , Análisis de Secuencia de ADN , Análisis de la Célula Individual/métodos , Transposasas/genética , Transposasas/metabolismoRESUMEN
High-throughput single-cell epigenomic assays can resolve cell type heterogeneity in complex tissues, however, spatial orientation is lost. Here, we present single-cell combinatorial indexing on Microbiopsies Assigned to Positions for the Assay for Transposase Accessible Chromatin, or sciMAP-ATAC, as a method for highly scalable, spatially resolved, single-cell profiling of chromatin states. sciMAP-ATAC produces data of equivalent quality to non-spatial sci-ATAC and retains the positional information of each cell within a 214 micron cubic region, with up to hundreds of tracked positions in a single experiment. We apply sciMAP-ATAC to assess cortical lamination in the adult mouse primary somatosensory cortex and in the human primary visual cortex, where we produce spatial trajectories and integrate our data with non-spatial single-nucleus RNA and other chromatin accessibility single-cell datasets. Finally, we characterize the spatially progressive nature of cerebral ischemic infarction in the mouse brain using a model of transient middle cerebral artery occlusion.
Asunto(s)
Encéfalo/metabolismo , Cromatina/metabolismo , Animales , Isquemia Encefálica/metabolismo , Núcleo Celular/metabolismo , Femenino , Inmunohistoquímica , Infarto de la Arteria Cerebral Media/metabolismo , RatonesRESUMEN
The chromatin landscape underlying the specification of human cell types is of fundamental interest. We generated human cell atlases of chromatin accessibility and gene expression in fetal tissues. For chromatin accessibility, we devised a three-level combinatorial indexing assay and applied it to 53 samples representing 15 organs, profiling ~800,000 single cells. We leveraged cell types defined by gene expression to annotate these data and cataloged hundreds of thousands of candidate regulatory elements that exhibit cell type-specific chromatin accessibility. We investigated the properties of lineage-specific transcription factors (such as POU2F1 in neurons), organ-specific specializations of broadly distributed cell types (such as blood and endothelial), and cell type-specific enrichments of complex trait heritability. These data represent a rich resource for the exploration of in vivo human gene regulation in diverse tissues and cell types.
Asunto(s)
Cromatina/metabolismo , Feto/citología , Feto/metabolismo , Perfilación de la Expresión Génica , Regulación del Desarrollo de la Expresión Génica , Análisis de la Célula Individual , Atlas como Asunto , Humanos , Neuronas/metabolismo , Factores de Transcripción/metabolismoRESUMEN
The gene expression program underlying the specification of human cell types is of fundamental interest. We generated human cell atlases of gene expression and chromatin accessibility in fetal tissues. For gene expression, we applied three-level combinatorial indexing to >110 samples representing 15 organs, ultimately profiling ~4 million single cells. We leveraged the literature and other atlases to identify and annotate hundreds of cell types and subtypes, both within and across tissues. Our analyses focused on organ-specific specializations of broadly distributed cell types (such as blood, endothelial, and epithelial), sites of fetal erythropoiesis (which notably included the adrenal gland), and integration with mouse developmental atlases (such as conserved specification of blood cells). These data represent a rich resource for the exploration of in vivo human gene expression in diverse tissues and cell types.
Asunto(s)
Cromatina/metabolismo , Feto/citología , Feto/metabolismo , Perfilación de la Expresión Génica , Regulación del Desarrollo de la Expresión Génica , Análisis de la Célula Individual , Atlas como Asunto , Humanos , Neuronas/metabolismo , Factores de Transcripción/metabolismoRESUMEN
von Economo neurons (VENs) are bipolar, spindle-shaped neurons restricted to layer 5 of human frontoinsula and anterior cingulate cortex that appear to be selectively vulnerable to neuropsychiatric and neurodegenerative diseases, although little is known about other VEN cellular phenotypes. Single nucleus RNA-sequencing of frontoinsula layer 5 identifies a transcriptomically-defined cell cluster that contained VENs, but also fork cells and a subset of pyramidal neurons. Cross-species alignment of this cell cluster with a well-annotated mouse classification shows strong homology to extratelencephalic (ET) excitatory neurons that project to subcerebral targets. This cluster also shows strong homology to a putative ET cluster in human temporal cortex, but with a strikingly specific regional signature. Together these results suggest that VENs are a regionally distinctive type of ET neuron. Additionally, we describe the first patch clamp recordings of VENs from neurosurgically-resected tissue that show distinctive intrinsic membrane properties relative to neighboring pyramidal neurons.
Asunto(s)
Neuronas/fisiología , Lóbulo Temporal/citología , Transcriptoma , Animales , Encéfalo/citología , Encéfalo/fisiología , Electrofisiología/métodos , Perfilación de la Expresión Génica , Humanos , Hibridación Fluorescente in Situ , Ratones , Neuronas/citología , Células Piramidales/fisiología , Telencéfalo/citología , Lóbulo Temporal/fisiologíaRESUMEN
Conventional methods for single-cell genome sequencing are limited with respect to uniformity and throughput. Here, we describe sci-L3, a single-cell sequencing method that combines combinatorial indexing (sci-) and linear (L) amplification. The sci-L3 method adopts a 3-level (3) indexing scheme that minimizes amplification biases while enabling exponential gains in throughput. We demonstrate the generalizability of sci-L3 with proof-of-concept demonstrations of single-cell whole-genome sequencing (sci-L3-WGS), targeted sequencing (sci-L3-target-seq), and a co-assay of the genome and transcriptome (sci-L3-RNA/DNA). We apply sci-L3-WGS to profile the genomes of >10,000 sperm and sperm precursors from F1 hybrid mice, mapping 86,786 crossovers and characterizing rare chromosome mis-segregation events in meiosis, including instances of whole-genome equational chromosome segregation. We anticipate that sci-L3 assays can be applied to fully characterize recombination landscapes, to couple CRISPR perturbations and measurements of genome stability, and to other goals requiring high-throughput, high-coverage single-cell sequencing.
Asunto(s)
Perfilación de la Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento , Técnicas de Amplificación de Ácido Nucleico , Análisis de Secuencia de ADN , Análisis de Secuencia de ARN , Análisis de la Célula Individual/métodos , Secuenciación Completa del Genoma , Animales , Segregación Cromosómica , Masculino , Meiosis/genética , Ratones , Prueba de Estudio Conceptual , Espermatozoides/fisiología , Transcriptoma , Flujo de TrabajoRESUMEN
Recent technical advancements have facilitated the mapping of epigenomes at single-cell resolution; however, the throughput and quality of these methods have limited their widespread adoption. Here we describe a high-quality (105 nuclear fragments per cell) droplet-microfluidics-based method for single-cell profiling of chromatin accessibility. We use this approach, named 'droplet single-cell assay for transposase-accessible chromatin using sequencing' (dscATAC-seq), to assay 46,653 cells for the unbiased discovery of cell types and regulatory elements in adult mouse brain. We further increase the throughput of this platform by combining it with combinatorial indexing (dsciATAC-seq), enabling single-cell studies at a massive scale. We demonstrate the utility of this approach by measuring chromatin accessibility across 136,463 resting and stimulated human bone marrow-derived cells to reveal changes in the cis- and trans-regulatory landscape across cell types and under stimulatory conditions at single-cell resolution. Altogether, we describe a total of 510,123 single-cell profiles, demonstrating the scalability and flexibility of this droplet-based platform.
Asunto(s)
Cromatina/química , Epigenómica/métodos , Microfluídica/métodos , Análisis de la Célula Individual/métodos , Animales , Encéfalo/citología , Línea Celular , Supervivencia Celular , Cromatina/metabolismo , Técnicas Químicas Combinatorias , Desoxirribonucleasas/farmacología , Epigénesis Genética/efectos de los fármacos , Regulación de la Expresión Génica/efectos de los fármacos , Ensayos Analíticos de Alto Rendimiento , Humanos , Leucocitos Mononucleares/metabolismo , Macrófagos/metabolismo , RatonesRESUMEN
Here we present a comprehensive map of the accessible chromatin landscape of the mouse hippocampus at single-cell resolution. Substantial advances of this work include the optimization of a single-cell combinatorial indexing assay for transposase accessible chromatin (sci-ATAC-seq); a software suite, scitools, for the rapid processing and visualization of single-cell combinatorial indexing data sets; and a valuable resource of hippocampal regulatory networks at single-cell resolution. We used sci-ATAC-seq to produce 2346 high-quality single-cell chromatin accessibility maps with a mean unique read count per cell of 29,201 from both fresh and frozen hippocampi, observing little difference in accessibility patterns between the preparations. By using this data set, we identified eight distinct major clusters of cells representing both neuronal and nonneuronal cell types and characterized the driving regulatory factors and differentially accessible loci that define each cluster. Within pyramidal neurons, we identified four major clusters, including CA1 and CA3 neurons, and three additional subclusters. We then applied a recently described coaccessibility framework, Cicero, which identified 146,818 links between promoters and putative distal regulatory DNA. Identified coaccessibility networks showed cell-type specificity, shedding light on key dynamic loci that reconfigure to specify hippocampal cell lineages. Lastly, we performed an additional sci-ATAC-seq preparation from cultured hippocampal neurons (899 high-quality cells, 43,532 mean unique reads) that revealed substantial alterations in their epigenetic landscape compared with nuclei from hippocampal tissue. This data set and accompanying analysis tools provide a new resource that can guide subsequent studies of the hippocampus.
Asunto(s)
Cromatina/genética , Hipocampo/metabolismo , Células Piramidales/metabolismo , Animales , Linaje de la Célula/genética , Núcleo Celular/genética , Núcleo Celular/metabolismo , Células Cultivadas , Cromatina/metabolismo , Epigenómica/métodos , Ratones , Plasticidad Neuronal/genética , Células Piramidales/citología , Análisis de Secuencia de ADN , Análisis de la Célula Individual/métodos , Transposasas/genética , Transposasas/metabolismoRESUMEN
Mammalian organogenesis is a remarkable process. Within a short timeframe, the cells of the three germ layers transform into an embryo that includes most of the major internal and external organs. Here we investigate the transcriptional dynamics of mouse organogenesis at single-cell resolution. Using single-cell combinatorial indexing, we profiled the transcriptomes of around 2 million cells derived from 61 embryos staged between 9.5 and 13.5 days of gestation, in a single experiment. The resulting 'mouse organogenesis cell atlas' (MOCA) provides a global view of developmental processes during this critical window. We use Monocle 3 to identify hundreds of cell types and 56 trajectories, many of which are detected only because of the depth of cellular coverage, and collectively define thousands of corresponding marker genes. We explore the dynamics of gene expression within cell types and trajectories over time, including focused analyses of the apical ectodermal ridge, limb mesenchyme and skeletal muscle.
Asunto(s)
Embrión de Mamíferos/citología , Embrión de Mamíferos/embriología , Regulación del Desarrollo de la Expresión Génica/genética , Organogénesis/genética , Análisis de la Célula Individual/métodos , Transcriptoma , Animales , Ectodermo/citología , Ectodermo/embriología , Ectodermo/metabolismo , Embrión de Mamíferos/metabolismo , Femenino , Marcadores Genéticos , Masculino , Mesodermo/citología , Mesodermo/embriología , Mesodermo/metabolismo , Ratones , Desarrollo de Músculos/genética , Músculo Esquelético/citología , Músculo Esquelético/embriología , Músculo Esquelético/metabolismo , Especificidad de Órganos/genética , Análisis de Secuencia de ARN , Factores de TiempoRESUMEN
Although we can increasingly measure transcription, chromatin, methylation, and other aspects of molecular biology at single-cell resolution, most assays survey only one aspect of cellular biology. Here we describe sci-CAR, a combinatorial indexing-based coassay that jointly profiles chromatin accessibility and mRNA (CAR) in each of thousands of single cells. As a proof of concept, we apply sci-CAR to 4825 cells, including a time series of dexamethasone treatment, as well as to 11,296 cells from the adult mouse kidney. With the resulting data, we compare the pseudotemporal dynamics of chromatin accessibility and gene expression, reconstruct the chromatin accessibility profiles of cell types defined by RNA profiles, and link cis-regulatory sites to their target genes on the basis of the covariance of chromatin accessibility and transcription across large numbers of single cells.
Asunto(s)
Cromatina/metabolismo , Perfilación de la Expresión Génica/métodos , Regulación de la Expresión Génica , Genómica/métodos , Análisis de la Célula Individual/métodos , Células A549 , Animales , Dexametasona/farmacología , Regulación de la Expresión Génica/efectos de los fármacos , Células HEK293 , Humanos , Riñón/citología , Riñón/efectos de los fármacos , Ratones , Células 3T3 NIH , Elementos Reguladores de la Transcripción/efectos de los fármacos , Transcripción Genética/efectos de los fármacosRESUMEN
We applied a combinatorial indexing assay, sci-ATAC-seq, to profile genome-wide chromatin accessibility in â¼100,000 single cells from 13 adult mouse tissues. We identify 85 distinct patterns of chromatin accessibility, most of which can be assigned to cell types, and â¼400,000 differentially accessible elements. We use these data to link regulatory elements to their target genes, to define the transcription factor grammar specifying each cell type, and to discover in vivo correlates of heterogeneity in accessibility within cell types. We develop a technique for mapping single cell gene expression data to single-cell chromatin accessibility data, facilitating the comparison of atlases. By intersecting mouse chromatin accessibility with human genome-wide association summary statistics, we identify cell-type-specific enrichments of the heritability signal for hundreds of complex traits. These data define the in vivo landscape of the regulatory genome for common mammalian cell types at single-cell resolution.
Asunto(s)
Cromatina/química , Análisis de la Célula Individual/métodos , Animales , Análisis por Conglomerados , Epigénesis Genética , Epigenómica , Regulación de la Expresión Génica , Genoma Humano , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Mamíferos , Ratones , Ratones Endogámicos C57BL , Factores de TranscripciónRESUMEN
Linking regulatory DNA elements to their target genes, which may be located hundreds of kilobases away, remains challenging. Here, we introduce Cicero, an algorithm that identifies co-accessible pairs of DNA elements using single-cell chromatin accessibility data and so connects regulatory elements to their putative target genes. We apply Cicero to investigate how dynamically accessible elements orchestrate gene regulation in differentiating myoblasts. Groups of Cicero-linked regulatory elements meet criteria of "chromatin hubs"-they are enriched for physical proximity, interact with a common set of transcription factors, and undergo coordinated changes in histone marks that are predictive of changes in gene expression. Pseudotemporal analysis revealed that most DNA elements remain in chromatin hubs throughout differentiation. A subset of elements bound by MYOD1 in myoblasts exhibit early opening in a PBX1- and MEIS1-dependent manner. Our strategy can be applied to dissect the architecture, sequence determinants, and mechanisms of cis-regulation on a genome-wide scale.
Asunto(s)
Ensamble y Desensamble de Cromatina/genética , Cromatina/genética , ADN/genética , Elementos de Facilitación Genéticos/genética , Regulación de la Expresión Génica/genética , Adolescente , Diferenciación Celular/genética , Femenino , Genes Homeobox/genética , Histonas/genética , Humanos , Mioblastos/fisiología , Factores de Transcripción/genéticaRESUMEN
We describe convergent evidence from transcriptomics, morphology, and physiology for a specialized GABAergic neuron subtype in human cortex. Using unbiased single-nucleus RNA sequencing, we identify ten GABAergic interneuron subtypes with combinatorial gene signatures in human cortical layer 1 and characterize a group of human interneurons with anatomical features never described in rodents, having large 'rosehip'-like axonal boutons and compact arborization. These rosehip cells show an immunohistochemical profile (GAD1+CCK+, CNR1-SST-CALB2-PVALB-) matching a single transcriptomically defined cell type whose specific molecular marker signature is not seen in mouse cortex. Rosehip cells in layer 1 make homotypic gap junctions, predominantly target apical dendritic shafts of layer 3 pyramidal neurons, and inhibit backpropagating pyramidal action potentials in microdomains of the dendritic tuft. These cells are therefore positioned for potent local control of distal dendritic computation in cortical pyramidal neurons.
Asunto(s)
Corteza Cerebral/metabolismo , Corteza Cerebral/ultraestructura , Neuronas GABAérgicas/metabolismo , Neuronas GABAérgicas/ultraestructura , Transcriptoma , Adulto , Anciano , Axones/ultraestructura , Espinas Dendríticas/metabolismo , Espinas Dendríticas/ultraestructura , Uniones Comunicantes/metabolismo , Uniones Comunicantes/ultraestructura , Biblioteca de Genes , Humanos , Masculino , Reacción en Cadena de la Polimerasa , Terminales Presinápticos/metabolismo , Terminales Presinápticos/ultraestructura , Células Piramidales/metabolismo , Células Piramidales/ultraestructura , ARN/análisis , ARN/genética , Análisis de Secuencia de ARNRESUMEN
We present a highly scalable assay for whole-genome methylation profiling of single cells. We use our approach, single-cell combinatorial indexing for methylation analysis (sci-MET), to produce 3,282 single-cell bisulfite sequencing libraries and achieve read alignment rates of 68 ± 8%. We apply sci-MET to discriminate the cellular identity of a mixture of three human cell lines and to identify excitatory and inhibitory neuronal populations from mouse cortical tissue.
Asunto(s)
Metilación de ADN/genética , Alineación de Secuencia/métodos , Análisis de la Célula Individual/métodos , Animales , Humanos , Ratones , Análisis de Secuencia de ADN/métodosRESUMEN
Understanding how gene regulatory networks control the progressive restriction of cell fates is a long-standing challenge. Recent advances in measuring gene expression in single cells are providing new insights into lineage commitment. However, the regulatory events underlying these changes remain unclear. Here we investigate the dynamics of chromatin regulatory landscapes during embryogenesis at single-cell resolution. Using single-cell combinatorial indexing assay for transposase accessible chromatin with sequencing (sci-ATAC-seq), we profiled chromatin accessibility in over 20,000 single nuclei from fixed Drosophila melanogaster embryos spanning three landmark embryonic stages: 2-4 h after egg laying (predominantly stage 5 blastoderm nuclei), when each embryo comprises around 6,000 multipotent cells; 6-8 h after egg laying (predominantly stage 10-11), to capture a midpoint in embryonic development when major lineages in the mesoderm and ectoderm are specified; and 10-12 h after egg laying (predominantly stage 13), when each of the embryo's more than 20,000 cells are undergoing terminal differentiation. Our results show that there is spatial heterogeneity in the accessibility of the regulatory genome before gastrulation, a feature that aligns with future cell fate, and that nuclei can be temporally ordered along developmental trajectories. During mid-embryogenesis, tissue granularity emerges such that individual cell types can be inferred by their chromatin accessibility while maintaining a signature of their germ layer of origin. Analysis of the data reveals overlapping usage of regulatory elements between cells of the endoderm and non-myogenic mesoderm, suggesting a common developmental program that is reminiscent of the mesendoderm lineage in other species. We identify 30,075 distal regulatory elements that exhibit tissue-specific accessibility. We validated the germ-layer specificity of a subset of these predicted enhancers in transgenic embryos, achieving an accuracy of 90%. Overall, our results demonstrate the power of shotgun single-cell profiling of embryos to resolve dynamic changes in the chromatin landscape during development, and to uncover the cis-regulatory programs of metazoan germ layers and cell types.
Asunto(s)
Drosophila melanogaster/citología , Drosophila melanogaster/embriología , Desarrollo Embrionario/genética , Regulación del Desarrollo de la Expresión Génica , Análisis de la Célula Individual , Animales , Diferenciación Celular/genética , Linaje de la Célula/genética , Cromatina/genética , Cromatina/metabolismo , Drosophila melanogaster/genética , Endodermo/citología , Endodermo/metabolismo , Elementos de Facilitación Genéticos/genética , Femenino , Gastrulación/genética , Genoma de los Insectos/genética , Masculino , Mesodermo/citología , Mesodermo/metabolismo , Especificidad de Órganos/genética , Organismos Modificados Genéticamente/citología , Organismos Modificados Genéticamente/genética , Reproducibilidad de los ResultadosRESUMEN
To resolve cellular heterogeneity, we developed a combinatorial indexing strategy to profile the transcriptomes of single cells or nuclei, termed sci-RNA-seq (single-cell combinatorial indexing RNA sequencing). We applied sci-RNA-seq to profile nearly 50,000 cells from the nematode Caenorhabditis elegans at the L2 larval stage, which provided >50-fold "shotgun" cellular coverage of its somatic cell composition. From these data, we defined consensus expression profiles for 27 cell types and recovered rare neuronal cell types corresponding to as few as one or two cells in the L2 worm. We integrated these profiles with whole-animal chromatin immunoprecipitation sequencing data to deconvolve the cell type-specific effects of transcription factors. The data generated by sci-RNA-seq constitute a powerful resource for nematode biology and foreshadow similar atlases for other organisms.
Asunto(s)
Caenorhabditis elegans/citología , Caenorhabditis elegans/genética , Núcleo Celular/genética , Análisis de la Célula Individual/métodos , Transcriptoma , Animales , Caenorhabditis elegans/crecimiento & desarrollo , Inmunoprecipitación de Cromatina , Células HEK293 , Humanos , Larva/genética , Ratones , Células 3T3 NIH , Neuronas/metabolismo , ARN/genética , Análisis de Secuencia de ARN , Factores de Transcripción/genéticaRESUMEN
Haplotype-resolved genome sequencing promises to unlock a wealth of information in population and medical genetics. However, for the vast majority of genomes sequenced to date, haplotypes have not been determined because of cumbersome haplotyping workflows that require fractions of the genome to be sequenced in a large number of compartments. Here we demonstrate barcode partitioning of long DNA molecules in a single compartment using "on-bead" barcoded tagmentation. The key to the method that we call "contiguity preserving transposition" sequencing on beads (CPTv2-seq) is transposon-mediated transfer of homogenous populations of barcodes from beads to individual long DNA molecules that get fragmented at the same time (tagmentation). These are then processed to sequencing libraries wherein all sequencing reads originating from each long DNA molecule share a common barcode. Single-tube, bulk processing of long DNA molecules with â¼150,000 different barcoded bead types provides a barcode-linked read structure that reveals long-range molecular contiguity. This technology provides a simple, rapid, plate-scalable and automatable route to accurate, haplotype-resolved sequencing, and phasing of structural variants of the genome.
Asunto(s)
Código de Barras del ADN Taxonómico/métodos , Genoma Humano/genética , Genómica/métodos , Haplotipos/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , HumanosRESUMEN
Most genomes to date have been sequenced without taking into account the diploid nature of the genome. However, the distribution of variants on each individual chromosome can (1) significantly impact gene regulation and protein function, (2) have important implications for analyses of population history and medical genetics, and (3) be of great value for accurate interpretation of medically relevant genetic variation. Here, we describe a comprehensive and detailed protocol for an ultra fast (<3 h library preparation), cost-effective, and scalable haplotyping method, named Contiguity Preserving Transposition sequencing or CPT-seq (Amini et al., Nat Genet 46(12):1343-1349, 2014). CPT-seq accurately phases >95 % of the whole human genome in Mb-scale phasing blocks. Additionally, the same workflow can be used to aid de novo assembly (Adey et al., Genome Res 24(12):2041-2049, 2014), detect structural variants, and perform single cell ATAC-seq analysis (Cusanovich et al., Science 348(6237):910-914, 2015).