RESUMEN
Non-coding genetic variation is a major driver of phenotypic diversity and allows the investigation of mechanisms that control gene expression. Here, we systematically investigated the effects of >50 million variations from five strains of mice on mRNA, nascent transcription, transcription start sites, and transcription factor binding in resting and activated macrophages. We observed substantial differences associated with distinct molecular pathways. Evaluating genetic variation provided evidence for roles of â¼100 TFs in shaping lineage-determining factor binding. Unexpectedly, a substantial fraction of strain-specific factor binding could not be explained by local mutations. Integration of genomic features with chromatin interaction data provided evidence for hundreds of connected cis-regulatory domains associated with differences in transcription factor binding and gene expression. This system and the >250 datasets establish a substantial new resource for investigation of how genetic variation affects cellular phenotypes.
Asunto(s)
Variación Genética , Macrófagos/metabolismo , Factores de Transcripción/metabolismo , Animales , Sitios de Unión , Células de la Médula Ósea/citología , Proteína beta Potenciadora de Unión a CCAAT/genética , Proteína beta Potenciadora de Unión a CCAAT/metabolismo , Análisis por Conglomerados , Elementos de Facilitación Genéticos/genética , Femenino , Regulación de la Expresión Génica/efectos de los fármacos , Lipopolisacáridos/farmacología , Macrófagos/citología , Macrófagos/efectos de los fármacos , Masculino , Ratones , Ratones Endogámicos BALB C , Ratones Endogámicos C57BL , Ratones Endogámicos NOD , Regiones Promotoras Genéticas , Unión Proteica , Proteínas Proto-Oncogénicas/genética , Proteínas Proto-Oncogénicas/metabolismo , Transactivadores/genética , Transactivadores/metabolismo , Factores de Transcripción/genéticaRESUMEN
Focal chromosomal amplification contributes to the initiation of cancer by mediating overexpression of oncogenes1-3, and to the development of cancer therapy resistance by increasing the expression of genes whose action diminishes the efficacy of anti-cancer drugs. Here we used whole-genome sequencing of clonal cell isolates that developed chemotherapeutic resistance to show that chromothripsis is a major driver of circular extrachromosomal DNA (ecDNA) amplification (also known as double minutes) through mechanisms that depend on poly(ADP-ribose) polymerases (PARP) and the catalytic subunit of DNA-dependent protein kinase (DNA-PKcs). Longitudinal analyses revealed that a further increase in drug tolerance is achieved by structural evolution of ecDNAs through additional rounds of chromothripsis. In situ Hi-C sequencing showed that ecDNAs preferentially tether near chromosome ends, where they re-integrate when DNA damage is present. Intrachromosomal amplifications that formed initially under low-level drug selection underwent continuing breakage-fusion-bridge cycles, generating amplicons more than 100 megabases in length that became trapped within interphase bridges and then shattered, thereby producing micronuclei whose encapsulated ecDNAs are substrates for chromothripsis. We identified similar genome rearrangement profiles linked to localized gene amplification in human cancers with acquired drug resistance or oncogene amplifications. We propose that chromothripsis is a primary mechanism that accelerates genomic DNA rearrangement and amplification into ecDNA and enables rapid acquisition of tolerance to altered growth conditions.
Asunto(s)
Cromotripsis , Evolución Molecular , Amplificación de Genes/genética , Neoplasias/genética , Oncogenes/genética , Daño del ADN , Reparación del ADN por Unión de Extremidades , ADN Circular/química , ADN Circular/metabolismo , ADN de Neoplasias/química , ADN de Neoplasias/metabolismo , Proteína Quinasa Activada por ADN , Resistencia a Antineoplásicos , Células HEK293 , Células HeLa , Humanos , Micronúcleos con Defecto Cromosómico , Neoplasias/tratamiento farmacológico , Neoplasias/enzimología , Neoplasias/patología , Poli(ADP-Ribosa) Polimerasas/metabolismo , Selección Genética , Secuenciación Completa del GenomaRESUMEN
The mammalian cerebrum performs high-level sensory perception, motor control and cognitive functions through highly specialized cortical and subcortical structures1. Recent surveys of mouse and human brains with single-cell transcriptomics2-6 and high-throughput imaging technologies7,8 have uncovered hundreds of neural cell types distributed in different brain regions, but the transcriptional regulatory programs that are responsible for the unique identity and function of each cell type remain unknown. Here we probe the accessible chromatin in more than 800,000 individual nuclei from 45 regions that span the adult mouse isocortex, olfactory bulb, hippocampus and cerebral nuclei, and use the resulting data to map the state of 491,818 candidate cis-regulatory DNA elements in 160 distinct cell types. We find high specificity of spatial distribution for not only excitatory neurons, but also most classes of inhibitory neurons and a subset of glial cell types. We characterize the gene regulatory sequences associated with the regional specificity within these cell types. We further link a considerable fraction of the cis-regulatory elements to putative target genes expressed in diverse cerebral cell types and predict transcriptional regulators that are involved in a broad spectrum of molecular and cellular pathways in different neuronal and glial cell populations. Our results provide a foundation for comprehensive analysis of gene regulatory programs of the mammalian brain and assist in the interpretation of noncoding risk variants associated with various neurological diseases and traits in humans.
Asunto(s)
Cerebro/citología , Cerebro/metabolismo , Secuencias Reguladoras de Ácidos Nucleicos/genética , Animales , Atlas como Asunto , Cromatina/química , Cromatina/genética , Cromatina/metabolismo , Ensamble y Desensamble de Cromatina , Regulación de la Expresión Génica , Predisposición Genética a la Enfermedad/genética , Humanos , Masculino , Ratones , Ratones Endogámicos C57BL , Enfermedades del Sistema Nervioso/genética , Neuroglía/clasificación , Neuroglía/metabolismo , Neuronas/clasificación , Neuronas/metabolismo , Análisis de Secuencia de ADN , Análisis de la Célula IndividualRESUMEN
The primary motor cortex (M1) is essential for voluntary fine-motor control and is functionally conserved across mammals1. Here, using high-throughput transcriptomic and epigenomic profiling of more than 450,000 single nuclei in humans, marmoset monkeys and mice, we demonstrate a broadly conserved cellular makeup of this region, with similarities that mirror evolutionary distance and are consistent between the transcriptome and epigenome. The core conserved molecular identities of neuronal and non-neuronal cell types allow us to generate a cross-species consensus classification of cell types, and to infer conserved properties of cell types across species. Despite the overall conservation, however, many species-dependent specializations are apparent, including differences in cell-type proportions, gene expression, DNA methylation and chromatin state. Few cell-type marker genes are conserved across species, revealing a short list of candidate genes and regulatory mechanisms that are responsible for conserved features of homologous cell types, such as the GABAergic chandelier cells. This consensus transcriptomic classification allows us to use patch-seq (a combination of whole-cell patch-clamp recordings, RNA sequencing and morphological characterization) to identify corticospinal Betz cells from layer 5 in non-human primates and humans, and to characterize their highly specialized physiology and anatomy. These findings highlight the robust molecular underpinnings of cell-type diversity in M1 across mammals, and point to the genes and regulatory pathways responsible for the functional identity of cell types and their species-specific adaptations.
Asunto(s)
Corteza Motora/citología , Neuronas/clasificación , Análisis de la Célula Individual , Animales , Atlas como Asunto , Callithrix/genética , Epigénesis Genética , Epigenómica , Femenino , Neuronas GABAérgicas/citología , Neuronas GABAérgicas/metabolismo , Perfilación de la Expresión Génica , Glutamatos/metabolismo , Humanos , Hibridación Fluorescente in Situ , Masculino , Ratones , Persona de Mediana Edad , Corteza Motora/anatomía & histología , Neuronas/citología , Neuronas/metabolismo , Especificidad de Órganos , Filogenia , Especificidad de la Especie , TranscriptomaRESUMEN
Single-cell transcriptomics can provide quantitative molecular signatures for large, unbiased samples of the diverse cell types in the brain1-3. With the proliferation of multi-omics datasets, a major challenge is to validate and integrate results into a biological understanding of cell-type organization. Here we generated transcriptomes and epigenomes from more than 500,000 individual cells in the mouse primary motor cortex, a structure that has an evolutionarily conserved role in locomotion. We developed computational and statistical methods to integrate multimodal data and quantitatively validate cell-type reproducibility. The resulting reference atlas-containing over 56 neuronal cell types that are highly replicable across analysis methods, sequencing technologies and modalities-is a comprehensive molecular and genomic account of the diverse neuronal and non-neuronal cell types in the mouse primary motor cortex. The atlas includes a population of excitatory neurons that resemble pyramidal cells in layer 4 in other cortical regions4. We further discovered thousands of concordant marker genes and gene regulatory elements for these cell types. Our results highlight the complex molecular regulation of cell types in the brain and will directly enable the design of reagents to target specific cell types in the mouse primary motor cortex for functional analysis.
Asunto(s)
Epigenómica , Perfilación de la Expresión Génica , Corteza Motora/citología , Neuronas/clasificación , Análisis de la Célula Individual , Transcriptoma , Animales , Atlas como Asunto , Conjuntos de Datos como Asunto , Epigénesis Genética , Femenino , Masculino , Ratones , Corteza Motora/anatomía & histología , Neuronas/citología , Neuronas/metabolismo , Especificidad de Órganos , Reproducibilidad de los ResultadosRESUMEN
Cytosine DNA methylation is essential for mammalian development but understanding of its spatiotemporal distribution in the developing embryo remains limited1,2. Here, as part of the mouse Encyclopedia of DNA Elements (ENCODE) project, we profiled 168 methylomes from 12 mouse tissues or organs at 9 developmental stages from embryogenesis to adulthood. We identified 1,808,810 genomic regions that showed variations in CG methylation by comparing the methylomes of different tissues or organs from different developmental stages. These DNA elements predominantly lose CG methylation during fetal development, whereas the trend is reversed after birth. During late stages of fetal development, non-CG methylation accumulated within the bodies of key developmental transcription factor genes, coinciding with their transcriptional repression. Integration of genome-wide DNA methylation, histone modification and chromatin accessibility data enabled us to predict 461,141 putative developmental tissue-specific enhancers, the human orthologues of which were enriched for disease-associated genetic variants. These spatiotemporal epigenome maps provide a resource for studies of gene regulation during tissue or organ progression, and a starting point for investigating regulatory elements that are involved in human developmental disorders.
Asunto(s)
Metilación de ADN , Epigenoma , Feto/embriología , Feto/metabolismo , Animales , Animales Recién Nacidos , Cromatina/genética , Cromatina/metabolismo , Enfermedad/genética , Regulación hacia Abajo , Elementos de Facilitación Genéticos/genética , Represión Epigenética , Femenino , Silenciador del Gen , Humanos , Ratones , Ratones Endogámicos C57BL , Modelos Animales , Anotación de Secuencia Molecular , Polimorfismo de Nucleótido Simple , Análisis Espacio-TemporalRESUMEN
In 2021, the World Health Organization reclassified glioblastoma, the most common form of adult brain cancer, into isocitrate dehydrogenase (IDH)-wild-type glioblastomas and grade IV IDH mutant (G4 IDHm) astrocytomas. For both tumor types, intratumoral heterogeneity is a key contributor to therapeutic failure. To better define this heterogeneity, genome-wide chromatin accessibility and transcription profiles of clinical samples of glioblastomas and G4 IDHm astrocytomas were analyzed at single-cell resolution. These profiles afforded resolution of intratumoral genetic heterogeneity, including delineation of cell-to-cell variations in distinct cell states, focal gene amplifications, as well as extrachromosomal circular DNAs. Despite differences in IDH mutation status and significant intratumoral heterogeneity, the profiled tumor cells shared a common chromatin structure defined by open regions enriched for nuclear factor 1 transcription factors (NFIA and NFIB). Silencing of NFIA or NFIB suppressed in vitro and in vivo growths of patient-derived glioblastomas and G4 IDHm astrocytoma models. These findings suggest that despite distinct genotypes and cell states, glioblastoma/G4 astrocytoma cells share dependency on core transcriptional programs, yielding an attractive platform for addressing therapeutic challenges associated with intratumoral heterogeneity.
Asunto(s)
Astrocitoma , Neoplasias Encefálicas , Glioblastoma , Adulto , Humanos , Glioblastoma/genética , Glioblastoma/patología , Cromatina/genética , Transcriptoma , Astrocitoma/genética , Astrocitoma/patología , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/patología , Mutación , Isocitrato Deshidrogenasa/genética , Isocitrato Deshidrogenasa/metabolismoRESUMEN
Single-cell Hi-C (scHi-C) analysis has been increasingly used to map chromatin architecture in diverse tissue contexts, but computational tools to define chromatin loops at high resolution from scHi-C data are still lacking. Here, we describe Single-Nucleus Analysis Pipeline for Hi-C (SnapHiC), a method that can identify chromatin loops at high resolution and accuracy from scHi-C data. Using scHi-C data from 742 mouse embryonic stem cells, we benchmark SnapHiC against a number of computational tools developed for mapping chromatin loops and interactions from bulk Hi-C. We further demonstrate its use by analyzing single-nucleus methyl-3C-seq data from 2,869 human prefrontal cortical cells, which uncovers cell type-specific chromatin loops and predicts putative target genes for noncoding sequence variants associated with neuropsychiatric disorders. Our results indicate that SnapHiC could facilitate the analysis of cell type-specific chromatin architecture and gene regulatory programs in complex tissues.
Asunto(s)
Cromatina/química , Biología Computacional/métodos , Análisis de la Célula Individual/métodos , Algoritmos , Animales , Cromatina/genética , Secuenciación de Inmunoprecipitación de Cromatina , Visualización de Datos , Bases de Datos Factuales , Expresión Génica , Humanos , Trastornos Mentales/genética , Ratones , Células Madre Embrionarias de Ratones/citología , Células Madre Embrionarias de Ratones/fisiología , Polimorfismo de Nucleótido Simple , Corteza Prefrontal/citología , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN/métodosRESUMEN
We report a molecular assay, Methyl-HiC, that can simultaneously capture the chromosome conformation and DNA methylome in a cell. Methyl-HiC reveals coordinated DNA methylation status between distal genomic segments that are in spatial proximity in the nucleus, and delineates heterogeneity of both the chromatin architecture and DNA methylome in a mixed population. It enables simultaneous characterization of cell-type-specific chromatin organization and epigenome in complex tissues.
Asunto(s)
Cromatina/metabolismo , Metilación de ADN , Análisis de la Célula Individual/métodos , Animales , Islas de CpG , Conjuntos de Datos como Asunto , Humanos , Ratones , Células Madre Embrionarias de Ratones/citología , Células Madre Embrionarias de Ratones/metabolismoRESUMEN
Millions of cis-regulatory elements are predicted to be present in the human genome, but direct evidence for their biological function is scarce. Here we report a high-throughput method, cis-regulatory element scan by tiling-deletion and sequencing (CREST-seq), for the unbiased discovery and functional assessment of cis-regulatory sequences in the genome. We used it to interrogate the 2-Mb POU5F1 locus in human embryonic stem cells, and identified 45 cis-regulatory elements. A majority of these elements have active chromatin marks, DNase hypersensitivity, and occupancy by multiple transcription factors, which confirms the utility of chromatin signatures in cis-element mapping. Notably, 17 of them are previously annotated promoters of functionally unrelated genes, and like typical enhancers, they form extensive spatial contacts with the POU5F1 promoter. These results point to the commonality of enhancer-like promoters in the human genome.
Asunto(s)
Mapeo Cromosómico/métodos , Pruebas Genéticas/métodos , Secuencias Reguladoras de Ácidos Nucleicos/genética , Algoritmos , Células Cultivadas , Células Madre Embrionarias/fisiología , Regulación de la Expresión Génica/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Análisis de Secuencia de ADN , Análisis de la Célula IndividualRESUMEN
Hi-C and chromatin immunoprecipitation (ChIP) have been combined to identify long-range chromatin interactions genome-wide at reduced cost and enhanced resolution, but extracting information from the resulting datasets has been challenging. Here we describe a computational method, MAPS, Model-based Analysis of PLAC-seq and HiChIP, to process the data from such experiments and identify long-range chromatin interactions. MAPS adopts a zero-truncated Poisson regression framework to explicitly remove systematic biases in the PLAC-seq and HiChIP datasets, and then uses the normalized chromatin contact frequencies to identify significant chromatin interactions anchored at genomic regions bound by the protein of interest. MAPS shows superior performance over existing software tools in the analysis of chromatin interactions from multiple PLAC-seq and HiChIP datasets centered on different transcriptional factors and histone marks. MAPS is freely available at https://github.com/ijuric/MAPS.
Asunto(s)
Ensamble y Desensamble de Cromatina/fisiología , Mapeo Cromosómico/métodos , Biología Computacional/métodos , Cromatina/metabolismo , Cromatina/fisiología , Inmunoprecipitación de Cromatina/métodos , Simulación por Computador , Genoma , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Código de Histonas , Humanos , Análisis de Secuencia de ADN/métodos , Programas InformáticosRESUMEN
With sparse and uneven site distribution, Global Positioning System (GPS) data is just barely able to infer low-degree coefficients in the surface mass field. The unresolved higher-degree coefficients turn out to introduce aliasing errors into the estimates of low-degree coefficients. To reduce the aliasing errors, the optimal truncation degree should be employed. Using surface displacements simulated from loading models, we theoretically prove that the optimal truncation degree should be degree 6-7 for a GPS inversion and degree 20 for combing GPS and Ocean Bottom Pressure (OBP) with no additional regularization. The optimal truncation degree should be decreased to degree 4-5 for real GPS data. Additionally, we prove that a Scaled Sensitivity Matrix (SSM) approach can be used to quantify the aliasing errors due to any one or any combination of unresolved higher degrees, which is beneficial to identify the major error source from among all the unresolved higher degrees. Results show that the unresolved higher degrees lower than degree 20 are the major error source for global inversion. We also theoretically prove that the SSM approach can be used to mitigate the aliasing errors in a GPS inversion, if the neglected higher degrees are well known from other sources.
RESUMEN
In order to satisfy the requirement of high-rate high-precision applications, 1 Hz BeiDou Navigation Satellite System (BDS) satellite clock corrections are generated based on precise orbit products, and the quality of the generated clock products is assessed by comparing with those from the other analysis centers. The comparisons show that the root mean square (RMS) of clock errors of geostationary Earth orbits (GEO) is about 0.63 ns, whereas those of inclined geosynchronous orbits (IGSO) and medium Earth orbits (MEO) are about 0.2-0.3 ns and 0.1 ns, respectively. Then, the 1 Hz clock products are used for BDS precise point positioning (PPP) to retrieve seismic displacements of the 2015 Mw 7.8 Gorkha, Nepal, earthquake. The derived seismic displacements from BDS PPP are consistent with those from the Global Positioning System (GPS) PPP, with RMS of 0.29, 0.38, and 1.08 cm in east, north, and vertical components, respectively. In addition, the BDS PPP solutions with different clock intervals of 1 s, 5 s, 30 s, and 300 s are processed and compared with each other. The results demonstrate that PPP with 300 s clock intervals is the worst and that with 1 s clock interval is the best. For the scenario of 5 s clock intervals, the precision of PPP solutions is almost the same to 1 s results. Considering the time consumption of clock estimates, we suggest that 5 s clock interval is competent for high-rate BDS solutions.
RESUMEN
BACKGROUND: The CCCTC-binding factor (CTCF) has diverse regulatory functions. However, the definitive characteristics of the CTCF binding motif required for its functional diversity still remains elusive. RESULTS: Here, we describe a new motif discovery workflow by which we have identified three CTCF binding motif variations with highly divergent functionalities. Supported by transcriptomic, epigenomic and chromatin-interactomic data, we show that the functional diversity of the CTCF binding motifs is strongly associated with their GC content, CpG dinucleotide coverage and relative DNA methylation level at the 12th position of the motifs. Further analysis suggested that the co-localization of cohesin, the key factor in cohesion of sister chromatids, is negatively correlated with the CpG coverage and the relative DNA methylation level at the 12th position. Finally, we present evidences for a hypothetical model in which chromatin interactions between promoters and distal regulatory regions are likely mediated by CTCFs binding to sequences with high CpG. CONCLUSION: These results demonstrate the existence of definitive CTCF binding motifs corresponding to CTCF's diverse functions, and that the functional diversity of the motifs is strongly associated with genetic and epigenetic features at the 12th position of the motifs.
Asunto(s)
Motivos de Nucleótidos/genética , Proteínas Represoras/genética , Secuencia de Bases , Factor de Unión a CCCTC , Islas de CpG/genética , Metilación de ADN/genética , Regulación de la Expresión Génica , Variación Genética , Genoma Humano , Células HeLa , Humanos , Células K562 , Datos de Secuencia Molecular , Unión Proteica , Secuencias Reguladoras de Ácidos Nucleicos/genéticaRESUMEN
The human cerebral cortex has tremendous cellular diversity. How different cell types are organized in the human cortex and how cellular organization varies across species remain unclear. In this study, we performed spatially resolved single-cell profiling of 4000 genes using multiplexed error-robust fluorescence in situ hybridization (MERFISH), identified more than 100 transcriptionally distinct cell populations, and generated a molecularly defined and spatially resolved cell atlas of the human middle and superior temporal gyrus. We further explored cell-cell interactions arising from soma contact or proximity in a cell type-specific manner. Comparison of the human and mouse cortices showed conservation in the laminar organization of cells and differences in somatic interactions across species. Our data revealed human-specific cell-cell proximity patterns and a markedly increased enrichment for interactions between neurons and non-neuronal cells in the human cortex.
Asunto(s)
Corteza Cerebral , Perfilación de la Expresión Génica , Neuronas , Análisis de la Célula Individual , Animales , Comunicación Celular , Corteza Cerebral/citología , Corteza Cerebral/metabolismo , Humanos , Hibridación Fluorescente in Situ/métodos , Ratones , Neuronas/citología , Neuronas/metabolismo , Análisis de la Célula Individual/métodosRESUMEN
Single-cell technologies measure unique cellular signatures but are typically limited to a single modality. Computational approaches allow the fusion of diverse single-cell data types, but their efficacy is difficult to validate in the absence of authentic multi-omic measurements. To comprehensively assess the molecular phenotypes of single cells, we devised single-nucleus methylcytosine, chromatin accessibility, and transcriptome sequencing (snmCAT-seq) and applied it to postmortem human frontal cortex tissue. We developed a cross-validation approach using multi-modal information to validate fine-grained cell types and assessed the effectiveness of computational data fusion methods. Correlation analysis in individual cells revealed distinct relations between methylation and gene expression. Our integrative approach enabled joint analyses of the methylome, transcriptome, chromatin accessibility, and conformation for 63 human cortical cell types. We reconstructed regulatory lineages for cortical cell populations and found specific enrichment of genetic risk for neuropsychiatric traits, enabling the prediction of cell types that are associated with diseases.
RESUMEN
Single-nucleus assay for transposase-accessible chromatin using sequencing (snATAC-seq) creates new opportunities to dissect cell type-specific mechanisms of complex diseases. Since pancreatic islets are central to type 2 diabetes (T2D), we profiled 15,298 islet cells by using combinatorial barcoding snATAC-seq and identified 12 clusters, including multiple alpha, beta and delta cell states. We cataloged 228,873 accessible chromatin sites and identified transcription factors underlying lineage- and state-specific regulation. We observed state-specific enrichment of fasting glucose and T2D genome-wide association studies for beta cells and enrichment for other endocrine cell types. At T2D signals localized to islet-accessible chromatin, we prioritized variants with predicted regulatory function and co-accessibility with target genes. A causal T2D variant rs231361 at the KCNQ1 locus had predicted effects on a beta cell enhancer co-accessible with INS and genome editing in embryonic stem cell-derived beta cells affected INS levels. Together our findings demonstrate the power of single-cell epigenomics for interpreting complex disease genetics.
Asunto(s)
Cromatina/química , Diabetes Mellitus Tipo 2/genética , Células Secretoras de Glucagón/metabolismo , Células Secretoras de Insulina/metabolismo , Canal de Potasio KCNQ1/genética , Células Secretoras de Polipéptido Pancreático/metabolismo , Células Secretoras de Somatostatina/metabolismo , Glucemia/metabolismo , Diferenciación Celular , Cromatina/metabolismo , Diabetes Mellitus Tipo 2/metabolismo , Diabetes Mellitus Tipo 2/patología , Epigenómica , Ayuno , Perfilación de la Expresión Génica , Estudio de Asociación del Genoma Completo , Células Secretoras de Glucagón/patología , Secuenciación de Nucleótidos de Alto Rendimiento , Células Madre Embrionarias Humanas/citología , Humanos , Células Secretoras de Insulina/patología , Canal de Potasio KCNQ1/metabolismo , Familia de Multigenes , Células Secretoras de Polipéptido Pancreático/patología , Polimorfismo Genético , Análisis de la Célula Individual , Células Secretoras de Somatostatina/patología , Factores de Transcripción/clasificación , Factores de Transcripción/genética , Factores de Transcripción/metabolismoRESUMEN
Identification of the cis-regulatory elements controlling cell-type specific gene expression patterns is essential for understanding the origin of cellular diversity. Conventional assays to map regulatory elements via open chromatin analysis of primary tissues is hindered by sample heterogeneity. Single cell analysis of accessible chromatin (scATAC-seq) can overcome this limitation. However, the high-level noise of each single cell profile and the large volume of data pose unique computational challenges. Here, we introduce SnapATAC, a software package for analyzing scATAC-seq datasets. SnapATAC dissects cellular heterogeneity in an unbiased manner and map the trajectories of cellular states. Using the Nyström method, SnapATAC can process data from up to a million cells. Furthermore, SnapATAC incorporates existing tools into a comprehensive package for analyzing single cell ATAC-seq dataset. As demonstration of its utility, SnapATAC is applied to 55,592 single-nucleus ATAC-seq profiles from the mouse secondary motor cortex. The analysis reveals ~370,000 candidate regulatory elements in 31 distinct cell populations in this brain region and inferred candidate cell-type specific transcriptional regulators.