RESUMO
To identify cancer-associated gene regulatory changes, we generated single-cell chromatin accessibility landscapes across eight tumor types as part of The Cancer Genome Atlas. Tumor chromatin accessibility is strongly influenced by copy number alterations that can be used to identify subclones, yet underlying cis-regulatory landscapes retain cancer type-specific features. Using organ-matched healthy tissues, we identified the "nearest healthy" cell types in diverse cancers, demonstrating that the chromatin signature of basal-like-subtype breast cancer is most similar to secretory-type luminal epithelial cells. Neural network models trained to learn regulatory programs in cancer revealed enrichment of model-prioritized somatic noncoding mutations near cancer-associated genes, suggesting that dispersed, nonrecurrent, noncoding mutations in cancer are functional. Overall, these data and interpretable gene regulatory models for cancer and healthy tissue provide a framework for understanding cancer-specific gene regulation.
Assuntos
Cromatina , Regulação Neoplásica da Expressão Gênica , Neoplasias , Análise de Célula Única , Humanos , Cromatina/metabolismo , Cromatina/genética , Neoplasias/genética , Redes Neurais de Computação , Mutação , Variações do Número de Cópias de DNA , Neoplasias da Mama/genética , Neoplasias da Mama/patologiaRESUMO
BACKGROUND: Archaea, together with Bacteria, represent the two main divisions of life on Earth, with many of the defining characteristics of the more complex eukaryotes tracing their origin to evolutionary innovations first made in their archaeal ancestors. One of the most notable such features is nucleosomal chromatin, although archaeal histones and chromatin differ significantly from those of eukaryotes, not all archaea possess histones and it is not clear if histones are a main packaging component for all that do. Despite increased interest in archaeal chromatin in recent years, its properties have been little studied using genomic tools. RESULTS: Here, we adapt the ATAC-seq assay to archaea and use it to map the accessible landscape of the genome of the euryarchaeote Haloferax volcanii. We integrate the resulting datasets with genome-wide maps of active transcription and single-stranded DNA (ssDNA) and find that while H. volcanii promoters exist in a preferentially accessible state, unlike most eukaryotes, modulation of transcriptional activity is not associated with changes in promoter accessibility. Applying orthogonal single-molecule footprinting methods, we quantify the absolute levels of physical protection of H. volcanii and find that Haloferax chromatin is similarly or only slightly more accessible, in aggregate, than that of eukaryotes. We also evaluate the degree of coordination of transcription within archaeal operons and make the unexpected observation that some CRISPR arrays are associated with highly prevalent ssDNA structures. CONCLUSIONS: Our results provide the first comprehensive maps of chromatin accessibility and active transcription in Haloferax across conditions and thus a foundation for future functional studies of archaeal chromatin.
Assuntos
Proteínas Arqueais , Haloferax volcanii , Cromatina , Histonas/genética , Haloferax volcanii/genética , Haloferax volcanii/metabolismo , Nucleossomos , Evolução Biológica , Eucariotos/genética , Proteínas Arqueais/genéticaRESUMO
Single-cell assay for transposase-accessible chromatin by sequencing (scATAC-seq) has emerged as a powerful tool for dissecting regulatory landscapes and cellular heterogeneity. However, an exploration of systemic biases among scATAC-seq technologies has remained absent. In this study, we benchmark the performance of eight scATAC-seq methods across 47 experiments using human peripheral blood mononuclear cells (PBMCs) as a reference sample and develop PUMATAC, a universal preprocessing pipeline, to handle the various sequencing data formats. Our analyses reveal significant differences in sequencing library complexity and tagmentation specificity, which impact cell-type annotation, genotype demultiplexing, peak calling, differential region accessibility and transcription factor motif enrichment. Our findings underscore the importance of sample extraction, method selection, data processing and total cost of experiments, offering valuable guidance for future research. Finally, our data and analysis pipeline encompasses 169,000 PBMC scATAC-seq profiles and a best practices code repository for scATAC-seq data analysis, which are freely available to extend this benchmarking effort to future protocols.
RESUMO
Detecting and mitigating off-target activity is critical to the practical application of CRISPR-mediated genome and epigenome editing. While numerous methods have been developed to map Cas9 binding specificity genome-wide, they are generally time-consuming and/or expensive, and not applicable to catalytically dead CRISPR enzymes. We have developed CasKAS, a rapid, inexpensive, and facile assay for identifying off-target CRISPR enzyme binding and cleavage by chemically mapping the unwound single-stranded DNA structures formed upon binding of a sgRNA-loaded Cas9 protein. We demonstrate this method in both in vitro and in vivo contexts.
Assuntos
Sistemas CRISPR-Cas , DNA de Cadeia Simples , DNA de Cadeia Simples/genética , Genoma , Proteína 9 Associada à CRISPR/genética , Epigenoma , Edição de Genes/métodosRESUMO
The ability to analyze the transcriptomic and epigenomic states of individual single cells has in recent years transformed our ability to measure and understand biological processes. Recent advancements have focused on increasing sensitivity and throughput to provide richer and deeper biological insights at the cellular level. The next frontier is the development of multiomic methods capable of analyzing multiple features from the same cell, such as the simultaneous measurement of the transcriptome and the chromatin accessibility of candidate regulatory elements. In this chapter, we discuss and describe SHARE-seq (Simultaneous high-throughput ATAC, and RNA expression with sequencing) for carrying out simultaneous chromatin accessibility and transcriptome measurements in single cells, together with the experimental and analytical considerations for achieving optimal results.
Assuntos
Cromatina , Transcriptoma , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Sequências Reguladoras de Ácido Nucleico , Análise de Célula Única/métodosRESUMO
The advent of single-cell chromatin accessibility profiling has accelerated the ability to map gene regulatory landscapes but has outpaced the development of scalable software to rapidly extract biological meaning from these data. Here we present a software suite for single-cell analysis of regulatory chromatin in R (ArchR; https://www.archrproject.com/ ) that enables fast and comprehensive analysis of single-cell chromatin accessibility data. ArchR provides an intuitive, user-focused interface for complex single-cell analyses, including doublet removal, single-cell clustering and cell type identification, unified peak set generation, cellular trajectory identification, DNA element-to-gene linkage, transcription factor footprinting, mRNA expression level prediction from chromatin accessibility and multi-omic integration with single-cell RNA sequencing (scRNA-seq). Enabling the analysis of over 1.2 million single cells within 8 h on a standard Unix laptop, ArchR is a comprehensive software suite for end-to-end analysis of single-cell chromatin accessibility that will accelerate the understanding of gene regulation at the resolution of individual cells.
Assuntos
Cromatina , Análise de Célula Única/métodos , Software , Animais , Cromatina/genética , Cromatina/metabolismo , Análise por Conglomerados , Regulação da Expressão Gênica , Genoma , Humanos , Camundongos , Análise de Sequência de RNA/métodos , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Interface Usuário-Computador , NavegadorRESUMO
Genome-wide association studies of neurological diseases have identified thousands of variants associated with disease phenotypes. However, most of these variants do not alter coding sequences, making it difficult to assign their function. Here, we present a multi-omic epigenetic atlas of the adult human brain through profiling of single-cell chromatin accessibility landscapes and three-dimensional chromatin interactions of diverse adult brain regions across a cohort of cognitively healthy individuals. We developed a machine-learning classifier to integrate this multi-omic framework and predict dozens of functional SNPs for Alzheimer's and Parkinson's diseases, nominating target genes and cell types for previously orphaned loci from genome-wide association studies. Moreover, we dissected the complex inverted haplotype of the MAPT (encoding tau) Parkinson's disease risk locus, identifying putative ectopic regulatory interactions in neurons that may mediate this disease association. This work expands understanding of inherited variation and provides a roadmap for the epigenomic dissection of causal regulatory variation in disease.