ABSTRACT
Amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration (FTLD) share many clinical, pathological, and genetic features, but a detailed understanding of their associated transcriptional alterations across vulnerable cortical cell types is lacking. Here, we report a high-resolution, comparative single-cell molecular atlas of the human primary motor and dorsolateral prefrontal cortices and their transcriptional alterations in sporadic and familial ALS and FTLD. By integrating transcriptional and genetic information, we identify known and previously unidentified vulnerable populations in cortical layer 5 and show that ALS- and FTLD-implicated motor and spindle neurons possess a virtually indistinguishable molecular identity. We implicate potential disease mechanisms affecting these cell types as well as non-neuronal drivers of pathogenesis. Finally, we show that neuron loss in cortical layer 5 tracks more closely with transcriptional identity rather than cellular morphology and extends beyond previously reported vulnerable cell types.
Subject(s)
Amyotrophic Lateral Sclerosis , Frontotemporal Lobar Degeneration , Prefrontal Cortex , Animals , Humans , Mice , Amyotrophic Lateral Sclerosis/genetics , Amyotrophic Lateral Sclerosis/metabolism , Amyotrophic Lateral Sclerosis/pathology , Frontotemporal Dementia/genetics , Frontotemporal Lobar Degeneration/genetics , Frontotemporal Lobar Degeneration/metabolism , Frontotemporal Lobar Degeneration/pathology , Gene Expression Profiling , Neurons/metabolism , Prefrontal Cortex/metabolism , Prefrontal Cortex/pathology , Single-Cell Gene Expression AnalysisABSTRACT
Persistent DNA double-strand breaks (DSBs) in neurons are an early pathological hallmark of neurodegenerative diseases including Alzheimer's disease (AD), with the potential to disrupt genome integrity. We used single-nucleus RNA-seq in human postmortem prefrontal cortex samples and found that excitatory neurons in AD were enriched for somatic mosaic gene fusions. Gene fusions were particularly enriched in excitatory neurons with DNA damage repair and senescence gene signatures. In addition, somatic genome structural variations and gene fusions were enriched in neurons burdened with DSBs in the CK-p25 mouse model of neurodegeneration. Neurons enriched for DSBs also had elevated levels of cohesin along with progressive multiscale disruption of the 3D genome organization aligned with transcriptional changes in synaptic, neuronal development, and histone genes. Overall, this study demonstrates the disruption of genome stability and the 3D genome organization by DSBs in neurons as pathological steps in the progression of neurodegenerative diseases.
Subject(s)
DNA Breaks, Double-Stranded , Neurodegenerative Diseases , Animals , Humans , Mice , Alzheimer Disease/genetics , DNA , DNA Repair/genetics , Neurodegenerative Diseases/genetics , Neurons/physiology , Single-Cell Analysis , Sequence Analysis, RNA , Genomic InstabilityABSTRACT
Altered microglial states affect neuroinflammation, neurodegeneration, and disease but remain poorly understood. Here, we report 194,000 single-nucleus microglial transcriptomes and epigenomes across 443 human subjects and diverse Alzheimer's disease (AD) pathological phenotypes. We annotate 12 microglial transcriptional states, including AD-dysregulated homeostatic, inflammatory, and lipid-processing states. We identify 1,542 AD-differentially-expressed genes, including both microglia-state-specific and disease-stage-specific alterations. By integrating epigenomic, transcriptomic, and motif information, we infer upstream regulators of microglial cell states, gene-regulatory networks, enhancer-gene links, and transcription-factor-driven microglial state transitions. We demonstrate that ectopic expression of our predicted homeostatic-state activators induces homeostatic features in human iPSC-derived microglia-like cells, while inhibiting activators of inflammation can block inflammatory progression. Lastly, we pinpoint the expression of AD-risk genes in microglial states and differential expression of AD-risk genes and their regulators during AD progression. Overall, we provide insights underlying microglial states, including state-specific and AD-stage-specific microglial alterations at unprecedented resolution.
Subject(s)
Alzheimer Disease , Microglia , Humans , Alzheimer Disease/genetics , Alzheimer Disease/pathology , Gene Expression Regulation , Inflammation/pathology , Microglia/metabolism , Transcription Factors/metabolism , Transcriptome , EpigenomeABSTRACT
Recent work has identified dozens of non-coding loci for Alzheimer's disease (AD) risk, but their mechanisms and AD transcriptional regulatory circuitry are poorly understood. Here, we profile epigenomic and transcriptomic landscapes of 850,000 nuclei from prefrontal cortexes of 92 individuals with and without AD to build a map of the brain regulome, including epigenomic profiles, transcriptional regulators, co-accessibility modules, and peak-to-gene links in a cell-type-specific manner. We develop methods for multimodal integration and detecting regulatory modules using peak-to-gene linking. We show AD risk loci are enriched in microglial enhancers and for specific TFs including SPI1, ELF2, and RUNX1. We detect 9,628 cell-type-specific ATAC-QTL loci, which we integrate alongside peak-to-gene links to prioritize AD variant regulatory circuits. We report differential accessibility of regulatory modules in late AD in glia and in early AD in neurons. Strikingly, late-stage AD brains show global epigenome dysregulation indicative of epigenome erosion and cell identity loss.
Subject(s)
Alzheimer Disease , Brain , Gene Expression Regulation , Humans , Alzheimer Disease/genetics , Alzheimer Disease/pathology , Brain/pathology , Epigenome , Epigenomics , Genome-Wide Association StudyABSTRACT
Alzheimer's disease (AD) is the most common cause of dementia worldwide, but the molecular and cellular mechanisms underlying cognitive impairment remain poorly understood. To address this, we generated a single-cell transcriptomic atlas of the aged human prefrontal cortex covering 2.3 million cells from postmortem human brain samples of 427 individuals with varying degrees of AD pathology and cognitive impairment. Our analyses identified AD-pathology-associated alterations shared between excitatory neuron subtypes, revealed a coordinated increase of the cohesin complex and DNA damage response factors in excitatory neurons and in oligodendrocytes, and uncovered genes and pathways associated with high cognitive function, dementia, and resilience to AD pathology. Furthermore, we identified selectively vulnerable somatostatin inhibitory neuron subtypes depleted in AD, discovered two distinct groups of inhibitory neurons that were more abundant in individuals with preserved high cognitive function late in life, and uncovered a link between inhibitory neurons and resilience to AD pathology.
Subject(s)
Alzheimer Disease , Brain , Aged , Humans , Alzheimer Disease/metabolism , Alzheimer Disease/pathology , Brain/metabolism , Brain/pathology , Cognition , Cognitive Dysfunction/metabolism , Neurons/metabolismABSTRACT
Metabolic programming controls immune cell lineages and functions, but little is known about γδ T cell metabolism. Here, we found that γδ T cell subsets making either interferon-γ (IFN-γ) or interleukin (IL)-17 have intrinsically distinct metabolic requirements. Whereas IFN-γ+ γδ T cells were almost exclusively dependent on glycolysis, IL-17+ γδ T cells strongly engaged oxidative metabolism, with increased mitochondrial mass and activity. These distinct metabolic signatures were surprisingly imprinted early during thymic development and were stably maintained in the periphery and within tumors. Moreover, pro-tumoral IL-17+ γδ T cells selectively showed high lipid uptake and intracellular lipid storage and were expanded in obesity and in tumors of obese mice. Conversely, glucose supplementation enhanced the antitumor functions of IFN-γ+ γδ T cells and reduced tumor growth upon adoptive transfer. These findings have important implications for the differentiation of effector γδ T cells and their manipulation in cancer immunotherapy.
Subject(s)
Breast Neoplasms/metabolism , Colonic Neoplasms/metabolism , Energy Metabolism , Lymphocytes, Tumor-Infiltrating/metabolism , Melanoma, Experimental/metabolism , Receptors, Antigen, T-Cell, gamma-delta/metabolism , T-Lymphocyte Subsets/metabolism , Thymus Gland/metabolism , Tumor Microenvironment , Animals , Breast Neoplasms/immunology , Breast Neoplasms/pathology , Breast Neoplasms/therapy , Cell Line, Tumor , Cell Lineage , Colonic Neoplasms/immunology , Colonic Neoplasms/pathology , Colonic Neoplasms/therapy , Female , Glucose/metabolism , Glycolysis , Humans , Immunotherapy, Adoptive , Interferon-gamma/metabolism , Interleukin-17/metabolism , Lipid Metabolism , Lymphocytes, Tumor-Infiltrating/immunology , Lymphocytes, Tumor-Infiltrating/transplantation , Melanoma, Experimental/immunology , Melanoma, Experimental/pathology , Melanoma, Experimental/therapy , Mice, Inbred C57BL , Mice, Transgenic , Mitochondria/metabolism , Obesity/immunology , Obesity/metabolism , Organ Culture Techniques , Phenotype , Signal Transduction , T-Lymphocyte Subsets/immunology , T-Lymphocyte Subsets/transplantation , Thymus Gland/immunology , Tumor BurdenABSTRACT
Neuronal activity causes the rapid expression of immediate early genes that are crucial for experience-driven changes to synapses, learning, and memory. Here, using both molecular and genome-wide next-generation sequencing methods, we report that neuronal activity stimulation triggers the formation of DNA double strand breaks (DSBs) in the promoters of a subset of early-response genes, including Fos, Npas4, and Egr1. Generation of targeted DNA DSBs within Fos and Npas4 promoters is sufficient to induce their expression even in the absence of an external stimulus. Activity-dependent DSB formation is likely mediated by the type II topoisomerase, Topoisomerase IIß (Topo IIß), and knockdown of Topo IIß attenuates both DSB formation and early-response gene expression following neuronal stimulation. Our results suggest that DSB formation is a physiological event that rapidly resolves topological constraints to early-response gene expression in neurons.
Subject(s)
DNA Breaks, Double-Stranded , Neurons/metabolism , Animals , Basic Helix-Loop-Helix Transcription Factors/genetics , CCCTC-Binding Factor , DNA Topoisomerases, Type II/analysis , DNA Topoisomerases, Type II/metabolism , DNA-Binding Proteins/analysis , DNA-Binding Proteins/metabolism , Early Growth Response Protein 1/genetics , Etoposide/pharmacology , Gene Expression Regulation , Genes, fos , Genome-Wide Association Study , Mice , Repressor Proteins/metabolism , Transcriptome/drug effectsABSTRACT
Alzheimer's disease is the leading cause of dementia worldwide, but the cellular pathways that underlie its pathological progression across brain regions remain poorly understood1-3. Here we report a single-cell transcriptomic atlas of six different brain regions in the aged human brain, covering 1.3 million cells from 283 post-mortem human brain samples across 48 individuals with and without Alzheimer's disease. We identify 76 cell types, including region-specific subtypes of astrocytes and excitatory neurons and an inhibitory interneuron population unique to the thalamus and distinct from canonical inhibitory subclasses. We identify vulnerable populations of excitatory and inhibitory neurons that are depleted in specific brain regions in Alzheimer's disease, and provide evidence that the Reelin signalling pathway is involved in modulating the vulnerability of these neurons. We develop a scalable method for discovering gene modules, which we use to identify cell-type-specific and region-specific modules that are altered in Alzheimer's disease and to annotate transcriptomic differences associated with diverse pathological variables. We identify an astrocyte program that is associated with cognitive resilience to Alzheimer's disease pathology, tying choline metabolism and polyamine biosynthesis in astrocytes to preserved cognitive function late in life. Together, our study develops a regional atlas of the ageing human brain and provides insights into cellular vulnerability, response and resilience to Alzheimer's disease pathology.
Subject(s)
Alzheimer Disease , Brain , Gene Expression Profiling , Single-Cell Analysis , Aged, 80 and over , Animals , Female , Humans , Male , Mice , Aging/metabolism , Aging/pathology , Alzheimer Disease/pathology , Alzheimer Disease/genetics , Alzheimer Disease/metabolism , Astrocytes/classification , Astrocytes/cytology , Astrocytes/metabolism , Astrocytes/pathology , Autopsy , Brain/anatomy & histology , Brain/cytology , Brain/metabolism , Brain/pathology , Case-Control Studies , Choline/metabolism , Cognition/physiology , Gene Regulatory Networks , Interneurons/classification , Interneurons/cytology , Interneurons/metabolism , Interneurons/pathology , Nerve Tissue Proteins/metabolism , Nerve Tissue Proteins/genetics , Neural Inhibition , Neurons/classification , Neurons/cytology , Neurons/metabolism , Neurons/pathology , Polyamines/metabolism , Reelin Protein , Signal Transduction , Thalamus/cytology , Thalamus/metabolism , Thalamus/pathology , TranscriptomeABSTRACT
The glymphatic movement of fluid through the brain removes metabolic waste1-4. Noninvasive 40 Hz stimulation promotes 40 Hz neural activity in multiple brain regions and attenuates pathology in mouse models of Alzheimer's disease5-8. Here we show that multisensory gamma stimulation promotes the influx of cerebrospinal fluid and the efflux of interstitial fluid in the cortex of the 5XFAD mouse model of Alzheimer's disease. Influx of cerebrospinal fluid was associated with increased aquaporin-4 polarization along astrocytic endfeet and dilated meningeal lymphatic vessels. Inhibiting glymphatic clearance abolished the removal of amyloid by multisensory 40 Hz stimulation. Using chemogenetic manipulation and a genetically encoded sensor for neuropeptide signalling, we found that vasoactive intestinal peptide interneurons facilitate glymphatic clearance by regulating arterial pulsatility. Our findings establish novel mechanisms that recruit the glymphatic system to remove brain amyloid.
Subject(s)
Alzheimer Disease , Amyloid , Brain , Cerebrospinal Fluid , Extracellular Fluid , Gamma Rhythm , Glymphatic System , Animals , Mice , Alzheimer Disease/metabolism , Alzheimer Disease/pathology , Alzheimer Disease/prevention & control , Amyloid/metabolism , Aquaporin 4/metabolism , Astrocytes/metabolism , Brain/cytology , Brain/metabolism , Brain/pathology , Cerebrospinal Fluid/metabolism , Disease Models, Animal , Extracellular Fluid/metabolism , Glymphatic System/physiology , Interneurons/metabolism , Vasoactive Intestinal Peptide/metabolism , Cerebral Cortex/cytology , Cerebral Cortex/metabolism , Cerebral Cortex/pathology , Electric StimulationABSTRACT
Despite the importance of the cerebrovasculature in maintaining normal brain physiology and in understanding neurodegeneration and drug delivery to the central nervous system1, human cerebrovascular cells remain poorly characterized owing to their sparsity and dispersion. Here we perform single-cell characterization of the human cerebrovasculature using both ex vivo fresh tissue experimental enrichment and post mortem in silico sorting of human cortical tissue samples. We capture 16,681 cerebrovascular nuclei across 11 subtypes, including endothelial cells, mural cells and three distinct subtypes of perivascular fibroblast along the vasculature. We uncover human-specific expression patterns along the arteriovenous axis and determine previously uncharacterized cell-type-specific markers. We use these human-specific signatures to study changes in 3,945 cerebrovascular cells from patients with Huntington's disease, which reveal activation of innate immune signalling in vascular and glial cell types and a concomitant reduction in the levels of proteins critical for maintenance of blood-brain barrier integrity. Finally, our study provides a comprehensive molecular atlas of the human cerebrovasculature to guide future biological and therapeutic studies.
Subject(s)
Endothelial Cells , Huntington Disease , Blood-Brain Barrier/metabolism , Brain/metabolism , Endothelial Cells/metabolism , Humans , Huntington Disease/metabolism , Immune System , Neuroglia , Proteins/metabolismABSTRACT
APOE4 is the strongest genetic risk factor for Alzheimer's disease1-3. However, the effects of APOE4 on the human brain are not fully understood, limiting opportunities to develop targeted therapeutics for individuals carrying APOE4 and other risk factors for Alzheimer's disease4-8. Here, to gain more comprehensive insights into the impact of APOE4 on the human brain, we performed single-cell transcriptomics profiling of post-mortem human brains from APOE4 carriers compared with non-carriers. This revealed that APOE4 is associated with widespread gene expression changes across all cell types of the human brain. Consistent with the biological function of APOE2-6, APOE4 significantly altered signalling pathways associated with cholesterol homeostasis and transport. Confirming these findings with histological and lipidomic analysis of the post-mortem human brain, induced pluripotent stem-cell-derived cells and targeted-replacement mice, we show that cholesterol is aberrantly deposited in oligodendrocytes-myelinating cells that are responsible for insulating and promoting the electrical activity of neurons. We show that altered cholesterol localization in the APOE4 brain coincides with reduced myelination. Pharmacologically facilitating cholesterol transport increases axonal myelination and improves learning and memory in APOE4 mice. We provide a single-cell atlas describing the transcriptional effects of APOE4 on the aging human brain and establish a functional link between APOE4, cholesterol, myelination and memory, offering therapeutic opportunities for Alzheimer's disease.
Subject(s)
Apolipoprotein E4 , Brain , Cholesterol , Nerve Fibers, Myelinated , Oligodendroglia , Animals , Humans , Mice , Alzheimer Disease/genetics , Alzheimer Disease/metabolism , Alzheimer Disease/pathology , Apolipoprotein E4/genetics , Apolipoprotein E4/metabolism , Brain/metabolism , Brain/pathology , Cholesterol/metabolism , Oligodendroglia/metabolism , Oligodendroglia/pathology , Nerve Fibers, Myelinated/metabolism , Nerve Fibers, Myelinated/pathology , Autopsy , Induced Pluripotent Stem Cells , Neurons/metabolism , Neurons/pathology , Heterozygote , Biological Transport , Homeostasis , Single-Cell Analysis , Memory , Aging/genetics , Gene Expression Profiling , Myelin Sheath/metabolism , Myelin Sheath/pathologyABSTRACT
The emergence of SARS-CoV-2 variants of concern suggests viral adaptation to enhance human-to-human transmission1,2. Although much effort has focused on the characterization of changes in the spike protein in variants of concern, mutations outside of spike are likely to contribute to adaptation. Here, using unbiased abundance proteomics, phosphoproteomics, RNA sequencing and viral replication assays, we show that isolates of the Alpha (B.1.1.7) variant3 suppress innate immune responses in airway epithelial cells more effectively than first-wave isolates. We found that the Alpha variant has markedly increased subgenomic RNA and protein levels of the nucleocapsid protein (N), Orf9b and Orf6-all known innate immune antagonists. Expression of Orf9b alone suppressed the innate immune response through interaction with TOM70, a mitochondrial protein that is required for activation of the RNA-sensing adaptor MAVS. Moreover, the activity of Orf9b and its association with TOM70 was regulated by phosphorylation. We propose that more effective innate immune suppression, through enhanced expression of specific viral antagonist proteins, increases the likelihood of successful transmission of the Alpha variant, and may increase in vivo replication and duration of infection4. The importance of mutations outside the spike coding region in the adaptation of SARS-CoV-2 to humans is underscored by the observation that similar mutations exist in the N and Orf9b regulatory regions of the Delta and Omicron variants.
Subject(s)
COVID-19/immunology , COVID-19/virology , Evolution, Molecular , Immune Evasion , Immunity, Innate/immunology , SARS-CoV-2/genetics , SARS-CoV-2/immunology , COVID-19/transmission , Coronavirus Nucleocapsid Proteins/chemistry , Coronavirus Nucleocapsid Proteins/metabolism , Humans , Immunity, Innate/genetics , Interferons/immunology , Mitochondrial Precursor Protein Import Complex Proteins/metabolism , Phosphoproteins/chemistry , Phosphoproteins/metabolism , Phosphorylation , Proteomics , RNA, Viral/genetics , RNA-Seq , SARS-CoV-2/classification , SARS-CoV-2/growth & developmentABSTRACT
We report that the SARS-CoV-2 nucleocapsid protein (N-protein) undergoes liquid-liquid phase separation (LLPS) with viral RNA. N-protein condenses with specific RNA genomic elements under physiological buffer conditions and condensation is enhanced at human body temperatures (33°C and 37°C) and reduced at room temperature (22°C). RNA sequence and structure in specific genomic regions regulate N-protein condensation while other genomic regions promote condensate dissolution, potentially preventing aggregation of the large genome. At low concentrations, N-protein preferentially crosslinks to specific regions characterized by single-stranded RNA flanked by structured elements and these features specify the location, number, and strength of N-protein binding sites (valency). Liquid-like N-protein condensates form in mammalian cells in a concentration-dependent manner and can be altered by small molecules. Condensation of N-protein is RNA sequence and structure specific, sensitive to human body temperature, and manipulatable with small molecules, and therefore presents a screenable process for identifying antiviral compounds effective against SARS-CoV-2.
Subject(s)
COVID-19/metabolism , Coronavirus Nucleocapsid Proteins/metabolism , Genome, Viral , Nucleocapsid/metabolism , RNA, Viral/metabolism , SARS-CoV-2/metabolism , Animals , Antiviral Agents/pharmacology , COVID-19/genetics , Chlorocebus aethiops , Coronavirus Nucleocapsid Proteins/genetics , Drug Evaluation, Preclinical , HEK293 Cells , Humans , Nucleocapsid/genetics , Phosphoproteins/genetics , Phosphoproteins/metabolism , SARS-CoV-2/genetics , Vero Cells , COVID-19 Drug TreatmentABSTRACT
Balancing the tradeoff between quantity and quality of phenotypic data is critical in omics studies. Measurements below the limit of quantification (BLQ) are often tagged in quality control fields, but these flags are currently underutilized in human genetics studies. Extreme phenotype sampling is advantageous for mapping rare variant effects. We hypothesize that genetic drivers, along with environmental and technical factors, contribute to the presence of BLQ flags. Here, we introduce "hypometric genetics" (hMG) analysis and uncover a genetic basis for BLQ flags, indicating an additional source of genetic signal for genetic discovery, especially from phenotypic extremes. Applying our hMG approach to n = 227,469 UK Biobank individuals with metabolomic profiles, we reveal more than 5% heritability for BLQ flags and report biologically relevant associations, for example, at APOC3, APOA5, and PDE3B loci. For common variants, polygenic scores trained only for BLQ flags predict the corresponding quantitative traits with 91% accuracy, validating the genetic basis. For rare coding variant associations, we find an asymmetric 65.4% higher enrichment of metabolite-lowering associations for BLQ flags, highlighting the impact of putative loss-of-function variants with large effects on phenotypic extremes. Joint analysis of binarized BLQ flags and the corresponding quantitative metabolite measurements improves power in Bayesian rare variant aggregation tests, resulting in an average of 181% more prioritized genes. Our approach is broadly applicable to omics profiling. Overall, our results underscore the benefit of integrating quality control flags and quantitative measurements and highlight the advantage of joint analysis of population-based samples and phenotypic extremes in human genetics studies.
ABSTRACT
Annotating the molecular basis of human disease remains an unsolved challenge, as 93% of disease loci are non-coding and gene-regulatory annotations are highly incomplete1-3. Here we present EpiMap, a compendium comprising 10,000 epigenomic maps across 800 samples, which we used to define chromatin states, high-resolution enhancers, enhancer modules, upstream regulators and downstream target genes. We used this resource to annotate 30,000 genetic loci that were associated with 540 traits4, predicting trait-relevant tissues, putative causal nucleotide variants in enriched tissue enhancers and candidate tissue-specific target genes for each. We partitioned multifactorial traits into tissue-specific contributing factors with distinct functional enrichments and disease comorbidity patterns, and revealed both single-factor monotropic and multifactor pleiotropic loci. Top-scoring loci frequently had multiple predicted driver variants, converging through multiple enhancers with a common target gene, multiple genes in common tissues, or multiple genes and multiple tissues, indicating extensive pleiotropy. Our results demonstrate the importance of dense, rich, high-resolution epigenomic annotations for the investigation of complex traits.
Subject(s)
Disease/genetics , Epigenesis, Genetic/genetics , Epigenomics , Gene Regulatory Networks/genetics , Genetic Loci/genetics , Chromatin/genetics , Enhancer Elements, Genetic/genetics , Female , Genome-Wide Association Study , Humans , Male , Multifactorial Inheritance/genetics , Organ Specificity/genetics , Reproducibility of ResultsABSTRACT
Admixed individuals offer unique opportunities for addressing limited transferability in polygenic scores (PGSs), given the substantial trans-ancestry genetic correlation in many complex traits. However, they are rarely considered in PGS training, given the challenges in representing ancestry-matched linkage-disequilibrium reference panels for admixed individuals. Here we present inclusive PGS (iPGS), which captures ancestry-shared genetic effects by finding the exact solution for penalized regression on individual-level data and is thus naturally applicable to admixed individuals. We validate our approach in a simulation study across 33 configurations with varying heritability, polygenicity, and ancestry composition in the training set. When iPGS is applied to n = 237,055 ancestry-diverse individuals in the UK Biobank, it shows the greatest improvements in Africans by 48.9% on average across 60 quantitative traits and up to 50-fold improvements for some traits (neutrophil count, R2 = 0.058) over the baseline model trained on the same number of European individuals. When we allowed iPGS to use n = 284,661 individuals, we observed an average improvement of 60.8% for African, 11.6% for South Asian, 7.3% for non-British White, 4.8% for White British, and 17.8% for the other individuals. We further developed iPGS+refit to jointly model the ancestry-shared and -dependent genetic effects when heterogeneous genetic associations were present. For neutrophil count, for example, iPGS+refit showed the highest predictive performance in the African group (R2 = 0.115), which exceeds the best predictive performance for the White British group (R2 = 0.090 in the iPGS model), even though only 1.49% of individuals used in the iPGS training are of African ancestry. Our results indicate the power of including diverse individuals for developing more equitable PGS models.
Subject(s)
Multifactorial Inheritance , White People , Humans , Multifactorial Inheritance/genetics , White People/genetics , Phenotype , Black People/genetics , Asian People/genetics , Genome-Wide Association Study/methodsABSTRACT
Single-cell RNA sequencing (scRNA-seq) enables the exploration of cellular heterogeneity by analyzing gene expression profiles in complex tissues. However, scRNA-seq data often suffer from technical noise, dropout events and sparsity, hindering downstream analyses. Although existing works attempt to mitigate these issues by utilizing graph structures for data denoising, they involve the risk of propagating noise and fall short of fully leveraging the inherent data relationships, relying mainly on one of cell-cell or gene-gene associations and graphs constructed by initial noisy data. To this end, this study presents single-cell bilevel feature propagation (scBFP), two-step graph-based feature propagation method. It initially imputes zero values using non-zero values, ensuring that the imputation process does not affect the non-zero values due to dropout. Subsequently, it denoises the entire dataset by leveraging gene-gene and cell-cell relationships in the respective steps. Extensive experimental results on scRNA-seq data demonstrate the effectiveness of scBFP in various downstream tasks, uncovering valuable biological insights.
Subject(s)
Sequence Analysis, RNA , Single-Cell Analysis , Single-Cell Analysis/methods , Sequence Analysis, RNA/methods , Humans , Algorithms , Gene Expression Profiling/methods , Computational Biology/methods , RNA-Seq/methodsABSTRACT
Hundreds of chromatin regulators (CRs) control chromatin structure and function by catalyzing and binding histone modifications, yet the rules governing these key processes remain obscure. Here, we present a systematic approach to infer CR function. We developed ChIP-string, a meso-scale assay that combines chromatin immunoprecipitation with a signature readout of 487 representative loci. We applied ChIP-string to screen 145 antibodies, thereby identifying effective reagents, which we used to map the genome-wide binding of 29 CRs in two cell types. We found that specific combinations of CRs colocalize in characteristic patterns at distinct chromatin environments, at genes of coherent functions, and at distal regulatory elements. When comparing between cell types, CRs redistribute to different loci but maintain their modular and combinatorial associations. Our work provides a multiplex method that substantially enhances the ability to monitor CR binding, presents a large resource of CR maps, and reveals common principles for combinatorial CR function.
Subject(s)
Chromatin Immunoprecipitation/methods , Chromatin/metabolism , Genomics/methods , Histone Code , Chromatin/chemistry , Chromatin Assembly and Disassembly , Embryonic Stem Cells , Genome , Humans , K562 CellsABSTRACT
Constitutive heterochromatin is traditionally viewed as the static form of heterochromatin that silences pericentromeric and telomeric repeats in a cell cycle- and differentiation-independent manner. Here, we show that, in the mouse olfactory epithelium, olfactory receptor (OR) genes are marked in a highly dynamic fashion with the molecular hallmarks of constitutive heterochromatin, H3K9me3 and H4K20me3. The cell type and developmentally dependent deposition of these marks along the OR clusters are, most likely, reversed during the process of OR choice to allow for monogenic and monoallelic OR expression. In contrast to the current view of OR choice, our data suggest that OR silencing takes place before OR expression, indicating that it is not the product of an OR-elicited feedback signal. Our findings suggest that chromatin-mediated silencing lays a molecular foundation upon which singular and stochastic selection for gene expression can be applied.
Subject(s)
Chromatin Assembly and Disassembly , Gene Silencing , Olfactory Mucosa/metabolism , Receptors, Odorant/genetics , Animals , Chromatin Immunoprecipitation , Gene Expression , Heterochromatin , Histone Code , Mice , Mice, Inbred C57BL , Mice, Transgenic , Oligonucleotide Array Sequence AnalysisABSTRACT
The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE1 and Roadmap Epigenomics2 data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.