Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 24
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Nature ; 624(7992): 621-629, 2023 Dec.
Article in English | MEDLINE | ID: mdl-38049589

ABSTRACT

Type 2 diabetes mellitus (T2D), a major cause of worldwide morbidity and mortality, is characterized by dysfunction of insulin-producing pancreatic islet ß cells1,2. T2D genome-wide association studies (GWAS) have identified hundreds of signals in non-coding and ß cell regulatory genomic regions, but deciphering their biological mechanisms remains challenging3-5. Here, to identify early disease-driving events, we performed traditional and multiplexed pancreatic tissue imaging, sorted-islet cell transcriptomics and islet functional analysis of early-stage T2D and control donors. By integrating diverse modalities, we show that early-stage T2D is characterized by ß cell-intrinsic defects that can be proportioned into gene regulatory modules with enrichment in signals of genetic risk. After identifying the ß cell hub gene and transcription factor RFX6 within one such module, we demonstrated multiple layers of genetic risk that converge on an RFX6-mediated network to reduce insulin secretion by ß cells. RFX6 perturbation in primary human islet cells alters ß cell chromatin architecture at regions enriched for T2D GWAS signals, and population-scale genetic analyses causally link genetically predicted reduced RFX6 expression with increased T2D risk. Understanding the molecular mechanisms of complex, systemic diseases necessitates integration of signals from multiple molecules, cells, organs and individuals, and thus we anticipate that this approach will be a useful template to identify and validate key regulatory networks and master hub genes for other diseases or traits using GWAS data.


Subject(s)
Diabetes Mellitus, Type 2 , Gene Expression Profiling , Gene Regulatory Networks , Genetic Predisposition to Disease , Islets of Langerhans , Humans , Case-Control Studies , Cell Separation , Chromatin/metabolism , Diabetes Mellitus, Type 2/genetics , Diabetes Mellitus, Type 2/metabolism , Diabetes Mellitus, Type 2/pathology , Diabetes Mellitus, Type 2/physiopathology , Gene Regulatory Networks/genetics , Genome-Wide Association Study , Insulin Secretion , Islets of Langerhans/metabolism , Islets of Langerhans/pathology , Reproducibility of Results
2.
Genome Res ; 2024 Sep 10.
Article in English | MEDLINE | ID: mdl-39255977

ABSTRACT

Pleiotropy, measured as expression breadth across tissues, is one of the best predictors for protein sequence and expression conservation. In this study, we investigated its effect on the evolution of cis-regulatory elements (CREs). To this end, we carefully reanalyzed the Epigenomics Roadmap data for nine fetal tissues, assigning a measure of pleiotropic degree to nearly half a million CREs. To assess the functional conservation of CREs, we generated ATAC-seq and RNA-seq data from humans and macaques. We found that more pleiotropic CREs exhibit greater conservation in accessibility, and the mRNA expression levels of the associated genes are more conserved. This trend of higher conservation for higher degrees of pleiotropy persists when analyzing the transcription factor binding repertoire. In contrast, simple DNA sequence conservation of orthologous sites between species tends to be even lower for pleiotropic CREs than for species-specific CREs. Combining various lines of evidence, we propose that the lack of sequence conservation in functionally conserved pleiotropic CREs is due to within-element compensatory evolution. In summary, our findings suggest that pleiotropy is also a good predictor for the functional conservation of CREs, even though this is not reflected in the sequence conservation of pleiotropic CREs.

3.
Am J Hum Genet ; 108(7): 1169-1189, 2021 07 01.
Article in English | MEDLINE | ID: mdl-34038741

ABSTRACT

Identifying the molecular mechanisms by which genome-wide association study (GWAS) loci influence traits remains challenging. Chromatin accessibility quantitative trait loci (caQTLs) help identify GWAS loci that may alter GWAS traits by modulating chromatin structure, but caQTLs have been identified in a limited set of human tissues. Here we mapped caQTLs in human liver tissue in 20 liver samples and identified 3,123 caQTLs. The caQTL variants are enriched in liver tissue promoter and enhancer states and frequently disrupt binding motifs of transcription factors expressed in liver. We predicted target genes for 861 caQTL peaks using proximity, chromatin interactions, correlation with promoter accessibility or gene expression, and colocalization with expression QTLs. Using GWAS signals for 19 liver function and/or cardiometabolic traits, we identified 110 colocalized caQTLs and GWAS signals, 56 of which contained a predicted caPeak target gene. At the LITAF LDL-cholesterol GWAS locus, we validated that a caQTL variant showed allelic differences in protein binding and transcriptional activity. These caQTLs contribute to the epigenomic characterization of human liver and help identify molecular mechanisms and genes at GWAS loci.


Subject(s)
Chromatin/metabolism , Liver/metabolism , Quantitative Trait Loci , Amino Acid Motifs , Binding Sites , Chromatin Assembly and Disassembly , Enhancer Elements, Genetic , Genetic Variation , Genome-Wide Association Study , Humans , Promoter Regions, Genetic , Protein Binding , Transcription Factors/chemistry , Transcription Factors/metabolism , Transcriptome
4.
Genome Res ; 31(12): 2258-2275, 2021 Dec.
Article in English | MEDLINE | ID: mdl-34815310

ABSTRACT

Skeletal muscle accounts for the largest proportion of human body mass, on average, and is a key tissue in complex diseases and mobility. It is composed of several different cell and muscle fiber types. Here, we optimize single-nucleus ATAC-seq (snATAC-seq) to map skeletal muscle cell-specific chromatin accessibility landscapes in frozen human and rat samples, and single-nucleus RNA-seq (snRNA-seq) to map cell-specific transcriptomes in human. We additionally perform multi-omics profiling (gene expression and chromatin accessibility) on human and rat muscle samples. We capture type I and type II muscle fiber signatures, which are generally missed by existing single-cell RNA-seq methods. We perform cross-modality and cross-species integrative analyses on 33,862 nuclei and identify seven cell types ranging in abundance from 59.6% to 1.0% of all nuclei. We introduce a regression-based approach to infer cell types by comparing transcription start site-distal ATAC-seq peaks to reference enhancer maps and show consistency with RNA-based marker gene cell type assignments. We find heterogeneity in enrichment of genetic variants linked to complex phenotypes from the UK Biobank and diabetes genome-wide association studies in cell-specific ATAC-seq peaks, with the most striking enrichment patterns in muscle mesenchymal stem cells (∼3.5% of nuclei). Finally, we overlay these chromatin accessibility maps on GWAS data to nominate causal cell types, SNPs, transcription factor motifs, and target genes for type 2 diabetes signals. These chromatin accessibility profiles for human and rat skeletal muscle cell types are a useful resource for nominating causal GWAS SNPs and cell types.

5.
Hum Mol Genet ; 28(5): 736-750, 2019 03 01.
Article in English | MEDLINE | ID: mdl-30380057

ABSTRACT

Danforth's short tail (Sd) mice provide an excellent model for investigating the underlying etiology of human caudal birth defects, which affect 1 in 10 000 live births. Sd animals exhibit aberrant axial skeleton, urogenital and gastrointestinal development similar to human caudal malformation syndromes including urorectal septum malformation, caudal regression, vertebral-anal-cardiac-tracheo-esophageal fistula-renal-limb (VACTERL) association and persistent cloaca. Previous studies have shown that the Sd mutation results from an endogenous retroviral (ERV) insertion upstream of the Ptf1a gene resulting in its ectopic expression at E9.5. Though the genetic lesion has been determined, the resulting epigenomic and transcriptomic changes driving the phenotype have not been investigated. Here, we performed ATAC-seq experiments on isolated E9.5 tailbud tissue, which revealed minimal changes in chromatin accessibility in Sd/Sd mutant embryos. Interestingly, chromatin changes were localized to a small interval adjacent to the Sd ERV insertion overlapping a known Ptf1a enhancer region, which is conserved in mice and humans. Furthermore, mRNA-seq experiments revealed increased transcription of Ptf1a target genes and, importantly, downregulation of hedgehog pathway genes. Reduced sonic hedgehog (SHH) signaling was confirmed by in situ hybridization and immunofluorescence suggesting that the Sd phenotype results, in part, from downregulated SHH signaling. Taken together, these data demonstrate substantial transcriptome changes in the Sd mouse, and indicate that the effect of the ERV insertion on Ptf1a expression may be mediated by increased chromatin accessibility at a conserved Ptf1a enhancer. We propose that human caudal dysgenesis disorders may result from dysregulation of hedgehog signaling pathways.


Subject(s)
Chromatin Assembly and Disassembly , Chromatin/genetics , Chromatin/metabolism , Epigenome , Hedgehog Proteins/metabolism , Signal Transduction , Transcriptome , Animals , Biomarkers , Computational Biology/methods , Enhancer Elements, Genetic , Fluorescent Antibody Technique , Gene Expression Profiling , Gene Expression Regulation , Gene Ontology , Mice , Mutation , Organogenesis/genetics , Phenotype , Promoter Regions, Genetic
6.
Proc Natl Acad Sci U S A ; 114(9): 2301-2306, 2017 02 28.
Article in English | MEDLINE | ID: mdl-28193859

ABSTRACT

Genome-wide association studies (GWAS) have identified >100 independent SNPs that modulate the risk of type 2 diabetes (T2D) and related traits. However, the pathogenic mechanisms of most of these SNPs remain elusive. Here, we examined genomic, epigenomic, and transcriptomic profiles in human pancreatic islets to understand the links between genetic variation, chromatin landscape, and gene expression in the context of T2D. We first integrated genome and transcriptome variation across 112 islet samples to produce dense cis-expression quantitative trait loci (cis-eQTL) maps. Additional integration with chromatin-state maps for islets and other diverse tissue types revealed that cis-eQTLs for islet-specific genes are specifically and significantly enriched in islet stretch enhancers. High-resolution chromatin accessibility profiling using assay for transposase-accessible chromatin sequencing (ATAC-seq) in two islet samples enabled us to identify specific transcription factor (TF) footprints embedded in active regulatory elements, which are highly enriched for islet cis-eQTL. Aggregate allelic bias signatures in TF footprints enabled us de novo to reconstruct TF binding affinities genetically, which support the high-quality nature of the TF footprint predictions. Interestingly, we found that T2D GWAS loci were strikingly and specifically enriched in islet Regulatory Factor X (RFX) footprints. Remarkably, within and across independent loci, T2D risk alleles that overlap with RFX footprints uniformly disrupt the RFX motifs at high-information content positions. Together, these results suggest that common regulatory variations have shaped islet TF footprints and the transcriptome and that a confluent RFX regulatory grammar plays a significant role in the genetic component of T2D predisposition.


Subject(s)
Diabetes Mellitus, Type 2/genetics , Genetic Predisposition to Disease , Genome, Human , Islets of Langerhans/metabolism , Quantitative Trait Loci , Transcriptome , Alleles , Base Sequence , Binding Sites , Chromatin/chemistry , Chromatin/metabolism , Diabetes Mellitus, Type 2/metabolism , Diabetes Mellitus, Type 2/pathology , Epigenesis, Genetic , Gene Expression Profiling , Genetic Variation , Genome-Wide Association Study , Genomic Imprinting , Humans , Islets of Langerhans/pathology , Polymorphism, Single Nucleotide , Protein Binding , Protein Isoforms/genetics , Protein Isoforms/metabolism , Regulatory Factor X Transcription Factors/genetics , Regulatory Factor X Transcription Factors/metabolism
7.
Pharmacogenomics J ; 18(4): 528-538, 2018 07.
Article in English | MEDLINE | ID: mdl-29795407

ABSTRACT

Methotrexate (MTX) monotherapy is a common first treatment for rheumatoid arthritis (RA), but many patients do not respond adequately. In order to identify genetic predictors of response, we have combined data from two consortia to carry out a genome-wide study of response to MTX in 1424 early RA patients of European ancestry. Clinical endpoints were change from baseline to 6 months after starting treatment in swollen 28-joint count, tender 28-joint count, C-reactive protein and the overall 3-component disease activity score (DAS28). No single nucleotide polymorphism (SNP) reached genome-wide statistical significance for any outcome measure. The strongest evidence for association was with rs168201 in NRG3 (p = 10-7 for change in DAS28). Some support was also seen for association with ZMIZ1, previously highlighted in a study of response to MTX in juvenile idiopathic arthritis. Follow-up in two smaller cohorts of 429 and 177 RA patients did not support these findings, although these cohorts were more heterogeneous.


Subject(s)
Antirheumatic Agents/therapeutic use , Arthritis, Rheumatoid/drug therapy , Genome-Wide Association Study , Methotrexate/therapeutic use , Antirheumatic Agents/adverse effects , Arthritis, Rheumatoid/genetics , Arthritis, Rheumatoid/physiopathology , C-Reactive Protein/genetics , Humans , Methotrexate/adverse effects , Neuregulins/genetics , Severity of Illness Index , Transcription Factors/genetics
8.
J Med Internet Res ; 20(9): e263, 2018 09 21.
Article in English | MEDLINE | ID: mdl-30249589

ABSTRACT

BACKGROUND: Telemonitoring of symptoms and physiological signs has been suggested as a means of early detection of chronic obstructive pulmonary disease (COPD) exacerbations, with a view to instituting timely treatment. However, algorithms to identify exacerbations result in frequent false-positive results and increased workload. Machine learning, when applied to predictive modelling, can determine patterns of risk factors useful for improving prediction quality. OBJECTIVE: Our objectives were to (1) establish whether machine learning techniques applied to telemonitoring datasets improve prediction of hospital admissions and decisions to start corticosteroids, and (2) determine whether the addition of weather data further improves such predictions. METHODS: We used daily symptoms, physiological measures, and medication data, with baseline demography, COPD severity, quality of life, and hospital admissions from a pilot and large randomized controlled trial of telemonitoring in COPD. We linked weather data from the United Kingdom meteorological service. We used feature selection and extraction techniques for time series to construct up to 153 predictive patterns (features) from symptom, medication, and physiological measurements. We used the resulting variables to construct predictive models fitted to training sets of patients and compared them with common symptom-counting algorithms. RESULTS: We had a mean 363 days of telemonitoring data from 135 patients. The two most practical traditional score-counting algorithms, restricted to cases with complete data, resulted in area under the receiver operating characteristic curve (AUC) estimates of 0.60 (95% CI 0.51-0.69) and 0.58 (95% CI 0.50-0.67) for predicting admissions based on a single day's readings. However, in a real-world scenario allowing for missing data, with greater numbers of patient daily data and hospitalizations (N=57,150, N+=55, respectively), the performance of all the traditional algorithms fell, including those based on 2 days' data. One of the most frequently used algorithms performed no better than chance. All considered machine learning models demonstrated significant improvements; the best machine learning algorithm based on 57,150 episodes resulted in an aggregated AUC of 0.74 (95% CI 0.67-0.80). Adding weather data measurements did not improve the predictive performance of the best model (AUC 0.74, 95% CI 0.69-0.79). To achieve an 80% true-positive rate (sensitivity), the traditional algorithms were associated with an 80% false-positive rate: our algorithm halved this rate to approximately 40% (specificity approximately 60%). The machine learning algorithm was moderately superior to the best symptom-counting algorithm (AUC 0.77, 95% CI 0.74-0.79 vs AUC 0.66, 95% CI 0.63-0.68) at predicting the need for corticosteroids. CONCLUSIONS: Early detection and management of COPD remains an important goal given its huge personal and economic costs. Machine learning approaches, which can be tailored to an individual's baseline profile and can learn from experience of the individual patient, are superior to existing predictive algorithms and show promise in achieving this goal. TRIAL REGISTRATION: International Standard Randomized Controlled Trial Number ISRCTN96634935; http://www.isrctn.com/ISRCTN96634935 (Archived by WebCite at http://www.webcitation.org/722YkuhAz).


Subject(s)
Hospitalization/trends , Machine Learning/trends , Pulmonary Disease, Chronic Obstructive/therapy , Quality of Life/psychology , Algorithms , Female , Humans , Male
9.
Mol Microbiol ; 100(4): 607-20, 2016 05.
Article in English | MEDLINE | ID: mdl-26815905

ABSTRACT

Protection against antimicrobial peptides (AMPs) often involves the parallel production of multiple, well-characterized resistance determinants. So far, little is known about how these resistance modules interact and how they jointly protect the cell. Here, we studied the interdependence between different layers of the envelope stress response of Bacillus subtilis when challenged with the lipid II cycle-inhibiting AMP bacitracin. The underlying regulatory network orchestrates the production of the ABC transporter BceAB, the UPP phosphatase BcrC and the phage-shock proteins LiaIH. Our systems-level analysis reveals a clear hierarchy, allowing us to discriminate between primary (BceAB) and secondary (BcrC and LiaIH) layers of bacitracin resistance. Deleting the primary layer provokes an enhanced induction of the secondary layer to partially compensate for this loss. This study reveals a direct role of LiaIH in bacitracin resistance, provides novel insights into the feedback regulation of the Lia system, and demonstrates a pivotal role of BcrC in maintaining cell wall homeostasis. The compensatory regulation within the bacitracin network can also explain how gene expression noise propagates between resistance layers. We suggest that this active redundancy in the bacitracin resistance network of B. subtilis is a general principle to be found in many bacterial antibiotic resistance networks.


Subject(s)
Anti-Bacterial Agents/pharmacology , Bacillus subtilis/drug effects , Bacillus subtilis/genetics , Bacitracin/pharmacology , Bacterial Proteins/genetics , Drug Resistance, Bacterial , Cell Wall/metabolism , Drug Resistance, Bacterial/genetics , Gene Expression Regulation, Bacterial/drug effects , Signal Transduction/drug effects
10.
Australas Psychiatry ; 22(2): 165-9, 2014 Apr.
Article in English | MEDLINE | ID: mdl-24452322

ABSTRACT

OBJECTIVE: The purpose of CanTeen's E-Mental Health Service for Young People Living With Cancer (YPLWC) is to meet the unique psychosocial needs of young people (12-24 years) in Australia impacted by cancer (either as a patient or family member of someone with cancer). CONCLUSIONS: This online platform will provide the primary site where all YPLWC can find information, connect with others going through a similar experience, express their feelings, utilise tools for support and access professional psychosocial support services that will meet their individual needs. The overall outcome of the service will be to ensure that the YPLWC visiting the site experience optimal psychological wellbeing. Ultimately, the service's value will be in improving the lives of young people who engage with it and the follow-on effect that this will have on their families and communities in the long-term.


Subject(s)
Hotlines , Internet , Mental Health Services/organization & administration , Neoplasms/psychology , Adolescent , Australia , Child , Family Health , Female , Humans , Male , Neoplasms/complications , Young Adult
11.
medRxiv ; 2024 Apr 19.
Article in English | MEDLINE | ID: mdl-38699360

ABSTRACT

Mosaic loss of Y (mLOY) is the most common somatic chromosomal alteration detected in human blood. The presence of mLOY is associated with altered blood cell counts and increased risk of Alzheimer's disease, solid tumors, and other age-related diseases. We sought to gain a better understanding of genetic drivers and associated phenotypes of mLOY through analyses of whole genome sequencing of a large set of genetically diverse males from the Trans-Omics for Precision Medicine (TOPMed) program. This approach enabled us to identify differences in mLOY frequencies across populations defined by genetic similarity, revealing a higher frequency of mLOY in the European American (EA) ancestry group compared to those of Hispanic American (HA), African American (AA), and East Asian (EAS) ancestry. Further, we identified two genes ( CFHR1 and LRP6 ) that harbor multiple rare, putatively deleterious variants associated with mLOY susceptibility, show that subsets of human hematopoietic stem cells are enriched for activity of mLOY susceptibility variants, and that certain alleles on chromosome Y are more likely to be lost than others.

12.
Genome Biol ; 24(1): 31, 2023 02 21.
Article in English | MEDLINE | ID: mdl-36810122

ABSTRACT

The current version of the human reference genome, GRCh38, contains a number of errors including 1.2 Mbp of falsely duplicated and 8.04 Mbp of collapsed regions. These errors impact the variant calling of 33 protein-coding genes, including 12 with medical relevance. Here, we present FixItFelix, an efficient remapping approach, together with a modified version of the GRCh38 reference genome that improves the subsequent analysis across these genes within minutes for an existing alignment file while maintaining the same coordinates. We showcase these improvements over multi-ethnic control samples, demonstrating improvements for population variant calling as well as eQTL studies.


Subject(s)
Genome, Human , Genomics , Humans , High-Throughput Nucleotide Sequencing , Sequence Analysis, DNA
13.
Res Sq ; 2023 Oct 18.
Article in English | MEDLINE | ID: mdl-37886586

ABSTRACT

Genome wide association studies (GWAS) have identified over 100 signals associated with type 1 diabetes (T1D). However, translating any given T1D GWAS signal into mechanistic insights, including putative causal variants and the context (cell type and cell state) in which they function, has been limited. Here, we present a comprehensive multi-omic integrative analysis of single-cell/nucleus resolution profiles of gene expression and chromatin accessibility in healthy and autoantibody+ (AAB+) human islets, as well as islets under multiple T1D stimulatory conditions. We broadly nominate effector cell types for all T1D GWAS signals. We further nominated higher-resolution contexts, including effector cell types, regulatory elements, and genes for three independent T1D risk variants acting through islet cells within the pancreas at the DLK1/MEG3, RASGRP1, and TOX loci. Subsequently, we created isogenic gene knockouts DLK1-/-, RASGRP1-/-, and TOX-/-, and the corresponding regulatory region knockout, RASGRP1Δ, and DLK1Δ hESCs. Loss of RASGRP1 or DLK1, as well as knockout of the regulatory region of RASGRP1 or DLK1, increased ß cell apoptosis. Additionally, pancreatic ß cells derived from isogenic hESCs carrying the risk allele of rs3783355A/A exhibited increased ß cell death. Finally, RNA-seq and ATAC-seq identified five genes upregulated in both RASGRP1-/- and DLK1-/- ß-like cells, four of which are associated with T1D. Together, this work reports an integrative approach for combining single cell multi-omics, GWAS, and isogenic hESC-derived ß-like cells to prioritize the T1D associated signals and their underlying context-specific cell types, genes, SNPs, and regulatory elements, to illuminate biological functions and molecular mechanisms.

14.
bioRxiv ; 2023 Dec 15.
Article in English | MEDLINE | ID: mdl-38168419

ABSTRACT

Skeletal muscle, the largest human organ by weight, is relevant to several polygenic metabolic traits and diseases including type 2 diabetes (T2D). Identifying genetic mechanisms underlying these traits requires pinpointing the relevant cell types, regulatory elements, target genes, and causal variants. Here, we used genetic multiplexing to generate population-scale single nucleus (sn) chromatin accessibility (snATAC-seq) and transcriptome (snRNA-seq) maps across 287 frozen human skeletal muscle biopsies representing 456,880 nuclei. We identified 13 cell types that collectively represented 983,155 ATAC summits. We integrated genetic variation to discover 6,866 expression quantitative trait loci (eQTL) and 100,928 chromatin accessibility QTL (caQTL) (5% FDR) across the five most abundant cell types, cataloging caQTL peaks that atlas-level snATAC maps often miss. We identified 1,973 eGenes colocalized with caQTL and used mediation analyses to construct causal directional maps for chromatin accessibility and gene expression. 3,378 genome-wide association study (GWAS) signals across 43 relevant traits colocalized with sn-e/caQTL, 52% in a cell-specific manner. 77% of GWAS signals colocalized with caQTL and not eQTL, highlighting the critical importance of population-scale chromatin profiling for GWAS functional studies. GWAS-caQTL colocalization showed distinct cell-specific regulatory paradigms. For example, a C2CD4A/B T2D GWAS signal colocalized with caQTL in muscle fibers and multiple chromatin loop models nominated VPS13C, a glucose uptake gene. Sequence of the caQTL peak overlapping caSNP rs7163757 showed allelic regulatory activity differences in a human myocyte cell line massively parallel reporter assay. These results illuminate the genetic regulatory architecture of human skeletal muscle at high-resolution epigenomic, transcriptomic, and cell state scales and serve as a template for population-scale multi-omic mapping in complex tissues and traits.

15.
Genome Biol ; 23(1): 105, 2022 04 26.
Article in English | MEDLINE | ID: mdl-35473573

ABSTRACT

BACKGROUND: Revealing the gene targets of distal regulatory elements is challenging yet critical for interpreting regulome data. Experiment-derived enhancer-gene links are restricted to a small set of enhancers and/or cell types, while the accuracy of genome-wide approaches remains elusive due to the lack of a systematic evaluation. We combined multiple spatial and in silico approaches for defining enhancer locations and linking them to their target genes aggregated across >500 cell types, generating 1860 human genome-wide distal enhancer-to-target gene definitions (EnTDefs). To evaluate performance, we used gene set enrichment (GSE) testing on 87 independent ENCODE ChIP-seq datasets of 34 transcription factors (TFs) and assessed concordance of results with known TF Gene Ontology annotations, and other benchmarks. RESULTS: The top ranked 741 (40%) EnTDefs significantly outperform the common, naïve approach of linking distal regions to the nearest genes, and the top 10 EnTDefs perform well when applied to ChIP-seq data of other cell types. The GSE-based ranking of EnTDefs is highly concordant with ranking based on overlap with curated benchmarks of enhancer-gene interactions. Both our top general EnTDef and cell-type-specific EnTDefs significantly outperform seven independent computational and experiment-based enhancer-gene pair datasets. We show that using our top EnTDefs for GSE with either genome-wide DNA methylation or ATAC-seq data is able to better recapitulate the biological processes changed in gene expression data performed in parallel for the same experiment than our lower-ranked EnTDefs. CONCLUSIONS: Our findings illustrate the power of our approach to provide genome-wide interpretation regardless of cell type.


Subject(s)
Chromatin Immunoprecipitation Sequencing , Regulatory Sequences, Nucleic Acid , DNA , Genome, Human , Humans , Molecular Sequence Annotation
16.
Nat Commun ; 12(1): 1307, 2021 02 26.
Article in English | MEDLINE | ID: mdl-33637709

ABSTRACT

Interactions between transcription factors and chromatin are fundamental to genome organization and regulation and, ultimately, cell state. Here, we use information theory to measure signatures of organized chromatin resulting from transcription factor-chromatin interactions encoded in the patterns of the accessible genome, which we term chromatin information enrichment (CIE). We calculate CIE for hundreds of transcription factor motifs across human samples and identify two classes: low and high CIE. The 10-20% of common and tissue-specific high CIE transcription factor motifs, associate with higher protein-DNA residence time, including different binding site subclasses of the same transcription factor, increased nucleosome phasing, specific protein domains, and the genetic control of both chromatin accessibility and gene expression. These results show that variations in the information encoded in chromatin architecture reflect functional biological variation, with implications for cell state dynamics and memory.


Subject(s)
Chromatin/metabolism , DNA/metabolism , Transcription Factors/metabolism , Transcription, Genetic/physiology , Binding Sites , Cell Line , DNA-Binding Proteins , Gene Expression Regulation , Hep G2 Cells , Humans , Nucleosomes
17.
Diabetes ; 70(7): 1581-1591, 2021 07.
Article in English | MEDLINE | ID: mdl-33849996

ABSTRACT

Identifying the tissue-specific molecular signatures of active regulatory elements is critical to understand gene regulatory mechanisms. Here, we identify transcription start sites (TSS) using cap analysis of gene expression (CAGE) across 57 human pancreatic islet samples. We identify 9,954 reproducible CAGE tag clusters (TCs), ∼20% of which are islet specific and occur mostly distal to known gene TSS. We integrated islet CAGE data with histone modification and chromatin accessibility profiles to identify epigenomic signatures of transcription initiation. Using a massively parallel reporter assay, we validated the transcriptional enhancer activity for 2,279 of 3,378 (∼68%) tested islet CAGE elements (5% false discovery rate). TCs within accessible enhancers show higher enrichment to overlap type 2 diabetes genome-wide association study (GWAS) signals than existing islet annotations, which emphasizes the utility of mapping CAGE profiles in disease-relevant tissue. This work provides a high-resolution map of transcriptional initiation in human pancreatic islets with utility for dissecting active enhancers at GWAS loci.


Subject(s)
Islets of Langerhans/physiology , Transcription Initiation Site , Enhancer Elements, Genetic , Genome-Wide Association Study , Humans , Polymorphism, Single Nucleotide , Quantitative Trait Loci
18.
Cell Syst ; 10(3): 298-306.e4, 2020 03 25.
Article in English | MEDLINE | ID: mdl-32213349

ABSTRACT

The assay for transposase-accessible chromatin using sequencing (ATAC-seq) has become the preferred method for mapping chromatin accessibility due to its time and input material efficiency. However, it can be difficult to evaluate data quality and identify sources of technical bias across samples. Here, we present ataqv, a computational toolkit for efficiently measuring, visualizing, and comparing quality control (QC) results across samples and experiments. We use ataqv to analyze 2,009 public ATAC-seq datasets; their QC metrics display a 10-fold range. Tn5 dosage experiments and statistical modeling show that technical variation in the ratio of Tn5 transposase to nuclei and sequencing flowcell density induces systematic bias in ATAC-seq data by changing the enrichment of reads across functional genomic annotations including promoters, enhancers, and transcription-factor-bound regions, with the notable exception of CTCF. ataqv can be integrated into existing computational pipelines and is freely available at https://github.com/ParkerLab/ataqv/.


Subject(s)
Chromatin Immunoprecipitation Sequencing/methods , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , Animals , Bias , Chromatin/genetics , Computational Biology/methods , Humans , Promoter Regions, Genetic/genetics , Quality Control , Regulatory Sequences, Nucleic Acid/genetics , Software , Transcription Factors/genetics , Transposases/genetics , Transposases/metabolism
19.
Nat Commun ; 11(1): 2379, 2020 05 13.
Article in English | MEDLINE | ID: mdl-32404872

ABSTRACT

Brown and beige fat share a remarkably similar transcriptional program that supports fuel oxidation and thermogenesis. The chromatin-remodeling machinery that governs genome accessibility and renders adipocytes poised for thermogenic activation remains elusive. Here we show that BAF60a, a subunit of the SWI/SNF chromatin-remodeling complexes, serves an indispensable role in cold-induced thermogenesis in brown fat. BAF60a maintains chromatin accessibility at PPARγ and EBF2 binding sites for key thermogenic genes. Surprisingly, fat-specific BAF60a inactivation triggers more pronounced cold-induced browning of inguinal white adipose tissue that is linked to induction of MC2R, a receptor for the pituitary hormone ACTH. Elevated MC2R expression sensitizes adipocytes and BAF60a-deficient adipose tissue to thermogenic activation in response to ACTH stimulation. These observations reveal an unexpected dichotomous role of BAF60a-mediated chromatin remodeling in transcriptional control of brown and beige gene programs and illustrate a pituitary-adipose signaling axis in the control of thermogenesis.


Subject(s)
Adipose Tissue, Brown/metabolism , Adipose Tissue, White/metabolism , Chromatin/metabolism , Chromosomal Proteins, Non-Histone/deficiency , Cold Temperature , Adipocytes, Brown/drug effects , Adipocytes, Brown/metabolism , Adipocytes, Brown/ultrastructure , Adipose Tissue, Beige/metabolism , Adipose Tissue, Brown/drug effects , Adipose Tissue, White/drug effects , Adrenocorticotropic Hormone/pharmacology , Animals , Basic Helix-Loop-Helix Transcription Factors/metabolism , Binding Sites/genetics , Cells, Cultured , Chromatin/genetics , Chromosomal Proteins, Non-Histone/genetics , Gene Expression/drug effects , Membrane Proteins/genetics , Membrane Proteins/metabolism , Mice, Inbred C57BL , Mice, Knockout , Mice, Transgenic , Peroxisome Proliferator-Activated Receptor Gamma Coactivator 1-alpha/metabolism , Thermogenesis/drug effects , Thermogenesis/genetics
20.
Nat Commun ; 11(1): 4912, 2020 09 30.
Article in English | MEDLINE | ID: mdl-32999275

ABSTRACT

Most signals detected by genome-wide association studies map to non-coding sequence and their tissue-specific effects influence transcriptional regulation. However, key tissues and cell-types required for functional inference are absent from large-scale resources. Here we explore the relationship between genetic variants influencing predisposition to type 2 diabetes (T2D) and related glycemic traits, and human pancreatic islet transcription using data from 420 donors. We find: (a) 7741 cis-eQTLs in islets with a replication rate across 44 GTEx tissues between 40% and 73%; (b) marked overlap between islet cis-eQTL signals and active regulatory sequences in islets, with reduced eQTL effect size observed in the stretch enhancers most strongly implicated in GWAS signal location; (c) enrichment of islet cis-eQTL signals with T2D risk variants identified in genome-wide association studies; and (d) colocalization between 47 islet cis-eQTLs and variants influencing T2D or glycemic traits, including DGKB and TCF7L2. Our findings illustrate the advantages of performing functional and regulatory studies in disease relevant tissues.


Subject(s)
Blood Glucose/genetics , Diabetes Mellitus, Type 2/genetics , Genetic Predisposition to Disease , Islets of Langerhans/metabolism , Quantitative Trait Loci , Adolescent , Adult , Aged , Aged, 80 and over , Animals , Blood Glucose/metabolism , Cell Line, Tumor , Cohort Studies , Diabetes Mellitus, Type 2/blood , Diacylglycerol Kinase/genetics , Diacylglycerol Kinase/metabolism , Enhancer Elements, Genetic , Female , Gene Expression Regulation , Genome-Wide Association Study , Humans , Male , Mice , Middle Aged , Polymorphism, Single Nucleotide , RNA-Seq , Sequence Analysis, DNA , Transcription Factor 7-Like 2 Protein/genetics , Transcription Factor 7-Like 2 Protein/metabolism , Young Adult
SELECTION OF CITATIONS
SEARCH DETAIL