Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
1.
Genome Res ; 2024 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-39255977

RESUMO

Pleiotropy, measured as expression breadth across tissues, is one of the best predictors for protein sequence and expression conservation. In this study, we investigated its effect on the evolution of cis-regulatory elements (CREs). To this end, we carefully reanalyzed the Epigenomics Roadmap data for nine fetal tissues, assigning a measure of pleiotropic degree to nearly half a million CREs. To assess the functional conservation of CREs, we generated ATAC-seq and RNA-seq data from humans and macaques. We found that more pleiotropic CREs exhibit greater conservation in accessibility, and the mRNA expression levels of the associated genes are more conserved. This trend of higher conservation for higher degrees of pleiotropy persists when analyzing the transcription factor binding repertoire. In contrast, simple DNA sequence conservation of orthologous sites between species tends to be even lower for pleiotropic CREs than for species-specific CREs. Combining various lines of evidence, we propose that the lack of sequence conservation in functionally conserved pleiotropic CREs is due to within-element compensatory evolution. In summary, our findings suggest that pleiotropy is also a good predictor for the functional conservation of CREs, even though this is not reflected in the sequence conservation of pleiotropic CREs.

2.
medRxiv ; 2024 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-38699360

RESUMO

Mosaic loss of Y (mLOY) is the most common somatic chromosomal alteration detected in human blood. The presence of mLOY is associated with altered blood cell counts and increased risk of Alzheimer's disease, solid tumors, and other age-related diseases. We sought to gain a better understanding of genetic drivers and associated phenotypes of mLOY through analyses of whole genome sequencing of a large set of genetically diverse males from the Trans-Omics for Precision Medicine (TOPMed) program. This approach enabled us to identify differences in mLOY frequencies across populations defined by genetic similarity, revealing a higher frequency of mLOY in the European American (EA) ancestry group compared to those of Hispanic American (HA), African American (AA), and East Asian (EAS) ancestry. Further, we identified two genes ( CFHR1 and LRP6 ) that harbor multiple rare, putatively deleterious variants associated with mLOY susceptibility, show that subsets of human hematopoietic stem cells are enriched for activity of mLOY susceptibility variants, and that certain alleles on chromosome Y are more likely to be lost than others.

3.
Nature ; 624(7992): 621-629, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38049589

RESUMO

Type 2 diabetes mellitus (T2D), a major cause of worldwide morbidity and mortality, is characterized by dysfunction of insulin-producing pancreatic islet ß cells1,2. T2D genome-wide association studies (GWAS) have identified hundreds of signals in non-coding and ß cell regulatory genomic regions, but deciphering their biological mechanisms remains challenging3-5. Here, to identify early disease-driving events, we performed traditional and multiplexed pancreatic tissue imaging, sorted-islet cell transcriptomics and islet functional analysis of early-stage T2D and control donors. By integrating diverse modalities, we show that early-stage T2D is characterized by ß cell-intrinsic defects that can be proportioned into gene regulatory modules with enrichment in signals of genetic risk. After identifying the ß cell hub gene and transcription factor RFX6 within one such module, we demonstrated multiple layers of genetic risk that converge on an RFX6-mediated network to reduce insulin secretion by ß cells. RFX6 perturbation in primary human islet cells alters ß cell chromatin architecture at regions enriched for T2D GWAS signals, and population-scale genetic analyses causally link genetically predicted reduced RFX6 expression with increased T2D risk. Understanding the molecular mechanisms of complex, systemic diseases necessitates integration of signals from multiple molecules, cells, organs and individuals, and thus we anticipate that this approach will be a useful template to identify and validate key regulatory networks and master hub genes for other diseases or traits using GWAS data.


Assuntos
Diabetes Mellitus Tipo 2 , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Predisposição Genética para Doença , Ilhotas Pancreáticas , Humanos , Estudos de Casos e Controles , Separação Celular , Cromatina/metabolismo , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Diabetes Mellitus Tipo 2/patologia , Diabetes Mellitus Tipo 2/fisiopatologia , Redes Reguladoras de Genes/genética , Estudo de Associação Genômica Ampla , Secreção de Insulina , Ilhotas Pancreáticas/metabolismo , Ilhotas Pancreáticas/patologia , Reprodutibilidade dos Testes
4.
Res Sq ; 2023 Oct 18.
Artigo em Inglês | MEDLINE | ID: mdl-37886586

RESUMO

Genome wide association studies (GWAS) have identified over 100 signals associated with type 1 diabetes (T1D). However, translating any given T1D GWAS signal into mechanistic insights, including putative causal variants and the context (cell type and cell state) in which they function, has been limited. Here, we present a comprehensive multi-omic integrative analysis of single-cell/nucleus resolution profiles of gene expression and chromatin accessibility in healthy and autoantibody+ (AAB+) human islets, as well as islets under multiple T1D stimulatory conditions. We broadly nominate effector cell types for all T1D GWAS signals. We further nominated higher-resolution contexts, including effector cell types, regulatory elements, and genes for three independent T1D risk variants acting through islet cells within the pancreas at the DLK1/MEG3, RASGRP1, and TOX loci. Subsequently, we created isogenic gene knockouts DLK1-/-, RASGRP1-/-, and TOX-/-, and the corresponding regulatory region knockout, RASGRP1Δ, and DLK1Δ hESCs. Loss of RASGRP1 or DLK1, as well as knockout of the regulatory region of RASGRP1 or DLK1, increased ß cell apoptosis. Additionally, pancreatic ß cells derived from isogenic hESCs carrying the risk allele of rs3783355A/A exhibited increased ß cell death. Finally, RNA-seq and ATAC-seq identified five genes upregulated in both RASGRP1-/- and DLK1-/- ß-like cells, four of which are associated with T1D. Together, this work reports an integrative approach for combining single cell multi-omics, GWAS, and isogenic hESC-derived ß-like cells to prioritize the T1D associated signals and their underlying context-specific cell types, genes, SNPs, and regulatory elements, to illuminate biological functions and molecular mechanisms.

5.
Genome Biol ; 24(1): 31, 2023 02 21.
Artigo em Inglês | MEDLINE | ID: mdl-36810122

RESUMO

The current version of the human reference genome, GRCh38, contains a number of errors including 1.2 Mbp of falsely duplicated and 8.04 Mbp of collapsed regions. These errors impact the variant calling of 33 protein-coding genes, including 12 with medical relevance. Here, we present FixItFelix, an efficient remapping approach, together with a modified version of the GRCh38 reference genome that improves the subsequent analysis across these genes within minutes for an existing alignment file while maintaining the same coordinates. We showcase these improvements over multi-ethnic control samples, demonstrating improvements for population variant calling as well as eQTL studies.


Assuntos
Genoma Humano , Genômica , Humanos , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA
6.
bioRxiv ; 2023 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-38168419

RESUMO

Skeletal muscle, the largest human organ by weight, is relevant to several polygenic metabolic traits and diseases including type 2 diabetes (T2D). Identifying genetic mechanisms underlying these traits requires pinpointing the relevant cell types, regulatory elements, target genes, and causal variants. Here, we used genetic multiplexing to generate population-scale single nucleus (sn) chromatin accessibility (snATAC-seq) and transcriptome (snRNA-seq) maps across 287 frozen human skeletal muscle biopsies representing 456,880 nuclei. We identified 13 cell types that collectively represented 983,155 ATAC summits. We integrated genetic variation to discover 6,866 expression quantitative trait loci (eQTL) and 100,928 chromatin accessibility QTL (caQTL) (5% FDR) across the five most abundant cell types, cataloging caQTL peaks that atlas-level snATAC maps often miss. We identified 1,973 eGenes colocalized with caQTL and used mediation analyses to construct causal directional maps for chromatin accessibility and gene expression. 3,378 genome-wide association study (GWAS) signals across 43 relevant traits colocalized with sn-e/caQTL, 52% in a cell-specific manner. 77% of GWAS signals colocalized with caQTL and not eQTL, highlighting the critical importance of population-scale chromatin profiling for GWAS functional studies. GWAS-caQTL colocalization showed distinct cell-specific regulatory paradigms. For example, a C2CD4A/B T2D GWAS signal colocalized with caQTL in muscle fibers and multiple chromatin loop models nominated VPS13C, a glucose uptake gene. Sequence of the caQTL peak overlapping caSNP rs7163757 showed allelic regulatory activity differences in a human myocyte cell line massively parallel reporter assay. These results illuminate the genetic regulatory architecture of human skeletal muscle at high-resolution epigenomic, transcriptomic, and cell state scales and serve as a template for population-scale multi-omic mapping in complex tissues and traits.

7.
Genome Biol ; 23(1): 105, 2022 04 26.
Artigo em Inglês | MEDLINE | ID: mdl-35473573

RESUMO

BACKGROUND: Revealing the gene targets of distal regulatory elements is challenging yet critical for interpreting regulome data. Experiment-derived enhancer-gene links are restricted to a small set of enhancers and/or cell types, while the accuracy of genome-wide approaches remains elusive due to the lack of a systematic evaluation. We combined multiple spatial and in silico approaches for defining enhancer locations and linking them to their target genes aggregated across >500 cell types, generating 1860 human genome-wide distal enhancer-to-target gene definitions (EnTDefs). To evaluate performance, we used gene set enrichment (GSE) testing on 87 independent ENCODE ChIP-seq datasets of 34 transcription factors (TFs) and assessed concordance of results with known TF Gene Ontology annotations, and other benchmarks. RESULTS: The top ranked 741 (40%) EnTDefs significantly outperform the common, naïve approach of linking distal regions to the nearest genes, and the top 10 EnTDefs perform well when applied to ChIP-seq data of other cell types. The GSE-based ranking of EnTDefs is highly concordant with ranking based on overlap with curated benchmarks of enhancer-gene interactions. Both our top general EnTDef and cell-type-specific EnTDefs significantly outperform seven independent computational and experiment-based enhancer-gene pair datasets. We show that using our top EnTDefs for GSE with either genome-wide DNA methylation or ATAC-seq data is able to better recapitulate the biological processes changed in gene expression data performed in parallel for the same experiment than our lower-ranked EnTDefs. CONCLUSIONS: Our findings illustrate the power of our approach to provide genome-wide interpretation regardless of cell type.


Assuntos
Sequenciamento de Cromatina por Imunoprecipitação , Sequências Reguladoras de Ácido Nucleico , DNA , Genoma Humano , Humanos , Anotação de Sequência Molecular
8.
Genome Res ; 31(12): 2258-2275, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34815310

RESUMO

Skeletal muscle accounts for the largest proportion of human body mass, on average, and is a key tissue in complex diseases and mobility. It is composed of several different cell and muscle fiber types. Here, we optimize single-nucleus ATAC-seq (snATAC-seq) to map skeletal muscle cell-specific chromatin accessibility landscapes in frozen human and rat samples, and single-nucleus RNA-seq (snRNA-seq) to map cell-specific transcriptomes in human. We additionally perform multi-omics profiling (gene expression and chromatin accessibility) on human and rat muscle samples. We capture type I and type II muscle fiber signatures, which are generally missed by existing single-cell RNA-seq methods. We perform cross-modality and cross-species integrative analyses on 33,862 nuclei and identify seven cell types ranging in abundance from 59.6% to 1.0% of all nuclei. We introduce a regression-based approach to infer cell types by comparing transcription start site-distal ATAC-seq peaks to reference enhancer maps and show consistency with RNA-based marker gene cell type assignments. We find heterogeneity in enrichment of genetic variants linked to complex phenotypes from the UK Biobank and diabetes genome-wide association studies in cell-specific ATAC-seq peaks, with the most striking enrichment patterns in muscle mesenchymal stem cells (∼3.5% of nuclei). Finally, we overlay these chromatin accessibility maps on GWAS data to nominate causal cell types, SNPs, transcription factor motifs, and target genes for type 2 diabetes signals. These chromatin accessibility profiles for human and rat skeletal muscle cell types are a useful resource for nominating causal GWAS SNPs and cell types.

9.
Am J Hum Genet ; 108(7): 1169-1189, 2021 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-34038741

RESUMO

Identifying the molecular mechanisms by which genome-wide association study (GWAS) loci influence traits remains challenging. Chromatin accessibility quantitative trait loci (caQTLs) help identify GWAS loci that may alter GWAS traits by modulating chromatin structure, but caQTLs have been identified in a limited set of human tissues. Here we mapped caQTLs in human liver tissue in 20 liver samples and identified 3,123 caQTLs. The caQTL variants are enriched in liver tissue promoter and enhancer states and frequently disrupt binding motifs of transcription factors expressed in liver. We predicted target genes for 861 caQTL peaks using proximity, chromatin interactions, correlation with promoter accessibility or gene expression, and colocalization with expression QTLs. Using GWAS signals for 19 liver function and/or cardiometabolic traits, we identified 110 colocalized caQTLs and GWAS signals, 56 of which contained a predicted caPeak target gene. At the LITAF LDL-cholesterol GWAS locus, we validated that a caQTL variant showed allelic differences in protein binding and transcriptional activity. These caQTLs contribute to the epigenomic characterization of human liver and help identify molecular mechanisms and genes at GWAS loci.


Assuntos
Cromatina/metabolismo , Fígado/metabolismo , Locos de Características Quantitativas , Motivos de Aminoácidos , Sítios de Ligação , Montagem e Desmontagem da Cromatina , Elementos Facilitadores Genéticos , Variação Genética , Estudo de Associação Genômica Ampla , Humanos , Regiões Promotoras Genéticas , Ligação Proteica , Fatores de Transcrição/química , Fatores de Transcrição/metabolismo , Transcriptoma
10.
Diabetes ; 70(7): 1581-1591, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-33849996

RESUMO

Identifying the tissue-specific molecular signatures of active regulatory elements is critical to understand gene regulatory mechanisms. Here, we identify transcription start sites (TSS) using cap analysis of gene expression (CAGE) across 57 human pancreatic islet samples. We identify 9,954 reproducible CAGE tag clusters (TCs), ∼20% of which are islet specific and occur mostly distal to known gene TSS. We integrated islet CAGE data with histone modification and chromatin accessibility profiles to identify epigenomic signatures of transcription initiation. Using a massively parallel reporter assay, we validated the transcriptional enhancer activity for 2,279 of 3,378 (∼68%) tested islet CAGE elements (5% false discovery rate). TCs within accessible enhancers show higher enrichment to overlap type 2 diabetes genome-wide association study (GWAS) signals than existing islet annotations, which emphasizes the utility of mapping CAGE profiles in disease-relevant tissue. This work provides a high-resolution map of transcriptional initiation in human pancreatic islets with utility for dissecting active enhancers at GWAS loci.


Assuntos
Ilhotas Pancreáticas/fisiologia , Sítio de Iniciação de Transcrição , Elementos Facilitadores Genéticos , Estudo de Associação Genômica Ampla , Humanos , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas
11.
Nat Commun ; 12(1): 1307, 2021 02 26.
Artigo em Inglês | MEDLINE | ID: mdl-33637709

RESUMO

Interactions between transcription factors and chromatin are fundamental to genome organization and regulation and, ultimately, cell state. Here, we use information theory to measure signatures of organized chromatin resulting from transcription factor-chromatin interactions encoded in the patterns of the accessible genome, which we term chromatin information enrichment (CIE). We calculate CIE for hundreds of transcription factor motifs across human samples and identify two classes: low and high CIE. The 10-20% of common and tissue-specific high CIE transcription factor motifs, associate with higher protein-DNA residence time, including different binding site subclasses of the same transcription factor, increased nucleosome phasing, specific protein domains, and the genetic control of both chromatin accessibility and gene expression. These results show that variations in the information encoded in chromatin architecture reflect functional biological variation, with implications for cell state dynamics and memory.


Assuntos
Cromatina/metabolismo , DNA/metabolismo , Fatores de Transcrição/metabolismo , Transcrição Gênica/fisiologia , Sítios de Ligação , Linhagem Celular , Proteínas de Ligação a DNA , Regulação da Expressão Gênica , Células Hep G2 , Humanos , Nucleossomos
12.
Nat Commun ; 11(1): 4912, 2020 09 30.
Artigo em Inglês | MEDLINE | ID: mdl-32999275

RESUMO

Most signals detected by genome-wide association studies map to non-coding sequence and their tissue-specific effects influence transcriptional regulation. However, key tissues and cell-types required for functional inference are absent from large-scale resources. Here we explore the relationship between genetic variants influencing predisposition to type 2 diabetes (T2D) and related glycemic traits, and human pancreatic islet transcription using data from 420 donors. We find: (a) 7741 cis-eQTLs in islets with a replication rate across 44 GTEx tissues between 40% and 73%; (b) marked overlap between islet cis-eQTL signals and active regulatory sequences in islets, with reduced eQTL effect size observed in the stretch enhancers most strongly implicated in GWAS signal location; (c) enrichment of islet cis-eQTL signals with T2D risk variants identified in genome-wide association studies; and (d) colocalization between 47 islet cis-eQTLs and variants influencing T2D or glycemic traits, including DGKB and TCF7L2. Our findings illustrate the advantages of performing functional and regulatory studies in disease relevant tissues.


Assuntos
Glicemia/genética , Diabetes Mellitus Tipo 2/genética , Predisposição Genética para Doença , Ilhotas Pancreáticas/metabolismo , Locos de Características Quantitativas , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Animais , Glicemia/metabolismo , Linhagem Celular Tumoral , Estudos de Coortes , Diabetes Mellitus Tipo 2/sangue , Diacilglicerol Quinase/genética , Diacilglicerol Quinase/metabolismo , Elementos Facilitadores Genéticos , Feminino , Regulação da Expressão Gênica , Estudo de Associação Genômica Ampla , Humanos , Masculino , Camundongos , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único , RNA-Seq , Análise de Sequência de DNA , Proteína 2 Semelhante ao Fator 7 de Transcrição/genética , Proteína 2 Semelhante ao Fator 7 de Transcrição/metabolismo , Adulto Jovem
13.
Nat Commun ; 11(1): 2379, 2020 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-32404872

RESUMO

Brown and beige fat share a remarkably similar transcriptional program that supports fuel oxidation and thermogenesis. The chromatin-remodeling machinery that governs genome accessibility and renders adipocytes poised for thermogenic activation remains elusive. Here we show that BAF60a, a subunit of the SWI/SNF chromatin-remodeling complexes, serves an indispensable role in cold-induced thermogenesis in brown fat. BAF60a maintains chromatin accessibility at PPARγ and EBF2 binding sites for key thermogenic genes. Surprisingly, fat-specific BAF60a inactivation triggers more pronounced cold-induced browning of inguinal white adipose tissue that is linked to induction of MC2R, a receptor for the pituitary hormone ACTH. Elevated MC2R expression sensitizes adipocytes and BAF60a-deficient adipose tissue to thermogenic activation in response to ACTH stimulation. These observations reveal an unexpected dichotomous role of BAF60a-mediated chromatin remodeling in transcriptional control of brown and beige gene programs and illustrate a pituitary-adipose signaling axis in the control of thermogenesis.


Assuntos
Tecido Adiposo Marrom/metabolismo , Tecido Adiposo Branco/metabolismo , Cromatina/metabolismo , Proteínas Cromossômicas não Histona/deficiência , Temperatura Baixa , Adipócitos Marrons/efeitos dos fármacos , Adipócitos Marrons/metabolismo , Adipócitos Marrons/ultraestrutura , Tecido Adiposo Bege/metabolismo , Tecido Adiposo Marrom/efeitos dos fármacos , Tecido Adiposo Branco/efeitos dos fármacos , Hormônio Adrenocorticotrópico/farmacologia , Animais , Fatores de Transcrição Hélice-Alça-Hélice Básicos/metabolismo , Sítios de Ligação/genética , Células Cultivadas , Cromatina/genética , Proteínas Cromossômicas não Histona/genética , Expressão Gênica/efeitos dos fármacos , Proteínas de Membrana/genética , Proteínas de Membrana/metabolismo , Camundongos Endogâmicos C57BL , Camundongos Knockout , Camundongos Transgênicos , Coativador 1-alfa do Receptor gama Ativado por Proliferador de Peroxissomo/metabolismo , Termogênese/efeitos dos fármacos , Termogênese/genética
14.
Cell Syst ; 10(3): 298-306.e4, 2020 03 25.
Artigo em Inglês | MEDLINE | ID: mdl-32213349

RESUMO

The assay for transposase-accessible chromatin using sequencing (ATAC-seq) has become the preferred method for mapping chromatin accessibility due to its time and input material efficiency. However, it can be difficult to evaluate data quality and identify sources of technical bias across samples. Here, we present ataqv, a computational toolkit for efficiently measuring, visualizing, and comparing quality control (QC) results across samples and experiments. We use ataqv to analyze 2,009 public ATAC-seq datasets; their QC metrics display a 10-fold range. Tn5 dosage experiments and statistical modeling show that technical variation in the ratio of Tn5 transposase to nuclei and sequencing flowcell density induces systematic bias in ATAC-seq data by changing the enrichment of reads across functional genomic annotations including promoters, enhancers, and transcription-factor-bound regions, with the notable exception of CTCF. ataqv can be integrated into existing computational pipelines and is freely available at https://github.com/ParkerLab/ataqv/.


Assuntos
Sequenciamento de Cromatina por Imunoprecipitação/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Animais , Viés , Cromatina/genética , Biologia Computacional/métodos , Humanos , Regiões Promotoras Genéticas/genética , Controle de Qualidade , Sequências Reguladoras de Ácido Nucleico/genética , Software , Fatores de Transcrição/genética , Transposases/genética , Transposases/metabolismo
15.
Cell Rep ; 26(3): 788-801.e6, 2019 01 15.
Artigo em Inglês | MEDLINE | ID: mdl-30650367

RESUMO

EndoC-ßH1 is emerging as a critical human ß cell model to study the genetic and environmental etiologies of ß cell (dys)function and diabetes. Comprehensive knowledge of its molecular landscape is lacking, yet required, for effective use of this model. Here, we report chromosomal (spectral karyotyping), genetic (genotyping), epigenomic (ChIP-seq and ATAC-seq), chromatin interaction (Hi-C and Pol2 ChIA-PET), and transcriptomic (RNA-seq and miRNA-seq) maps of EndoC-ßH1. Analyses of these maps define known (e.g., PDX1 and ISL1) and putative (e.g., PCSK1 and mir-375) ß cell-specific transcriptional cis-regulatory networks and identify allelic effects on cis-regulatory element use. Importantly, comparison with maps generated in primary human islets and/or ß cells indicates preservation of chromatin looping but also highlights chromosomal aberrations and fetal genomic signatures in EndoC-ßH1. Together, these maps, and a web application we created for their exploration, provide important tools for the design of experiments to probe and manipulate the genetic programs governing ß cell identity and (dys)function in diabetes.


Assuntos
Redes Reguladoras de Genes/genética , Células Secretoras de Insulina/metabolismo , Linhagem Celular , Humanos
16.
Genetics ; 211(2): 549-562, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30593493

RESUMO

Epigenomic signatures from histone marks and transcription factor (TF)-binding sites have been used to annotate putative gene regulatory regions. However, a direct comparison of these diverse annotations is missing, and it is unclear how genetic variation within these annotations affects gene expression. Here, we compare five widely used annotations of active regulatory elements that represent high densities of one or more relevant epigenomic marks-"super" and "typical" (nonsuper) enhancers, stretch enhancers, high-occupancy target (HOT) regions, and broad domains-across the four matched human cell types for which they are available. We observe that stretch and super enhancers cover cell type-specific enhancer "chromatin states," whereas HOT regions and broad domains comprise more ubiquitous promoter states. Expression quantitative trait loci (eQTL) in stretch enhancers have significantly smaller effect sizes compared to those in HOT regions. Strikingly, chromatin accessibility QTL in stretch enhancers have significantly larger effect sizes compared to those in HOT regions. These observations suggest that stretch enhancers could harbor genetically primed chromatin to enable changes in TF binding, possibly to drive cell type-specific responses to environmental stimuli. Our results suggest that current eQTL studies are relatively underpowered or could lack the appropriate environmental context to detect genetic effects in the most cell type-specific "regulatory annotations," which likely contributes to infrequent colocalization of eQTL with genome-wide association study signals.


Assuntos
Elementos Facilitadores Genéticos , Locos de Características Quantitativas , Ativação Transcricional , Cromatina/genética , Cromatina/metabolismo , Células-Tronco Embrionárias/metabolismo , Células Hep G2 , Humanos , Especificidade de Órgãos , Fatores de Transcrição/metabolismo
17.
Hum Mol Genet ; 28(5): 736-750, 2019 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-30380057

RESUMO

Danforth's short tail (Sd) mice provide an excellent model for investigating the underlying etiology of human caudal birth defects, which affect 1 in 10 000 live births. Sd animals exhibit aberrant axial skeleton, urogenital and gastrointestinal development similar to human caudal malformation syndromes including urorectal septum malformation, caudal regression, vertebral-anal-cardiac-tracheo-esophageal fistula-renal-limb (VACTERL) association and persistent cloaca. Previous studies have shown that the Sd mutation results from an endogenous retroviral (ERV) insertion upstream of the Ptf1a gene resulting in its ectopic expression at E9.5. Though the genetic lesion has been determined, the resulting epigenomic and transcriptomic changes driving the phenotype have not been investigated. Here, we performed ATAC-seq experiments on isolated E9.5 tailbud tissue, which revealed minimal changes in chromatin accessibility in Sd/Sd mutant embryos. Interestingly, chromatin changes were localized to a small interval adjacent to the Sd ERV insertion overlapping a known Ptf1a enhancer region, which is conserved in mice and humans. Furthermore, mRNA-seq experiments revealed increased transcription of Ptf1a target genes and, importantly, downregulation of hedgehog pathway genes. Reduced sonic hedgehog (SHH) signaling was confirmed by in situ hybridization and immunofluorescence suggesting that the Sd phenotype results, in part, from downregulated SHH signaling. Taken together, these data demonstrate substantial transcriptome changes in the Sd mouse, and indicate that the effect of the ERV insertion on Ptf1a expression may be mediated by increased chromatin accessibility at a conserved Ptf1a enhancer. We propose that human caudal dysgenesis disorders may result from dysregulation of hedgehog signaling pathways.


Assuntos
Montagem e Desmontagem da Cromatina , Cromatina/genética , Cromatina/metabolismo , Epigenoma , Proteínas Hedgehog/metabolismo , Transdução de Sinais , Transcriptoma , Animais , Biomarcadores , Biologia Computacional/métodos , Elementos Facilitadores Genéticos , Imunofluorescência , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Ontologia Genética , Camundongos , Mutação , Organogênese/genética , Fenótipo , Regiões Promotoras Genéticas
18.
J Med Internet Res ; 20(9): e263, 2018 09 21.
Artigo em Inglês | MEDLINE | ID: mdl-30249589

RESUMO

BACKGROUND: Telemonitoring of symptoms and physiological signs has been suggested as a means of early detection of chronic obstructive pulmonary disease (COPD) exacerbations, with a view to instituting timely treatment. However, algorithms to identify exacerbations result in frequent false-positive results and increased workload. Machine learning, when applied to predictive modelling, can determine patterns of risk factors useful for improving prediction quality. OBJECTIVE: Our objectives were to (1) establish whether machine learning techniques applied to telemonitoring datasets improve prediction of hospital admissions and decisions to start corticosteroids, and (2) determine whether the addition of weather data further improves such predictions. METHODS: We used daily symptoms, physiological measures, and medication data, with baseline demography, COPD severity, quality of life, and hospital admissions from a pilot and large randomized controlled trial of telemonitoring in COPD. We linked weather data from the United Kingdom meteorological service. We used feature selection and extraction techniques for time series to construct up to 153 predictive patterns (features) from symptom, medication, and physiological measurements. We used the resulting variables to construct predictive models fitted to training sets of patients and compared them with common symptom-counting algorithms. RESULTS: We had a mean 363 days of telemonitoring data from 135 patients. The two most practical traditional score-counting algorithms, restricted to cases with complete data, resulted in area under the receiver operating characteristic curve (AUC) estimates of 0.60 (95% CI 0.51-0.69) and 0.58 (95% CI 0.50-0.67) for predicting admissions based on a single day's readings. However, in a real-world scenario allowing for missing data, with greater numbers of patient daily data and hospitalizations (N=57,150, N+=55, respectively), the performance of all the traditional algorithms fell, including those based on 2 days' data. One of the most frequently used algorithms performed no better than chance. All considered machine learning models demonstrated significant improvements; the best machine learning algorithm based on 57,150 episodes resulted in an aggregated AUC of 0.74 (95% CI 0.67-0.80). Adding weather data measurements did not improve the predictive performance of the best model (AUC 0.74, 95% CI 0.69-0.79). To achieve an 80% true-positive rate (sensitivity), the traditional algorithms were associated with an 80% false-positive rate: our algorithm halved this rate to approximately 40% (specificity approximately 60%). The machine learning algorithm was moderately superior to the best symptom-counting algorithm (AUC 0.77, 95% CI 0.74-0.79 vs AUC 0.66, 95% CI 0.63-0.68) at predicting the need for corticosteroids. CONCLUSIONS: Early detection and management of COPD remains an important goal given its huge personal and economic costs. Machine learning approaches, which can be tailored to an individual's baseline profile and can learn from experience of the individual patient, are superior to existing predictive algorithms and show promise in achieving this goal. TRIAL REGISTRATION: International Standard Randomized Controlled Trial Number ISRCTN96634935; http://www.isrctn.com/ISRCTN96634935 (Archived by WebCite at http://www.webcitation.org/722YkuhAz).


Assuntos
Hospitalização/tendências , Aprendizado de Máquina/tendências , Doença Pulmonar Obstrutiva Crônica/terapia , Qualidade de Vida/psicologia , Algoritmos , Feminino , Humanos , Masculino
19.
Pharmacogenomics J ; 18(4): 528-538, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29795407

RESUMO

Methotrexate (MTX) monotherapy is a common first treatment for rheumatoid arthritis (RA), but many patients do not respond adequately. In order to identify genetic predictors of response, we have combined data from two consortia to carry out a genome-wide study of response to MTX in 1424 early RA patients of European ancestry. Clinical endpoints were change from baseline to 6 months after starting treatment in swollen 28-joint count, tender 28-joint count, C-reactive protein and the overall 3-component disease activity score (DAS28). No single nucleotide polymorphism (SNP) reached genome-wide statistical significance for any outcome measure. The strongest evidence for association was with rs168201 in NRG3 (p = 10-7 for change in DAS28). Some support was also seen for association with ZMIZ1, previously highlighted in a study of response to MTX in juvenile idiopathic arthritis. Follow-up in two smaller cohorts of 429 and 177 RA patients did not support these findings, although these cohorts were more heterogeneous.


Assuntos
Antirreumáticos/uso terapêutico , Artrite Reumatoide/tratamento farmacológico , Estudo de Associação Genômica Ampla , Metotrexato/uso terapêutico , Antirreumáticos/efeitos adversos , Artrite Reumatoide/genética , Artrite Reumatoide/fisiopatologia , Proteína C-Reativa/genética , Humanos , Metotrexato/efeitos adversos , Neurregulinas/genética , Índice de Gravidade de Doença , Fatores de Transcrição/genética
20.
Genetics ; 206(1): 91-104, 2017 05.
Artigo em Inglês | MEDLINE | ID: mdl-28348060

RESUMO

We address the task of genotype imputation to a dense reference panel given genotype likelihoods computed from ultralow coverage sequencing as inputs. In this setting, the data have a high-level of missingness or uncertainty, and are thus more amenable to a probabilistic representation. Most existing imputation algorithms are not well suited for this situation, as they rely on prephasing for computational efficiency, and, without definite genotype calls, the prephasing task becomes computationally expensive. We describe GeneImp, a program for genotype imputation that does not require prephasing and is computationally tractable for whole-genome imputation. GeneImp does not explicitly model recombination, instead it capitalizes on the existence of large reference panels-comprising thousands of reference haplotypes-and assumes that the reference haplotypes can adequately represent the target haplotypes over short regions unaltered. We validate GeneImp based on data from ultralow coverage sequencing (0.5×), and compare its performance to the most recent version of BEAGLE that can perform this task. We show that GeneImp achieves imputation quality very close to that of BEAGLE, using one to two orders of magnitude less time, without an increase in memory complexity. Therefore, GeneImp is the first practical choice for whole-genome imputation to a dense reference panel when prephasing cannot be applied, for instance, in datasets produced via ultralow coverage sequencing. A related future application for GeneImp is whole-genome imputation based on the off-target reads from deep whole-exome sequencing.


Assuntos
Biologia Computacional , Genoma Humano , Genótipo , Software , Algoritmos , Frequência do Gene , Haplótipos/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...