Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 59
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Nature ; 624(7992): 621-629, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38049589

RESUMEN

Type 2 diabetes mellitus (T2D), a major cause of worldwide morbidity and mortality, is characterized by dysfunction of insulin-producing pancreatic islet ß cells1,2. T2D genome-wide association studies (GWAS) have identified hundreds of signals in non-coding and ß cell regulatory genomic regions, but deciphering their biological mechanisms remains challenging3-5. Here, to identify early disease-driving events, we performed traditional and multiplexed pancreatic tissue imaging, sorted-islet cell transcriptomics and islet functional analysis of early-stage T2D and control donors. By integrating diverse modalities, we show that early-stage T2D is characterized by ß cell-intrinsic defects that can be proportioned into gene regulatory modules with enrichment in signals of genetic risk. After identifying the ß cell hub gene and transcription factor RFX6 within one such module, we demonstrated multiple layers of genetic risk that converge on an RFX6-mediated network to reduce insulin secretion by ß cells. RFX6 perturbation in primary human islet cells alters ß cell chromatin architecture at regions enriched for T2D GWAS signals, and population-scale genetic analyses causally link genetically predicted reduced RFX6 expression with increased T2D risk. Understanding the molecular mechanisms of complex, systemic diseases necessitates integration of signals from multiple molecules, cells, organs and individuals, and thus we anticipate that this approach will be a useful template to identify and validate key regulatory networks and master hub genes for other diseases or traits using GWAS data.


Asunto(s)
Diabetes Mellitus Tipo 2 , Perfilación de la Expresión Génica , Redes Reguladoras de Genes , Predisposición Genética a la Enfermedad , Islotes Pancreáticos , Humanos , Estudios de Casos y Controles , Separación Celular , Cromatina/metabolismo , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Diabetes Mellitus Tipo 2/patología , Diabetes Mellitus Tipo 2/fisiopatología , Redes Reguladoras de Genes/genética , Estudio de Asociación del Genoma Completo , Secreción de Insulina , Islotes Pancreáticos/metabolismo , Islotes Pancreáticos/patología , Reproducibilidad de los Resultados
2.
Am J Hum Genet ; 108(7): 1169-1189, 2021 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-34038741

RESUMEN

Identifying the molecular mechanisms by which genome-wide association study (GWAS) loci influence traits remains challenging. Chromatin accessibility quantitative trait loci (caQTLs) help identify GWAS loci that may alter GWAS traits by modulating chromatin structure, but caQTLs have been identified in a limited set of human tissues. Here we mapped caQTLs in human liver tissue in 20 liver samples and identified 3,123 caQTLs. The caQTL variants are enriched in liver tissue promoter and enhancer states and frequently disrupt binding motifs of transcription factors expressed in liver. We predicted target genes for 861 caQTL peaks using proximity, chromatin interactions, correlation with promoter accessibility or gene expression, and colocalization with expression QTLs. Using GWAS signals for 19 liver function and/or cardiometabolic traits, we identified 110 colocalized caQTLs and GWAS signals, 56 of which contained a predicted caPeak target gene. At the LITAF LDL-cholesterol GWAS locus, we validated that a caQTL variant showed allelic differences in protein binding and transcriptional activity. These caQTLs contribute to the epigenomic characterization of human liver and help identify molecular mechanisms and genes at GWAS loci.


Asunto(s)
Cromatina/metabolismo , Hígado/metabolismo , Sitios de Carácter Cuantitativo , Secuencias de Aminoácidos , Sitios de Unión , Ensamble y Desensamble de Cromatina , Elementos de Facilitación Genéticos , Variación Genética , Estudio de Asociación del Genoma Completo , Humanos , Regiones Promotoras Genéticas , Unión Proteica , Factores de Transcripción/química , Factores de Transcripción/metabolismo , Transcriptoma
3.
Genome Res ; 31(12): 2258-2275, 2021 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-34815310

RESUMEN

Skeletal muscle accounts for the largest proportion of human body mass, on average, and is a key tissue in complex diseases and mobility. It is composed of several different cell and muscle fiber types. Here, we optimize single-nucleus ATAC-seq (snATAC-seq) to map skeletal muscle cell-specific chromatin accessibility landscapes in frozen human and rat samples, and single-nucleus RNA-seq (snRNA-seq) to map cell-specific transcriptomes in human. We additionally perform multi-omics profiling (gene expression and chromatin accessibility) on human and rat muscle samples. We capture type I and type II muscle fiber signatures, which are generally missed by existing single-cell RNA-seq methods. We perform cross-modality and cross-species integrative analyses on 33,862 nuclei and identify seven cell types ranging in abundance from 59.6% to 1.0% of all nuclei. We introduce a regression-based approach to infer cell types by comparing transcription start site-distal ATAC-seq peaks to reference enhancer maps and show consistency with RNA-based marker gene cell type assignments. We find heterogeneity in enrichment of genetic variants linked to complex phenotypes from the UK Biobank and diabetes genome-wide association studies in cell-specific ATAC-seq peaks, with the most striking enrichment patterns in muscle mesenchymal stem cells (∼3.5% of nuclei). Finally, we overlay these chromatin accessibility maps on GWAS data to nominate causal cell types, SNPs, transcription factor motifs, and target genes for type 2 diabetes signals. These chromatin accessibility profiles for human and rat skeletal muscle cell types are a useful resource for nominating causal GWAS SNPs and cell types.

4.
Nature ; 536(7614): 41-47, 2016 08 04.
Artículo en Inglés | MEDLINE | ID: mdl-27398621

RESUMEN

The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of the heritability of this disease. Here, to test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole-genome sequencing in 2,657 European individuals with and without diabetes, and exome sequencing in 12,940 individuals from five ancestry groups. To increase statistical power, we expanded the sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes.


Asunto(s)
Diabetes Mellitus Tipo 2/genética , Predisposición Genética a la Enfermedad/genética , Variación Genética/genética , Alelos , Análisis Mutacional de ADN , Europa (Continente)/etnología , Exoma , Estudio de Asociación del Genoma Completo , Técnicas de Genotipaje , Humanos , Tamaño de la Muestra
5.
Proc Natl Acad Sci U S A ; 116(22): 10883-10888, 2019 05 28.
Artículo en Inglés | MEDLINE | ID: mdl-31076557

RESUMEN

We integrate comeasured gene expression and DNA methylation (DNAme) in 265 human skeletal muscle biopsies from the FUSION study with >7 million genetic variants and eight physiological traits: height, waist, weight, waist-hip ratio, body mass index, fasting serum insulin, fasting plasma glucose, and type 2 diabetes. We find hundreds of genes and DNAme sites associated with fasting insulin, waist, and body mass index, as well as thousands of DNAme sites associated with gene expression (eQTM). We find that controlling for heterogeneity in tissue/muscle fiber type reduces the number of physiological trait associations, and that long-range eQTMs (>1 Mb) are reduced when controlling for tissue/muscle fiber type or latent factors. We map genetic regulators (quantitative trait loci; QTLs) of expression (eQTLs) and DNAme (mQTLs). Using Mendelian randomization (MR) and mediation techniques, we leverage these genetic maps to predict 213 causal relationships between expression and DNAme, approximately two-thirds of which predict methylation to causally influence expression. We use MR to integrate FUSION mQTLs, FUSION eQTLs, and GTEx eQTLs for 48 tissues with genetic associations for 534 diseases and quantitative traits. We identify hundreds of genes and thousands of DNAme sites that may drive the reported disease/quantitative trait genetic associations. We identify 300 gene expression MR associations that are present in both FUSION and GTEx skeletal muscle and that show stronger evidence of MR association in skeletal muscle than other tissues, which may partially reflect differences in power across tissues. As one example, we find that increased RXRA muscle expression may decrease lean tissue mass.


Asunto(s)
Metilación de ADN/genética , Expresión Génica/genética , Músculo Esquelético , Glucemia/análisis , Pesos y Medidas Corporales , Diabetes Mellitus Tipo 2 , Estudio de Asociación del Genoma Completo/métodos , Genómica/métodos , Humanos , Insulina/análisis , Músculo Esquelético/química , Músculo Esquelético/fisiología , Sitios de Carácter Cuantitativo/genética
6.
BMC Biol ; 19(1): 76, 2021 04 15.
Artículo en Inglés | MEDLINE | ID: mdl-33858413

RESUMEN

BACKGROUND: The pituitary gland is a neuroendocrine organ containing diverse cell types specialized in secreting hormones that regulate physiology. Pituitary thyrotropes produce thyroid-stimulating hormone (TSH), a critical factor for growth and maintenance of metabolism. The transcription factors POU1F1 and GATA2 have been implicated in thyrotrope fate, but the transcriptomic and epigenomic landscapes of these neuroendocrine cells have not been characterized. The goal of this work was to discover transcriptional regulatory elements that drive thyrotrope fate. RESULTS: We identified the transcription factors and epigenomic changes in chromatin that are associated with differentiation of POU1F1-expressing progenitors into thyrotropes using cell lines that represent an undifferentiated Pou1f1 lineage progenitor (GHF-T1) and a committed thyrotrope line that produces TSH (TαT1). We compared RNA-seq, ATAC-seq, histone modification (H3K27Ac, H3K4Me1, and H3K27Me3), and POU1F1 binding in these cell lines. POU1F1 binding sites are commonly associated with bZIP transcription factor consensus binding sites in GHF-T1 cells and Helix-Turn-Helix (HTH) or basic Helix-Loop-Helix (bHLH) factors in TαT1 cells, suggesting that these classes of transcription factors may recruit or cooperate with POU1F1 binding at unique sites. We validated enhancer function of novel elements we mapped near Cga, Pitx1, Gata2, and Tshb by transfection in TαT1 cells. Finally, we confirmed that an enhancer element near Tshb can drive expression in thyrotropes of transgenic mice, and we demonstrate that GATA2 enhances Tshb expression through this element. CONCLUSION: These results extend the ENCODE multi-omic profiling approach to the pituitary gland, which should be valuable for understanding pituitary development and disease pathogenesis.


Asunto(s)
Hipófisis , Animales , Ratones , Hipófisis/metabolismo , Secuencias Reguladoras de Ácidos Nucleicos , Tirotropina/genética , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Transfección
7.
Hum Mol Genet ; 28(5): 736-750, 2019 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-30380057

RESUMEN

Danforth's short tail (Sd) mice provide an excellent model for investigating the underlying etiology of human caudal birth defects, which affect 1 in 10 000 live births. Sd animals exhibit aberrant axial skeleton, urogenital and gastrointestinal development similar to human caudal malformation syndromes including urorectal septum malformation, caudal regression, vertebral-anal-cardiac-tracheo-esophageal fistula-renal-limb (VACTERL) association and persistent cloaca. Previous studies have shown that the Sd mutation results from an endogenous retroviral (ERV) insertion upstream of the Ptf1a gene resulting in its ectopic expression at E9.5. Though the genetic lesion has been determined, the resulting epigenomic and transcriptomic changes driving the phenotype have not been investigated. Here, we performed ATAC-seq experiments on isolated E9.5 tailbud tissue, which revealed minimal changes in chromatin accessibility in Sd/Sd mutant embryos. Interestingly, chromatin changes were localized to a small interval adjacent to the Sd ERV insertion overlapping a known Ptf1a enhancer region, which is conserved in mice and humans. Furthermore, mRNA-seq experiments revealed increased transcription of Ptf1a target genes and, importantly, downregulation of hedgehog pathway genes. Reduced sonic hedgehog (SHH) signaling was confirmed by in situ hybridization and immunofluorescence suggesting that the Sd phenotype results, in part, from downregulated SHH signaling. Taken together, these data demonstrate substantial transcriptome changes in the Sd mouse, and indicate that the effect of the ERV insertion on Ptf1a expression may be mediated by increased chromatin accessibility at a conserved Ptf1a enhancer. We propose that human caudal dysgenesis disorders may result from dysregulation of hedgehog signaling pathways.


Asunto(s)
Ensamble y Desensamble de Cromatina , Cromatina/genética , Cromatina/metabolismo , Epigenoma , Proteínas Hedgehog/metabolismo , Transducción de Señal , Transcriptoma , Animales , Biomarcadores , Biología Computacional/métodos , Elementos de Facilitación Genéticos , Técnica del Anticuerpo Fluorescente , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Ontología de Genes , Ratones , Mutación , Organogénesis/genética , Fenotipo , Regiones Promotoras Genéticas
8.
Am J Hum Genet ; 102(4): 620-635, 2018 04 05.
Artículo en Inglés | MEDLINE | ID: mdl-29625024

RESUMEN

Genome-wide association studies (GWASs) and functional genomics approaches implicate enhancer disruption in islet dysfunction and type 2 diabetes (T2D) risk. We applied genetic fine-mapping and functional (epi)genomic approaches to a T2D- and proinsulin-associated 15q22.2 locus to identify a most likely causal variant, determine its direction of effect, and elucidate plausible target genes. Fine-mapping and conditional analyses of proinsulin levels of 8,635 non-diabetic individuals from the METSIM study support a single association signal represented by a cluster of 16 strongly associated (p < 10-17) variants in high linkage disequilibrium (r2 > 0.8) with the GWAS index SNP rs7172432. These variants reside in an evolutionarily and functionally conserved islet and ß cell stretch or super enhancer; the most strongly associated variant (rs7163757, p = 3 × 10-19) overlaps a conserved islet open chromatin site. DNA sequence containing the rs7163757 risk allele displayed 2-fold higher enhancer activity than the non-risk allele in reporter assays (p < 0.01) and was differentially bound by ß cell nuclear extract proteins. Transcription factor NFAT specifically potentiated risk-allele enhancer activity and altered patterns of nuclear protein binding to the risk allele in vitro, suggesting that it could be a factor mediating risk-allele effects. Finally, the rs7163757 proinsulin-raising and T2D risk allele (C) was associated with increased expression of C2CD4B, and possibly C2CD4A, both of which were induced by inflammatory cytokines, in human islets. Together, these data suggest that rs7163757 contributes to genetic risk of islet dysfunction and T2D by increasing NFAT-mediated islet enhancer activity and modulating C2CD4B, and possibly C2CD4A, expression in (patho)physiologic states.


Asunto(s)
Proteínas de Unión al Calcio/genética , Secuencia Conservada , Elementos de Facilitación Genéticos/genética , Evolución Molecular , Islotes Pancreáticos/patología , Mutación/genética , Proteínas Nucleares/genética , Factores de Transcripción/genética , Anciano , Alelos , Animales , Secuencia de Bases , Proteínas de Unión al Calcio/metabolismo , Línea Celular , Cromatina/metabolismo , Cromosomas Humanos Par 15/genética , Citocinas/metabolismo , ADN Intergénico/genética , Humanos , Mediadores de Inflamación/metabolismo , Ratones , Persona de Mediana Edad , Factores de Transcripción NFATC/metabolismo , Mapeo Físico de Cromosoma , Polimorfismo de Nucleótido Simple/genética , Proinsulina/metabolismo , Ratas , Factores de Riesgo
9.
Nature ; 520(7548): 558-62, 2015 Apr 23.
Artículo en Inglés | MEDLINE | ID: mdl-25686607

RESUMEN

Enhancers regulate spatiotemporal gene expression and impart cell-specific transcriptional outputs that drive cell identity. Super-enhancers (SEs), also known as stretch-enhancers, are a subset of enhancers especially important for genes associated with cell identity and genetic risk of disease. CD4(+) T cells are critical for host defence and autoimmunity. Here we analysed maps of mouse T-cell SEs as a non-biased means of identifying key regulatory nodes involved in cell specification. We found that cytokines and cytokine receptors were the dominant class of genes exhibiting SE architecture in T cells. Nonetheless, the locus encoding Bach2, a key negative regulator of effector differentiation, emerged as the most prominent T-cell SE, revealing a network in which SE-associated genes critical for T-cell biology are repressed by BACH2. Disease-associated single-nucleotide polymorphisms for immune-mediated disorders, including rheumatoid arthritis, were highly enriched for T-cell SEs versus typical enhancers or SEs in other cell lineages. Intriguingly, treatment of T cells with the Janus kinase (JAK) inhibitor tofacitinib disproportionately altered the expression of rheumatoid arthritis risk genes with SE structures. Together, these results indicate that genes with SE architecture in T cells encompass a variety of cytokines and cytokine receptors but are controlled by a 'guardian' transcription factor, itself endowed with an SE. Thus, enumeration of SEs allows the unbiased determination of key regulatory nodes in T cells, which are preferentially modulated by pharmacological intervention.


Asunto(s)
Artritis Reumatoide/genética , Elementos de Facilitación Genéticos/genética , Linfocitos T Colaboradores-Inductores/metabolismo , Linfocitos T Colaboradores-Inductores/patología , Animales , Artritis Reumatoide/inmunología , Artritis Reumatoide/patología , Factores de Transcripción con Cremalleras de Leucina de Carácter Básico/metabolismo , Diferenciación Celular/genética , Linaje de la Célula/genética , Regulación de la Expresión Génica/genética , Predisposición Genética a la Enfermedad/genética , Janus Quinasa 3/antagonistas & inhibidores , Ratones , Ratones Endogámicos C57BL , Piperidinas/farmacología , Pirimidinas/farmacología , Pirroles/farmacología , ARN no Traducido/genética , Linfocitos T Colaboradores-Inductores/inmunología , Transcripción Genética/genética , Factores de Transcripción p300-CBP/metabolismo
10.
Nature ; 512(7515): 449-52, 2014 Aug 28.
Artículo en Inglés | MEDLINE | ID: mdl-25164756

RESUMEN

Genome function is dynamically regulated in part by chromatin, which consists of the histones, non-histone proteins and RNA molecules that package DNA. Studies in Caenorhabditis elegans and Drosophila melanogaster have contributed substantially to our understanding of molecular mechanisms of genome function in humans, and have revealed conservation of chromatin components and mechanisms. Nevertheless, the three organisms have markedly different genome sizes, chromosome architecture and gene organization. On human and fly chromosomes, for example, pericentric heterochromatin flanks single centromeres, whereas worm chromosomes have dispersed heterochromatin-like regions enriched in the distal chromosomal 'arms', and centromeres distributed along their lengths. To systematically investigate chromatin organization and associated gene regulation across species, we generated and analysed a large collection of genome-wide chromatin data sets from cell lines and developmental stages in worm, fly and human. Here we present over 800 new data sets from our ENCODE and modENCODE consortia, bringing the total to over 1,400. Comparison of combinatorial patterns of histone modifications, nuclear lamina-associated domains, organization of large-scale topological domains, chromatin environment at promoters and enhancers, nucleosome positioning, and DNA replication patterns reveals many conserved features of chromatin organization among the three organisms. We also find notable differences in the composition and locations of repressive chromatin. These data sets and analyses provide a rich resource for comparative and species-specific investigations of chromatin composition, organization and function.


Asunto(s)
Caenorhabditis elegans/citología , Caenorhabditis elegans/genética , Cromatina/genética , Cromatina/metabolismo , Drosophila melanogaster/citología , Drosophila melanogaster/genética , Animales , Línea Celular , Centrómero/genética , Centrómero/metabolismo , Cromatina/química , Ensamble y Desensamble de Cromatina/genética , Replicación del ADN/genética , Elementos de Facilitación Genéticos/genética , Epigénesis Genética , Heterocromatina/química , Heterocromatina/genética , Heterocromatina/metabolismo , Histonas/química , Histonas/metabolismo , Humanos , Anotación de Secuencia Molecular , Lámina Nuclear/metabolismo , Nucleosomas/química , Nucleosomas/genética , Nucleosomas/metabolismo , Regiones Promotoras Genéticas/genética , Especificidad de la Especie
11.
Proc Natl Acad Sci U S A ; 114(9): 2301-2306, 2017 02 28.
Artículo en Inglés | MEDLINE | ID: mdl-28193859

RESUMEN

Genome-wide association studies (GWAS) have identified >100 independent SNPs that modulate the risk of type 2 diabetes (T2D) and related traits. However, the pathogenic mechanisms of most of these SNPs remain elusive. Here, we examined genomic, epigenomic, and transcriptomic profiles in human pancreatic islets to understand the links between genetic variation, chromatin landscape, and gene expression in the context of T2D. We first integrated genome and transcriptome variation across 112 islet samples to produce dense cis-expression quantitative trait loci (cis-eQTL) maps. Additional integration with chromatin-state maps for islets and other diverse tissue types revealed that cis-eQTLs for islet-specific genes are specifically and significantly enriched in islet stretch enhancers. High-resolution chromatin accessibility profiling using assay for transposase-accessible chromatin sequencing (ATAC-seq) in two islet samples enabled us to identify specific transcription factor (TF) footprints embedded in active regulatory elements, which are highly enriched for islet cis-eQTL. Aggregate allelic bias signatures in TF footprints enabled us de novo to reconstruct TF binding affinities genetically, which support the high-quality nature of the TF footprint predictions. Interestingly, we found that T2D GWAS loci were strikingly and specifically enriched in islet Regulatory Factor X (RFX) footprints. Remarkably, within and across independent loci, T2D risk alleles that overlap with RFX footprints uniformly disrupt the RFX motifs at high-information content positions. Together, these results suggest that common regulatory variations have shaped islet TF footprints and the transcriptome and that a confluent RFX regulatory grammar plays a significant role in the genetic component of T2D predisposition.


Asunto(s)
Diabetes Mellitus Tipo 2/genética , Predisposición Genética a la Enfermedad , Genoma Humano , Islotes Pancreáticos/metabolismo , Sitios de Carácter Cuantitativo , Transcriptoma , Alelos , Secuencia de Bases , Sitios de Unión , Cromatina/química , Cromatina/metabolismo , Diabetes Mellitus Tipo 2/metabolismo , Diabetes Mellitus Tipo 2/patología , Epigénesis Genética , Perfilación de la Expresión Génica , Variación Genética , Estudio de Asociación del Genoma Completo , Impresión Genómica , Humanos , Islotes Pancreáticos/patología , Polimorfismo de Nucleótido Simple , Unión Proteica , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Factores de Transcripción del Factor Regulador X/genética , Factores de Transcripción del Factor Regulador X/metabolismo
12.
Diabetologia ; 62(5): 735-743, 2019 05.
Artículo en Inglés | MEDLINE | ID: mdl-30756131

RESUMEN

Variation in non-coding DNA, encompassing gene regulatory regions such as enhancers and promoters, contributes to risk for complex disorders, including type 2 diabetes. While genome-wide association studies have successfully identified hundreds of type 2 diabetes loci throughout the genome, the vast majority of these reside in non-coding DNA, which complicates the process of determining their functional significance and level of priority for further study. Here we review the methods used to experimentally annotate these non-coding variants, to nominate causal variants and to link them to diabetes pathophysiology. In recent years, chromatin profiling, massively parallel sequencing, high-throughput reporter assays and CRISPR gene editing technologies have rapidly become indispensable tools. Rather than treating individual variants in isolation, we discuss the importance of accounting for context, both genetic (such as flanking DNA sequence) and environmental (such as cellular state or environmental exposure). Incorporating these features shows promise in terms of revealing biologically convergent molecular signatures across distant and seemingly unrelated loci. Studying regulatory elements in the proper context will be crucial for interpreting the functional significance of disease-associated variants and applying the resulting knowledge to improve patient care.


Asunto(s)
Diabetes Mellitus Tipo 2/diagnóstico , Diabetes Mellitus Tipo 2/genética , Estudio de Asociación del Genoma Completo , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Cromatina/química , Predisposición Genética a la Enfermedad , Variación Genética , Genoma Humano , Genómica , Histonas/química , Humanos , Regiones Promotoras Genéticas , Secuencias Reguladoras de Ácidos Nucleicos
13.
Bioinformatics ; 34(20): 3578-3580, 2018 10 15.
Artículo en Inglés | MEDLINE | ID: mdl-29790915

RESUMEN

Motivation: Motif discovery in large biopolymer sequence datasets can be computationally demanding, presenting significant challenges for discovery in omics research. MEME, arguably one of the most popular motif discovery software, takes quadratic time with respect to dataset size, leading to excessively long runtimes for large datasets. Therefore, there is a demand for fast programs that can generate results of the same quality as MEME. Results: Here we describe YAMDA, a highly scalable motif discovery software package. It is built on Pytorch, a tensor computation deep learning library with strong GPU acceleration that is highly optimized for tensor operations that are also useful for motifs. YAMDA takes linear time to find motifs as accurately as MEME, completing in seconds or minutes, which translates to speedups over a thousandfold. Availability and implementation: YAMDA is freely available on Github (https://github.com/daquang/YAMDA). Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Aprendizaje Profundo , Programas Informáticos , Algoritmos , Gráficos por Computador , Factores de Tiempo
14.
BMC Genomics ; 19(1): 390, 2018 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-29792182

RESUMEN

BACKGROUND: Bisulfite sequencing is widely employed to study the role of DNA methylation in disease; however, the data suffer from biases due to coverage depth variability. Imputation of methylation values at low-coverage sites may mitigate these biases while also identifying important genomic features associated with predictive power. RESULTS: Here we describe BoostMe, a method for imputing low-quality DNA methylation estimates within whole-genome bisulfite sequencing (WGBS) data. BoostMe uses a gradient boosting algorithm, XGBoost, and leverages information from multiple samples for prediction. We find that BoostMe outperforms existing algorithms in speed and accuracy when applied to WGBS of human tissues. Furthermore, we show that imputation improves concordance between WGBS and the MethylationEPIC array at low WGBS depth, suggesting improved WGBS accuracy after imputation. CONCLUSIONS: Our findings support the use of BoostMe as a preprocessing step for WGBS analysis.


Asunto(s)
Biología Computacional/métodos , Metilación de ADN/efectos de los fármacos , Sulfitos/farmacología , Secuenciación Completa del Genoma , Algoritmos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos
15.
Nucleic Acids Res ; 43(Database issue): D103-9, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25326329

RESUMEN

Many regulatory mechanisms require a high degree of specificity in protein-DNA binding. Nucleotide sequence does not provide an answer to the question of why a protein binds only to a small subset of the many putative binding sites in the genome that share the same core motif. Whereas higher-order effects, such as chromatin accessibility, cooperativity and cofactors, have been described, DNA shape recently gained attention as another feature that fine-tunes the DNA binding specificities of some transcription factor families. Our Genome Browser for DNA shape annotations (GBshape; freely available at http://rohslab.cmb.usc.edu/GBshape/) provides minor groove width, propeller twist, roll, helix twist and hydroxyl radical cleavage predictions for the entire genomes of 94 organisms. Additional genomes can easily be added using the GBshape framework. GBshape can be used to visualize DNA shape annotations qualitatively in a genome browser track format, and to download quantitative values of DNA shape features as a function of genomic position at nucleotide resolution. As biological applications, we illustrate the periodicity of DNA shape features that are present in nucleosome-occupied sequences from human, fly and worm, and we demonstrate structural similarities between transcription start sites in the genomes of four Drosophila species.


Asunto(s)
ADN/química , Bases de Datos de Ácidos Nucleicos , Genoma , Anotación de Secuencia Molecular , Navegador Web , Animales , Sitios de Unión , Humanos , Conformación de Ácido Nucleico , Nucleosomas/metabolismo , Sitio de Iniciación de la Transcripción
16.
Proc Natl Acad Sci U S A ; 110(44): 17921-6, 2013 Oct 29.
Artículo en Inglés | MEDLINE | ID: mdl-24127591

RESUMEN

Chromatin-based functional genomic analyses and genomewide association studies (GWASs) together implicate enhancers as critical elements influencing gene expression and risk for common diseases. Here, we performed systematic chromatin and transcriptome profiling in human pancreatic islets. Integrated analysis of islet data with those from nine cell types identified specific and significant enrichment of type 2 diabetes and related quantitative trait GWAS variants in islet enhancers. Our integrated chromatin maps reveal that most enhancers are short (median = 0.8 kb). Each cell type also contains a substantial number of more extended (≥ 3 kb) enhancers. Interestingly, these stretch enhancers are often tissue-specific and overlap locus control regions, suggesting that they are important chromatin regulatory beacons. Indeed, we show that (i) tissue specificity of enhancers and nearby gene expression increase with enhancer length; (ii) neighborhoods containing stretch enhancers are enriched for important cell type-specific genes; and (iii) GWAS variants associated with traits relevant to a particular cell type are more enriched in stretch enhancers compared with short enhancers. Reporter constructs containing stretch enhancer sequences exhibited tissue-specific activity in cell culture experiments and in transgenic mice. These results suggest that stretch enhancers are critical chromatin elements for coordinating cell type-specific regulatory programs and that sequence variation in stretch enhancers affects risk of major common human diseases.


Asunto(s)
Diferenciación Celular/fisiología , Cromatina/fisiología , Diabetes Mellitus Tipo 2/fisiopatología , Elementos de Facilitación Genéticos/genética , Epigenómica/métodos , Regulación de la Expresión Génica/fisiología , Células Secretoras de Insulina/metabolismo , Animales , Inmunoprecipitación de Cromatina , Diabetes Mellitus Tipo 2/genética , Elementos de Facilitación Genéticos/fisiología , Perfilación de la Expresión Génica , Regulación de la Expresión Génica/genética , Estudio de Asociación del Genoma Completo , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Células Secretoras de Insulina/fisiología , Luciferasas , Ratones , Ratones Transgénicos
17.
Proc Natl Acad Sci U S A ; 110(33): 13481-6, 2013 Aug 13.
Artículo en Inglés | MEDLINE | ID: mdl-23901115

RESUMEN

Synonymous mutations, which do not alter the protein sequence, have been shown to affect protein function [Sauna ZE, Kimchi-Sarfaty C (2011) Nat Rev Genet 12(10):683-691]. However, synonymous mutations are rarely investigated in the cancer genomics field. We used whole-genome and -exome sequencing to identify somatic mutations in 29 melanoma samples. Validation of one synonymous somatic mutation in BCL2L12 in 285 samples identified 12 cases that harbored the recurrent F17F mutation. This mutation led to increased BCL2L12 mRNA and protein levels because of differential targeting of WT and mutant BCL2L12 by hsa-miR-671-5p. Protein made from mutant BCL2L12 transcript bound p53, inhibited UV-induced apoptosis more efficiently than WT BCL2L12, and reduced endogenous p53 target gene transcription. This report shows selection of a recurrent somatic synonymous mutation in cancer. Our data indicate that silent alterations have a role to play in human cancer, emphasizing the importance of their investigation in future cancer genome studies.


Asunto(s)
Apoptosis/genética , Regulación de la Expresión Génica/genética , Genoma Humano/genética , Melanoma/genética , Proteínas Musculares/genética , Proteínas Proto-Oncogénicas c-bcl-2/genética , Secuencia de Bases , Western Blotting , Cartilla de ADN/genética , Exoma/genética , Vectores Genéticos/genética , Células HEK293 , Humanos , Inmunoprecipitación , Lentivirus , MicroARNs/genética , Datos de Secuencia Molecular , Proteínas Musculares/metabolismo , Mutación/genética , Polimorfismo de Nucleótido Simple/genética , Proteínas Proto-Oncogénicas c-bcl-2/metabolismo , ARN Interferente Pequeño/genética , Reacción en Cadena en Tiempo Real de la Polimerasa , Análisis de Secuencia de ADN , Proteína p53 Supresora de Tumor/metabolismo
18.
PLoS Genet ; 8(8): e1002871, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22912592

RESUMEN

Much emphasis has been placed on the identification, functional characterization, and therapeutic potential of somatic variants in tumor genomes. However, the majority of somatic variants lie outside coding regions and their role in cancer progression remains to be determined. In order to establish a system to test the functional importance of non-coding somatic variants in cancer, we created a low-passage cell culture of a metastatic melanoma tumor sample. As a foundation for interpreting functional assays, we performed whole-genome sequencing and analysis of this cell culture, the metastatic tumor from which it was derived, and the patient-matched normal genomes. When comparing somatic mutations identified in the cell culture and tissue genomes, we observe concordance at the majority of single nucleotide variants, whereas copy number changes are more variable. To understand the functional impact of non-coding somatic variation, we leveraged functional data generated by the ENCODE Project Consortium. We analyzed regulatory regions derived from multiple different cell types and found that melanocyte-specific regions are among the most depleted for somatic mutation accumulation. Significant depletion in other cell types suggests the metastatic melanoma cells de-differentiated to a more basal regulatory state. Experimental identification of genome-wide regulatory sites in two different melanoma samples supports this observation. Together, these results show that mutation accumulation in metastatic melanoma is nonrandom across the genome and that a de-differentiated regulatory architecture is common among different samples. Our findings enable identification of the underlying genetic components of melanoma and define the differences between a tissue-derived tumor sample and the cell culture created from it. Such information helps establish a broader mechanistic understanding of the linkage between non-coding genomic variations and the cellular evolution of cancer.


Asunto(s)
Desdiferenciación Celular/genética , ADN Intergénico , Melanoma/genética , Metástasis de la Neoplasia , Polimorfismo de Nucleótido Simple , Adulto , Variaciones en el Número de Copia de ADN , Genoma Humano , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Melanocitos/metabolismo , Melanocitos/patología , Cultivo Primario de Células , Secuencias Reguladoras de Ácidos Nucleicos , Células Tumorales Cultivadas
19.
PLoS Genet ; 8(6): e1002789, 2012 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-22761590

RESUMEN

Understanding the molecular basis for phenotypic differences between humans and other primates remains an outstanding challenge. Mutations in non-coding regulatory DNA that alter gene expression have been hypothesized as a key driver of these phenotypic differences. This has been supported by differential gene expression analyses in general, but not by the identification of specific regulatory elements responsible for changes in transcription and phenotype. To identify the genetic source of regulatory differences, we mapped DNaseI hypersensitive (DHS) sites, which mark all types of active gene regulatory elements, genome-wide in the same cell type isolated from human, chimpanzee, and macaque. Most DHS sites were conserved among all three species, as expected based on their central role in regulating transcription. However, we found evidence that several hundred DHS sites were gained or lost on the lineages leading to modern human and chimpanzee. Species-specific DHS site gains are enriched near differentially expressed genes, are positively correlated with increased transcription, show evidence of branch-specific positive selection, and overlap with active chromatin marks. Species-specific sequence differences in transcription factor motifs found within these DHS sites are linked with species-specific changes in chromatin accessibility. Together, these indicate that the regulatory elements identified here are genetic contributors to transcriptional and phenotypic differences among primate species.


Asunto(s)
Desoxirribonucleasa I/genética , Evolución Molecular , Primates/genética , Secuencias Reguladoras de Ácidos Nucleicos/genética , Transcripción Genética , Animales , Sitios de Unión/genética , Línea Celular , Cromatina/genética , Regulación de la Expresión Génica , Genoma Humano , Humanos , Mutación , Motivos de Nucleótidos , Fenotipo , Selección Genética , Especificidad de la Especie , Factores de Transcripción/genética
20.
Genome Res ; 21(9): 1498-505, 2011 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-21771779

RESUMEN

As whole-genome sequencing becomes commoditized and we begin to sequence and analyze personal genomes for clinical and diagnostic purposes, it is necessary to understand what constitutes a complete sequencing experiment for determining genotypes and detecting single-nucleotide variants. Here, we show that the current recommendation of ∼30× coverage is not adequate to produce genotype calls across a large fraction of the genome with acceptably low error rates. Our results are based on analyses of a clinical sample sequenced on two related Illumina platforms, GAII(x) and HiSeq 2000, to a very high depth (126×). We used these data to establish genotype-calling filters that dramatically increase accuracy. We also empirically determined how the callable portion of the genome varies as a function of the amount of sequence data used. These results help provide a "sequencing guide" for future whole-genome sequencing decisions and metrics by which coverage statistics should be reported.


Asunto(s)
Genoma Humano , Análisis de Secuencia de ADN , Genómica , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Polimorfismo de Nucleótido Simple , Reproducibilidad de los Resultados
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA