RESUMEN
Genetic risk variants that have been identified in genome-wide association studies of complex diseases are primarily non-coding1. Translating these risk variants into mechanistic insights requires detailed maps of gene regulation in disease-relevant cell types2. Here we combined two approaches: a genome-wide association study of type 1 diabetes (T1D) using 520,580 samples, and the identification of candidate cis-regulatory elements (cCREs) in pancreas and peripheral blood mononuclear cells using single-nucleus assay for transposase-accessible chromatin with sequencing (snATAC-seq) of 131,554 nuclei. Risk variants for T1D were enriched in cCREs that were active in T cells and other cell types, including acinar and ductal cells of the exocrine pancreas. Risk variants at multiple T1D signals overlapped with exocrine-specific cCREs that were linked to genes with exocrine-specific expression. At the CFTR locus, the T1D risk variant rs7795896 mapped to a ductal-specific cCRE that regulated CFTR; the risk allele reduced transcription factor binding, enhancer activity and CFTR expression in ductal cells. These findings support a role for the exocrine pancreas in the pathogenesis of T1D and highlight the power of large-scale genome-wide association studies and single-cell epigenomics for understanding the cellular origins of complex disease.
Asunto(s)
Diabetes Mellitus Tipo 1/genética , Epigenómica , Predisposición Genética a la Enfermedad , Análisis de la Célula Individual , Cromatina/genética , Regulador de Conductancia de Transmembrana de Fibrosis Quística/genética , Femenino , Regulación de la Expresión Génica , Estudio de Asociación del Genoma Completo , Humanos , Inmunidad/genética , Masculino , Conductos Pancreáticos/metabolismo , Conductos Pancreáticos/patologíaRESUMEN
Many sequence variants have been linked to complex human traits and diseases1, but deciphering their biological functions remains challenging, as most of them reside in noncoding DNA. Here we have systematically assessed the binding of 270 human transcription factors to 95,886 noncoding variants in the human genome using an ultra-high-throughput multiplex protein-DNA binding assay, termed single-nucleotide polymorphism evaluation by systematic evolution of ligands by exponential enrichment (SNP-SELEX). The resulting 828 million measurements of transcription factor-DNA interactions enable estimation of the relative affinity of these transcription factors to each variant in vitro and evaluation of the current methods to predict the effects of noncoding variants on transcription factor binding. We show that the position weight matrices of most transcription factors lack sufficient predictive power, whereas the support vector machine combined with the gapped k-mer representation show much improved performance, when assessed on results from independent SNP-SELEX experiments involving a new set of 61,020 sequence variants. We report highly predictive models for 94 human transcription factors and demonstrate their utility in genome-wide association studies and understanding of the molecular pathways involved in diverse human traits and diseases.
Asunto(s)
Polimorfismo de Nucleótido Simple/genética , Técnica SELEX de Producción de Aptámeros , Máquina de Vectores de Soporte , Factores de Transcripción/metabolismo , Sitios de Unión/genética , Enfermedad/genética , Genoma Humano/genética , Humanos , Ligandos , Unión ProteicaRESUMEN
Gene regulation is highly cell type-specific and understanding the function of non-coding genetic variants associated with complex traits requires molecular phenotyping at cell type resolution. In this study we performed single nucleus ATAC-seq (snATAC-seq) and genotyping in peripheral blood mononuclear cells from 13 individuals. Clustering chromatin accessibility profiles of 96,002 total nuclei identified 17 immune cell types and sub-types. We mapped chromatin accessibility QTLs (caQTLs) in each immune cell type and sub-type using individuals of European ancestry which identified 6,901 caQTLs at FDR < .10 and 4,220 caQTLs at FDR < .05, including those obscured from assays of bulk tissue such as with divergent effects on different cell types. For 3,941 caQTLs we further annotated putative target genes of variant activity using single cell co-accessibility, and caQTL variants were significantly correlated with the accessibility level of linked gene promoters. We fine-mapped loci associated with 16 complex immune traits and identified immune cell caQTLs at 622 candidate causal variants, including those with cell type-specific effects. At the 6q15 locus associated with type 1 diabetes, in line with previous reports, variant rs72928038 was a naïve CD4+ T cell caQTL linked to BACH2 and we validated the allelic effects of this variant on regulatory activity in Jurkat T cells. These results highlight the utility of snATAC-seq for mapping genetic effects on accessible chromatin in specific cell types.
Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina , Cromatina , Humanos , Cromatina/genética , Herencia Multifactorial , Leucocitos Mononucleares , Sitios de Carácter Cuantitativo/genéticaRESUMEN
Glucocorticoids are key regulators of glucose homeostasis and pancreatic islet function, but the gene regulatory programs driving responses to glucocorticoid signaling in islets and the contribution of these programs to diabetes risk are unknown. In this study we used ATAC-seq and RNA-seq to map chromatin accessibility and gene expression from eleven primary human islet samples cultured in vitro with the glucocorticoid dexamethasone at multiple doses and durations. We identified thousands of accessible chromatin sites and genes with significant changes in activity in response to glucocorticoids. Chromatin sites up-regulated in glucocorticoid signaling were prominently enriched for glucocorticoid receptor binding sites and up-regulated genes were enriched for ion transport and lipid metabolism, whereas down-regulated chromatin sites and genes were enriched for inflammatory, stress response and proliferative processes. Genetic variants associated with glucose levels and T2D risk were enriched in glucocorticoid-responsive chromatin sites, including fine-mapped variants at 51 known signals. Among fine-mapped variants in glucocorticoid-responsive chromatin, a likely casual variant at the 2p21 locus had glucocorticoid-dependent allelic effects on beta cell enhancer activity and affected SIX2 and SIX3 expression. Our results provide a comprehensive map of islet regulatory programs in response to glucocorticoids through which we uncover a role for islet glucocorticoid signaling in mediating genetic risk of T2D.
Asunto(s)
Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Redes Reguladoras de Genes , Predisposición Genética a la Enfermedad , Glucocorticoides/metabolismo , Islotes Pancreáticos/metabolismo , Transducción de Señal , Animales , Glucemia/metabolismo , Línea Celular , Cromatina/genética , Cromatina/metabolismo , Diabetes Mellitus Tipo 2/sangre , Humanos , RatonesRESUMEN
BACKGROUND: Adult granulosa cell tumor (aGCT) is a rare type of stromal cell malignant cancer of the ovary characterized by elevated estrogen levels. aGCTs ubiquitously harbor a somatic mutation in FOXL2 gene, Cys134Trp (c.402C < G); however, the general molecular effect of this mutation and its putative pathogenic role in aGCT tumorigenesis is not completely understood. We previously studied the role of FOXL2C134W, its partner SMAD3 and its antagonist FOXO1 in cellular models of aGCT. METHODS: In this work, seeking more comprehensive profiling of FOXL2C134W transcriptomic effects, we performed an RNA-seq analysis comparing the effect of FOXL2WT/SMAD3 and FOXL2C134W/SMAD3 overexpression in an established human GC line (HGrC1), which is not luteinized, and bears normal alleles of FOXL2. RESULTS: Our data shows that FOXL2C134W/SMAD3 overexpression alters the expression of 717 genes. These genes include known and novel FOXL2 targets (TGFB2, SMARCA4, HSPG2, MKI67, NFKBIA) and are enriched for neoplastic pathways (Proteoglycans in Cancer, Chromatin remodeling, Apoptosis, Tissue Morphogenesis, Tyrosine Kinase Receptors). We additionally expressed the FOXL2 antagonistic Forkhead protein, FOXO1. Surprisingly, overexpression of FOXO1 mitigated 40% of the altered genome-wide effects specifically related to FOXL2C134W, suggesting it can be a new target for aGCT treatment. CONCLUSIONS: Our transcriptomic data provide novel insights into potential genes (FOXO1 regulated) that could be used as biomarkers of efficacy in aGCT patients.
Asunto(s)
Tumor de Células de la Granulosa , Neoplasias Ováricas , Adulto , Línea Celular Tumoral , ADN Helicasas , Femenino , Proteína Forkhead Box L2 , Proteína Forkhead Box O1/genética , Factores de Transcripción Forkhead/genética , Factores de Transcripción Forkhead/metabolismo , Tumor de Células de la Granulosa/genética , Humanos , Mutación , Proteínas Nucleares , Neoplasias Ováricas/genética , Proteína smad3/genética , Factores de Transcripción , Transcriptoma/genéticaRESUMEN
BACKGROUND: Genomic interaction studies use next-generation sequencing (NGS) to examine the interactions between two loci on the genome, with subsequent bioinformatics analyses typically including annotation, intersection, and merging of data from multiple experiments. While many file types and analysis tools exist for storing and manipulating single locus NGS data, there is currently no file standard or analysis tool suite for manipulating and storing paired-genomic-loci: the data type resulting from "genomic interaction" studies. As genomic interaction sequencing data are becoming prevalent, a standard file format and tools for working with these data conveniently and efficiently are needed. RESULTS: This article details a file standard and novel software tool suite for working with paired-genomic-loci data. We present the paired-genomic-loci (PGL) file standard for genomic-interactions data, and the accompanying analysis tool suite "pgltools": a cross platform, pypy compatible python package available both as an easy-to-use UNIX package, and as a python module, for integration into pipelines of paired-genomic-loci analyses. CONCLUSIONS: Pgltools is a freely available, open source tool suite for manipulating paired-genomic-loci data. Source code, an in-depth manual, and a tutorial are available publicly at www.github.com/billgreenwald/pgltools , and a python module of the operations can be installed from PyPI via the PyGLtools module.
Asunto(s)
Cromatina/metabolismo , Genómica/métodos , Programas Informáticos , Cromatina/genética , Inmunoprecipitación de Cromatina , Sitios Genéticos , Secuenciación de Nucleótidos de Alto RendimientoRESUMEN
We performed whole genome sequencing in 16 unrelated patients with autosomal recessive retinitis pigmentosa (ARRP), a disease characterized by progressive retinal degeneration and caused by mutations in over 50 genes, in search of pathogenic DNA variants. Eight patients were from North America, whereas eight were Japanese, a population for which ARRP seems to have different genetic drivers. Using a specific workflow, we assessed both the coding and noncoding regions of the human genome, including the evaluation of highly polymorphic SNPs, structural and copy number variations, as well as 69 control genomes sequenced by the same procedures. We detected homozygous or compound heterozygous mutations in 7 genes associated with ARRP (USH2A, RDH12, CNGB1, EYS, PDE6B, DFNB31, and CERKL) in eight patients, three Japanese and five Americans. Fourteen of the 16 mutant alleles identified were previously unknown. Among these, there was a 2.3-kb deletion in USH2A and an inverted duplication of ~446 kb in EYS, which would have likely escaped conventional screening techniques or exome sequencing. Moreover, in another Japanese patient, we identified a homozygous frameshift (p.L206fs), absent in more than 2,500 chromosomes from ethnically matched controls, in the ciliary gene NEK2, encoding a serine/threonine-protein kinase. Inactivation of this gene in zebrafish induced retinal photoreceptor defects that were rescued by human NEK2 mRNA. In addition to identifying a previously undescribed ARRP gene, our study highlights the importance of rare structural DNA variations in Mendelian diseases and advocates the need for screening approaches that transcend the analysis of the coding sequences of the human genome.
Asunto(s)
Reordenamiento Génico/genética , Genoma Humano/genética , Proteínas Serina-Treonina Quinasas/genética , Retinitis Pigmentosa/genética , Animales , Secuencia de Bases , Mutación del Sistema de Lectura/genética , Genética Médica , Estudio de Asociación del Genoma Completo , Humanos , Japón , Datos de Secuencia Molecular , Quinasas Relacionadas con NIMA , Análisis de Secuencia de ADN , Estados Unidos , Pez CebraRESUMEN
Pseudomonas knackmussiiâ B13 was the first strain to be isolated in 1974 that could degrade chlorinated aromatic hydrocarbons. This discovery was the prologue for subsequent characterization of numerous bacterial metabolic pathways, for genetic and biochemical studies, and which spurred ideas for pollutant bioremediation. In this study, we determined the complete genome sequence of B13 using next generation sequencing technologies and optical mapping. Genome annotation indicated that B13 has a variety of metabolic pathways for degrading monoaromatic hydrocarbons including chlorobenzoate, aminophenol, anthranilate and hydroxyquinol, but not polyaromatic compounds. Comparative genome analysis revealed that B13 is closest to Pseudomonas denitrificans and Pseudomonas aeruginosa. The B13 genome contains at least eight genomic islands [prophages and integrative conjugative elements (ICEs)], which were absent in closely related pseudomonads. We confirm that two ICEs are identical copies of the 103 kb self-transmissible element ICEclc that carries the genes for chlorocatechol metabolism. Comparison of ICEclc showed that it is composed of a variable and a 'core' region, which is very conserved among proteobacterial genomes, suggesting a widely distributed family of so far uncharacterized ICE. Resequencing of two spontaneous B13 mutants revealed a number of single nucleotide substitutions, as well as excision of a large 220 kb region and a prophage that drastically change the host metabolic capacity and survivability.
Asunto(s)
Genoma Bacteriano , Pseudomonas/genética , Clorobenzoatos/metabolismo , Cromosomas Bacterianos , Islas Genómicas , Genómica , Hidrocarburos Aromáticos/metabolismo , Redes y Vías Metabólicas , Profagos/genética , Pseudomonas/clasificación , Pseudomonas/metabolismo , Pseudomonas aeruginosa/genéticaRESUMEN
PURPOSE: Mutations in genes encoding proteins from the tri-snRNP complex of the spliceosome account for more than 12% of cases of autosomal dominant retinitis pigmentosa (adRP). Although the exact mechanism by which splicing factor defects trigger photoreceptor death is not completely clear, their role in retinitis pigmentosa has been demonstrated by several genetic and functional studies. To test for possible novel associations between splicing factors and adRP, we screened four tri-snRNP splicing factor genes (EFTUD2, PRPF4, NHP2L1, and AAR2) as candidate disease genes. METHODS: We screened up to 303 patients with adRP from Europe and North America who did not carry known RP mutations. Exon-PCR and Sanger methods were used to sequence the NHP2L1 and AAR2 genes, while the sequences of EFTUD2 and PRPF4 were obtained by using long-range PCRs spanning coding and non-coding regions followed by next-generation sequencing. RESULTS: We detected novel missense changes in individual patients in the sequence of the genes PRPF4 and EFTUD2, but the role of these changes in relationship to disease could not be verified. In one other patient we identified a novel nucleotide substitution in the 5' untranslated region (UTR) of NHP2L1, which did not segregate with the disease in the family. CONCLUSIONS: The absence of clearly pathogenic mutations in the candidate genes screened in our cohort suggests that EFTUD2, PRPF4, NHP2L1, and AAR2 are either not involved in adRP or are associated with the disease in rare instances, at least as observed in this study in patients of European and North American origin.
Asunto(s)
Análisis Mutacional de ADN/métodos , Genes Dominantes , Pruebas Genéticas , Empalme del ARN/genética , Retinitis Pigmentosa/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Sistemas de Lectura Abierta/genética , Factores de Elongación de Péptidos/genética , Ribonucleoproteína Nuclear Pequeña U4-U6/genética , Ribonucleoproteína Nuclear Pequeña U5 , Ribonucleoproteínas Nucleares Pequeñas/genéticaRESUMEN
Physiological variability in pancreatic cell type gene regulation and the impact on diabetes risk is poorly understood. In this study we mapped gene regulation in pancreatic cell types using single cell multiomic (joint RNA-seq and ATAC-seq) profiling in 28 non-diabetic donors in combination with single cell data from 35 non-diabetic donors in the Human Pancreas Analysis Program. We identified widespread associations with age, sex, BMI, and HbA1c, where gene regulatory responses were highly cell type- and phenotype-specific. In beta cells, donor age associated with hypoxia, apoptosis, unfolded protein response, and external signal-dependent transcriptional regulators, while HbA1c associated with inflammatory responses and gender with chromatin organization. We identified 10.8K loci where genetic variants were QTLs for cis regulatory element (cRE) accessibility, including 20% with lineage- or cell type-specific effects which disrupted distinct transcription factor motifs. Type 2 diabetes and glycemic trait associated variants were enriched in both phenotype- and QTL-associated beta cell cREs, whereas type 1 diabetes showed limited enrichment. Variants at 226 diabetes and glycemic trait loci were QTLs in beta and other cell types, including 40 that were statistically colocalized, and annotating target genes of colocalized QTLs revealed genes with putatively novel roles in disease. Our findings reveal diverse responses of pancreatic cell types to phenotype and genotype in physiology, and identify pathways, networks, and genes through which physiology impacts diabetes risk.
RESUMEN
Pancreatic islets consist of multiple cell types that produce hormones required for glucose homeostasis, and islet dysfunction is a major factor in type 1 and type 2 diabetes. Numerous studies have assessed transcription across individual cell types using single-cell assays; however, there is no canonical reference of gene expression in islet cell types that is also easily accessible for researchers to query and use in bioinformatics pipelines. Here we present an integrated map of islet cell type-specific gene expression from 192,203 cells from single-cell RNA sequencing of 65 donors without diabetes, donors who were type 1 diabetes autoantibody positive, donors with type 1 diabetes, and donors with type 2 diabetes from the Human Pancreas Analysis Program. We identified 10 distinct cell types, annotated subpopulations of several cell types, and defined cell type-specific marker genes. We tested differential expression within each cell type across disease states and identified 1,701 genes with significant changes in expression, with most changes observed in ß-cells from donors with type 1 diabetes. To facilitate user interaction, we provide several single-cell visualization and reference mapping tools, as well as the open-access analytical pipelines used to create this reference. The results will serve as a valuable resource to investigators studying islet biology.
Asunto(s)
Diabetes Mellitus Tipo 1 , Diabetes Mellitus Tipo 2 , Islotes Pancreáticos , Humanos , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Diabetes Mellitus Tipo 1/genética , Diabetes Mellitus Tipo 1/metabolismo , Islotes Pancreáticos/metabolismo , Páncreas/metabolismo , Expresión GénicaRESUMEN
Pancreatic islets are comprised of multiple endocrine cell types that produce hormones required for glucose homeostasis, and islet dysfunction is a major factor in the development of type 1 and type 2 diabetes (T1D, T2D). Numerous studies have generated gene expression profiles in individual islet cell types using single cell assays. However, there is no canonical reference of gene expression in islet cell types in both health and disease that is also easily accessible for researchers to access, query, and use in bioinformatics pipelines. Here we present an integrated reference map of islet cell type-specific gene expression from 192,203 cells derived from single cell RNA-seq assays of 65 non-diabetic, T1D autoantibody positive (Aab+), T1D, and T2D donors from the Human Pancreas Analysis Program. We identified 10 endocrine and non-endocrine cell types as well as sub-populations of several cell types, and defined sets of marker genes for each cell type and sub-population. We tested for differential expression within each cell type in T1D Aab+, T1D, and T2D states, and identified 1,701 genes with significant changes in expression in any cell type. Most changes were observed in beta cells in T1D, and, by comparison, there were almost no genes with changes in T1D Aab+. To facilitate user interaction with this reference, we provide the data using several single cell visualization and reference mapping tools as well as open-access analytical pipelines used to create this reference. The results will serve as a valuable resource to investigators studying islet biology and diabetes.
RESUMEN
Over 10% of type 1 diabetes (T1D) cases do not have high-risk HLA-DR3 or DR4 haplotypes with distinct clinical features such as later onset and reduced insulin dependence. To identify genetic drivers of T1D in the absence of DR3/DR4, we performed association and fine-mapping analyses in 12,316 non-DR3/DR4 samples. Risk variants at the MHC and other loci genome-wide had heterogeneity in effects on T1D dependent on DR3/DR4, and non-DR3/DR4 T1D had evidence for a greater polygenic burden. T1D-assocated variants in non-DR3/DR4 were more enriched for loci, regulatory elements, and pathways for antigen presentation, innate immunity, and beta cells, and depleted in T cells, compared to DR3/DR4. Non-DR3/DR4 T1D cases were poorly classified based on an existing genetic risk score GRS2, and we created a new GRS which highly discriminated non-DR3/DR4 T1D from both non-diabetes and T2D. In total we identified heterogeneity in T1D genetic risk and disease mechanisms dependent on high-risk HLA haplotype and which enabled accurate classification of T1D across HLA background.
RESUMEN
We combined functional genomics and human genetics to investigate processes that affect type 1 diabetes (T1D) risk by mediating beta cell survival in response to proinflammatory cytokines. We mapped 38,931 cytokine-responsive candidate cis-regulatory elements (cCREs) in beta cells using ATAC-seq and snATAC-seq and linked them to target genes using co-accessibility and HiChIP. Using a genome-wide CRISPR screen in EndoC-ßH1 cells, we identified 867 genes affecting cytokine-induced survival, and genes promoting survival and up-regulated in cytokines were enriched at T1D risk loci. Using SNP-SELEX, we identified 2,229 variants in cytokine-responsive cCREs altering transcription factor (TF) binding, and variants altering binding of TFs regulating stress, inflammation, and apoptosis were enriched for T1D risk. At the 16p13 locus, a fine-mapped T1D variant altering TF binding in a cytokine-induced cCRE interacted with SOCS1, which promoted survival in cytokine exposure. Our findings reveal processes and genes acting in beta cells during inflammation that modulate T1D risk.
RESUMEN
The gene SNRNP200 is composed of 45 exons and encodes a protein essential for pre-mRNA splicing, the 200 kDa helicase hBrr2. Two mutations in SNRNP200 have recently been associated with autosomal dominant retinitis pigmentosa (adRP), a retinal degenerative disease, in two families from China. In this work we analyzed the entire 35-Kb SNRNP200 genomic region in a cohort of 96 unrelated North American patients with adRP. To complete this large-scale sequencing project, we performed ultra high-throughput sequencing of pooled, untagged PCR products. We then validated the detected DNA changes by Sanger sequencing of individual samples from this cohort and from an additional one of 95 patients. One of the two previously known mutations (p.S1087L) was identified in 3 patients, while 4 new missense changes (p.R681C, p.R681H, p.V683L, p.Y689C) affecting highly conserved codons were identified in 6 unrelated individuals, indicating that the prevalence of SNRNP200-associated adRP is relatively high. We also took advantage of this research to evaluate the pool-and-sequence method, especially with respect to the generation of false positive and negative results. We conclude that, although this strategy can be adopted for rapid discovery of new disease-associated variants, it still requires extensive validation to be used in routine DNA screenings.
Asunto(s)
Retinitis Pigmentosa/genética , Ribonucleoproteínas Nucleares Pequeñas/genética , Secuencia de Aminoácidos , China , Codón , Exones , Genes Dominantes , Humanos , Datos de Secuencia Molecular , Mutación Missense/genética , Linaje , Análisis de Secuencia de ADN/métodosRESUMEN
While genetic variation at chromatin loops is relevant for human disease, the relationships between contact propensity (the probability that loci at loops physically interact), genetics, and gene regulation are unclear. We quantitatively interrogate these relationships by comparing Hi-C and molecular phenotype data across cell types and haplotypes. While chromatin loops consistently form across different cell types, they have subtle quantitative differences in contact frequency that are associated with larger changes in gene expression and H3K27ac. For the vast majority of loci with quantitative differences in contact frequency across haplotypes, the changes in magnitude are smaller than those across cell types; however, the proportional relationships between contact propensity, gene expression, and H3K27ac are consistent. These findings suggest that subtle changes in contact propensity have a biologically meaningful role in gene regulation and could be a mechanism by which regulatory genetic variants in loop anchors mediate effects on expression.
Asunto(s)
Cromatina/genética , ADN/genética , Regulación de la Expresión Génica , Histonas/genética , Sitios de Carácter Cuantitativo/genética , Adolescente , Adulto , Anciano , Línea Celular , Cromatina/metabolismo , ADN/metabolismo , Femenino , Histonas/metabolismo , Humanos , Células Madre Pluripotentes Inducidas , Masculino , Persona de Mediana Edad , Miocitos Cardíacos , Conformación de Ácido Nucleico , Polimorfismo de Nucleótido Simple , Secuenciación Completa del Genoma , Adulto JovenRESUMEN
We evaluate whether human induced pluripotent stem cell-derived retinal pigment epithelium (iPSC-RPE) cells can be used to prioritize and functionally characterize causal variants at age-related macular degeneration (AMD) risk loci. We generated iPSC-RPE from six subjects and show that they have morphological and molecular characteristics similar to those of native RPE. We generated RNA-seq, ATAC-seq, and H3K27ac ChIP-seq data and observed high similarity in gene expression and enriched transcription factor motif profiles between iPSC-RPE and human fetal RPE. We performed fine mapping of AMD risk loci by integrating molecular data from the iPSC-RPE, adult retina, and adult RPE, which identified rs943080 as the probable causal variant at VEGFA. We show that rs943080 is associated with altered chromatin accessibility of a distal ATAC-seq peak, decreased overall gene expression of VEGFA, and allele-specific expression of a non-coding transcript. Our study thus provides a potential mechanism underlying the association of the VEGFA locus with AMD.
Asunto(s)
Sitios Genéticos , Células Madre Pluripotentes Inducidas/metabolismo , Degeneración Macular , Epitelio Pigmentado de la Retina/metabolismo , Factor A de Crecimiento Endotelial Vascular , Femenino , Humanos , Células Madre Pluripotentes Inducidas/patología , Degeneración Macular/genética , Degeneración Macular/metabolismo , Degeneración Macular/patología , Epitelio Pigmentado de la Retina/patología , Análisis de Secuencia de ARN , Factor A de Crecimiento Endotelial Vascular/biosíntesis , Factor A de Crecimiento Endotelial Vascular/genéticaRESUMEN
The cardiac transcription factor (TF) gene NKX2-5 has been associated with electrocardiographic (EKG) traits through genome-wide association studies (GWASs), but the extent to which differential binding of NKX2-5 at common regulatory variants contributes to these traits has not yet been studied. We analyzed transcriptomic and epigenomic data from induced pluripotent stem cell-derived cardiomyocytes from seven related individuals, and identified ~2,000 single-nucleotide variants associated with allele-specific effects (ASE-SNVs) on NKX2-5 binding. NKX2-5 ASE-SNVs were enriched for altered TF motifs, for heart-specific expression quantitative trait loci and for EKG GWAS signals. Using fine-mapping combined with epigenomic data from induced pluripotent stem cell-derived cardiomyocytes, we prioritized candidate causal variants for EKG traits, many of which were NKX2-5 ASE-SNVs. Experimentally characterizing two NKX2-5 ASE-SNVs (rs3807989 and rs590041) showed that they modulate the expression of target genes via differential protein binding in cardiac cells, indicating that they are functional variants underlying EKG GWAS signals. Our results show that differential NKX2-5 binding at numerous regulatory variants across the genome contributes to EKG phenotypes.
Asunto(s)
Fibrilación Atrial/genética , Fibrilación Atrial/patología , Proteína Homeótica Nkx-2.5/genética , Proteína Homeótica Nkx-2.5/metabolismo , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Elementos Reguladores de la Transcripción , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Alelos , Niño , Electrocardiografía , Epigenómica , Femenino , Predisposición Genética a la Enfermedad , Genoma Humano , Estudio de Asociación del Genoma Completo , Humanos , Células Madre Pluripotentes Inducidas/metabolismo , Células Madre Pluripotentes Inducidas/patología , Masculino , Persona de Mediana Edad , Miocitos Cardíacos/metabolismo , Miocitos Cardíacos/patología , Fenotipo , Unión Proteica , Transcriptoma , Adulto JovenRESUMEN
To understand the mutational burden of human induced pluripotent stem cells (iPSCs), we sequenced genomes of 18 fibroblast-derived iPSC lines and identified different classes of somatic mutations based on structure, origin, and frequency. Copy-number alterations affected 295 kb in each sample and strongly impacted gene expression. UV-damage mutations were present in â¼45% of the iPSCs and accounted for most of the observed heterogeneity in mutation rates across lines. Subclonal mutations (not present in all iPSCs within a line) composed 10% of point mutations and, compared with clonal variants, showed an enrichment in active promoters and increased association with altered gene expression. Our study shows that, by combining WGS, transcriptome, and epigenome data, we can understand the mutational burden of each iPSC line on an individual basis and suggests that this information could be used to prioritize iPSC lines for models of specific human diseases and/or transplantation therapy.
Asunto(s)
Células Madre Pluripotentes Inducidas/citología , Células Madre Pluripotentes Inducidas/fisiología , Diferenciación Celular/fisiología , Células Cultivadas , Reprogramación Celular/genética , Humanos , Mutación , Tasa de MutaciónRESUMEN
In this study, we used whole-genome sequencing and gene expression profiling of 215 human induced pluripotent stem cell (iPSC) lines from different donors to identify genetic variants associated with RNA expression for 5,746 genes. We were able to predict causal variants for these expression quantitative trait loci (eQTLs) that disrupt transcription factor binding and validated a subset of them experimentally. We also identified copy-number variant (CNV) eQTLs, including some that appear to affect gene expression by altering the copy number of intergenic regulatory regions. In addition, we were able to identify effects on gene expression of rare genic CNVs and regulatory single-nucleotide variants and found that reactivation of gene expression on the X chromosome depends on gene chromosomal position. Our work highlights the value of iPSCs for genetic association analyses and provides a unique resource for investigating the genetic regulation of gene expression in pluripotent cells.