Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 25
Filter
Add more filters

Publication year range
1.
Nature ; 594(7863): 398-402, 2021 06.
Article in English | MEDLINE | ID: mdl-34012112

ABSTRACT

Genetic risk variants that have been identified in genome-wide association studies of complex diseases are primarily non-coding1. Translating these risk variants into mechanistic insights requires detailed maps of gene regulation in disease-relevant cell types2. Here we combined two approaches: a genome-wide association study of type 1 diabetes (T1D) using 520,580 samples, and the identification of candidate cis-regulatory elements (cCREs) in pancreas and peripheral blood mononuclear cells using single-nucleus assay for transposase-accessible chromatin with sequencing (snATAC-seq) of 131,554 nuclei. Risk variants for T1D were enriched in cCREs that were active in T cells and other cell types, including acinar and ductal cells of the exocrine pancreas. Risk variants at multiple T1D signals overlapped with exocrine-specific cCREs that were linked to genes with exocrine-specific expression. At the CFTR locus, the T1D risk variant rs7795896 mapped to a ductal-specific cCRE that regulated CFTR; the risk allele reduced transcription factor binding, enhancer activity and CFTR expression in ductal cells. These findings support a role for the exocrine pancreas in the pathogenesis of T1D and highlight the power of large-scale genome-wide association studies and single-cell epigenomics for understanding the cellular origins of complex disease.


Subject(s)
Diabetes Mellitus, Type 1/genetics , Epigenomics , Genetic Predisposition to Disease , Single-Cell Analysis , Chromatin/genetics , Cystic Fibrosis Transmembrane Conductance Regulator/genetics , Female , Gene Expression Regulation , Genome-Wide Association Study , Humans , Immunity/genetics , Male , Pancreatic Ducts/metabolism , Pancreatic Ducts/pathology
2.
Nature ; 591(7848): 147-151, 2021 03.
Article in English | MEDLINE | ID: mdl-33505025

ABSTRACT

Many sequence variants have been linked to complex human traits and diseases1, but deciphering their biological functions remains challenging, as most of them reside in noncoding DNA. Here we have systematically assessed the binding of 270 human transcription factors to 95,886 noncoding variants in the human genome using an ultra-high-throughput multiplex protein-DNA binding assay, termed single-nucleotide polymorphism evaluation by systematic evolution of ligands by exponential enrichment (SNP-SELEX). The resulting 828 million measurements of transcription factor-DNA interactions enable estimation of the relative affinity of these transcription factors to each variant in vitro and evaluation of the current methods to predict the effects of noncoding variants on transcription factor binding. We show that the position weight matrices of most transcription factors lack sufficient predictive power, whereas the support vector machine combined with the gapped k-mer representation show much improved performance, when assessed on results from independent SNP-SELEX experiments involving a new set of 61,020 sequence variants. We report highly predictive models for 94 human transcription factors and demonstrate their utility in genome-wide association studies and understanding of the molecular pathways involved in diverse human traits and diseases.


Subject(s)
Polymorphism, Single Nucleotide/genetics , SELEX Aptamer Technique , Support Vector Machine , Transcription Factors/metabolism , Binding Sites/genetics , Disease/genetics , Genome, Human/genetics , Humans , Ligands , Protein Binding
3.
PLoS Genet ; 19(6): e1010759, 2023 06.
Article in English | MEDLINE | ID: mdl-37289818

ABSTRACT

Gene regulation is highly cell type-specific and understanding the function of non-coding genetic variants associated with complex traits requires molecular phenotyping at cell type resolution. In this study we performed single nucleus ATAC-seq (snATAC-seq) and genotyping in peripheral blood mononuclear cells from 13 individuals. Clustering chromatin accessibility profiles of 96,002 total nuclei identified 17 immune cell types and sub-types. We mapped chromatin accessibility QTLs (caQTLs) in each immune cell type and sub-type using individuals of European ancestry which identified 6,901 caQTLs at FDR < .10 and 4,220 caQTLs at FDR < .05, including those obscured from assays of bulk tissue such as with divergent effects on different cell types. For 3,941 caQTLs we further annotated putative target genes of variant activity using single cell co-accessibility, and caQTL variants were significantly correlated with the accessibility level of linked gene promoters. We fine-mapped loci associated with 16 complex immune traits and identified immune cell caQTLs at 622 candidate causal variants, including those with cell type-specific effects. At the 6q15 locus associated with type 1 diabetes, in line with previous reports, variant rs72928038 was a naïve CD4+ T cell caQTL linked to BACH2 and we validated the allelic effects of this variant on regulatory activity in Jurkat T cells. These results highlight the utility of snATAC-seq for mapping genetic effects on accessible chromatin in specific cell types.


Subject(s)
Chromatin Immunoprecipitation Sequencing , Chromatin , Humans , Chromatin/genetics , Multifactorial Inheritance , Leukocytes, Mononuclear , Quantitative Trait Loci/genetics
4.
PLoS Genet ; 17(5): e1009531, 2021 05.
Article in English | MEDLINE | ID: mdl-33983929

ABSTRACT

Glucocorticoids are key regulators of glucose homeostasis and pancreatic islet function, but the gene regulatory programs driving responses to glucocorticoid signaling in islets and the contribution of these programs to diabetes risk are unknown. In this study we used ATAC-seq and RNA-seq to map chromatin accessibility and gene expression from eleven primary human islet samples cultured in vitro with the glucocorticoid dexamethasone at multiple doses and durations. We identified thousands of accessible chromatin sites and genes with significant changes in activity in response to glucocorticoids. Chromatin sites up-regulated in glucocorticoid signaling were prominently enriched for glucocorticoid receptor binding sites and up-regulated genes were enriched for ion transport and lipid metabolism, whereas down-regulated chromatin sites and genes were enriched for inflammatory, stress response and proliferative processes. Genetic variants associated with glucose levels and T2D risk were enriched in glucocorticoid-responsive chromatin sites, including fine-mapped variants at 51 known signals. Among fine-mapped variants in glucocorticoid-responsive chromatin, a likely casual variant at the 2p21 locus had glucocorticoid-dependent allelic effects on beta cell enhancer activity and affected SIX2 and SIX3 expression. Our results provide a comprehensive map of islet regulatory programs in response to glucocorticoids through which we uncover a role for islet glucocorticoid signaling in mediating genetic risk of T2D.


Subject(s)
Diabetes Mellitus, Type 2/genetics , Diabetes Mellitus, Type 2/metabolism , Gene Regulatory Networks , Genetic Predisposition to Disease , Glucocorticoids/metabolism , Islets of Langerhans/metabolism , Signal Transduction , Animals , Blood Glucose/metabolism , Cell Line , Chromatin/genetics , Chromatin/metabolism , Diabetes Mellitus, Type 2/blood , Humans , Mice
5.
J Transl Med ; 19(1): 90, 2021 02 27.
Article in English | MEDLINE | ID: mdl-33639972

ABSTRACT

BACKGROUND: Adult granulosa cell tumor (aGCT) is a rare type of stromal cell malignant cancer of the ovary characterized by elevated estrogen levels. aGCTs ubiquitously harbor a somatic mutation in FOXL2 gene, Cys134Trp (c.402C < G); however, the general molecular effect of this mutation and its putative pathogenic role in aGCT tumorigenesis is not completely understood. We previously studied the role of FOXL2C134W, its partner SMAD3 and its antagonist FOXO1 in cellular models of aGCT. METHODS: In this work, seeking more comprehensive profiling of FOXL2C134W transcriptomic effects, we performed an RNA-seq analysis comparing the effect of FOXL2WT/SMAD3 and FOXL2C134W/SMAD3 overexpression in an established human GC line (HGrC1), which is not luteinized, and bears normal alleles of FOXL2. RESULTS: Our data shows that FOXL2C134W/SMAD3 overexpression alters the expression of 717 genes. These genes include known and novel FOXL2 targets (TGFB2, SMARCA4, HSPG2, MKI67, NFKBIA) and are enriched for neoplastic pathways (Proteoglycans in Cancer, Chromatin remodeling, Apoptosis, Tissue Morphogenesis, Tyrosine Kinase Receptors). We additionally expressed the FOXL2 antagonistic Forkhead protein, FOXO1. Surprisingly, overexpression of FOXO1 mitigated 40% of the altered genome-wide effects specifically related to FOXL2C134W, suggesting it can be a new target for aGCT treatment. CONCLUSIONS: Our transcriptomic data provide novel insights into potential genes (FOXO1 regulated) that could be used as biomarkers of efficacy in aGCT patients.


Subject(s)
Granulosa Cell Tumor , Ovarian Neoplasms , Adult , Cell Line, Tumor , DNA Helicases , Female , Forkhead Box Protein L2 , Forkhead Box Protein O1/genetics , Forkhead Transcription Factors/genetics , Forkhead Transcription Factors/metabolism , Granulosa Cell Tumor/genetics , Humans , Mutation , Nuclear Proteins , Ovarian Neoplasms/genetics , Smad3 Protein/genetics , Transcription Factors , Transcriptome/genetics
6.
BMC Bioinformatics ; 18(1): 207, 2017 Apr 07.
Article in English | MEDLINE | ID: mdl-28388874

ABSTRACT

BACKGROUND: Genomic interaction studies use next-generation sequencing (NGS) to examine the interactions between two loci on the genome, with subsequent bioinformatics analyses typically including annotation, intersection, and merging of data from multiple experiments. While many file types and analysis tools exist for storing and manipulating single locus NGS data, there is currently no file standard or analysis tool suite for manipulating and storing paired-genomic-loci: the data type resulting from "genomic interaction" studies. As genomic interaction sequencing data are becoming prevalent, a standard file format and tools for working with these data conveniently and efficiently are needed. RESULTS: This article details a file standard and novel software tool suite for working with paired-genomic-loci data. We present the paired-genomic-loci (PGL) file standard for genomic-interactions data, and the accompanying analysis tool suite "pgltools": a cross platform, pypy compatible python package available both as an easy-to-use UNIX package, and as a python module, for integration into pipelines of paired-genomic-loci analyses. CONCLUSIONS: Pgltools is a freely available, open source tool suite for manipulating paired-genomic-loci data. Source code, an in-depth manual, and a tutorial are available publicly at www.github.com/billgreenwald/pgltools , and a python module of the operations can be installed from PyPI via the PyGLtools module.


Subject(s)
Chromatin/metabolism , Genomics/methods , Software , Chromatin/genetics , Chromatin Immunoprecipitation , Genetic Loci , High-Throughput Nucleotide Sequencing
7.
Proc Natl Acad Sci U S A ; 110(40): 16139-44, 2013 Oct 01.
Article in English | MEDLINE | ID: mdl-24043777

ABSTRACT

We performed whole genome sequencing in 16 unrelated patients with autosomal recessive retinitis pigmentosa (ARRP), a disease characterized by progressive retinal degeneration and caused by mutations in over 50 genes, in search of pathogenic DNA variants. Eight patients were from North America, whereas eight were Japanese, a population for which ARRP seems to have different genetic drivers. Using a specific workflow, we assessed both the coding and noncoding regions of the human genome, including the evaluation of highly polymorphic SNPs, structural and copy number variations, as well as 69 control genomes sequenced by the same procedures. We detected homozygous or compound heterozygous mutations in 7 genes associated with ARRP (USH2A, RDH12, CNGB1, EYS, PDE6B, DFNB31, and CERKL) in eight patients, three Japanese and five Americans. Fourteen of the 16 mutant alleles identified were previously unknown. Among these, there was a 2.3-kb deletion in USH2A and an inverted duplication of ~446 kb in EYS, which would have likely escaped conventional screening techniques or exome sequencing. Moreover, in another Japanese patient, we identified a homozygous frameshift (p.L206fs), absent in more than 2,500 chromosomes from ethnically matched controls, in the ciliary gene NEK2, encoding a serine/threonine-protein kinase. Inactivation of this gene in zebrafish induced retinal photoreceptor defects that were rescued by human NEK2 mRNA. In addition to identifying a previously undescribed ARRP gene, our study highlights the importance of rare structural DNA variations in Mendelian diseases and advocates the need for screening approaches that transcend the analysis of the coding sequences of the human genome.


Subject(s)
Gene Rearrangement/genetics , Genome, Human/genetics , Protein Serine-Threonine Kinases/genetics , Retinitis Pigmentosa/genetics , Animals , Base Sequence , Frameshift Mutation/genetics , Genetics, Medical , Genome-Wide Association Study , Humans , Japan , Molecular Sequence Data , NIMA-Related Kinases , Sequence Analysis, DNA , United States , Zebrafish
8.
Environ Microbiol ; 17(1): 91-104, 2015 Jan.
Article in English | MEDLINE | ID: mdl-24803113

ABSTRACT

Pseudomonas knackmussii B13 was the first strain to be isolated in 1974 that could degrade chlorinated aromatic hydrocarbons. This discovery was the prologue for subsequent characterization of numerous bacterial metabolic pathways, for genetic and biochemical studies, and which spurred ideas for pollutant bioremediation. In this study, we determined the complete genome sequence of B13 using next generation sequencing technologies and optical mapping. Genome annotation indicated that B13 has a variety of metabolic pathways for degrading monoaromatic hydrocarbons including chlorobenzoate, aminophenol, anthranilate and hydroxyquinol, but not polyaromatic compounds. Comparative genome analysis revealed that B13 is closest to Pseudomonas denitrificans and Pseudomonas aeruginosa. The B13 genome contains at least eight genomic islands [prophages and integrative conjugative elements (ICEs)], which were absent in closely related pseudomonads. We confirm that two ICEs are identical copies of the 103 kb self-transmissible element ICEclc that carries the genes for chlorocatechol metabolism. Comparison of ICEclc showed that it is composed of a variable and a 'core' region, which is very conserved among proteobacterial genomes, suggesting a widely distributed family of so far uncharacterized ICE. Resequencing of two spontaneous B13 mutants revealed a number of single nucleotide substitutions, as well as excision of a large 220 kb region and a prophage that drastically change the host metabolic capacity and survivability.


Subject(s)
Genome, Bacterial , Pseudomonas/genetics , Chlorobenzoates/metabolism , Chromosomes, Bacterial , Genomic Islands , Genomics , Hydrocarbons, Aromatic/metabolism , Metabolic Networks and Pathways , Prophages/genetics , Pseudomonas/classification , Pseudomonas/metabolism , Pseudomonas aeruginosa/genetics
9.
Mol Vis ; 20: 843-51, 2014.
Article in English | MEDLINE | ID: mdl-24959063

ABSTRACT

PURPOSE: Mutations in genes encoding proteins from the tri-snRNP complex of the spliceosome account for more than 12% of cases of autosomal dominant retinitis pigmentosa (adRP). Although the exact mechanism by which splicing factor defects trigger photoreceptor death is not completely clear, their role in retinitis pigmentosa has been demonstrated by several genetic and functional studies. To test for possible novel associations between splicing factors and adRP, we screened four tri-snRNP splicing factor genes (EFTUD2, PRPF4, NHP2L1, and AAR2) as candidate disease genes. METHODS: We screened up to 303 patients with adRP from Europe and North America who did not carry known RP mutations. Exon-PCR and Sanger methods were used to sequence the NHP2L1 and AAR2 genes, while the sequences of EFTUD2 and PRPF4 were obtained by using long-range PCRs spanning coding and non-coding regions followed by next-generation sequencing. RESULTS: We detected novel missense changes in individual patients in the sequence of the genes PRPF4 and EFTUD2, but the role of these changes in relationship to disease could not be verified. In one other patient we identified a novel nucleotide substitution in the 5' untranslated region (UTR) of NHP2L1, which did not segregate with the disease in the family. CONCLUSIONS: The absence of clearly pathogenic mutations in the candidate genes screened in our cohort suggests that EFTUD2, PRPF4, NHP2L1, and AAR2 are either not involved in adRP or are associated with the disease in rare instances, at least as observed in this study in patients of European and North American origin.


Subject(s)
DNA Mutational Analysis/methods , Genes, Dominant , Genetic Testing , RNA Splicing/genetics , Retinitis Pigmentosa/genetics , High-Throughput Nucleotide Sequencing , Humans , Open Reading Frames/genetics , Peptide Elongation Factors/genetics , Ribonucleoprotein, U4-U6 Small Nuclear/genetics , Ribonucleoprotein, U5 Small Nuclear , Ribonucleoproteins, Small Nuclear/genetics
10.
bioRxiv ; 2023 Feb 04.
Article in English | MEDLINE | ID: mdl-36778506

ABSTRACT

Pancreatic islets are comprised of multiple endocrine cell types that produce hormones required for glucose homeostasis, and islet dysfunction is a major factor in the development of type 1 and type 2 diabetes (T1D, T2D). Numerous studies have generated gene expression profiles in individual islet cell types using single cell assays. However, there is no canonical reference of gene expression in islet cell types in both health and disease that is also easily accessible for researchers to access, query, and use in bioinformatics pipelines. Here we present an integrated reference map of islet cell type-specific gene expression from 192,203 cells derived from single cell RNA-seq assays of 65 non-diabetic, T1D autoantibody positive (Aab+), T1D, and T2D donors from the Human Pancreas Analysis Program. We identified 10 endocrine and non-endocrine cell types as well as sub-populations of several cell types, and defined sets of marker genes for each cell type and sub-population. We tested for differential expression within each cell type in T1D Aab+, T1D, and T2D states, and identified 1,701 genes with significant changes in expression in any cell type. Most changes were observed in beta cells in T1D, and, by comparison, there were almost no genes with changes in T1D Aab+. To facilitate user interaction with this reference, we provide the data using several single cell visualization and reference mapping tools as well as open-access analytical pipelines used to create this reference. The results will serve as a valuable resource to investigators studying islet biology and diabetes.

11.
medRxiv ; 2023 Dec 04.
Article in English | MEDLINE | ID: mdl-37986756

ABSTRACT

Over 10% of type 1 diabetes (T1D) cases do not have high-risk HLA-DR3 or DR4 haplotypes with distinct clinical features such as later onset and reduced insulin dependence. To identify genetic drivers of T1D in the absence of DR3/DR4, we performed association and fine-mapping analyses in 12,316 non-DR3/DR4 samples. Risk variants at the MHC and other loci genome-wide had heterogeneity in effects on T1D dependent on DR3/DR4, and non-DR3/DR4 T1D had evidence for a greater polygenic burden. T1D-assocated variants in non-DR3/DR4 were more enriched for loci, regulatory elements, and pathways for antigen presentation, innate immunity, and beta cells, and depleted in T cells, compared to DR3/DR4. Non-DR3/DR4 T1D cases were poorly classified based on an existing genetic risk score GRS2, and we created a new GRS which highly discriminated non-DR3/DR4 T1D from both non-diabetes and T2D. In total we identified heterogeneity in T1D genetic risk and disease mechanisms dependent on high-risk HLA haplotype and which enabled accurate classification of T1D across HLA background.

12.
Diabetes ; 72(11): 1719-1728, 2023 Nov 01.
Article in English | MEDLINE | ID: mdl-37582230

ABSTRACT

Pancreatic islets consist of multiple cell types that produce hormones required for glucose homeostasis, and islet dysfunction is a major factor in type 1 and type 2 diabetes. Numerous studies have assessed transcription across individual cell types using single-cell assays; however, there is no canonical reference of gene expression in islet cell types that is also easily accessible for researchers to query and use in bioinformatics pipelines. Here we present an integrated map of islet cell type-specific gene expression from 192,203 cells from single-cell RNA sequencing of 65 donors without diabetes, donors who were type 1 diabetes autoantibody positive, donors with type 1 diabetes, and donors with type 2 diabetes from the Human Pancreas Analysis Program. We identified 10 distinct cell types, annotated subpopulations of several cell types, and defined cell type-specific marker genes. We tested differential expression within each cell type across disease states and identified 1,701 genes with significant changes in expression, with most changes observed in ß-cells from donors with type 1 diabetes. To facilitate user interaction, we provide several single-cell visualization and reference mapping tools, as well as the open-access analytical pipelines used to create this reference. The results will serve as a valuable resource to investigators studying islet biology.


Subject(s)
Diabetes Mellitus, Type 1 , Diabetes Mellitus, Type 2 , Islets of Langerhans , Humans , Diabetes Mellitus, Type 2/genetics , Diabetes Mellitus, Type 2/metabolism , Diabetes Mellitus, Type 1/genetics , Diabetes Mellitus, Type 1/metabolism , Islets of Langerhans/metabolism , Pancreas/metabolism , Gene Expression
13.
Cell Genom ; 2(12): 100214, 2022 Dec 14.
Article in English | MEDLINE | ID: mdl-36778047

ABSTRACT

We combined functional genomics and human genetics to investigate processes that affect type 1 diabetes (T1D) risk by mediating beta cell survival in response to proinflammatory cytokines. We mapped 38,931 cytokine-responsive candidate cis-regulatory elements (cCREs) in beta cells using ATAC-seq and snATAC-seq and linked them to target genes using co-accessibility and HiChIP. Using a genome-wide CRISPR screen in EndoC-ßH1 cells, we identified 867 genes affecting cytokine-induced survival, and genes promoting survival and up-regulated in cytokines were enriched at T1D risk loci. Using SNP-SELEX, we identified 2,229 variants in cytokine-responsive cCREs altering transcription factor (TF) binding, and variants altering binding of TFs regulating stress, inflammation, and apoptosis were enriched for T1D risk. At the 16p13 locus, a fine-mapped T1D variant altering TF binding in a cytokine-induced cCRE interacted with SOCS1, which promoted survival in cytokine exposure. Our findings reveal processes and genes acting in beta cells during inflammation that modulate T1D risk.

14.
Hum Mutat ; 32(6): E2246-58, 2011 Jun.
Article in English | MEDLINE | ID: mdl-21618346

ABSTRACT

The gene SNRNP200 is composed of 45 exons and encodes a protein essential for pre-mRNA splicing, the 200 kDa helicase hBrr2. Two mutations in SNRNP200 have recently been associated with autosomal dominant retinitis pigmentosa (adRP), a retinal degenerative disease, in two families from China. In this work we analyzed the entire 35-Kb SNRNP200 genomic region in a cohort of 96 unrelated North American patients with adRP. To complete this large-scale sequencing project, we performed ultra high-throughput sequencing of pooled, untagged PCR products. We then validated the detected DNA changes by Sanger sequencing of individual samples from this cohort and from an additional one of 95 patients. One of the two previously known mutations (p.S1087L) was identified in 3 patients, while 4 new missense changes (p.R681C, p.R681H, p.V683L, p.Y689C) affecting highly conserved codons were identified in 6 unrelated individuals, indicating that the prevalence of SNRNP200-associated adRP is relatively high. We also took advantage of this research to evaluate the pool-and-sequence method, especially with respect to the generation of false positive and negative results. We conclude that, although this strategy can be adopted for rapid discovery of new disease-associated variants, it still requires extensive validation to be used in routine DNA screenings.


Subject(s)
Retinitis Pigmentosa/genetics , Ribonucleoproteins, Small Nuclear/genetics , Amino Acid Sequence , China , Codon , Exons , Genes, Dominant , Humans , Molecular Sequence Data , Mutation, Missense/genetics , Pedigree , Sequence Analysis, DNA/methods
15.
Nat Commun ; 10(1): 1054, 2019 03 05.
Article in English | MEDLINE | ID: mdl-30837461

ABSTRACT

While genetic variation at chromatin loops is relevant for human disease, the relationships between contact propensity (the probability that loci at loops physically interact), genetics, and gene regulation are unclear. We quantitatively interrogate these relationships by comparing Hi-C and molecular phenotype data across cell types and haplotypes. While chromatin loops consistently form across different cell types, they have subtle quantitative differences in contact frequency that are associated with larger changes in gene expression and H3K27ac. For the vast majority of loci with quantitative differences in contact frequency across haplotypes, the changes in magnitude are smaller than those across cell types; however, the proportional relationships between contact propensity, gene expression, and H3K27ac are consistent. These findings suggest that subtle changes in contact propensity have a biologically meaningful role in gene regulation and could be a mechanism by which regulatory genetic variants in loop anchors mediate effects on expression.


Subject(s)
Chromatin/genetics , DNA/genetics , Gene Expression Regulation , Histones/genetics , Quantitative Trait Loci/genetics , Adolescent , Adult , Aged , Cell Line , Chromatin/metabolism , DNA/metabolism , Female , Histones/metabolism , Humans , Induced Pluripotent Stem Cells , Male , Middle Aged , Myocytes, Cardiac , Nucleic Acid Conformation , Polymorphism, Single Nucleotide , Whole Genome Sequencing , Young Adult
16.
Stem Cell Reports ; 12(6): 1342-1353, 2019 06 11.
Article in English | MEDLINE | ID: mdl-31080113

ABSTRACT

We evaluate whether human induced pluripotent stem cell-derived retinal pigment epithelium (iPSC-RPE) cells can be used to prioritize and functionally characterize causal variants at age-related macular degeneration (AMD) risk loci. We generated iPSC-RPE from six subjects and show that they have morphological and molecular characteristics similar to those of native RPE. We generated RNA-seq, ATAC-seq, and H3K27ac ChIP-seq data and observed high similarity in gene expression and enriched transcription factor motif profiles between iPSC-RPE and human fetal RPE. We performed fine mapping of AMD risk loci by integrating molecular data from the iPSC-RPE, adult retina, and adult RPE, which identified rs943080 as the probable causal variant at VEGFA. We show that rs943080 is associated with altered chromatin accessibility of a distal ATAC-seq peak, decreased overall gene expression of VEGFA, and allele-specific expression of a non-coding transcript. Our study thus provides a potential mechanism underlying the association of the VEGFA locus with AMD.


Subject(s)
Genetic Loci , Induced Pluripotent Stem Cells/metabolism , Macular Degeneration , Retinal Pigment Epithelium/metabolism , Vascular Endothelial Growth Factor A , Female , Humans , Induced Pluripotent Stem Cells/pathology , Macular Degeneration/genetics , Macular Degeneration/metabolism , Macular Degeneration/pathology , Retinal Pigment Epithelium/pathology , Sequence Analysis, RNA , Vascular Endothelial Growth Factor A/biosynthesis , Vascular Endothelial Growth Factor A/genetics
17.
Nat Genet ; 51(10): 1506-1517, 2019 10.
Article in English | MEDLINE | ID: mdl-31570892

ABSTRACT

The cardiac transcription factor (TF) gene NKX2-5 has been associated with electrocardiographic (EKG) traits through genome-wide association studies (GWASs), but the extent to which differential binding of NKX2-5 at common regulatory variants contributes to these traits has not yet been studied. We analyzed transcriptomic and epigenomic data from induced pluripotent stem cell-derived cardiomyocytes from seven related individuals, and identified ~2,000 single-nucleotide variants associated with allele-specific effects (ASE-SNVs) on NKX2-5 binding. NKX2-5 ASE-SNVs were enriched for altered TF motifs, for heart-specific expression quantitative trait loci and for EKG GWAS signals. Using fine-mapping combined with epigenomic data from induced pluripotent stem cell-derived cardiomyocytes, we prioritized candidate causal variants for EKG traits, many of which were NKX2-5 ASE-SNVs. Experimentally characterizing two NKX2-5 ASE-SNVs (rs3807989 and rs590041) showed that they modulate the expression of target genes via differential protein binding in cardiac cells, indicating that they are functional variants underlying EKG GWAS signals. Our results show that differential NKX2-5 binding at numerous regulatory variants across the genome contributes to EKG phenotypes.


Subject(s)
Atrial Fibrillation/genetics , Atrial Fibrillation/pathology , Homeobox Protein Nkx-2.5/genetics , Homeobox Protein Nkx-2.5/metabolism , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Regulatory Elements, Transcriptional , Adolescent , Adult , Aged , Aged, 80 and over , Alleles , Child , Electrocardiography , Epigenomics , Female , Genetic Predisposition to Disease , Genome, Human , Genome-Wide Association Study , Humans , Induced Pluripotent Stem Cells/metabolism , Induced Pluripotent Stem Cells/pathology , Male , Middle Aged , Myocytes, Cardiac/metabolism , Myocytes, Cardiac/pathology , Phenotype , Protein Binding , Transcriptome , Young Adult
18.
Cell Rep ; 24(4): 883-894, 2018 07 24.
Article in English | MEDLINE | ID: mdl-30044985

ABSTRACT

To understand the mutational burden of human induced pluripotent stem cells (iPSCs), we sequenced genomes of 18 fibroblast-derived iPSC lines and identified different classes of somatic mutations based on structure, origin, and frequency. Copy-number alterations affected 295 kb in each sample and strongly impacted gene expression. UV-damage mutations were present in ∼45% of the iPSCs and accounted for most of the observed heterogeneity in mutation rates across lines. Subclonal mutations (not present in all iPSCs within a line) composed 10% of point mutations and, compared with clonal variants, showed an enrichment in active promoters and increased association with altered gene expression. Our study shows that, by combining WGS, transcriptome, and epigenome data, we can understand the mutational burden of each iPSC line on an individual basis and suggests that this information could be used to prioritize iPSC lines for models of specific human diseases and/or transplantation therapy.


Subject(s)
Induced Pluripotent Stem Cells/cytology , Induced Pluripotent Stem Cells/physiology , Cell Differentiation/physiology , Cells, Cultured , Cellular Reprogramming/genetics , Humans , Mutation , Mutation Rate
19.
Cell Stem Cell ; 20(4): 533-546.e7, 2017 04 06.
Article in English | MEDLINE | ID: mdl-28388430

ABSTRACT

In this study, we used whole-genome sequencing and gene expression profiling of 215 human induced pluripotent stem cell (iPSC) lines from different donors to identify genetic variants associated with RNA expression for 5,746 genes. We were able to predict causal variants for these expression quantitative trait loci (eQTLs) that disrupt transcription factor binding and validated a subset of them experimentally. We also identified copy-number variant (CNV) eQTLs, including some that appear to affect gene expression by altering the copy number of intergenic regulatory regions. In addition, we were able to identify effects on gene expression of rare genic CNVs and regulatory single-nucleotide variants and found that reactivation of gene expression on the X chromosome depends on gene chromosomal position. Our work highlights the value of iPSCs for genetic association analyses and provides a unique resource for investigating the genetic regulation of gene expression in pluripotent cells.


Subject(s)
Gene Expression Profiling/methods , Gene Expression Regulation , Genetic Variation , Induced Pluripotent Stem Cells/metabolism , Binding Sites/genetics , Cellular Reprogramming/genetics , Chromosomes, Human, X/genetics , DNA Copy Number Variations/genetics , Genetic Heterogeneity , Humans , Molecular Sequence Annotation , Quantitative Trait Loci/genetics , Regulatory Sequences, Nucleic Acid/genetics , Transcription Factors/metabolism
20.
Stem Cell Reports ; 8(4): 1086-1100, 2017 04 11.
Article in English | MEDLINE | ID: mdl-28410642

ABSTRACT

Large-scale collections of induced pluripotent stem cells (iPSCs) could serve as powerful model systems for examining how genetic variation affects biology and disease. Here we describe the iPSCORE resource: a collection of systematically derived and characterized iPSC lines from 222 ethnically diverse individuals that allows for both familial and association-based genetic studies. iPSCORE lines are pluripotent with high genomic integrity (no or low numbers of somatic copy-number variants) as determined using high-throughput RNA-sequencing and genotyping arrays, respectively. Using iPSCs from a family of individuals, we show that iPSC-derived cardiomyocytes demonstrate gene expression patterns that cluster by genetic background, and can be used to examine variants associated with physiological and disease phenotypes. The iPSCORE collection contains representative individuals for risk and non-risk alleles for 95% of SNPs associated with human phenotypes through genome-wide association studies. Our study demonstrates the utility of iPSCORE for examining how genetic variants influence molecular and physiological traits in iPSCs and derived cell lines.


Subject(s)
Arrhythmias, Cardiac/genetics , Databases, Factual , Genetic Association Studies , Genetic Variation , Induced Pluripotent Stem Cells/metabolism , Myocytes, Cardiac/metabolism , Arrhythmias, Cardiac/ethnology , Arrhythmias, Cardiac/metabolism , Arrhythmias, Cardiac/physiopathology , Cell Differentiation , Cell Line , Cellular Reprogramming/genetics , Genotype , High-Throughput Nucleotide Sequencing , Humans , Induced Pluripotent Stem Cells/cytology , Multigene Family , Myocytes, Cardiac/cytology , Oligonucleotide Array Sequence Analysis , Phenotype , Polymorphism, Single Nucleotide , Racial Groups
SELECTION OF CITATIONS
SEARCH DETAIL