RESUMO
Induced pluripotent stem cells (iPSCs) show variable methylation patterns between lines, some of which reflect aberrant differences relative to embryonic stem cells (ESCs). To examine whether this aberrant methylation results from genetic variation or non-genetic mechanisms, we generated human iPSCs from monozygotic twins to investigate how genetic background, clone, and passage number contribute. We found that aberrantly methylated CpGs are enriched in regulatory regions associated with MYC protein motifs and affect gene expression. We classified differentially methylated CpGs as being associated with genetic and/or non-genetic factors (clone and passage), and we found that aberrant methylation preferentially occurs at CpGs associated with clone-specific effects. We further found that clone-specific effects play a strong role in recurrent aberrant methylation at specific CpG sites across different studies. Our results argue that a non-genetic biological mechanism underlies aberrant methylation in iPSCs and that it is likely based on a probabilistic process involving MYC that takes place during or shortly after reprogramming.
Assuntos
Metilação de DNA/genética , Células-Tronco Pluripotentes Induzidas/metabolismo , Motivos de Nucleotídeos/genética , Proteínas Proto-Oncogênicas c-myc/metabolismo , Células Clonais , Ilhas de CpG/genética , Fibroblastos/metabolismo , Regulação da Expressão Gênica , Variação Genética , Estudo de Associação Genômica Ampla , Humanos , Análise de Sequência de RNA , Fatores de Transcrição/metabolismo , Gêmeos Monozigóticos/genéticaRESUMO
Large-scale collections of induced pluripotent stem cells (iPSCs) could serve as powerful model systems for examining how genetic variation affects biology and disease. Here we describe the iPSCORE resource: a collection of systematically derived and characterized iPSC lines from 222 ethnically diverse individuals that allows for both familial and association-based genetic studies. iPSCORE lines are pluripotent with high genomic integrity (no or low numbers of somatic copy-number variants) as determined using high-throughput RNA-sequencing and genotyping arrays, respectively. Using iPSCs from a family of individuals, we show that iPSC-derived cardiomyocytes demonstrate gene expression patterns that cluster by genetic background, and can be used to examine variants associated with physiological and disease phenotypes. The iPSCORE collection contains representative individuals for risk and non-risk alleles for 95% of SNPs associated with human phenotypes through genome-wide association studies. Our study demonstrates the utility of iPSCORE for examining how genetic variants influence molecular and physiological traits in iPSCs and derived cell lines.