RESUMO
Emerging evidence implicates common genetic variation - aggregated into polygenic scores (PGS) - in the onset and phenotypic presentation of rare diseases. Here, we comprehensively map individual polygenic liability for 1102 open-source PGS in a cohort of 3059 probands enrolled in the Genomic Answers for Kids (GA4K) rare disease study, revealing widespread associations between rare disease phenotypes and PGSs for common complex diseases and traits, blood protein levels, and brain and other organ morphological measurements. Using this resource, we demonstrate increased polygenic liability in probands with an inherited candidate disease variant (VUS) compared to unaffected carrier parents. Further, we show an enrichment for large-effect rare variants in putative core PGS genes for associated complex traits. Overall, our study supports and expands on previous findings of complex trait associations in rare diseases, implicates polygenic liability as a potential mechanism underlying variable penetrance of candidate causal variants, and provides a framework for identifying novel candidate rare disease genes.
Assuntos
Predisposição Genética para Doença , Herança Multifatorial , Fenótipo , Doenças Raras , Humanos , Herança Multifatorial/genética , Doenças Raras/genética , Variação Genética , Masculino , Feminino , Estudo de Associação Genômica Ampla , Penetrância , Criança , Estudos de CoortesRESUMO
The factors driving or preventing pathological expansion of tandem repeats remain largely unknown. Here, we assessed the FGF14 (GAA)·(TTC) repeat locus in 2,530 individuals by long-read and Sanger sequencing and identified a common 5'-flanking variant in 70.34% of alleles analyzed (3,463/4,923) that represents the phylogenetically ancestral allele and is present on all major haplotypes. This common sequence variation is present nearly exclusively on nonpathogenic alleles with fewer than 30 GAA-pure triplets and is associated with enhanced stability of the repeat locus upon intergenerational transmission and increased Fiber-seq chromatin accessibility.
Assuntos
Alelos , Fatores de Crescimento de Fibroblastos , Fatores de Crescimento de Fibroblastos/genética , Fatores de Crescimento de Fibroblastos/metabolismo , Humanos , Haplótipos , Variação Genética , Loci GênicosRESUMO
Recent studies have revealed the pervasive landscape of rare structural variants (rSVs) present in human genomes. rSVs can have extreme effects on the expression of proximal genes and, in a rare disease context, have been implicated in patient cases where no diagnostic single nucleotide variant (SNV) was found. Approaches for integrating rSVs to date have focused on targeted approaches in known Mendelian rare disease genes. This approach is intractable for rare diseases with many causal loci or patients with complex, multi-phenotype syndromes. We hypothesized that integrating trait-relevant polygenic scores (PGS) would provide a substantial reduction in the number of candidate disease genes in which to assess rSV effects. We further implemented a method for ranking PGS genes to define a set of core/key genes where a rSV has the potential to exert relatively larger effects on disease risk. Among a subset of patients enrolled in the Genomic Answers for Kids (GA4K) rare disease program (N=497), we used PacBio HiFi long-read whole genome sequencing (lrWGS) to identify rSVs intersecting genes in trait-relevant PGSs. Illustrating our approach in Autism (N=54 cases), we identified 22, 019 deletions, 2,041 duplications, 87,826 insertions, and 214 inversions overlapping putative core/key PGS genes. Additionally, by integrating genomic constraint annotations from gnomAD, we observed that rare duplications overlapping putative core/key PGS genes were frequently in higher constraint regions compared to controls (P = 1×10-03). This difference was not observed in the lowest-ranked gene set (P = 0.15). Overall, our study provides a framework for the annotation of long-read rSVs from lrWGS data and prioritization of disease-linked genomic regions for downstream functional validation of rSV impacts. To enable reuse by other researchers, we have made SV allele frequencies and gene associations freely available.
RESUMO
Tandem repeat (TR) variation is associated with gene expression changes and numerous rare monogenic diseases. Although long-read sequencing provides accurate full-length sequences and methylation of TRs, there is still a need for computational methods to profile TRs across the genome. Here we introduce the Tandem Repeat Genotyping Tool (TRGT) and an accompanying TR database. TRGT determines the consensus sequences and methylation levels of specified TRs from PacBio HiFi sequencing data. It also reports reads that support each repeat allele. These reads can be subsequently visualized with a companion TR visualization tool. Assessing 937,122 TRs, TRGT showed a Mendelian concordance of 98.38%, allowing a single repeat unit difference. In six samples with known repeat expansions, TRGT detected all expansions while also identifying methylation signals and mosaicism and providing finer repeat length resolution than existing methods. Additionally, we released a database with allele sequences and methylation levels for 937,122 TRs across 100 genomes.
Assuntos
Metilação de DNA , Sequências de Repetição em Tandem , Sequências de Repetição em Tandem/genética , Humanos , Metilação de DNA/genética , Genoma Humano/genética , Alelos , Análise de Sequência de DNA/métodos , Software , Bases de Dados GenéticasRESUMO
Rare DNA alterations that cause heritable diseases are only partially resolvable by clinical next-generation sequencing due to the difficulty of detecting structural variation (SV) in all genomic contexts. Long-read, high fidelity genome sequencing (HiFi-GS) detects SVs with increased sensitivity and enables assembling personal and graph genomes. We leverage standard reference genomes, public assemblies (n = 94) and a large collection of HiFi-GS data from a rare disease program (Genomic Answers for Kids, GA4K, n = 574 assemblies) to build a graph genome representing a unified SV callset in GA4K, identify common variation and prioritize SVs that are more likely to cause genetic disease (MAF < 0.01). Using graphs, we obtain a higher level of reproducibility than the standard reference approach. We observe over 200,000 SV alleles unique to GA4K, including nearly 1000 rare variants that impact coding sequence. With improved specificity for rare SVs, we isolate 30 candidate SVs in phenotypically prioritized genes, including known disease SVs. We isolate a novel diagnostic SV in KMT2E, demonstrating use of personal assemblies coupled with pangenome graphs for rare disease genomics. The community may interrogate our pangenome with additional assemblies to discover new SVs within the allele frequency spectrum relevant to genetic diseases.
Assuntos
Genômica , Doenças Raras , Humanos , Doenças Raras/genética , Reprodutibilidade dos Testes , Mapeamento Cromossômico , AlelosRESUMO
The extravillous trophoblast cell lineage is a key feature of placentation and successful pregnancy. Knowledge of transcriptional regulation driving extravillous trophoblast cell development is limited. Here, we map the transcriptome and epigenome landscape as well as chromatin interactions of human trophoblast stem cells and their transition into extravillous trophoblast cells. We show that integrating chromatin accessibility, long-range chromatin interactions, transcriptomic, and transcription factor binding motif enrichment enables identification of transcription factors and regulatory mechanisms critical for extravillous trophoblast cell development. We elucidate functional roles for TFAP2C, SNAI1, and EPAS1 in the regulation of extravillous trophoblast cell development. EPAS1 is identified as an upstream regulator of key extravillous trophoblast cell transcription factors, including ASCL2 and SNAI1 and together with its target genes, is linked to pregnancy loss and birth weight. Collectively, we reveal activation of a dynamic regulatory network and provide a framework for understanding extravillous trophoblast cell specification in trophoblast cell lineage development and human placentation.
Assuntos
Cromatina , Trofoblastos , Gravidez , Feminino , Humanos , Trofoblastos/metabolismo , Cromatina/genética , Cromatina/metabolismo , Placentação/genética , Diferenciação Celular/genética , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Linhagem da Célula/genética , Placenta/metabolismo , Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética , Fatores de Transcrição Hélice-Alça-Hélice Básicos/metabolismoRESUMO
Exposure to adverse early-life environments (AME) increases the incidence of developing adult-onset non-alcoholic fatty liver disease (NAFLD). DNA methylation has been postulated to link AME and late-onset diseases. This study aimed to investigate whether and to what extent the hepatic DNA methylome was perturbed prior to the development of NAFLD in offspring exposed to AME in mice. AME constituted maternal Western diet and late-gestational stress. Male offspring livers at birth (d0) and weaning (d21) were used for evaluating the DNA methylome and transcriptome using the reduced representation of bisulfite sequencing and RNA-seq, respectively. We found AME caused 5879 differentially methylated regions (DMRs) and zero differentially expressed genes (DEGs) at d0 and 2970 and 123, respectively, at d21. The majority of the DMRs were distal to gene transcription start sites and did not correlate with DEGs. The DEGs at d21 were significantly enriched in GO biological processes characteristic of liver metabolic functions. In conclusion, AME drove changes in the hepatic DNA methylome, which preceded perturbations in the hepatic metabolic transcriptome, which preceded the onset of NAFLD. We speculate that subtle impacts on dynamic enhancers lead to long-range regulatory changes that manifest over time as gene network alternations and increase the incidence of NAFLD later in life.
Assuntos
Hepatopatia Gordurosa não Alcoólica , Masculino , Animais , Camundongos , Gravidez , Feminino , Hepatopatia Gordurosa não Alcoólica/genética , Epigenoma , Transcriptoma , Metilação de DNARESUMO
Long-read HiFi genome sequencing allows for accurate detection and direct phasing of single nucleotide variants, indels, and structural variants. Recent algorithmic development enables simultaneous detection of CpG methylation for analysis of regulatory element activity directly in HiFi reads. We present a comprehensive haplotype resolved 5-base HiFi genome sequencing dataset from a rare disease cohort of 276 samples in 152 families to identify rare (~0.5%) hypermethylation events. We find that 80% of these events are allele-specific and predicted to cause loss of regulatory element activity. We demonstrate heritability of extreme hypermethylation including rare cis variants associated with short (~200 bp) and large hypermethylation events (>1 kb), respectively. We identify repeat expansions in proximal promoters predicting allelic gene silencing via hypermethylation and demonstrate allelic transcriptional events downstream. On average 30-40 rare hypermethylation tiles overlap rare disease genes per patient, providing indications for variation prioritization including a previously undiagnosed pathogenic allele in DIP2B causing global developmental delay. We propose that use of HiFi genome sequencing in unsolved rare disease cases will allow detection of unconventional diseases alleles due to loss of regulatory element activity.
Assuntos
Metilação de DNA , Doenças Raras , Humanos , Haplótipos , Doenças Raras/genética , Metilação de DNA/genética , Análise de Sequência de DNA , Sequência de Bases , Sequenciamento de Nucleotídeos em Larga Escala , Proteínas do Tecido Nervoso/genéticaRESUMO
PURPOSE: This study aimed to assess the amount and types of clinical genetic testing denied by insurance and the rate of diagnostic and candidate genetic findings identified through research in patients who faced insurance denials. METHODS: Analysis consisted of review of insurance denials in 801 patients enrolled in a pediatric genomic research repository with either no previous genetic testing or previous negative genetic testing result identified through cross-referencing with insurance prior-authorizations in patient medical records. Patients and denials were also categorized by type of insurance coverage. Diagnostic findings and candidate genetic findings in these groups were determined through review of our internal variant database and patient charts. RESULTS: Of the 801 patients analyzed, 147 had insurance prior-authorization denials on record (18.3%). Exome sequencing and microarray were the most frequently denied genetic tests. Private insurance was significantly more likely to deny testing than public insurance (odds ratio = 2.03 [95% CI = 1.38-2.99] P = .0003). Of the 147 patients with insurance denials, 53.7% had at least 1 diagnostic or candidate finding and 10.9% specifically had a clinically diagnostic finding. Fifty percent of patients with clinically diagnostic results had immediate medical management changes (5.4% of all patients experiencing denials). CONCLUSION: Many patients face a major barrier to genetic testing in the form of lack of insurance coverage. A number of these patients have clinically diagnostic findings with medical management implications that would not have been identified without access to research testing. These findings support re-evaluation of insurance carriers' coverage policies.
Assuntos
Genômica , Cobertura do Seguro , Criança , HumanosRESUMO
PURPOSE: This study aimed to provide comprehensive diagnostic and candidate analyses in a pediatric rare disease cohort through the Genomic Answers for Kids program. METHODS: Extensive analyses of 960 families with suspected genetic disorders included short-read exome sequencing and short-read genome sequencing (srGS); PacBio HiFi long-read genome sequencing (HiFi-GS); variant calling for single nucleotide variants (SNV), structural variant (SV), and repeat variants; and machine-learning variant prioritization. Structured phenotypes, prioritized variants, and pedigrees were stored in PhenoTips database, with data sharing through controlled access the database of Genotypes and Phenotypes. RESULTS: Diagnostic rates ranged from 11% in patients with prior negative genetic testing to 34.5% in naive patients. Incorporating SVs from genome sequencing added up to 13% of new diagnoses in previously unsolved cases. HiFi-GS yielded increased discovery rate with >4-fold more rare coding SVs compared with srGS. Variants and genes of unknown significance remain the most common finding (58% of nondiagnostic cases). CONCLUSION: Computational prioritization is efficient for diagnostic SNVs. Thorough identification of non-SNVs remains challenging and is partly mitigated using HiFi-GS sequencing. Importantly, community research is supported by sharing real-time data to accelerate gene validation and by providing HiFi variant (SNV/SV) resources from >1000 human alleles to facilitate implementation of new sequencing platforms for rare disease diagnoses.
Assuntos
Genômica , Doenças Raras , Criança , Genoma , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Linhagem , Doenças Raras/diagnóstico , Doenças Raras/genética , Análise de Sequência de DNARESUMO
Previous studies focusing on the age disparity in COVID-19 severity have suggested that younger individuals mount a more robust innate immune response in the nasal mucosa after infection with SARS-CoV-2. However, it is unclear if this reflects increased immune activation or increased immune residence in the nasal mucosa. We hypothesized that immune residency in the nasal mucosa of healthy individuals may differ across the age range. We applied single-cell RNA-sequencing and measured the cellular composition and transcriptional profile of the nasal mucosa in 35 SARS-CoV-2 negative children and adults, ranging in age from 4 months to 65 years. We analyzed in total of ~ 30,000 immune and epithelial cells and found that age and immune cell proportion in the nasal mucosa are inversely correlated, with little evidence for structural changes in the transcriptional state of a given cell type across the age range. Orthogonal validation by epigenome sequencing indicate that it is especially cells of the innate immune system that underlie the age-association. Additionally, we characterize the predominate immune cell type in the nasal mucosa: a resident T cell like population with potent antiviral properties. These results demonstrate fundamental changes in the immune cell makeup of the uninfected nasal mucosa over the lifespan. The resource we generate here is an asset for future studies focusing on respiratory infection and immunization strategies.
Assuntos
COVID-19/imunologia , Mucosa Nasal/imunologia , SARS-CoV-2/imunologia , Adolescente , Adulto , COVID-19/genética , Criança , Pré-Escolar , Feminino , Humanos , Imunidade Celular , Imunidade Inata , Lactente , Masculino , Pessoa de Meia-Idade , Mucosa Nasal/citologia , Mucosa Nasal/metabolismo , Índice de Gravidade de Doença , Linfócitos T/imunologia , Linfócitos T/metabolismo , Transcriptoma , Adulto JovemAssuntos
Asma/genética , Metilação de DNA/genética , Predisposição Genética para Doença , Proteínas de Neoplasias/genética , Asma/imunologia , Asma/patologia , Linfócitos T CD4-Positivos/imunologia , Cromossomos Humanos Par 17/genética , Cromossomos Humanos Par 17/imunologia , Feminino , Estudos de Associação Genética , Humanos , Masculino , Linfócitos T/imunologia , Linfócitos T/patologiaRESUMO
The complex relationship between metabolic disease risk and body fat distribution in humans involves cellular characteristics which are specific to body fat compartments. Here we show depot-specific differences in the stromal vascual fraction of visceral and subcutaneous adipose tissue by performing single-cell RNA sequencing of tissue specimen from obese individuals. We characterize multiple immune cells, endothelial cells, fibroblasts, adipose and hematopoietic stem cell progenitors. Subpopulations of adipose-resident immune cells are metabolically active and associated with metabolic disease status and those include a population of potential dysfunctional CD8+ T cells expressing metallothioneins. We identify multiple types of adipocyte progenitors that are common across depots, including a subtype enriched in individuals with type 2 diabetes. Depot-specific analysis reveals a class of adipocyte progenitors unique to visceral adipose tissue, which shares common features with beige preadipocytes. Our human single-cell transcriptome atlas across fat depots provides a resource to dissect functional genomics of metabolic disease.
Assuntos
Tecido Adiposo/metabolismo , Doenças Metabólicas/metabolismo , Análise de Célula Única/métodos , Adipócitos/metabolismo , Tecido Adiposo/citologia , Adulto , Distribuição da Gordura Corporal , Feminino , Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino , Doenças Metabólicas/patologia , Pessoa de Meia-Idade , Obesidade/metabolismoRESUMO
Following publication of the original article [1], the authors reported an error in Additional file 1.
RESUMO
Lys-27-Met mutations in histone 3 genes (H3K27M) characterize a subgroup of deadly gliomas and decrease genome-wide H3K27 trimethylation. Here we use primary H3K27M tumor lines and isogenic CRISPR-edited controls to assess H3K27M effects in vitro and in vivo. We find that whereas H3K27me3 and H3K27me2 are normally deposited by PRC2 across broad regions, their deposition is severely reduced in H3.3K27M cells. H3K27me3 is unable to spread from large unmethylated CpG islands, while H3K27me2 can be deposited outside these PRC2 high-affinity sites but to levels corresponding to H3K27me3 deposition in wild-type cells. Our findings indicate that PRC2 recruitment and propagation on chromatin are seemingly unaffected by K27M, which mostly impairs spread of the repressive marks it catalyzes, especially H3K27me3. Genome-wide loss of H3K27me3 and me2 deposition has limited transcriptomic consequences, preferentially affecting lowly-expressed genes regulating neurogenesis. Removal of H3K27M restores H3K27me2/me3 spread, impairs cell proliferation, and completely abolishes their capacity to form tumors in mice.
Assuntos
Neoplasias Encefálicas/genética , Cromatina/metabolismo , Glioblastoma/genética , Histonas/genética , Complexo Repressor Polycomb 2/metabolismo , Adolescente , Idoso , Animais , Neoplasias Encefálicas/patologia , Sistemas CRISPR-Cas , Carcinogênese/genética , Linhagem Celular Tumoral , Proliferação de Células/genética , Criança , Ilhas de CpG/genética , Metilação de DNA/genética , Epigênese Genética , Feminino , Edição de Genes/métodos , Regulação Neoplásica da Expressão Gênica , Glioblastoma/patologia , Células HEK293 , Código das Histonas/genética , Histonas/metabolismo , Humanos , Lisina/genética , Masculino , Metionina/genética , Camundongos , Camundongos Endogâmicos NOD , Camundongos SCID , Mutação , Neurogênese/genética , Ensaios Antitumorais Modelo de XenoenxertoRESUMO
Sparse profiling of CpG methylation in blood by microarrays has identified epigenetic links to common diseases. Here we apply methylC-capture sequencing (MCC-Seq) in a clinical population of ~200 adipose tissue and matched blood samples (Ntotal~400), providing high-resolution methylation profiling (>1.3 M CpGs) at regulatory elements. We link methylation to cardiometabolic risk through associations to circulating plasma lipid levels and identify lipid-associated CpGs with unique localization patterns in regulatory elements. We show distinct features of tissue-specific versus tissue-independent lipid-linked regulatory regions by contrasting with parallel assessments in ~800 independent adipose tissue and blood samples from the general population. We follow-up on adipose-specific regulatory regions under (1) genetic and (2) epigenetic (environmental) regulation via integrational studies. Overall, the comprehensive sequencing of regulatory element methylomes reveals a rich landscape of functional variants linked genetically as well as epigenetically to plasma lipid traits.
Assuntos
Doenças Cardiovasculares/genética , Ilhas de CpG/genética , Epigênese Genética , Doenças Metabólicas/genética , Sequências Reguladoras de Ácido Nucleico/genética , Tecido Adiposo/metabolismo , Adulto , Idoso , Doenças Cardiovasculares/sangue , Doenças Cardiovasculares/metabolismo , Metilação de DNA , Epigenômica/métodos , Feminino , Perfilação da Expressão Gênica , Genoma Humano , Estudo de Associação Genômica Ampla , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Lipídeos/sangue , Masculino , Doenças Metabólicas/sangue , Doenças Metabólicas/metabolismo , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/métodosRESUMO
BACKGROUND: The functional impact of genetic variation has been extensively surveyed, revealing that genetic changes correlated to phenotypes lie mostly in non-coding genomic regions. Studies have linked allele-specific genetic changes to gene expression, DNA methylation, and histone marks but these investigations have only been carried out in a limited set of samples. RESULTS: We describe a large-scale coordinated study of allelic and non-allelic effects on DNA methylation, histone mark deposition, and gene expression, detecting the interrelations between epigenetic and functional features at unprecedented resolution. We use information from whole genome and targeted bisulfite sequencing from 910 samples to perform genotype-dependent analyses of allele-specific methylation (ASM) and non-allelic methylation (mQTL). In addition, we introduce a novel genotype-independent test to detect methylation imbalance between chromosomes. Of the ~2.2 million CpGs tested for ASM, mQTL, and genotype-independent effects, we identify ~32% as being genetically regulated (ASM or mQTL) and ~14% as being putatively epigenetically regulated. We also show that epigenetically driven effects are strongly enriched in repressed regions and near transcription start sites, whereas the genetically regulated CpGs are enriched in enhancers. Known imprinted regions are enriched among epigenetically regulated loci, but we also observe several novel genomic regions (e.g., HOX genes) as being epigenetically regulated. Finally, we use our ASM datasets for functional interpretation of disease-associated loci and show the advantage of utilizing naïve T cells for understanding autoimmune diseases. CONCLUSIONS: Our rich catalogue of haploid methylomes across multiple tissues will allow validation of epigenome association studies and exploration of new biological models for allelic exclusion in the human genome.
Assuntos
Alelos , Metilação de DNA , Epigênese Genética , Epigenômica , Variação Genética , Genoma Humano , Efeitos da Posição Cromossômica , Ilhas de CpG , Elementos Facilitadores Genéticos , Epigenômica/métodos , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Especificidade de Órgãos/genética , Fenótipo , Polimorfismo de Nucleotídeo Único , Locos de Características QuantitativasRESUMO
The incidence of type 1 diabetes (T1D) has substantially increased over the past decade, suggesting a role for non-genetic factors such as epigenetic mechanisms in disease development. Here we present an epigenome-wide association study across 406,365 CpGs in 52 monozygotic twin pairs discordant for T1D in three immune effector cell types. We observe a substantial enrichment of differentially variable CpG positions (DVPs) in T1D twins when compared with their healthy co-twins and when compared with healthy, unrelated individuals. These T1D-associated DVPs are found to be temporally stable and enriched at gene regulatory elements. Integration with cell type-specific gene regulatory circuits highlight pathways involved in immune cell metabolism and the cell cycle, including mTOR signalling. Evidence from cord blood of newborns who progress to overt T1D suggests that the DVPs likely emerge after birth. Our findings, based on 772 methylomes, implicate epigenetic changes that could contribute to disease pathogenesis in T1D.
Assuntos
Metilação de DNA/genética , Diabetes Mellitus Tipo 1/genética , Diabetes Mellitus Tipo 1/imunologia , Ilhas de CpG/genética , Sangue Fetal/metabolismo , Humanos , Anotação de Sequência Molecular , Fatores de Tempo , Gêmeos Monozigóticos/genéticaRESUMO
BACKGROUND: CpG methylation variation is involved in human trait formation and disease susceptibility. Analyses within populations have been biased towards CpG-dense regions through the application of targeted arrays. We generate whole-genome bisulfite sequencing data for approximately 30 adipose and blood samples from monozygotic and dizygotic twins for the characterization of non-genetic and genetic effects at single-site resolution. RESULTS: Purely invariable CpGs display a bimodal distribution with enrichment of unmethylated CpGs and depletion of fully methylated CpGs in promoter and enhancer regions. Population-variable CpGs account for approximately 15-20 % of total CpGs per tissue, are enriched in enhancer-associated regions and depleted in promoters, and single nucleotide polymorphisms at CpGs are a frequent confounder of extreme methylation variation. Differential methylation is primarily non-genetic in origin, with non-shared environment accounting for most of the variance. These non-genetic effects are mainly tissue-specific. Tobacco smoking is associated with differential methylation in blood with no evidence of this exposure impacting cell counts. Opposite to non-genetic effects, genetic effects of CpG methylation are shared across tissues and thus limit inter-tissue epigenetic drift. CpH methylation is rare, and shows similar characteristics of variation patterns as CpGs. CONCLUSIONS: Our study highlights the utility of low pass whole-genome bisulfite sequencing in identifying methylome variation beyond promoter regions, and suggests that targeting the population dynamic methylome of tissues requires assessment of understudied intergenic CpGs distal to gene promoters to reveal the full extent of inter-individual variation.
Assuntos
Metilação de DNA , Interação Gene-Ambiente , Variação Genética , Genoma Humano , Tecido Adiposo/metabolismo , Sangue/metabolismo , Ilhas de CpG , Feminino , Humanos , Fumar/genética , Gêmeos Dizigóticos , Gêmeos MonozigóticosRESUMO
Large-scale epigenome mapping by the NIH Roadmap Epigenomics Project, the ENCODE Consortium and the International Human Epigenome Consortium (IHEC) produces genome-wide DNA methylation data at one base-pair resolution. We examine how such data can be made open-access while balancing appropriate interpretation and genomic privacy. We propose guidelines for data release that both reduce ambiguity in the interpretation of open-access data and limit immediate access to genetic variation data that are made available through controlled access.