RESUMO
The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of the heritability of this disease. Here, to test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole-genome sequencing in 2,657 European individuals with and without diabetes, and exome sequencing in 12,940 individuals from five ancestry groups. To increase statistical power, we expanded the sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes.
Assuntos
Diabetes Mellitus Tipo 2/genética , Predisposição Genética para Doença/genética , Variação Genética/genética , Alelos , Análise Mutacional de DNA , Europa (Continente)/etnologia , Exoma , Estudo de Associação Genômica Ampla , Técnicas de Genotipagem , Humanos , Tamanho da AmostraRESUMO
We integrate comeasured gene expression and DNA methylation (DNAme) in 265 human skeletal muscle biopsies from the FUSION study with >7 million genetic variants and eight physiological traits: height, waist, weight, waist-hip ratio, body mass index, fasting serum insulin, fasting plasma glucose, and type 2 diabetes. We find hundreds of genes and DNAme sites associated with fasting insulin, waist, and body mass index, as well as thousands of DNAme sites associated with gene expression (eQTM). We find that controlling for heterogeneity in tissue/muscle fiber type reduces the number of physiological trait associations, and that long-range eQTMs (>1 Mb) are reduced when controlling for tissue/muscle fiber type or latent factors. We map genetic regulators (quantitative trait loci; QTLs) of expression (eQTLs) and DNAme (mQTLs). Using Mendelian randomization (MR) and mediation techniques, we leverage these genetic maps to predict 213 causal relationships between expression and DNAme, approximately two-thirds of which predict methylation to causally influence expression. We use MR to integrate FUSION mQTLs, FUSION eQTLs, and GTEx eQTLs for 48 tissues with genetic associations for 534 diseases and quantitative traits. We identify hundreds of genes and thousands of DNAme sites that may drive the reported disease/quantitative trait genetic associations. We identify 300 gene expression MR associations that are present in both FUSION and GTEx skeletal muscle and that show stronger evidence of MR association in skeletal muscle than other tissues, which may partially reflect differences in power across tissues. As one example, we find that increased RXRA muscle expression may decrease lean tissue mass.
Assuntos
Metilação de DNA/genética , Expressão Gênica/genética , Músculo Esquelético , Glicemia/análise , Pesos e Medidas Corporais , Diabetes Mellitus Tipo 2 , Estudo de Associação Genômica Ampla/métodos , Genômica/métodos , Humanos , Insulina/análise , Músculo Esquelético/química , Músculo Esquelético/fisiologia , Locos de Características Quantitativas/genéticaRESUMO
Patients with classic hydroa vacciniforme-like lymphoproliferative disorder (HVLPD) typically have high levels of Epstein-Barr virus (EBV) DNA in T cells and/or natural killer (NK) cells in blood and skin lesions induced by sun exposure that are infiltrated with EBV-infected lymphocytes. HVLPD is very rare in the United States and Europe but more common in Asia and South America. The disease can progress to a systemic form that may result in fatal lymphoma. We report our 11-year experience with 16 HVLPD patients from the United States and England and found that whites were less likely to develop systemic EBV disease (1/10) than nonwhites (5/6). All (10/10) of the white patients were generally in good health at last follow-up, while two-thirds (4/6) of the nonwhite patients required hematopoietic stem cell transplantation. Nonwhite patients had later age of onset of HVLPD than white patients (median age, 8 vs 5 years) and higher levels of EBV DNA (median, 1 515 000 vs 250 000 copies/ml) and more often had low numbers of NK cells (83% vs 50% of patients) and T-cell clones in the blood (83% vs 30% of patients). RNA-sequencing analysis of an HVLPD skin lesion in a white patient compared with his normal skin showed increased expression of interferon-γ and chemokines that attract T cells and NK cells. Thus, white patients with HVLPD were less likely to have systemic disease with EBV and had a much better prognosis than nonwhite patients. This trial was registered at www.clinicaltrials.gov as #NCT00369421 and #NCT00032513.
Assuntos
Infecções por Vírus Epstein-Barr/patologia , Hidroa Vaciniforme/virologia , Transtornos Linfoproliferativos/patologia , Transtornos Linfoproliferativos/virologia , Criança , Pré-Escolar , Infecções por Vírus Epstein-Barr/etnologia , Infecções por Vírus Epstein-Barr/imunologia , Feminino , Humanos , Transtornos Linfoproliferativos/etnologia , Masculino , População BrancaRESUMO
A major challenge in evaluating the contribution of rare variants to complex disease is identifying enough copies of the rare alleles to permit informative statistical analysis. To investigate the contribution of rare variants to the risk of type 2 diabetes (T2D) and related traits, we performed deep whole-genome analysis of 1,034 members of 20 large Mexican-American families with high prevalence of T2D. If rare variants of large effect accounted for much of the diabetes risk in these families, our experiment was powered to detect association. Using gene expression data on 21,677 transcripts for 643 pedigree members, we identified evidence for large-effect rare-variant cis-expression quantitative trait loci that could not be detected in population studies, validating our approach. However, we did not identify any rare variants of large effect associated with T2D, or the related traits of fasting glucose and insulin, suggesting that large-effect rare variants account for only a modest fraction of the genetic risk of these traits in this sample of families. Reliable identification of large-effect rare variants will require larger samples of extended pedigrees or different study designs that further enrich for such variants.
Assuntos
Diabetes Mellitus Tipo 2/genética , Predisposição Genética para Doença/genética , Variação Genética , Americanos Mexicanos/genética , Diabetes Mellitus Tipo 2/etnologia , Diabetes Mellitus Tipo 2/patologia , Saúde da Família , Feminino , Frequência do Gene , Predisposição Genética para Doença/etnologia , Estudo de Associação Genômica Ampla/métodos , Genótipo , Humanos , Masculino , Linhagem , Fenótipo , Locos de Características Quantitativas/genética , Sequenciamento Completo do Genoma/métodosRESUMO
Comprehensive metabolite profiling captures many highly heritable traits, including amino acid levels, which are potentially sensitive biomarkers for disease pathogenesis. To better understand the contribution of genetic variation to amino acid levels, we performed single variant and gene-based tests of association between nine serum amino acids (alanine, glutamine, glycine, histidine, isoleucine, leucine, phenylalanine, tyrosine, and valine) and 16.6 million genotyped and imputed variants in 8545 non-diabetic Finnish men from the METabolic Syndrome In Men (METSIM) study with replication in Northern Finland Birth Cohort (NFBC1966). We identified five novel loci associated with amino acid levels (P = < 5×10-8): LOC157273/PPP1R3B with glycine (rs9987289, P = 2.3×10-26); ZFHX3 (chr16:73326579, minor allele frequency (MAF) = 0.42%, P = 3.6×10-9), LIPC (rs10468017, P = 1.5×10-8), and WWOX (rs9937914, P = 3.8×10-8) with alanine; and TRIB1 with tyrosine (rs28601761, P = 8×10-9). Gene-based tests identified two novel genes harboring missense variants of MAF <1% that show aggregate association with amino acid levels: PYCR1 with glycine (Pgene = 1.5×10-6) and BCAT2 with valine (Pgene = 7.4×10-7); neither gene was implicated by single variant association tests. These findings are among the first applications of gene-based tests to identify new loci for amino acid levels. In addition to the seven novel gene associations, we identified five independent signals at established amino acid loci, including two rare variant signals at GLDC (rs138640017, MAF=0.95%, Pconditional = 5.8×10-40) with glycine levels and HAL (rs141635447, MAF = 0.46%, Pconditional = 9.4×10-11) with histidine levels. Examination of all single variant association results in our data revealed a strong inverse relationship between effect size and MAF (Ptrend<0.001). These novel signals provide further insight into the molecular mechanisms of amino acid metabolism and potentially, their perturbations in disease.
Assuntos
Aminoácidos/metabolismo , Estudo de Associação Genômica Ampla/métodos , Finlândia , Frequência do Gene/genética , Genótipo , Humanos , Masculino , Pessoa de Meia-IdadeRESUMO
Subcutaneous adipose tissue stores excess lipids and maintains energy balance. We performed expression quantitative trait locus (eQTL) analyses by using abdominal subcutaneous adipose tissue of 770 extensively phenotyped participants of the METSIM study. We identified cis-eQTLs for 12,400 genes at a 1% false-discovery rate. Among an approximately 680 known genome-wide association study (GWAS) loci for cardio-metabolic traits, we identified 140 coincident cis-eQTLs at 109 GWAS loci, including 93 eQTLs not previously described. At 49 of these 140 eQTLs, gene expression was nominally associated (p < 0.05) with levels of the GWAS trait. The size of our dataset enabled identification of five loci associated (p < 5 × 10-8) with at least five genes located >5 Mb away. These trans-eQTL signals confirmed and extended the previously reported KLF14-mediated network to 55 target genes, validated the CIITA regulation of class II MHC genes, and identified ZNF800 as a candidate master regulator. Finally, we observed similar expression-clinical trait correlations of genes associated with GWAS loci in both humans and a panel of genetically diverse mice. These results provide candidate genes for further investigation of their potential roles in adipose biology and in regulating cardio-metabolic traits.
Assuntos
Doenças Cardiovasculares/genética , Regulação da Expressão Gênica , Síndrome Metabólica/genética , Locos de Características Quantitativas , Gordura Subcutânea/metabolismo , Idoso , Animais , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Estudo de Associação Genômica Ampla , Técnicas de Genotipagem , Humanos , Masculino , Camundongos , Pessoa de Meia-Idade , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo , Fenótipo , Reprodutibilidade dos Testes , Transativadores/genética , Transativadores/metabolismoRESUMO
Genome-wide association studies (GWAS) have identified >100 independent SNPs that modulate the risk of type 2 diabetes (T2D) and related traits. However, the pathogenic mechanisms of most of these SNPs remain elusive. Here, we examined genomic, epigenomic, and transcriptomic profiles in human pancreatic islets to understand the links between genetic variation, chromatin landscape, and gene expression in the context of T2D. We first integrated genome and transcriptome variation across 112 islet samples to produce dense cis-expression quantitative trait loci (cis-eQTL) maps. Additional integration with chromatin-state maps for islets and other diverse tissue types revealed that cis-eQTLs for islet-specific genes are specifically and significantly enriched in islet stretch enhancers. High-resolution chromatin accessibility profiling using assay for transposase-accessible chromatin sequencing (ATAC-seq) in two islet samples enabled us to identify specific transcription factor (TF) footprints embedded in active regulatory elements, which are highly enriched for islet cis-eQTL. Aggregate allelic bias signatures in TF footprints enabled us de novo to reconstruct TF binding affinities genetically, which support the high-quality nature of the TF footprint predictions. Interestingly, we found that T2D GWAS loci were strikingly and specifically enriched in islet Regulatory Factor X (RFX) footprints. Remarkably, within and across independent loci, T2D risk alleles that overlap with RFX footprints uniformly disrupt the RFX motifs at high-information content positions. Together, these results suggest that common regulatory variations have shaped islet TF footprints and the transcriptome and that a confluent RFX regulatory grammar plays a significant role in the genetic component of T2D predisposition.
Assuntos
Diabetes Mellitus Tipo 2/genética , Predisposição Genética para Doença , Genoma Humano , Ilhotas Pancreáticas/metabolismo , Locos de Características Quantitativas , Transcriptoma , Alelos , Sequência de Bases , Sítios de Ligação , Cromatina/química , Cromatina/metabolismo , Diabetes Mellitus Tipo 2/metabolismo , Diabetes Mellitus Tipo 2/patologia , Epigênese Genética , Perfilação da Expressão Gênica , Variação Genética , Estudo de Associação Genômica Ampla , Impressão Genômica , Humanos , Ilhotas Pancreáticas/patologia , Polimorfismo de Nucleotídeo Único , Ligação Proteica , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Fatores de Transcrição de Fator Regulador X/genética , Fatores de Transcrição de Fator Regulador X/metabolismoRESUMO
Lipid and lipoprotein subclasses are associated with metabolic and cardiovascular diseases, yet the genetic contributions to variability in subclass traits are not fully understood. We conducted single-variant and gene-based association tests between 15.1M variants from genome-wide and exome array and imputed genotypes and 72 lipid and lipoprotein traits in 8,372 Finns. After accounting for 885 variants at 157 previously identified lipid loci, we identified five novel signals near established loci at HIF3A, ADAMTS3, PLTP, LCAT, and LIPG. Four of the signals were identified with a low-frequency (0.005Assuntos
Frequência do Gene/genética
, Metabolismo dos Lipídeos/genética
, Lipídeos/genética
, Lipoproteínas/genética
, Polimorfismo de Nucleotídeo Único/genética
, Triglicerídeos/genética
, População Branca/genética
, HDL-Colesterol/genética
, Exoma/genética
, Finlândia
, Estudo de Associação Genômica Ampla/métodos
, Genótipo
, Humanos
, Masculino
, Pessoa de Meia-Idade
, Análise de Componente Principal/métodos
RESUMO
BACKGROUND: Bisulfite sequencing is widely employed to study the role of DNA methylation in disease; however, the data suffer from biases due to coverage depth variability. Imputation of methylation values at low-coverage sites may mitigate these biases while also identifying important genomic features associated with predictive power. RESULTS: Here we describe BoostMe, a method for imputing low-quality DNA methylation estimates within whole-genome bisulfite sequencing (WGBS) data. BoostMe uses a gradient boosting algorithm, XGBoost, and leverages information from multiple samples for prediction. We find that BoostMe outperforms existing algorithms in speed and accuracy when applied to WGBS of human tissues. Furthermore, we show that imputation improves concordance between WGBS and the MethylationEPIC array at low WGBS depth, suggesting improved WGBS accuracy after imputation. CONCLUSIONS: Our findings support the use of BoostMe as a preprocessing step for WGBS analysis.
Assuntos
Biologia Computacional/métodos , Metilação de DNA/efeitos dos fármacos , Sulfitos/farmacologia , Sequenciamento Completo do Genoma , Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala , HumanosRESUMO
Hutchinson-Gilford progeria syndrome (HGPS) is a premature aging disease that is frequently caused by a de novo point mutation at position 1824 in LMNA. This mutation activates a cryptic splice donor site in exon 11, and leads to an in-frame deletion within the prelamin A mRNA and the production of a dominant-negative lamin A protein, known as progerin. Here we show that primary HGPS skin fibroblasts experience genome-wide correlated alterations in patterns of H3K27me3 deposition, DNA-lamin A/C associations, and, at late passages, genome-wide loss of spatial compartmentalization of active and inactive chromatin domains. We further demonstrate that the H3K27me3 changes associate with gene expression alterations in HGPS cells. Our results support a model that the accumulation of progerin in the nuclear lamina leads to altered H3K27me3 marks in heterochromatin, possibly through the down-regulation of EZH2, and disrupts heterochromatin-lamina interactions. These changes may result in transcriptional misregulation and eventually trigger the global loss of spatial chromatin compartmentalization in late passage HGPS fibroblasts.
Assuntos
Genoma Humano , Histonas/metabolismo , Laminas/metabolismo , Progéria/genética , Progéria/metabolismo , Linhagem Celular , Imunoprecipitação da Cromatina , Fibroblastos/metabolismo , Regulação da Expressão Gênica , Heterocromatina/metabolismo , Humanos , Metilação , Ligação Proteica , Análise de Sequência de DNARESUMO
Chromatin-based functional genomic analyses and genomewide association studies (GWASs) together implicate enhancers as critical elements influencing gene expression and risk for common diseases. Here, we performed systematic chromatin and transcriptome profiling in human pancreatic islets. Integrated analysis of islet data with those from nine cell types identified specific and significant enrichment of type 2 diabetes and related quantitative trait GWAS variants in islet enhancers. Our integrated chromatin maps reveal that most enhancers are short (median = 0.8 kb). Each cell type also contains a substantial number of more extended (≥ 3 kb) enhancers. Interestingly, these stretch enhancers are often tissue-specific and overlap locus control regions, suggesting that they are important chromatin regulatory beacons. Indeed, we show that (i) tissue specificity of enhancers and nearby gene expression increase with enhancer length; (ii) neighborhoods containing stretch enhancers are enriched for important cell type-specific genes; and (iii) GWAS variants associated with traits relevant to a particular cell type are more enriched in stretch enhancers compared with short enhancers. Reporter constructs containing stretch enhancer sequences exhibited tissue-specific activity in cell culture experiments and in transgenic mice. These results suggest that stretch enhancers are critical chromatin elements for coordinating cell type-specific regulatory programs and that sequence variation in stretch enhancers affects risk of major common human diseases.
Assuntos
Diferenciação Celular/fisiologia , Cromatina/fisiologia , Diabetes Mellitus Tipo 2/fisiopatologia , Elementos Facilitadores Genéticos/genética , Epigenômica/métodos , Regulação da Expressão Gênica/fisiologia , Células Secretoras de Insulina/metabolismo , Animais , Imunoprecipitação da Cromatina , Diabetes Mellitus Tipo 2/genética , Elementos Facilitadores Genéticos/fisiologia , Perfilação da Expressão Gênica , Regulação da Expressão Gênica/genética , Estudo de Associação Genômica Ampla , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Células Secretoras de Insulina/fisiologia , Luciferases , Camundongos , Camundongos TransgênicosRESUMO
Transgenic animals are extensively used to model human disease. Typically, the transgene copy number is estimated, but the exact integration site and configuration of the foreign DNA remains uncharacterized. When transgenes have been closely examined, some unexpected configurations have been found. Here, we describe a method to recover transgene insertion sites and assess structural rearrangements of host and transgene DNA using microarray hybridization and targeted sequence capture. We used information about the transgene insertion site to develop a polymerase chain reaction genotyping assay to distinguish heterozygous from homozygous transgenic animals. Although we worked with a bacterial artificial chromosome transgenic mouse line, this method can be used to analyse the integration site and configuration of any foreign DNA in a sequenced genome.
Assuntos
Técnicas de Genotipagem , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência de DNA , Transgenes , Animais , Cromossomos Artificiais Bacterianos , Camundongos , Camundongos Transgênicos , Reação em Cadeia da PolimeraseRESUMO
Genome-wide association studies have identified hundreds of loci for type 2 diabetes, coronary artery disease and myocardial infarction, as well as for related traits such as body mass index, glucose and insulin levels, lipid levels, and blood pressure. These studies also have pointed to thousands of loci with promising but not yet compelling association evidence. To establish association at additional loci and to characterize the genome-wide significant loci by fine-mapping, we designed the "Metabochip," a custom genotyping array that assays nearly 200,000 SNP markers. Here, we describe the Metabochip and its component SNP sets, evaluate its performance in capturing variation across the allele-frequency spectrum, describe solutions to methodological challenges commonly encountered in its analysis, and evaluate its performance as a platform for genotype imputation. The metabochip achieves dramatic cost efficiencies compared to designing single-trait follow-up reagents, and provides the opportunity to compare results across a range of related traits. The metabochip and similar custom genotyping arrays offer a powerful and cost-effective approach to follow-up large-scale genotyping and sequencing studies and advance our understanding of the genetic basis of complex human diseases and traits.
Assuntos
Antropometria/instrumentação , Metabolômica/instrumentação , Análise de Sequência com Séries de Oligonucleotídeos/instrumentação , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Alelos , Antropometria/métodos , Doenças Cardiovasculares/diagnóstico , Doenças Cardiovasculares/genética , Doenças Cardiovasculares/metabolismo , Diabetes Mellitus Tipo 2/diagnóstico , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Frequência do Gene , Genoma Humano , Estudo de Associação Genômica Ampla , Genótipo , Técnicas de Genotipagem , Humanos , Metabolômica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , FenótipoRESUMO
Massively parallel DNA sequencing technologies have greatly increased our ability to generate large amounts of sequencing data at a rapid pace. Several methods have been developed to enrich for genomic regions of interest for targeted sequencing. We have compared three of these methods: Molecular Inversion Probes (MIP), Solution Hybrid Selection (SHS), and Microarray-based Genomic Selection (MGS). Using HapMap DNA samples, we compared each of these methods with respect to their ability to capture an identical set of exons and evolutionarily conserved regions associated with 528 genes (2.61 Mb). For sequence analysis, we developed and used a novel Bayesian genotype-assigning algorithm, Most Probable Genotype (MPG). All three capture methods were effective, but sensitivities (percentage of targeted bases associated with high-quality genotypes) varied for an equivalent amount of pass-filtered sequence: for example, 70% (MIP), 84% (SHS), and 91% (MGS) for 400 Mb. In contrast, all methods yielded similar accuracies of >99.84% when compared to Infinium 1M SNP BeadChip-derived genotypes and >99.998% when compared to 30-fold coverage whole-genome shotgun sequencing data. We also observed a low false-positive rate with all three methods; of the heterozygous positions identified by each of the capture methods, >99.57% agreed with 1M SNP BeadChip, and >98.840% agreed with the whole-genome shotgun data. In addition, we successfully piloted the genomic enrichment of a set of 12 pooled samples via the MGS method using molecular bar codes. We find that these three genomic enrichment methods are highly accurate and practical, with sensitivities comparable to that of 30-fold coverage whole-genome shotgun data.
Assuntos
Diabetes Mellitus Tipo 2/genética , Genoma Humano , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência de DNA/métodos , Algoritmos , Teorema de Bayes , DNA/genética , Sondas de DNA/genética , Éxons , Genótipo , Humanos , Reprodutibilidade dos Testes , Sensibilidade e EspecificidadeRESUMO
Hereditary congenital facial paresis type 1 (HCFP1) is an autosomal dominant disorder of absent or limited facial movement that maps to chromosome 3q21-q22 and is hypothesized to result from facial branchial motor neuron (FBMN) maldevelopment. In the present study, we report that HCFP1 results from heterozygous duplications within a neuron-specific GATA2 regulatory region that includes two enhancers and one silencer, and from noncoding single-nucleotide variants (SNVs) within the silencer. Some SNVs impair binding of NR2F1 to the silencer in vitro and in vivo and attenuate in vivo enhancer reporter expression in FBMNs. Gata2 and its effector Gata3 are essential for inner-ear efferent neuron (IEE) but not FBMN development. A humanized HCFP1 mouse model extends Gata2 expression, favors the formation of IEEs over FBMNs and is rescued by conditional loss of Gata3. These findings highlight the importance of temporal gene regulation in development and of noncoding variation in rare mendelian disease.
Assuntos
Paralisia Facial , Animais , Camundongos , Paralisia Facial/genética , Paralisia Facial/congênito , Paralisia Facial/metabolismo , Fator de Transcrição GATA2/genética , Fator de Transcrição GATA2/metabolismo , Neurônios Motores/metabolismo , Neurogênese , Neurônios EferentesRESUMO
ClinSeq is a pilot project to investigate the use of whole-genome sequencing as a tool for clinical research. By piloting the acquisition of large amounts of DNA sequence data from individual human subjects, we are fostering the development of hypothesis-generating approaches for performing research in genomic medicine, including the exploration of issues related to the genetic architecture of disease, implementation of genomic technology, informed consent, disclosure of genetic information, and archiving, analyzing, and displaying sequence data. In the initial phase of ClinSeq, we are enrolling roughly 1000 participants; the evaluation of each includes obtaining a detailed family and medical history, as well as a clinical evaluation. The participants are being consented broadly for research on many traits and for whole-genome sequencing. Initially, Sanger-based sequencing of 300-400 genes thought to be relevant to atherosclerosis is being performed, with the resulting data analyzed for rare, high-penetrance variants associated with specific clinical traits. The participants are also being consented to allow the contact of family members for additional studies of sequence variants to explore their potential association with specific phenotypes. Here, we present the general considerations in designing ClinSeq, preliminary results based on the generation of an initial 826 Mb of sequence data, the findings for several genes that serve as positive controls for the project, and our views about the potential implications of ClinSeq. The early experiences with ClinSeq illustrate how large-scale medical sequencing can be a practical, productive, and critical component of research in genomic medicine.
Assuntos
Aterosclerose/genética , Pesquisa Biomédica , Doenças Cardiovasculares/genética , Genoma Humano , Genômica , Projetos Piloto , Análise de Sequência de DNA/métodos , Idoso , Estudos de Coortes , Feminino , Humanos , Masculino , Linhagem , FenótipoRESUMO
UNLABELLED: Genome-wide association studies (GWAS) have revealed hundreds of loci associated with common human genetic diseases and traits. We have developed a web-based plotting tool that provides fast visual display of GWAS results in a publication-ready format. LocusZoom visually displays regional information such as the strength and extent of the association signal relative to genomic position, local linkage disequilibrium (LD) and recombination patterns and the positions of genes in the region. AVAILABILITY: LocusZoom can be accessed from a web interface at http://csg.sph.umich.edu/locuszoom. Users may generate a single plot using a web form, or many plots using batch mode. The software utilizes LD information from HapMap Phase II (CEU, YRI and JPT+CHB) or 1000 Genomes (CEU) and gene information from the UCSC browser, and will accept SNP identifiers in dbSNP or 1000 Genomes format. Single plots are generated in approximately 20 s. Source code and associated databases are available for download and local installation, and full documentation is available online.
Assuntos
Estudo de Associação Genômica Ampla , Software , Gráficos por Computador , Humanos , InternetRESUMO
EndoC-ßH1 is emerging as a critical human ß cell model to study the genetic and environmental etiologies of ß cell (dys)function and diabetes. Comprehensive knowledge of its molecular landscape is lacking, yet required, for effective use of this model. Here, we report chromosomal (spectral karyotyping), genetic (genotyping), epigenomic (ChIP-seq and ATAC-seq), chromatin interaction (Hi-C and Pol2 ChIA-PET), and transcriptomic (RNA-seq and miRNA-seq) maps of EndoC-ßH1. Analyses of these maps define known (e.g., PDX1 and ISL1) and putative (e.g., PCSK1 and mir-375) ß cell-specific transcriptional cis-regulatory networks and identify allelic effects on cis-regulatory element use. Importantly, comparison with maps generated in primary human islets and/or ß cells indicates preservation of chromatin looping but also highlights chromosomal aberrations and fetal genomic signatures in EndoC-ßH1. Together, these maps, and a web application we created for their exploration, provide important tools for the design of experiments to probe and manipulate the genetic programs governing ß cell identity and (dys)function in diabetes.
Assuntos
Redes Reguladoras de Genes/genética , Células Secretoras de Insulina/metabolismo , Linhagem Celular , HumanosRESUMO
More than 120 published reports have described associations between single nucleotide polymorphisms (SNPs) and type 2 diabetes. However, multiple studies of the same variant have often been discordant. From a literature search, we identified previously reported type 2 diabetes-associated SNPs. We initially genotyped 134 SNPs on 786 index case subjects from type 2 diabetes families and 617 control subjects with normal glucose tolerance from Finland and excluded from analysis 20 SNPs in strong linkage disequilibrium (r(2) > 0.8) with another typed SNP. Of the 114 SNPs examined, we followed up the 20 most significant SNPs (P < 0.10) on an additional 384 case subjects and 366 control subjects from a population-based study in Finland. In the combined data, we replicated association (P < 0.05) for 12 SNPs: PPARG Pro12Ala and His447, KCNJ11 Glu23Lys and rs5210, TNF -857, SLC2A2 Ile110Thr, HNF1A/TCF1 rs2701175 and GE117881_360, PCK1 -232, NEUROD1 Thr45Ala, IL6 -598, and ENPP1 Lys121Gln. The replication of 12 SNPs of 114 tested was significantly greater than expected by chance under the null hypothesis of no association (P = 0.012). We observed that SNPs from genes that had three or more previous reports of association were significantly more likely to be replicated in our sample (P = 0.03), although we also replicated 4 of 58 SNPs from genes that had only one previous report of association.
Assuntos
Mapeamento Cromossômico , Diabetes Mellitus Tipo 2/genética , Testes Genéticos , Polimorfismo de Nucleotídeo Único , Idoso , Glicemia/análise , Índice de Massa Corporal , Jejum , Feminino , Humanos , Lactente , Insulina/sangue , Masculino , Pessoa de Meia-IdadeRESUMO
From whole organisms to individual cells, responses to environmental conditions are influenced by genetic makeup, where the effect of genetic variation on a trait depends on the environmental context. RNA-sequencing quantifies gene expression as a molecular trait, and is capable of capturing both genetic and environmental effects. In this study, we explore opportunities of using allele-specific expression (ASE) to discover cis-acting genotype-environment interactions (GxE)-genetic effects on gene expression that depend on an environmental condition. Treating 17 common, clinical traits as approximations of the cellular environment of 267 skeletal muscle biopsies, we identify 10 candidate environmental response expression quantitative trait loci (reQTLs) across 6 traits (12 unique gene-environment trait pairs; 10% FDR per trait) including sex, systolic blood pressure, and low-density lipoprotein cholesterol. Although using ASE is in principle a promising approach to detect GxE effects, replication of such signals can be challenging as validation requires harmonization of environmental traits across cohorts and a sufficient sampling of heterozygotes for a transcribed SNP. Comprehensive discovery and replication will require large human transcriptome datasets, or the integration of multiple transcribed SNPs, coupled with standardized clinical phenotyping.