Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 57
Filter
Add more filters

Publication year range
1.
Nature ; 536(7614): 41-47, 2016 08 04.
Article in English | MEDLINE | ID: mdl-27398621

ABSTRACT

The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of the heritability of this disease. Here, to test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole-genome sequencing in 2,657 European individuals with and without diabetes, and exome sequencing in 12,940 individuals from five ancestry groups. To increase statistical power, we expanded the sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes.


Subject(s)
Diabetes Mellitus, Type 2/genetics , Genetic Predisposition to Disease/genetics , Genetic Variation/genetics , Alleles , DNA Mutational Analysis , Europe/ethnology , Exome , Genome-Wide Association Study , Genotyping Techniques , Humans , Sample Size
2.
Proc Natl Acad Sci U S A ; 116(22): 10883-10888, 2019 05 28.
Article in English | MEDLINE | ID: mdl-31076557

ABSTRACT

We integrate comeasured gene expression and DNA methylation (DNAme) in 265 human skeletal muscle biopsies from the FUSION study with >7 million genetic variants and eight physiological traits: height, waist, weight, waist-hip ratio, body mass index, fasting serum insulin, fasting plasma glucose, and type 2 diabetes. We find hundreds of genes and DNAme sites associated with fasting insulin, waist, and body mass index, as well as thousands of DNAme sites associated with gene expression (eQTM). We find that controlling for heterogeneity in tissue/muscle fiber type reduces the number of physiological trait associations, and that long-range eQTMs (>1 Mb) are reduced when controlling for tissue/muscle fiber type or latent factors. We map genetic regulators (quantitative trait loci; QTLs) of expression (eQTLs) and DNAme (mQTLs). Using Mendelian randomization (MR) and mediation techniques, we leverage these genetic maps to predict 213 causal relationships between expression and DNAme, approximately two-thirds of which predict methylation to causally influence expression. We use MR to integrate FUSION mQTLs, FUSION eQTLs, and GTEx eQTLs for 48 tissues with genetic associations for 534 diseases and quantitative traits. We identify hundreds of genes and thousands of DNAme sites that may drive the reported disease/quantitative trait genetic associations. We identify 300 gene expression MR associations that are present in both FUSION and GTEx skeletal muscle and that show stronger evidence of MR association in skeletal muscle than other tissues, which may partially reflect differences in power across tissues. As one example, we find that increased RXRA muscle expression may decrease lean tissue mass.


Subject(s)
DNA Methylation/genetics , Gene Expression/genetics , Muscle, Skeletal , Blood Glucose/analysis , Body Weights and Measures , Diabetes Mellitus, Type 2 , Genome-Wide Association Study/methods , Genomics/methods , Humans , Insulin/analysis , Muscle, Skeletal/chemistry , Muscle, Skeletal/physiology , Quantitative Trait Loci/genetics
3.
Blood ; 133(26): 2753-2764, 2019 06 27.
Article in English | MEDLINE | ID: mdl-31064750

ABSTRACT

Patients with classic hydroa vacciniforme-like lymphoproliferative disorder (HVLPD) typically have high levels of Epstein-Barr virus (EBV) DNA in T cells and/or natural killer (NK) cells in blood and skin lesions induced by sun exposure that are infiltrated with EBV-infected lymphocytes. HVLPD is very rare in the United States and Europe but more common in Asia and South America. The disease can progress to a systemic form that may result in fatal lymphoma. We report our 11-year experience with 16 HVLPD patients from the United States and England and found that whites were less likely to develop systemic EBV disease (1/10) than nonwhites (5/6). All (10/10) of the white patients were generally in good health at last follow-up, while two-thirds (4/6) of the nonwhite patients required hematopoietic stem cell transplantation. Nonwhite patients had later age of onset of HVLPD than white patients (median age, 8 vs 5 years) and higher levels of EBV DNA (median, 1 515 000 vs 250 000 copies/ml) and more often had low numbers of NK cells (83% vs 50% of patients) and T-cell clones in the blood (83% vs 30% of patients). RNA-sequencing analysis of an HVLPD skin lesion in a white patient compared with his normal skin showed increased expression of interferon-γ and chemokines that attract T cells and NK cells. Thus, white patients with HVLPD were less likely to have systemic disease with EBV and had a much better prognosis than nonwhite patients. This trial was registered at www.clinicaltrials.gov as #NCT00369421 and #NCT00032513.


Subject(s)
Epstein-Barr Virus Infections/pathology , Hydroa Vacciniforme/virology , Lymphoproliferative Disorders/pathology , Lymphoproliferative Disorders/virology , Child , Child, Preschool , Epstein-Barr Virus Infections/ethnology , Epstein-Barr Virus Infections/immunology , Female , Humans , Lymphoproliferative Disorders/ethnology , Male , White People
4.
Nature ; 526(7571): 75-81, 2015 Oct 01.
Article in English | MEDLINE | ID: mdl-26432246

ABSTRACT

Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.


Subject(s)
Genetic Variation/genetics , Genome, Human/genetics , Physical Chromosome Mapping , Amino Acid Sequence , Genetic Predisposition to Disease , Genetics, Medical , Genetics, Population , Genome-Wide Association Study , Genomics , Genotype , Haplotypes/genetics , Homozygote , Humans , Molecular Sequence Data , Mutation Rate , Polymorphism, Single Nucleotide/genetics , Quantitative Trait Loci/genetics , Sequence Analysis, DNA , Sequence Deletion/genetics
5.
Proc Natl Acad Sci U S A ; 115(2): 379-384, 2018 01 09.
Article in English | MEDLINE | ID: mdl-29279374

ABSTRACT

A major challenge in evaluating the contribution of rare variants to complex disease is identifying enough copies of the rare alleles to permit informative statistical analysis. To investigate the contribution of rare variants to the risk of type 2 diabetes (T2D) and related traits, we performed deep whole-genome analysis of 1,034 members of 20 large Mexican-American families with high prevalence of T2D. If rare variants of large effect accounted for much of the diabetes risk in these families, our experiment was powered to detect association. Using gene expression data on 21,677 transcripts for 643 pedigree members, we identified evidence for large-effect rare-variant cis-expression quantitative trait loci that could not be detected in population studies, validating our approach. However, we did not identify any rare variants of large effect associated with T2D, or the related traits of fasting glucose and insulin, suggesting that large-effect rare variants account for only a modest fraction of the genetic risk of these traits in this sample of families. Reliable identification of large-effect rare variants will require larger samples of extended pedigrees or different study designs that further enrich for such variants.


Subject(s)
Diabetes Mellitus, Type 2/genetics , Genetic Predisposition to Disease/genetics , Genetic Variation , Mexican Americans/genetics , Diabetes Mellitus, Type 2/ethnology , Diabetes Mellitus, Type 2/pathology , Family Health , Female , Gene Frequency , Genetic Predisposition to Disease/ethnology , Genome-Wide Association Study/methods , Genotype , Humans , Male , Pedigree , Phenotype , Quantitative Trait Loci/genetics , Whole Genome Sequencing/methods
6.
Hum Mol Genet ; 27(9): 1664-1674, 2018 05 01.
Article in English | MEDLINE | ID: mdl-29481666

ABSTRACT

Comprehensive metabolite profiling captures many highly heritable traits, including amino acid levels, which are potentially sensitive biomarkers for disease pathogenesis. To better understand the contribution of genetic variation to amino acid levels, we performed single variant and gene-based tests of association between nine serum amino acids (alanine, glutamine, glycine, histidine, isoleucine, leucine, phenylalanine, tyrosine, and valine) and 16.6 million genotyped and imputed variants in 8545 non-diabetic Finnish men from the METabolic Syndrome In Men (METSIM) study with replication in Northern Finland Birth Cohort (NFBC1966). We identified five novel loci associated with amino acid levels (P = < 5×10-8): LOC157273/PPP1R3B with glycine (rs9987289, P = 2.3×10-26); ZFHX3 (chr16:73326579, minor allele frequency (MAF) = 0.42%, P = 3.6×10-9), LIPC (rs10468017, P = 1.5×10-8), and WWOX (rs9937914, P = 3.8×10-8) with alanine; and TRIB1 with tyrosine (rs28601761, P = 8×10-9). Gene-based tests identified two novel genes harboring missense variants of MAF <1% that show aggregate association with amino acid levels: PYCR1 with glycine (Pgene = 1.5×10-6) and BCAT2 with valine (Pgene = 7.4×10-7); neither gene was implicated by single variant association tests. These findings are among the first applications of gene-based tests to identify new loci for amino acid levels. In addition to the seven novel gene associations, we identified five independent signals at established amino acid loci, including two rare variant signals at GLDC (rs138640017, MAF=0.95%, Pconditional = 5.8×10-40) with glycine levels and HAL (rs141635447, MAF = 0.46%, Pconditional = 9.4×10-11) with histidine levels. Examination of all single variant association results in our data revealed a strong inverse relationship between effect size and MAF (Ptrend<0.001). These novel signals provide further insight into the molecular mechanisms of amino acid metabolism and potentially, their perturbations in disease.


Subject(s)
Amino Acids/metabolism , Genome-Wide Association Study/methods , Finland , Gene Frequency/genetics , Genotype , Humans , Male , Middle Aged
7.
Am J Hum Genet ; 100(3): 428-443, 2017 Mar 02.
Article in English | MEDLINE | ID: mdl-28257690

ABSTRACT

Subcutaneous adipose tissue stores excess lipids and maintains energy balance. We performed expression quantitative trait locus (eQTL) analyses by using abdominal subcutaneous adipose tissue of 770 extensively phenotyped participants of the METSIM study. We identified cis-eQTLs for 12,400 genes at a 1% false-discovery rate. Among an approximately 680 known genome-wide association study (GWAS) loci for cardio-metabolic traits, we identified 140 coincident cis-eQTLs at 109 GWAS loci, including 93 eQTLs not previously described. At 49 of these 140 eQTLs, gene expression was nominally associated (p < 0.05) with levels of the GWAS trait. The size of our dataset enabled identification of five loci associated (p < 5 × 10-8) with at least five genes located >5 Mb away. These trans-eQTL signals confirmed and extended the previously reported KLF14-mediated network to 55 target genes, validated the CIITA regulation of class II MHC genes, and identified ZNF800 as a candidate master regulator. Finally, we observed similar expression-clinical trait correlations of genes associated with GWAS loci in both humans and a panel of genetically diverse mice. These results provide candidate genes for further investigation of their potential roles in adipose biology and in regulating cardio-metabolic traits.


Subject(s)
Cardiovascular Diseases/genetics , Gene Expression Regulation , Metabolic Syndrome/genetics , Quantitative Trait Loci , Subcutaneous Fat/metabolism , Aged , Animals , Databases, Genetic , Gene Expression Profiling , Genome-Wide Association Study , Genotyping Techniques , Humans , Male , Mice , Middle Aged , Nuclear Proteins/genetics , Nuclear Proteins/metabolism , Phenotype , Reproducibility of Results , Trans-Activators/genetics , Trans-Activators/metabolism
8.
Proc Natl Acad Sci U S A ; 114(9): 2301-2306, 2017 02 28.
Article in English | MEDLINE | ID: mdl-28193859

ABSTRACT

Genome-wide association studies (GWAS) have identified >100 independent SNPs that modulate the risk of type 2 diabetes (T2D) and related traits. However, the pathogenic mechanisms of most of these SNPs remain elusive. Here, we examined genomic, epigenomic, and transcriptomic profiles in human pancreatic islets to understand the links between genetic variation, chromatin landscape, and gene expression in the context of T2D. We first integrated genome and transcriptome variation across 112 islet samples to produce dense cis-expression quantitative trait loci (cis-eQTL) maps. Additional integration with chromatin-state maps for islets and other diverse tissue types revealed that cis-eQTLs for islet-specific genes are specifically and significantly enriched in islet stretch enhancers. High-resolution chromatin accessibility profiling using assay for transposase-accessible chromatin sequencing (ATAC-seq) in two islet samples enabled us to identify specific transcription factor (TF) footprints embedded in active regulatory elements, which are highly enriched for islet cis-eQTL. Aggregate allelic bias signatures in TF footprints enabled us de novo to reconstruct TF binding affinities genetically, which support the high-quality nature of the TF footprint predictions. Interestingly, we found that T2D GWAS loci were strikingly and specifically enriched in islet Regulatory Factor X (RFX) footprints. Remarkably, within and across independent loci, T2D risk alleles that overlap with RFX footprints uniformly disrupt the RFX motifs at high-information content positions. Together, these results suggest that common regulatory variations have shaped islet TF footprints and the transcriptome and that a confluent RFX regulatory grammar plays a significant role in the genetic component of T2D predisposition.


Subject(s)
Diabetes Mellitus, Type 2/genetics , Genetic Predisposition to Disease , Genome, Human , Islets of Langerhans/metabolism , Quantitative Trait Loci , Transcriptome , Alleles , Base Sequence , Binding Sites , Chromatin/chemistry , Chromatin/metabolism , Diabetes Mellitus, Type 2/metabolism , Diabetes Mellitus, Type 2/pathology , Epigenesis, Genetic , Gene Expression Profiling , Genetic Variation , Genome-Wide Association Study , Genomic Imprinting , Humans , Islets of Langerhans/pathology , Polymorphism, Single Nucleotide , Protein Binding , Protein Isoforms/genetics , Protein Isoforms/metabolism , Regulatory Factor X Transcription Factors/genetics , Regulatory Factor X Transcription Factors/metabolism
9.
PLoS Genet ; 13(10): e1007079, 2017 Oct.
Article in English | MEDLINE | ID: mdl-29084231

ABSTRACT

Lipid and lipoprotein subclasses are associated with metabolic and cardiovascular diseases, yet the genetic contributions to variability in subclass traits are not fully understood. We conducted single-variant and gene-based association tests between 15.1M variants from genome-wide and exome array and imputed genotypes and 72 lipid and lipoprotein traits in 8,372 Finns. After accounting for 885 variants at 157 previously identified lipid loci, we identified five novel signals near established loci at HIF3A, ADAMTS3, PLTP, LCAT, and LIPG. Four of the signals were identified with a low-frequency (0.005

Subject(s)
Gene Frequency/genetics , Lipid Metabolism/genetics , Lipids/genetics , Lipoproteins/genetics , Polymorphism, Single Nucleotide/genetics , Triglycerides/genetics , White People/genetics , Cholesterol, HDL/genetics , Exome/genetics , Finland , Genome-Wide Association Study/methods , Genotype , Humans , Male , Middle Aged , Principal Component Analysis/methods
10.
BMC Genomics ; 19(1): 390, 2018 May 23.
Article in English | MEDLINE | ID: mdl-29792182

ABSTRACT

BACKGROUND: Bisulfite sequencing is widely employed to study the role of DNA methylation in disease; however, the data suffer from biases due to coverage depth variability. Imputation of methylation values at low-coverage sites may mitigate these biases while also identifying important genomic features associated with predictive power. RESULTS: Here we describe BoostMe, a method for imputing low-quality DNA methylation estimates within whole-genome bisulfite sequencing (WGBS) data. BoostMe uses a gradient boosting algorithm, XGBoost, and leverages information from multiple samples for prediction. We find that BoostMe outperforms existing algorithms in speed and accuracy when applied to WGBS of human tissues. Furthermore, we show that imputation improves concordance between WGBS and the MethylationEPIC array at low WGBS depth, suggesting improved WGBS accuracy after imputation. CONCLUSIONS: Our findings support the use of BoostMe as a preprocessing step for WGBS analysis.


Subject(s)
Computational Biology/methods , DNA Methylation/drug effects , Sulfites/pharmacology , Whole Genome Sequencing , Algorithms , High-Throughput Nucleotide Sequencing , Humans
11.
J Med Genet ; 54(3): 212-216, 2017 03.
Article in English | MEDLINE | ID: mdl-27920058

ABSTRACT

BACKGROUND: Hutchinson-Gilford progeria syndrome (HGPS) is a fatal sporadic autosomal dominant premature ageing disease caused by single base mutations that optimise a cryptic splice site within exon 11 of the LMNA gene. The resultant disease-causing protein, progerin, acts as a dominant negative. Disease severity relies partly on progerin levels. METHODS AND RESULTS: We report a novel form of somatic mosaicism, where a child possessed two cell populations with different HGPS disease-producing mutations of the same nucleotide-one producing severe HGPS and one mild HGPS. The proband possessed an intermediate phenotype. The mosaicism was initially discovered when Sanger sequencing showed a c.1968+2T>A mutation in blood DNA and a c.1968+2T>C in DNA from cultured fibroblasts. Deep sequencing of DNA from the proband's blood revealed 4.7% c.1968+2T>C mutation, and 41.3% c.1968+2T>A mutation. CONCLUSIONS: We hypothesise that the germline mutation was c.1968+2T>A, but a rescue event occurred during early development, where the somatic mutation from A to C at 1968+2 provided a selective advantage. This type of mosaicism where a partial phenotypic rescue event results from a second but milder disease-causing mutation in the same nucleotide has not been previously characterised for any disease.


Subject(s)
Cell Nucleus/genetics , Lamin Type A/genetics , Progeria/genetics , Adolescent , Cell Nucleus/pathology , Cells, Cultured , Child , Child, Preschool , Exons/genetics , Female , Fibroblasts/pathology , Genetic Predisposition to Disease , Germ-Line Mutation , High-Throughput Nucleotide Sequencing , Humans , Infant , Male , Mosaicism , Progeria/pathology
12.
PLoS Genet ; 10(11): e1004809, 2014 Nov.
Article in English | MEDLINE | ID: mdl-25411967

ABSTRACT

Although prostate cancer typically runs an indolent course, a subset of men develop aggressive, fatal forms of this disease. We hypothesize that germline variation modulates susceptibility to aggressive prostate cancer. The goal of this work is to identify susceptibility genes using the C57BL/6-Tg(TRAMP)8247Ng/J (TRAMP) mouse model of neuroendocrine prostate cancer. Quantitative trait locus (QTL) mapping was performed in transgene-positive (TRAMPxNOD/ShiLtJ) F2 intercross males (n = 228), which facilitated identification of 11 loci associated with aggressive disease development. Microarray data derived from 126 (TRAMPxNOD/ShiLtJ) F2 primary tumors were used to prioritize candidate genes within QTLs, with candidate genes deemed as being high priority when possessing both high levels of expression-trait correlation and a proximal expression QTL. This process enabled the identification of 35 aggressive prostate tumorigenesis candidate genes. The role of these genes in aggressive forms of human prostate cancer was investigated using two concurrent approaches. First, logistic regression analysis in two human prostate gene expression datasets revealed that expression levels of five genes (CXCL14, ITGAX, LPCAT2, RNASEH2A, and ZNF322) were positively correlated with aggressive prostate cancer and two genes (CCL19 and HIST1H1A) were protective for aggressive prostate cancer. Higher than average levels of expression of the five genes that were positively correlated with aggressive disease were consistently associated with patient outcome in both human prostate cancer tumor gene expression datasets. Second, three of these five genes (CXCL14, ITGAX, and LPCAT2) harbored polymorphisms associated with aggressive disease development in a human GWAS cohort consisting of 1,172 prostate cancer patients. This study is the first example of using a systems genetics approach to successfully identify novel susceptibility genes for aggressive prostate cancer. Such approaches will facilitate the identification of novel germline factors driving aggressive disease susceptibility and allow for new insights into these deadly forms of prostate cancer.


Subject(s)
1-Acylglycerophosphocholine O-Acyltransferase/genetics , CD11c Antigen/genetics , Chemokines, CXC/genetics , Prostatic Neoplasms/genetics , Animals , Cell Transformation, Neoplastic/genetics , Disease Models, Animal , Gene Expression Regulation, Neoplastic , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Male , Mice , Prostatic Neoplasms/pathology , Quantitative Trait Loci/genetics , Ribonuclease H/genetics
13.
Hum Mutat ; 37(1): 52-64, 2016 Jan.
Article in English | MEDLINE | ID: mdl-26411452

ABSTRACT

Genome-wide association studies have identified genomic loci, whose single-nucleotide polymorphisms (SNPs) predispose to prostate cancer (PCa). However, the mechanisms of most of these variants are largely unknown. We integrated chromatin-immunoprecipitation-coupled sequencing and microarray expression profiling in TMPRSS2-ERG gene rearrangement positive DUCaP cells with the GWAS PCa risk SNPs catalog to identify disease susceptibility SNPs localized within functional androgen receptor-binding sites (ARBSs). Among the 48 GWAS index risk SNPs and 3,917 linked SNPs, 80 were found located in ARBSs. Of these, rs11891426:T>G in an intron of the melanophilin gene (MLPH) was within a novel putative auxiliary AR-binding motif, which is enriched in the neighborhood of canonical androgen-responsive elements. T→G exchange attenuated the transcriptional activity of the ARBS in an AR reporter gene assay. The expression of MLPH in primary prostate tumors was significantly lower in those with the G compared with the T allele and correlated significantly with AR protein. Higher melanophilin level in prostate tissue of patients with a favorable PCa risk profile points out a tumor-suppressive effect. These results unravel a hidden link between AR and a functional putative PCa risk SNP, whose allele alteration affects androgen regulation of its host gene MLPH.


Subject(s)
Adaptor Proteins, Signal Transducing/genetics , Binding Sites , Polymorphism, Single Nucleotide , Prostatic Neoplasms/genetics , Prostatic Neoplasms/metabolism , Receptors, Androgen/metabolism , Response Elements , Adult , Aged , Alleles , Base Sequence , Cell Line, Tumor , Chromatin Immunoprecipitation , Gene Expression Regulation, Neoplastic , Genotype , High-Throughput Nucleotide Sequencing , Humans , Male , Middle Aged , Neoplasm Grading , Neoplasm Staging , Nucleotide Motifs , Position-Specific Scoring Matrices , Prostatic Neoplasms/pathology , Protein Binding , Tumor Burden
14.
Genome Res ; 23(2): 260-9, 2013 Feb.
Article in English | MEDLINE | ID: mdl-23152449

ABSTRACT

Hutchinson-Gilford progeria syndrome (HGPS) is a premature aging disease that is frequently caused by a de novo point mutation at position 1824 in LMNA. This mutation activates a cryptic splice donor site in exon 11, and leads to an in-frame deletion within the prelamin A mRNA and the production of a dominant-negative lamin A protein, known as progerin. Here we show that primary HGPS skin fibroblasts experience genome-wide correlated alterations in patterns of H3K27me3 deposition, DNA-lamin A/C associations, and, at late passages, genome-wide loss of spatial compartmentalization of active and inactive chromatin domains. We further demonstrate that the H3K27me3 changes associate with gene expression alterations in HGPS cells. Our results support a model that the accumulation of progerin in the nuclear lamina leads to altered H3K27me3 marks in heterochromatin, possibly through the down-regulation of EZH2, and disrupts heterochromatin-lamina interactions. These changes may result in transcriptional misregulation and eventually trigger the global loss of spatial chromatin compartmentalization in late passage HGPS fibroblasts.


Subject(s)
Genome, Human , Histones/metabolism , Lamins/metabolism , Progeria/genetics , Progeria/metabolism , Cell Line , Chromatin Immunoprecipitation , Fibroblasts/metabolism , Gene Expression Regulation , Heterochromatin/metabolism , Humans , Methylation , Protein Binding , Sequence Analysis, DNA
15.
Proc Natl Acad Sci U S A ; 110(44): 17921-6, 2013 Oct 29.
Article in English | MEDLINE | ID: mdl-24127591

ABSTRACT

Chromatin-based functional genomic analyses and genomewide association studies (GWASs) together implicate enhancers as critical elements influencing gene expression and risk for common diseases. Here, we performed systematic chromatin and transcriptome profiling in human pancreatic islets. Integrated analysis of islet data with those from nine cell types identified specific and significant enrichment of type 2 diabetes and related quantitative trait GWAS variants in islet enhancers. Our integrated chromatin maps reveal that most enhancers are short (median = 0.8 kb). Each cell type also contains a substantial number of more extended (≥ 3 kb) enhancers. Interestingly, these stretch enhancers are often tissue-specific and overlap locus control regions, suggesting that they are important chromatin regulatory beacons. Indeed, we show that (i) tissue specificity of enhancers and nearby gene expression increase with enhancer length; (ii) neighborhoods containing stretch enhancers are enriched for important cell type-specific genes; and (iii) GWAS variants associated with traits relevant to a particular cell type are more enriched in stretch enhancers compared with short enhancers. Reporter constructs containing stretch enhancer sequences exhibited tissue-specific activity in cell culture experiments and in transgenic mice. These results suggest that stretch enhancers are critical chromatin elements for coordinating cell type-specific regulatory programs and that sequence variation in stretch enhancers affects risk of major common human diseases.


Subject(s)
Cell Differentiation/physiology , Chromatin/physiology , Diabetes Mellitus, Type 2/physiopathology , Enhancer Elements, Genetic/genetics , Epigenomics/methods , Gene Expression Regulation/physiology , Insulin-Secreting Cells/metabolism , Animals , Chromatin Immunoprecipitation , Diabetes Mellitus, Type 2/genetics , Enhancer Elements, Genetic/physiology , Gene Expression Profiling , Gene Expression Regulation/genetics , Genome-Wide Association Study , High-Throughput Nucleotide Sequencing , Humans , Insulin-Secreting Cells/physiology , Luciferases , Mice , Mice, Transgenic
16.
Nucleic Acids Res ; 41(6): e70, 2013 Apr 01.
Article in English | MEDLINE | ID: mdl-23314155

ABSTRACT

Transgenic animals are extensively used to model human disease. Typically, the transgene copy number is estimated, but the exact integration site and configuration of the foreign DNA remains uncharacterized. When transgenes have been closely examined, some unexpected configurations have been found. Here, we describe a method to recover transgene insertion sites and assess structural rearrangements of host and transgene DNA using microarray hybridization and targeted sequence capture. We used information about the transgene insertion site to develop a polymerase chain reaction genotyping assay to distinguish heterozygous from homozygous transgenic animals. Although we worked with a bacterial artificial chromosome transgenic mouse line, this method can be used to analyse the integration site and configuration of any foreign DNA in a sequenced genome.


Subject(s)
Genotyping Techniques , High-Throughput Nucleotide Sequencing , Oligonucleotide Array Sequence Analysis/methods , Sequence Analysis, DNA , Transgenes , Animals , Chromosomes, Artificial, Bacterial , Mice , Mice, Transgenic , Polymerase Chain Reaction
17.
PLoS Genet ; 8(8): e1002793, 2012.
Article in English | MEDLINE | ID: mdl-22876189

ABSTRACT

Genome-wide association studies have identified hundreds of loci for type 2 diabetes, coronary artery disease and myocardial infarction, as well as for related traits such as body mass index, glucose and insulin levels, lipid levels, and blood pressure. These studies also have pointed to thousands of loci with promising but not yet compelling association evidence. To establish association at additional loci and to characterize the genome-wide significant loci by fine-mapping, we designed the "Metabochip," a custom genotyping array that assays nearly 200,000 SNP markers. Here, we describe the Metabochip and its component SNP sets, evaluate its performance in capturing variation across the allele-frequency spectrum, describe solutions to methodological challenges commonly encountered in its analysis, and evaluate its performance as a platform for genotype imputation. The metabochip achieves dramatic cost efficiencies compared to designing single-trait follow-up reagents, and provides the opportunity to compare results across a range of related traits. The metabochip and similar custom genotyping arrays offer a powerful and cost-effective approach to follow-up large-scale genotyping and sequencing studies and advance our understanding of the genetic basis of complex human diseases and traits.


Subject(s)
Anthropometry/instrumentation , Metabolomics/instrumentation , Oligonucleotide Array Sequence Analysis/instrumentation , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Alleles , Anthropometry/methods , Cardiovascular Diseases/diagnosis , Cardiovascular Diseases/genetics , Cardiovascular Diseases/metabolism , Diabetes Mellitus, Type 2/diagnosis , Diabetes Mellitus, Type 2/genetics , Diabetes Mellitus, Type 2/metabolism , Gene Frequency , Genome, Human , Genome-Wide Association Study , Genotype , Genotyping Techniques , Humans , Metabolomics/methods , Oligonucleotide Array Sequence Analysis/methods , Phenotype
18.
Am J Respir Cell Mol Biol ; 51(3): 436-45, 2014 Sep.
Article in English | MEDLINE | ID: mdl-24693920

ABSTRACT

Airway allergen exposure induces inflammation among individuals with atopy that is characterized by altered airway gene expression, elevated levels of T helper type 2 cytokines, mucus hypersecretion, and airflow obstruction. To identify the genetic determinants of the airway allergen response, we employed a systems genetics approach. We applied a house dust mite mouse model of allergic airway disease to 151 incipient lines of the Collaborative Cross, a new mouse genetic reference population, and measured serum IgE, airway eosinophilia, and gene expression in the lung. Allergen-induced serum IgE and airway eosinophilia were not correlated. We detected quantitative trait loci (QTL) for airway eosinophilia on chromosome (Chr) 11 (71.802-87.098 megabases [Mb]) and allergen-induced IgE on Chr 4 (13.950-31.660 Mb). More than 4,500 genes expressed in the lung had gene expression QTL (eQTL), the majority of which were located near the gene itself. However, we also detected approximately 1,700 trans-eQTL, and many of these trans-eQTL clustered into two regions on Chr 2. We show that one of these loci (at 147.6 Mb) is associated with the expression of more than 100 genes, and, using bioinformatics resources, fine-map this locus to a 53 kb-long interval. We also use the gene expression and eQTL data to identify a candidate gene, Tlcd2, for the eosinophil QTL. Our results demonstrate that hallmark allergic airway disease phenotypes are associated with distinct genetic loci on Chrs 4 and 11, and that gene expression in the allergically inflamed lung is controlled by both cis and trans regulatory factors.


Subject(s)
Bronchial Hyperreactivity/immunology , Hypersensitivity/metabolism , Lung/immunology , Animals , Antigens, Dermatophagoides/immunology , Dermatophagoides pteronyssinus/metabolism , Disease Models, Animal , Gene Expression Regulation , Genetics , Hypersensitivity/immunology , Immunoglobulin E/blood , Inflammation , Lung/metabolism , Male , Mice , Phenotype , Quantitative Trait Loci , Respiratory Hypersensitivity/immunology
19.
Genome Res ; 20(10): 1420-31, 2010 Oct.
Article in English | MEDLINE | ID: mdl-20810667

ABSTRACT

Massively parallel DNA sequencing technologies have greatly increased our ability to generate large amounts of sequencing data at a rapid pace. Several methods have been developed to enrich for genomic regions of interest for targeted sequencing. We have compared three of these methods: Molecular Inversion Probes (MIP), Solution Hybrid Selection (SHS), and Microarray-based Genomic Selection (MGS). Using HapMap DNA samples, we compared each of these methods with respect to their ability to capture an identical set of exons and evolutionarily conserved regions associated with 528 genes (2.61 Mb). For sequence analysis, we developed and used a novel Bayesian genotype-assigning algorithm, Most Probable Genotype (MPG). All three capture methods were effective, but sensitivities (percentage of targeted bases associated with high-quality genotypes) varied for an equivalent amount of pass-filtered sequence: for example, 70% (MIP), 84% (SHS), and 91% (MGS) for 400 Mb. In contrast, all methods yielded similar accuracies of >99.84% when compared to Infinium 1M SNP BeadChip-derived genotypes and >99.998% when compared to 30-fold coverage whole-genome shotgun sequencing data. We also observed a low false-positive rate with all three methods; of the heterozygous positions identified by each of the capture methods, >99.57% agreed with 1M SNP BeadChip, and >98.840% agreed with the whole-genome shotgun data. In addition, we successfully piloted the genomic enrichment of a set of 12 pooled samples via the MGS method using molecular bar codes. We find that these three genomic enrichment methods are highly accurate and practical, with sensitivities comparable to that of 30-fold coverage whole-genome shotgun data.


Subject(s)
Diabetes Mellitus, Type 2/genetics , Genome, Human , Oligonucleotide Array Sequence Analysis/methods , Sequence Analysis, DNA/methods , Algorithms , Bayes Theorem , DNA/genetics , DNA Probes/genetics , Exons , Genotype , Humans , Reproducibility of Results , Sensitivity and Specificity
20.
Nat Genet ; 55(7): 1149-1163, 2023 07.
Article in English | MEDLINE | ID: mdl-37386251

ABSTRACT

Hereditary congenital facial paresis type 1 (HCFP1) is an autosomal dominant disorder of absent or limited facial movement that maps to chromosome 3q21-q22 and is hypothesized to result from facial branchial motor neuron (FBMN) maldevelopment. In the present study, we report that HCFP1 results from heterozygous duplications within a neuron-specific GATA2 regulatory region that includes two enhancers and one silencer, and from noncoding single-nucleotide variants (SNVs) within the silencer. Some SNVs impair binding of NR2F1 to the silencer in vitro and in vivo and attenuate in vivo enhancer reporter expression in FBMNs. Gata2 and its effector Gata3 are essential for inner-ear efferent neuron (IEE) but not FBMN development. A humanized HCFP1 mouse model extends Gata2 expression, favors the formation of IEEs over FBMNs and is rescued by conditional loss of Gata3. These findings highlight the importance of temporal gene regulation in development and of noncoding variation in rare mendelian disease.


Subject(s)
Facial Paralysis , Animals , Mice , Facial Paralysis/genetics , Facial Paralysis/congenital , Facial Paralysis/metabolism , GATA2 Transcription Factor/genetics , GATA2 Transcription Factor/metabolism , Motor Neurons/metabolism , Neurogenesis , Neurons, Efferent
SELECTION OF CITATIONS
SEARCH DETAIL