Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 52
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Nat Immunol ; 24(9): 1540-1551, 2023 09.
Article in English | MEDLINE | ID: mdl-37563310

ABSTRACT

Circulating proteins have important functions in inflammation and a broad range of diseases. To identify genetic influences on inflammation-related proteins, we conducted a genome-wide protein quantitative trait locus (pQTL) study of 91 plasma proteins measured using the Olink Target platform in 14,824 participants. We identified 180 pQTLs (59 cis, 121 trans). Integration of pQTL data with eQTL and disease genome-wide association studies provided insight into pathogenesis, implicating lymphotoxin-α in multiple sclerosis. Using Mendelian randomization (MR) to assess causality in disease etiology, we identified both shared and distinct effects of specific proteins across immune-mediated diseases, including directionally discordant effects of CD40 on risk of rheumatoid arthritis versus multiple sclerosis and inflammatory bowel disease. MR implicated CXCL5 in the etiology of ulcerative colitis (UC) and we show elevated gut CXCL5 transcript expression in patients with UC. These results identify targets of existing drugs and provide a powerful resource to facilitate future drug target prioritization.


Subject(s)
Colitis, Ulcerative , Inflammatory Bowel Diseases , Multiple Sclerosis , Humans , Genome-Wide Association Study , Inflammatory Bowel Diseases/genetics , Quantitative Trait Loci , Colitis, Ulcerative/drug therapy , Colitis, Ulcerative/genetics , Inflammation/genetics , Multiple Sclerosis/genetics , Polymorphism, Single Nucleotide
3.
Nature ; 622(7982): 339-347, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37794183

ABSTRACT

Integrating human genomics and proteomics can help elucidate disease mechanisms, identify clinical biomarkers and discover drug targets1-4. Because previous proteogenomic studies have focused on common variation via genome-wide association studies, the contribution of rare variants to the plasma proteome remains largely unknown. Here we identify associations between rare protein-coding variants and 2,923 plasma protein abundances measured in 49,736 UK Biobank individuals. Our variant-level exome-wide association study identified 5,433 rare genotype-protein associations, of which 81% were undetected in a previous genome-wide association study of the same cohort5. We then looked at aggregate signals using gene-level collapsing analysis, which revealed 1,962 gene-protein associations. Of the 691 gene-level signals from protein-truncating variants, 99.4% were associated with decreased protein levels. STAB1 and STAB2, encoding scavenger receptors involved in plasma protein clearance, emerged as pleiotropic loci, with 77 and 41 protein associations, respectively. We demonstrate the utility of our publicly accessible resource through several applications. These include detailing an allelic series in NLRC4, identifying potential biomarkers for a fatty liver disease-associated variant in HSD17B13 and bolstering phenome-wide association studies by integrating protein quantitative trait loci with protein-truncating variants in collapsing analyses. Finally, we uncover distinct proteomic consequences of clonal haematopoiesis (CH), including an association between TET2-CH and increased FLT3 levels. Our results highlight a considerable role for rare variation in plasma protein abundance and the value of proteogenomics in therapeutic discovery.


Subject(s)
Biological Specimen Banks , Blood Proteins , Genetic Association Studies , Genomics , Proteomics , Humans , Alleles , Biomarkers/blood , Blood Proteins/analysis , Blood Proteins/genetics , Databases, Factual , Exome/genetics , Hematopoiesis , Mutation , Plasma/chemistry , United Kingdom
4.
Nature ; 616(7955): 123-131, 2023 04.
Article in English | MEDLINE | ID: mdl-36991119

ABSTRACT

The use of omic modalities to dissect the molecular underpinnings of common diseases and traits is becoming increasingly common. But multi-omic traits can be genetically predicted, which enables highly cost-effective and powerful analyses for studies that do not have multi-omics1. Here we examine a large cohort (the INTERVAL study2; n = 50,000 participants) with extensive multi-omic data for plasma proteomics (SomaScan, n = 3,175; Olink, n = 4,822), plasma metabolomics (Metabolon HD4, n = 8,153), serum metabolomics (Nightingale, n = 37,359) and whole-blood Illumina RNA sequencing (n = 4,136), and use machine learning to train genetic scores for 17,227 molecular traits, including 10,521 that reach Bonferroni-adjusted significance. We evaluate the performance of genetic scores through external validation across cohorts of individuals of European, Asian and African American ancestries. In addition, we show the utility of these multi-omic genetic scores by quantifying the genetic control of biological pathways and by generating a synthetic multi-omic dataset of the UK Biobank3 to identify disease associations using a phenome-wide scan. We highlight a series of biological insights with regard to genetic mechanisms in metabolism and canonical pathway associations with disease; for example, JAK-STAT signalling and coronary atherosclerosis. Finally, we develop a portal ( https://www.omicspred.org/ ) to facilitate public access to all genetic scores and validation results, as well as to serve as a platform for future extensions and enhancements of multi-omic genetic scores.


Subject(s)
Coronary Artery Disease , Multiomics , Humans , Coronary Artery Disease/genetics , Coronary Artery Disease/metabolism , Metabolomics/methods , Phenotype , Proteomics/methods , Machine Learning , Black or African American/genetics , Asian/genetics , European People/genetics , United Kingdom , Datasets as Topic , Internet , Reproducibility of Results , Cohort Studies , Proteome/analysis , Proteome/metabolism , Metabolome , Plasma/metabolism , Databases, Factual
5.
Am J Hum Genet ; 111(8): 1524-1543, 2024 Aug 08.
Article in English | MEDLINE | ID: mdl-39053458

ABSTRACT

Gene misexpression is the aberrant transcription of a gene in a context where it is usually inactive. Despite its known pathological consequences in specific rare diseases, we have a limited understanding of its wider prevalence and mechanisms in humans. To address this, we analyzed gene misexpression in 4,568 whole-blood bulk RNA sequencing samples from INTERVAL study blood donors. We found that while individual misexpression events occur rarely, in aggregate they were found in almost all samples and a third of inactive protein-coding genes. Using 2,821 paired whole-genome and RNA sequencing samples, we identified that misexpression events are enriched in cis for rare structural variants. We established putative mechanisms through which a subset of SVs lead to gene misexpression, including transcriptional readthrough, transcript fusions, and gene inversion. Overall, we develop misexpression as a type of transcriptomic outlier analysis and extend our understanding of the variety of mechanisms by which genetic variants can influence gene expression.


Subject(s)
Gene Expression Regulation , Humans , Sequence Analysis, RNA , Genetic Variation , Genomic Structural Variation/genetics , Transcriptome/genetics , Blood Donors
6.
Am J Hum Genet ; 110(3): 487-498, 2023 03 02.
Article in English | MEDLINE | ID: mdl-36809768

ABSTRACT

Genome-wide association studies (GWASs) have established the contribution of common and low-frequency variants to metabolic blood measurements in the UK Biobank (UKB). To complement existing GWAS findings, we assessed the contribution of rare protein-coding variants in relation to 355 metabolic blood measurements-including 325 predominantly lipid-related nuclear magnetic resonance (NMR)-derived blood metabolite measurements (Nightingale Health Plc) and 30 clinical blood biomarkers-using 412,393 exome sequences from four genetically diverse ancestries in the UKB. Gene-level collapsing analyses were conducted to evaluate a diverse range of rare-variant architectures for the metabolic blood measurements. Altogether, we identified significant associations (p < 1 × 10-8) for 205 distinct genes that involved 1,968 significant relationships for the Nightingale blood metabolite measurements and 331 for the clinical blood biomarkers. These include associations for rare non-synonymous variants in PLIN1 and CREB3L3 with lipid metabolite measurements and SYT7 with creatinine, among others, which may not only provide insights into novel biology but also deepen our understanding of established disease mechanisms. Of the study-wide significant clinical biomarker associations, 40% were not previously detected on analyzing coding variants in a GWAS in the same cohort, reinforcing the importance of studying rare variation to fully understand the genetic architecture of metabolic blood measurements.


Subject(s)
Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Biological Specimen Banks , Biomarkers , Lipids , United Kingdom , Polymorphism, Single Nucleotide
7.
Nature ; 558(7708): 73-79, 2018 06.
Article in English | MEDLINE | ID: mdl-29875488

ABSTRACT

Although plasma proteins have important roles in biological processes and are the direct targets of many drugs, the genetic factors that control inter-individual variation in plasma protein levels are not well understood. Here we characterize the genetic architecture of the human plasma proteome in healthy blood donors from the INTERVAL study. We identify 1,927 genetic associations with 1,478 proteins, a fourfold increase on existing knowledge, including trans associations for 1,104 proteins. To understand the consequences of perturbations in plasma protein levels, we apply an integrated approach that links genetic variation with biological pathway, disease, and drug databases. We show that protein quantitative trait loci overlap with gene expression quantitative trait loci, as well as with disease-associated loci, and find evidence that protein biomarkers have causal roles in disease using Mendelian randomization analysis. By linking genetic factors to diseases via specific proteins, our analyses highlight potential therapeutic targets, opportunities for matching existing drugs with new disease indications, and potential safety concerns for drugs under development.


Subject(s)
Blood Proteins/genetics , Genomics , Proteome/genetics , Female , Hepatocyte Growth Factor/genetics , Humans , Inflammatory Bowel Diseases/genetics , Male , Mutation, Missense/genetics , Myeloblastin/genetics , Positive Regulatory Domain I-Binding Factor 1/genetics , Proto-Oncogene Proteins/genetics , Quantitative Trait Loci/genetics , Vasculitis/genetics , alpha 1-Antitrypsin/genetics
8.
Genome Res ; 28(12): 1779-1790, 2018 12.
Article in English | MEDLINE | ID: mdl-30355600

ABSTRACT

Mosaic mutations present in the germline have important implications for reproductive risk and disease transmission. We previously demonstrated a phenomenon occurring in the male germline, whereby specific mutations arising spontaneously in stem cells (spermatogonia) lead to clonal expansion, resulting in elevated mutation levels in sperm over time. This process, termed "selfish spermatogonial selection," explains the high spontaneous birth prevalence and strong paternal age-effect of disorders such as achondroplasia and Apert, Noonan and Costello syndromes, with direct experimental evidence currently available for specific positions of six genes (FGFR2, FGFR3, RET, PTPN11, HRAS, and KRAS). We present a discovery screen to identify novel mutations and genes showing evidence of positive selection in the male germline, by performing massively parallel simplex PCR using RainDance technology to interrogate mutational hotspots in 67 genes (51.5 kb in total) in 276 biopsies of testes from five men (median age, 83 yr). Following ultradeep sequencing (about 16,000×), development of a low-frequency variant prioritization strategy, and targeted validation, we identified 61 distinct variants present at frequencies as low as 0.06%, including 54 variants not previously directly associated with selfish selection. The majority (80%) of variants identified have previously been implicated in developmental disorders and/or oncogenesis and include mutations in six newly associated genes (BRAF, CBL, MAP2K1, MAP2K2, RAF1, and SOS1), all of which encode components of the RAS-MAPK pathway and activate signaling. Our findings extend the link between mutations dysregulating the RAS-MAPK pathway and selfish selection, and show that the aging male germline is a repository for such deleterious mutations.


Subject(s)
Mitogen-Activated Protein Kinases/metabolism , Mutation , Signal Transduction , Testis/metabolism , ras Proteins/metabolism , Aged , Aged, 80 and over , Genetic Variation , Humans , Male , Middle Aged
9.
BMC Med ; 19(1): 232, 2021 09 10.
Article in English | MEDLINE | ID: mdl-34503513

ABSTRACT

BACKGROUND: Genetic, lifestyle, and environmental factors can lead to perturbations in circulating lipid levels and increase the risk of cardiovascular and metabolic diseases. However, how changes in individual lipid species contribute to disease risk is often unclear. Moreover, little is known about the role of lipids on cardiovascular disease in Pakistan, a population historically underrepresented in cardiovascular studies. METHODS: We characterised the genetic architecture of the human blood lipidome in 5662 hospital controls from the Pakistan Risk of Myocardial Infarction Study (PROMIS) and 13,814 healthy British blood donors from the INTERVAL study. We applied a candidate causal gene prioritisation tool to link the genetic variants associated with each lipid to the most likely causal genes, and Gaussian Graphical Modelling network analysis to identify and illustrate relationships between lipids and genetic loci. RESULTS: We identified 253 genetic associations with 181 lipids measured using direct infusion high-resolution mass spectrometry in PROMIS, and 502 genetic associations with 244 lipids in INTERVAL. Our analyses revealed new biological insights at genetic loci associated with cardiometabolic diseases, including novel lipid associations at the LPL, MBOAT7, LIPC, APOE-C1-C2-C4, SGPP1, and SPTLC3 loci. CONCLUSIONS: Our findings, generated using a distinctive lipidomics platform in an understudied South Asian population, strengthen and expand the knowledge base of the genetic determinants of lipids and their association with cardiometabolic disease-related loci.


Subject(s)
Genome-Wide Association Study , Myocardial Infarction , Asian People/genetics , Genetic Predisposition to Disease , Humans , Lipids , Polymorphism, Single Nucleotide , White People
10.
Nucleic Acids Res ; 47(1): e3, 2019 01 10.
Article in English | MEDLINE | ID: mdl-30239796

ABSTRACT

Quantitative trait locus (QTL) mapping of molecular phenotypes such as metabolites, lipids and proteins through genome-wide association studies represents a powerful means of highlighting molecular mechanisms relevant to human diseases. However, a major challenge of this approach is to identify the causal gene(s) at the observed QTLs. Here, we present a framework for the 'Prioritization of candidate causal Genes at Molecular QTLs' (ProGeM), which incorporates biological domain-specific annotation data alongside genome annotation data from multiple repositories. We assessed the performance of ProGeM using a reference set of 227 previously reported and extensively curated metabolite QTLs. For 98% of these loci, the expert-curated gene was one of the candidate causal genes prioritized by ProGeM. Benchmarking analyses revealed that 69% of the causal candidates were nearest to the sentinel variant at the investigated molecular QTLs, indicating that genomic proximity is the most reliable indicator of 'true positive' causal genes. In contrast, cis-gene expression QTL data led to three false positive candidate causal gene assignments for every one true positive assignment. We provide evidence that these conclusions also apply to other molecular phenotypes, suggesting that ProGeM is a powerful and versatile tool for annotating molecular QTLs. ProGeM is freely available via GitHub.


Subject(s)
Genetic Association Studies , Genome-Wide Association Study/methods , Molecular Sequence Annotation/methods , Quantitative Trait Loci/genetics , Chromosome Mapping/methods , Humans , Lipids/genetics , Phenotype , Proteins/genetics
11.
Bioessays ; 40(2)2018 02.
Article in English | MEDLINE | ID: mdl-29251357

ABSTRACT

Epigenetic and transcriptional variability contribute to the vast diversity of cellular and organismal phenotypes and are key in human health and disease. In this review, we describe different types, sources, and determinants of epigenetic and transcriptional variability, enabling cells and organisms to adapt and evolve to a changing environment. We highlight the latest research and hypotheses on how chromatin structure and the epigenome influence gene expression variability. Further, we provide an overview of challenges in the analysis of biological variability. An improved understanding of the molecular mechanisms underlying epigenetic and transcriptional variability, at both the intra- and inter-individual level, provides great opportunity for disease prevention, better therapeutic approaches, and personalized medicine.


Subject(s)
Adaptation, Physiological/genetics , Biological Variation, Population/genetics , Epigenesis, Genetic , Genetic Variation , Transcription, Genetic , Biological Variation, Individual , Chromatin/genetics , Humans , Precision Medicine
12.
J Proteome Res ; 18(6): 2397-2410, 2019 06 07.
Article in English | MEDLINE | ID: mdl-30887811

ABSTRACT

Direct infusion high-resolution mass spectrometry (DIHRMS) is a novel, high-throughput approach to rapidly and accurately profile hundreds of lipids in human serum without prior chromatography, facilitating in-depth lipid phenotyping for large epidemiological studies to reveal the detailed associations of individual lipids with coronary heart disease (CHD) risk factors. Intact lipid profiling by DIHRMS was performed on 5662 serum samples from healthy participants in the Pakistan Risk of Myocardial Infarction Study (PROMIS). We developed a novel semi-targeted peak-picking algorithm to detect mass-to-charge ratios in positive and negative ionization modes. We analyzed lipid partial correlations, assessed the association of lipid principal components with established CHD risk factors and genetic variants, and examined differences between lipids for a common genetic polymorphism. The DIHRMS method provided information on 360 lipids (including fatty acyls, glycerolipids, glycerophospholipids, sphingolipids, and sterol lipids), with a median coefficient of variation of 11.6% (range: 5.4-51.9). The lipids were highly correlated and exhibited a range of associations with clinical chemistry biomarkers and lifestyle factors. This platform can provide many novel insights into the effects of physiology and lifestyle on lipid metabolism, genetic determinants of lipids, and the relationship between individual lipids and CHD risk factors.


Subject(s)
Biomarkers/blood , Coronary Disease/genetics , Lipids/genetics , Coronary Disease/blood , Coronary Disease/pathology , Female , Genetic Variation , Glycerophospholipids/blood , Humans , Lipid Metabolism/genetics , Lipids/blood , Male , Middle Aged , Risk Factors , Sphingolipids/blood , Sphingolipids/genetics , Sterols/blood
13.
Hum Mol Genet ; 26(8): 1584-1596, 2017 Apr 15.
Article in English | MEDLINE | ID: mdl-28334838

ABSTRACT

The Asp358Ala variant in the interleukin-6 receptor (IL-6R) gene has been implicated in asthma, autoimmune and cardiovascular disorders, but its role in other respiratory conditions such as chronic obstructive pulmonary disease (COPD) has not been investigated. The aims of this study were to evaluate whether there is an association between Asp358Ala and COPD or asthma risk, and to explore the role of the Asp358Ala variant in sIL-6R shedding from neutrophils and its pro-inflammatory effects in the lung. We undertook logistic regression using data from the UK Biobank and the ECLIPSE COPD cohort. Results were meta-analyzed with summary data from a further three COPD cohorts (7,519 total cases and 35,653 total controls), showing no association between Asp358Ala and COPD (OR = 1.02 [95% CI: 0.96, 1.07]). Data from the UK Biobank showed a positive association between the Asp358Ala variant and atopic asthma (OR = 1.07 [1.01, 1.13]). In a series of in vitro studies using blood samples from 37 participants, we found that shedding of sIL-6R from neutrophils was greater in carriers of the Asp358Ala minor allele than in non-carriers. Human pulmonary artery endothelial cells cultured with serum from homozygous carriers showed an increase in MCP-1 release in carriers of the minor allele, with the difference eliminated upon addition of tocilizumab. In conclusion, there is evidence that neutrophils may be an important source of sIL-6R in the lungs, and the Asp358Ala variant may have pro-inflammatory effects in lung cells. However, we were unable to identify evidence for an association between Asp358Ala and COPD.


Subject(s)
Asthma/genetics , Genetic Association Studies , Pulmonary Disease, Chronic Obstructive/genetics , Receptors, Interleukin-6/genetics , Asthma/blood , Asthma/pathology , Female , Humans , Lung/metabolism , Lung/pathology , Male , Neutrophils/metabolism , Neutrophils/pathology , Pulmonary Disease, Chronic Obstructive/blood , Pulmonary Disease, Chronic Obstructive/pathology
14.
Lancet ; 391(10129): 1513-1523, 2018 04 14.
Article in English | MEDLINE | ID: mdl-29676281

ABSTRACT

BACKGROUND: Low-risk limits recommended for alcohol consumption vary substantially across different national guidelines. To define thresholds associated with lowest risk for all-cause mortality and cardiovascular disease, we studied individual-participant data from 599 912 current drinkers without previous cardiovascular disease. METHODS: We did a combined analysis of individual-participant data from three large-scale data sources in 19 high-income countries (the Emerging Risk Factors Collaboration, EPIC-CVD, and the UK Biobank). We characterised dose-response associations and calculated hazard ratios (HRs) per 100 g per week of alcohol (12·5 units per week) across 83 prospective studies, adjusting at least for study or centre, age, sex, smoking, and diabetes. To be eligible for the analysis, participants had to have information recorded about their alcohol consumption amount and status (ie, non-drinker vs current drinker), plus age, sex, history of diabetes and smoking status, at least 1 year of follow-up after baseline, and no baseline history of cardiovascular disease. The main analyses focused on current drinkers, whose baseline alcohol consumption was categorised into eight predefined groups according to the amount in grams consumed per week. We assessed alcohol consumption in relation to all-cause mortality, total cardiovascular disease, and several cardiovascular disease subtypes. We corrected HRs for estimated long-term variability in alcohol consumption using 152 640 serial alcohol assessments obtained some years apart (median interval 5·6 years [5th-95th percentile 1·04-13·5]) from 71 011 participants from 37 studies. FINDINGS: In the 599 912 current drinkers included in the analysis, we recorded 40 310 deaths and 39 018 incident cardiovascular disease events during 5·4 million person-years of follow-up. For all-cause mortality, we recorded a positive and curvilinear association with the level of alcohol consumption, with the minimum mortality risk around or below 100 g per week. Alcohol consumption was roughly linearly associated with a higher risk of stroke (HR per 100 g per week higher consumption 1·14, 95% CI, 1·10-1·17), coronary disease excluding myocardial infarction (1·06, 1·00-1·11), heart failure (1·09, 1·03-1·15), fatal hypertensive disease (1·24, 1·15-1·33); and fatal aortic aneurysm (1·15, 1·03-1·28). By contrast, increased alcohol consumption was log-linearly associated with a lower risk of myocardial infarction (HR 0·94, 0·91-0·97). In comparison to those who reported drinking >0-≤100 g per week, those who reported drinking >100-≤200 g per week, >200-≤350 g per week, or >350 g per week had lower life expectancy at age 40 years of approximately 6 months, 1-2 years, or 4-5 years, respectively. INTERPRETATION: In current drinkers of alcohol in high-income countries, the threshold for lowest risk of all-cause mortality was about 100 g/week. For cardiovascular disease subtypes other than myocardial infarction, there were no clear risk thresholds below which lower alcohol consumption stopped being associated with lower disease risk. These data support limits for alcohol consumption that are lower than those recommended in most current guidelines. FUNDING: UK Medical Research Council, British Heart Foundation, National Institute for Health Research, European Union Framework 7, and European Research Council.


Subject(s)
Alcohol Drinking/adverse effects , Alcohol Drinking/mortality , Cardiovascular Diseases/etiology , Female , Humans , Male , Middle Aged , Prospective Studies
15.
Nature ; 492(7429): 369-75, 2012 Dec 20.
Article in English | MEDLINE | ID: mdl-23222517

ABSTRACT

Anaemia is a chief determinant of global ill health, contributing to cognitive impairment, growth retardation and impaired physical capacity. To understand further the genetic factors influencing red blood cells, we carried out a genome-wide association study of haemoglobin concentration and related parameters in up to 135,367 individuals. Here we identify 75 independent genetic loci associated with one or more red blood cell phenotypes at P < 10(-8), which together explain 4-9% of the phenotypic variance per trait. Using expression quantitative trait loci and bioinformatic strategies, we identify 121 candidate genes enriched in functions relevant to red blood cell biology. The candidate genes are expressed preferentially in red blood cell precursors, and 43 have haematopoietic phenotypes in Mus musculus or Drosophila melanogaster. Through open-chromatin and coding-variant analyses we identify potential causal genetic variants at 41 loci. Our findings provide extensive new insights into genetic mechanisms and biological pathways controlling red blood cell formation and function.


Subject(s)
Erythrocytes/metabolism , Genetic Loci , Genome-Wide Association Study , Phenotype , Animals , Cell Cycle/genetics , Cytokines/metabolism , Drosophila melanogaster/genetics , Erythrocytes/cytology , Female , Gene Expression Regulation/genetics , Hematopoiesis/genetics , Hemoglobins/genetics , Humans , Male , Mice , Organ Specificity , Polymorphism, Single Nucleotide/genetics , RNA Interference , Signal Transduction/genetics
16.
Bioinformatics ; 32(20): 3207-3209, 2016 10 15.
Article in English | MEDLINE | ID: mdl-27318201

ABSTRACT

PhenoScanner is a curated database of publicly available results from large-scale genetic association studies. This tool aims to facilitate 'phenome scans', the cross-referencing of genetic variants with many phenotypes, to help aid understanding of disease pathways and biology. The database currently contains over 350 million association results and over 10 million unique genetic variants, mostly single nucleotide polymorphisms. It is accompanied by a web-based tool that queries the database for associations with user-specified variants, providing results according to the same effect and non-effect alleles for each input variant. The tool provides the option of searching for trait associations with proxies of the input variants, calculated using the European samples from 1000 Genomes and Hapmap. AVAILABILITY AND IMPLEMENTATION: PhenoScanner is available at www.phenoscanner.medschl.cam.ac.uk CONTACT: jrs95@medschl.cam.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
Databases, Factual , Genetic Association Studies , Genetic Variation , Genotype , Humans , Phenotype , Polymorphism, Single Nucleotide , Software
17.
Genome Res ; 23(7): 1130-41, 2013 Jul.
Article in English | MEDLINE | ID: mdl-23570689

ABSTRACT

Nearly three-quarters of the 143 genetic signals associated with platelet and erythrocyte phenotypes identified by meta-analyses of genome-wide association (GWA) studies are located at non-protein-coding regions. Here, we assessed the role of candidate regulatory variants associated with cell type-restricted, closely related hematological quantitative traits in biologically relevant hematopoietic cell types. We used formaldehyde-assisted isolation of regulatory elements followed by next-generation sequencing (FAIRE-seq) to map regions of open chromatin in three primary human blood cells of the myeloid lineage. In the precursors of platelets and erythrocytes, as well as in monocytes, we found that open chromatin signatures reflect the corresponding hematopoietic lineages of the studied cell types and associate with the cell type-specific gene expression patterns. Dependent on their signal strength, open chromatin regions showed correlation with promoter and enhancer histone marks, distance to the transcription start site, and ontology classes of nearby genes. Cell type-restricted regions of open chromatin were enriched in sequence variants associated with hematological indices. The majority (63.6%) of such candidate functional variants at platelet quantitative trait loci (QTLs) coincided with binding sites of five transcription factors key in regulating megakaryopoiesis. We experimentally tested 13 candidate regulatory variants at 10 platelet QTLs and found that 10 (76.9%) affected protein binding, suggesting that this is a frequent mechanism by which regulatory variants influence quantitative trait levels. Our findings demonstrate that combining large-scale GWA data with open chromatin profiles of relevant cell types can be a powerful means of dissecting the genetic architecture of closely related quantitative traits.


Subject(s)
Chromatin Assembly and Disassembly , Chromatin/metabolism , Genetic Variation , Quantitative Trait Loci , Quantitative Trait, Heritable , Regulatory Sequences, Nucleic Acid , Blood Platelets/metabolism , Cell Lineage/genetics , Chromosome Mapping , Cluster Analysis , Erythrocytes/metabolism , Gene Expression Regulation , Genome-Wide Association Study , Histones/metabolism , Humans , Myeloid Cells/metabolism , Nucleosomes/metabolism , Organ Specificity/genetics , Phenotype , Polymorphism, Single Nucleotide
18.
Bioessays ; 36(2): 191-9, 2014 Feb.
Article in English | MEDLINE | ID: mdl-24311363

ABSTRACT

Understanding the functional mechanisms underlying genetic signals associated with complex traits and common diseases, such as cancer, diabetes and Alzheimer's disease, is a formidable challenge. Many genetic signals discovered through genome-wide association studies map to non-protein coding sequences, where their molecular consequences are difficult to evaluate. This article summarizes concepts for the systematic interpretation of non-coding genetic signals using genome annotation data sets in different cellular systems. We outline strategies for the global analysis of multiple association intervals and the in-depth molecular investigation of individual intervals. We highlight experimental techniques to validate candidate (potential causal) regulatory variants, with a focus on novel genome-editing techniques including CRISPR/Cas9. These approaches are also applicable to low-frequency and rare variants, which have become increasingly important in genomic studies of complex traits and diseases. There is a pressing need to translate genetic signals into biological mechanisms, leading to prognostic, diagnostic and therapeutic advances.


Subject(s)
Genetic Variation/genetics , Computational Biology , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans
19.
Blood ; 120(24): 4859-68, 2012 Dec 06.
Article in English | MEDLINE | ID: mdl-22972982

ABSTRACT

We recently identified 68 genomic loci where common sequence variants are associated with platelet count and volume. Platelets are formed in the bone marrow by megakaryocytes, which are derived from hematopoietic stem cells by a process mainly controlled by transcription factors. The homeobox transcription factor MEIS1 is uniquely transcribed in megakaryocytes and not in the other lineage-committed blood cells. By ChIP-seq, we show that 5 of the 68 loci pinpoint a MEIS1 binding event within a group of 252 MK-overexpressed genes. In one such locus in DNM3, regulating platelet volume, the MEIS1 binding site falls within a region acting as an alternative promoter that is solely used in megakaryocytes, where allelic variation dictates different levels of a shorter transcript. The importance of dynamin activity to the latter stages of thrombopoiesis was confirmed by the observation that the inhibitor Dynasore reduced murine proplatelet for-mation in vitro.


Subject(s)
Blood Platelets/metabolism , Dynamin III/genetics , Genome, Human/genetics , Homeodomain Proteins/genetics , Megakaryocytes/metabolism , Neoplasm Proteins/genetics , Promoter Regions, Genetic/genetics , Animals , Binding Sites/genetics , Blood Platelets/drug effects , Cell Line, Tumor , Cell Lineage/genetics , Cells, Cultured , Chromatin Immunoprecipitation , Gene Expression , Genetic Variation , Homeodomain Proteins/metabolism , Humans , Hydrazones/pharmacology , Mice , Myeloid Ecotropic Viral Integration Site 1 Protein , Neoplasm Proteins/metabolism , Platelet Count , Polymorphism, Single Nucleotide , Reverse Transcriptase Polymerase Chain Reaction , Sequence Analysis, DNA , Transcription Initiation Site , Transcription, Genetic
20.
PLoS Genet ; 7(6): e1002139, 2011 Jun.
Article in English | MEDLINE | ID: mdl-21738486

ABSTRACT

Turning genetic discoveries identified in genome-wide association (GWA) studies into biological mechanisms is an important challenge in human genetics. Many GWA signals map outside exons, suggesting that the associated variants may lie within regulatory regions. We applied the formaldehyde-assisted isolation of regulatory elements (FAIRE) method in a megakaryocytic and an erythroblastoid cell line to map active regulatory elements at known loci associated with hematological quantitative traits, coronary artery disease, and myocardial infarction. We showed that the two cell types exhibit distinct patterns of open chromatin and that cell-specific open chromatin can guide the finding of functional variants. We identified an open chromatin region at chromosome 7q22.3 in megakaryocytes but not erythroblasts, which harbors the common non-coding sequence variant rs342293 known to be associated with platelet volume and function. Resequencing of this open chromatin region in 643 individuals provided strong evidence that rs342293 is the only putative causative variant in this region. We demonstrated that the C- and G-alleles differentially bind the transcription factor EVI1 affecting PIK3CG gene expression in platelets and macrophages. A protein-protein interaction network including up- and down-regulated genes in Pik3cg knockout mice indicated that PIK3CG is associated with gene pathways with an established role in platelet membrane biogenesis and thrombus formation. Thus, rs342293 is the functional common variant at this locus; to the best of our knowledge this is the first such variant to be elucidated among the known platelet quantitative trait loci (QTLs). Our data suggested a molecular mechanism by which a non-coding GWA index SNP modulates platelet phenotype.


Subject(s)
Chromatin/genetics , Genome-Wide Association Study , Animals , Blood Platelets/metabolism , Chromosomes, Human, Pair 7/genetics , Class Ib Phosphatidylinositol 3-Kinase/genetics , DNA-Binding Proteins/metabolism , Erythroblasts/metabolism , Female , Gene Expression Profiling , Humans , MDS1 and EVI1 Complex Locus Protein , Macrophages/metabolism , Megakaryocytes/metabolism , Mice , Mice, Inbred C57BL , Mice, Knockout , Models, Genetic , Phenotype , Proto-Oncogenes , Quantitative Trait Loci , Signal Transduction/genetics , Transcription Factors/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL