RESUMO
Factor VII (FVII) is an important component of the coagulation cascade. Few genetic loci regulating FVII activity and/or levels have been discovered to date. We conducted a meta-analysis of 9 genome-wide association studies of plasma FVII levels (7 FVII activity and 2 FVII antigen) among 27 495 participants of European and African ancestry. Each study performed ancestry-specific association analyses. Inverse variance weighted meta-analysis was performed within each ancestry group and then combined for a trans-ancestry meta-analysis. Our primary analysis included the 7 studies that measured FVII activity, and a secondary analysis included all 9 studies. We provided functional genomic validation for newly identified significant loci by silencing candidate genes in a human liver cell line (HuH7) using small-interfering RNA and then measuring F7 messenger RNA and FVII protein expression. Lastly, we used meta-analysis results to perform Mendelian randomization analysis to estimate the causal effect of FVII activity on coronary artery disease, ischemic stroke (IS), and venous thromboembolism. We identified 2 novel (REEP3 and JAZF1-AS1) and 6 known loci associated with FVII activity, explaining 19.0% of the phenotypic variance. Adding FVII antigen data to the meta-analysis did not result in the discovery of further loci. Silencing REEP3 in HuH7 cells upregulated FVII, whereas silencing JAZF1 downregulated FVII. Mendelian randomization analyses suggest that FVII activity has a positive causal effect on the risk of IS. Variants at REEP3 and JAZF1 contribute to FVII activity by regulating F7 expression levels. FVII activity appears to contribute to the etiology of IS in the general population.
Assuntos
Isquemia Encefálica/etiologia , Fator VII/genética , Estudo de Associação Genômica Ampla , Proteínas de Membrana Transportadoras/genética , Proteínas de Neoplasias/genética , Polimorfismo de Nucleotídeo Único , Acidente Vascular Cerebral/etiologia , Isquemia Encefálica/metabolismo , Isquemia Encefálica/patologia , Proteínas Correpressoras , Estudos de Coortes , Doença da Artéria Coronariana/etiologia , Doença da Artéria Coronariana/metabolismo , Doença da Artéria Coronariana/patologia , Proteínas de Ligação a DNA , Fator VII/metabolismo , Feminino , Seguimentos , Loci Gênicos , Predisposição Genética para Doença , Humanos , Masculino , Proteínas de Membrana Transportadoras/metabolismo , Análise da Randomização Mendeliana , Pessoa de Meia-Idade , Proteínas de Neoplasias/metabolismo , Fenótipo , Prognóstico , Acidente Vascular Cerebral/metabolismo , Acidente Vascular Cerebral/patologia , Tromboembolia Venosa/etiologia , Tromboembolia Venosa/metabolismo , Tromboembolia Venosa/patologiaRESUMO
Recent advances in highly multiplexed immunoassays have allowed systematic large-scale measurement of hundreds of plasma proteins in large cohort studies. In combination with genotyping, such studies offer the prospect to 1) identify mechanisms involved with regulation of protein expression in plasma, and 2) determine whether the plasma proteins are likely to be causally implicated in disease. We report here the results of genome-wide association (GWA) studies of 83 proteins considered relevant to cardiovascular disease (CVD), measured in 3,394 individuals with multiple CVD risk factors. We identified 79 genome-wide significant (p<5e-8) association signals, 55 of which replicated at P<0.0007 in separate validation studies (n = 2,639 individuals). Using automated text mining, manual curation, and network-based methods incorporating information on expression quantitative trait loci (eQTL), we propose plausible causal mechanisms for 25 trans-acting loci, including a potential post-translational regulation of stem cell factor by matrix metalloproteinase 9 and receptor-ligand pairs such as RANK-RANK ligand. Using public GWA study data, we further evaluate all 79 loci for their causal effect on coronary artery disease, and highlight several potentially causal associations. Overall, a majority of the plasma proteins studied showed evidence of regulation at the genetic level. Our results enable future studies of the causal architecture of human disease, which in turn should aid discovery of new drug targets.
Assuntos
Biomarcadores/sangue , Proteínas Sanguíneas/genética , Doenças Cardiovasculares/sangue , Doenças Cardiovasculares/genética , Locos de Características Quantitativas , Doença da Artéria Coronariana/sangue , Doença da Artéria Coronariana/genética , Feminino , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , MasculinoRESUMO
There is a clear clinical need for high-specificity plasma biomarkers for predicting risk of venous thromboembolism (VTE), but thus far, such markers have remained elusive. Utilizing affinity reagents from the Human Protein Atlas project and multiplexed immuoassays, we extensively analyzed plasma samples from 2 individual studies to identify candidate protein markers associated with VTE risk. We screened plasma samples from 88 VTE cases and 85 matched controls, collected as part of the Swedish "Venous Thromboembolism Biomarker Study," using suspension bead arrays composed of 755 antibodies targeting 408 candidate proteins. We identified significant associations between VTE occurrence and plasma levels of human immunodeficiency virus type I enhancer binding protein 1 (HIVEP1), von Willebrand factor (VWF), glutathione peroxidase 3 (GPX3), and platelet-derived growth factor ß (PDGFB). For replication, we profiled plasma samples of 580 cases and 589 controls from the French FARIVE study. These results confirmed the association of VWF and PDGFB with VTE after correction for multiple testing, whereas only weak trends were observed for HIVEP1 and GPX3. Although plasma levels of VWF and PDGFB correlated modestly (ρ â¼ 0.30) with each other, they were independently associated with VTE risk in a joint model in FARIVE (VWF P < .001; PDGFB P = .002). PDGFΒ was verified as the target of the capture antibody by immunocapture mass spectrometry and sandwich enzyme-linked immunosorbent assay. In conclusion, we demonstrate that high-throughput affinity plasma proteomic profiling is a valuable research strategy to identify potential candidate biomarkers for thrombosis-related disorders, and our study suggests a novel association of PDGFB plasma levels with VTE.
Assuntos
Proteômica , Proteínas Proto-Oncogênicas c-sis/sangue , Tromboembolia Venosa/sangue , Biomarcadores/sangue , Proteínas de Ligação a DNA/sangue , Feminino , Glutationa Peroxidase/sangue , Humanos , Masculino , Fatores de Risco , Fatores de Transcrição/sangue , Fator de von Willebrand/metabolismoRESUMO
A complex disease has, by definition, multiple genetic causes. In theory, these causes could be identified individually, but their identification will likely benefit from informed use of anticipated interactions between causes. In addition, characterizing and understanding interactions must be considered key to revealing the etiology of any complex disease. Large-scale collaborative efforts are now paving the way for comprehensive studies of interaction. As a consequence, there is a need for methods with a computational efficiency sufficient for modern data sets as well as for improvements of statistical accuracy and power. Another issue is that, currently, the relation between different methods for interaction inference is in many cases not transparent, complicating the comparison and interpretation of results between different interaction studies. In this paper we present computationally efficient tests of interaction for the complete family of generalized linear models (GLMs). The tests can be applied for inference of single or multiple interaction parameters, but we show, by simulation, that jointly testing the full set of interaction parameters yields superior power and control of false positive rate. Based on these tests we also describe how to combine results from multiple independent studies of interaction in a meta-analysis. We investigate the impact of several assumptions commonly made when modeling interactions. We also show that, across the important class of models with a full set of interaction parameters, jointly testing the interaction parameters yields identical results. Further, we apply our method to genetic data for cardiovascular disease. This allowed us to identify a putative interaction involved in Lp(a) plasma levels between two 'tag' variants in the LPA locus (p = 2.42 â 10-09) as well as replicate the interaction (p = 6.97 â 10-07). Finally, our meta-analysis method is used in a small (N = 16,181) study of interactions in myocardial infarction.
Assuntos
Mapeamento Cromossômico/métodos , Epistasia Genética/genética , Estudos de Associação Genética/métodos , Estudo de Associação Genômica Ampla/métodos , Modelos Lineares , Modelos Genéticos , Algoritmos , Animais , Humanos , Modelos TeóricosRESUMO
Despite the success of genome-wide association studies in medical genetics, the underlying genetics of many complex diseases remains enigmatic. One plausible reason for this could be the failure to account for the presence of genetic interactions in current analyses. Exhaustive investigations of interactions are typically infeasible because the vast number of possible interactions impose hard statistical and computational challenges. There is, therefore, a need for computationally efficient methods that build on models appropriately capturing interaction. We introduce a new methodology where we augment the interaction hypothesis with a set of simpler hypotheses that are tested, in order of their complexity, against a saturated alternative hypothesis representing interaction. This sequential testing provides an efficient way to reduce the number of non-interacting variant pairs before the final interaction test. We devise two different methods, one that relies on a priori estimated numbers of marginally associated variants to correct for multiple tests, and a second that does this adaptively. We show that our methodology in general has an improved statistical power in comparison to seven other methods, and, using the idea of closed testing, that it controls the family-wise error rate. We apply our methodology to genetic data from the PROCARDIS coronary artery disease case/control cohort and discover three distinct interactions. While analyses on simulated data suggest that the statistical power may suffice for an exhaustive search of all variant pairs in ideal cases, we explore strategies for a priori selecting subsets of variant pairs to test. Our new methodology facilitates identification of new disease-relevant interactions from existing and future genome-wide association data, which may involve genes with previously unknown association to the disease. Moreover, it enables construction of interaction networks that provide a systems biology view of complex diseases, serving as a basis for more comprehensive understanding of disease pathophysiology and its clinical consequences.
Assuntos
Epistasia Genética , Estudo de Associação Genômica Ampla , Funções Verossimilhança , Humanos , Modelos TeóricosRESUMO
In the version of the article published, the surname of author Aaron Isaacs is misspelled as Issacs.
RESUMO
Reads from paired-end and mate-pair libraries are often utilized to find structural variation in genomes, and one common approach is to use their fragment length for detection. After aligning read pairs to the reference, read pair distances are analyzed for statistically significant deviations. However, previously proposed methods are based on a simplified model of observed fragment lengths that does not agree with data. We show how this model limits statistical analysis of identifying variants and propose a new model by adapting a model we have previously introduced for contig scaffolding, which agrees with data. From this model, we derive an improved null hypothesis that when applied in the variant caller CLEVER, reduces the number of false positives and corrects a bias that contributes to more deletion calls than insertion calls. We advise developers of variant callers with statistical fragment length-based methods to adapt the concepts in our proposed model and null hypothesis.
Assuntos
Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Metagenômica/métodos , Análise de Sequência de DNA/métodos , Software , Viés , Genoma Humano , Variação Estrutural do Genoma , Humanos , Modelos GenéticosRESUMO
An increasing number of genome-wide association (GWA) studies are now using the higher resolution 1000 Genomes Project reference panel (1000G) for imputation, with the expectation that 1000G imputation will lead to the discovery of additional associated loci when compared to HapMap imputation. In order to assess the improvement of 1000G over HapMap imputation in identifying associated loci, we compared the results of GWA studies of circulating fibrinogen based on the two reference panels. Using both HapMap and 1000G imputation we performed a meta-analysis of 22 studies comprising the same 91,953 individuals. We identified six additional signals using 1000G imputation, while 29 loci were associated using both HapMap and 1000G imputation. One locus identified using HapMap imputation was not significant using 1000G imputation. The genome-wide significance threshold of 5×10-8 is based on the number of independent statistical tests using HapMap imputation, and 1000G imputation may lead to further independent tests that should be corrected for. When using a stricter Bonferroni correction for the 1000G GWA study (P-value < 2.5×10-8), the number of loci significant only using HapMap imputation increased to 4 while the number of loci significant only using 1000G decreased to 5. In conclusion, 1000G imputation enabled the identification of 20% more loci than HapMap imputation, although the advantage of 1000G imputation became less clear when a stricter Bonferroni correction was used. More generally, our results provide insights that are applicable to the implementation of other dense reference panels that are under development.
Assuntos
Estudo de Associação Genômica Ampla , Projeto HapMap , HumanosRESUMO
To characterize type 2 diabetes (T2D)-associated variation across the allele frequency spectrum, we conducted a meta-analysis of genome-wide association data from 26,676 T2D case and 132,532 control subjects of European ancestry after imputation using the 1000 Genomes multiethnic reference panel. Promising association signals were followed up in additional data sets (of 14,545 or 7,397 T2D case and 38,994 or 71,604 control subjects). We identified 13 novel T2D-associated loci (P < 5 × 10-8), including variants near the GLP2R, GIP, and HLA-DQA1 genes. Our analysis brought the total number of independent T2D associations to 128 distinct signals at 113 loci. Despite substantially increased sample size and more complete coverage of low-frequency variation, all novel associations were driven by common single nucleotide variants. Credible sets of potentially causal variants were generally larger than those based on imputation with earlier reference panels, consistent with resolution of causal signals to common risk haplotypes. Stratification of T2D-associated loci based on T2D-related quantitative trait associations revealed tissue-specific enrichment of regulatory annotations in pancreatic islet enhancers for loci influencing insulin secretion and in adipocytes, monocytes, and hepatocytes for insulin action-associated loci. These findings highlight the predominant role played by common variants of modest effect and the diversity of biological mechanisms influencing T2D pathophysiology.
Assuntos
Diabetes Mellitus Tipo 2/genética , Regulação da Expressão Gênica/fisiologia , Estudo de Associação Genômica Ampla , População Branca , Variação Genética , HumanosRESUMO
Large-scale whole-genome sequence data sets offer novel opportunities to identify genetic variation underlying human traits. Here we apply genotype imputation based on whole-genome sequence data from the UK10K and 1000 Genomes Project into 35,981 study participants of European ancestry, followed by association analysis with 20 quantitative cardiometabolic and hematological traits. We describe 17 new associations, including 6 rare (minor allele frequency (MAF) < 1%) or low-frequency (1% < MAF < 5%) variants with platelet count (PLT), red blood cell indices (MCH and MCV) and HDL cholesterol. Applying fine-mapping analysis to 233 known and new loci associated with the 20 traits, we resolve the associations of 59 loci to credible sets of 20 or fewer variants and describe trait enrichments within regions of predicted regulatory function. These findings improve understanding of the allelic architecture of risk factors for cardiometabolic and hematological diseases and provide additional functional insights with the identification of potentially novel biological targets.
Assuntos
Loci Gênicos , Genoma Humano , Estudo de Associação Genômica Ampla , Cardiopatias/genética , Doenças Hematológicas/genética , Feminino , Predisposição Genética para Doença , Variação Genética , Humanos , Masculino , Locos de Características Quantitativas , Análise de Sequência de DNARESUMO
We performed fine mapping of 39 established type 2 diabetes (T2D) loci in 27,206 cases and 57,574 controls of European ancestry. We identified 49 distinct association signals at these loci, including five mapping in or near KCNQ1. 'Credible sets' of the variants most likely to drive each distinct signal mapped predominantly to noncoding sequence, implying that association with T2D is mediated through gene regulation. Credible set variants were enriched for overlap with FOXA2 chromatin immunoprecipitation binding sites in human islet and liver cells, including at MTNR1B, where fine mapping implicated rs10830963 as driving T2D association. We confirmed that the T2D risk allele for this SNP increases FOXA2-bound enhancer activity in islet- and liver-derived cells. We observed allele-specific differences in NEUROD1 binding in islet-derived cells, consistent with evidence that the T2D risk allele increases islet MTNR1B expression. Our study demonstrates how integration of genetic and genomic information can define molecular mechanisms through which variants underlying association signals exert their effects on disease.