RESUMEN
Critical illness in COVID-19 is an extreme and clinically homogeneous disease phenotype that we have previously shown1 to be highly efficient for discovery of genetic associations2. Despite the advanced stage of illness at presentation, we have shown that host genetics in patients who are critically ill with COVID-19 can identify immunomodulatory therapies with strong beneficial effects in this group3. Here we analyse 24,202 cases of COVID-19 with critical illness comprising a combination of microarray genotype and whole-genome sequencing data from cases of critical illness in the international GenOMICC (11,440 cases) study, combined with other studies recruiting hospitalized patients with a strong focus on severe and critical disease: ISARIC4C (676 cases) and the SCOURGE consortium (5,934 cases). To put these results in the context of existing work, we conduct a meta-analysis of the new GenOMICC genome-wide association study (GWAS) results with previously published data. We find 49 genome-wide significant associations, of which 16 have not been reported previously. To investigate the therapeutic implications of these findings, we infer the structural consequences of protein-coding variants, and combine our GWAS results with gene expression data using a monocyte transcriptome-wide association study (TWAS) model, as well as gene and protein expression using Mendelian randomization. We identify potentially druggable targets in multiple systems, including inflammatory signalling (JAK1), monocyte-macrophage activation and endothelial permeability (PDE4A), immunometabolism (SLC2A5 and AK5), and host factors required for viral entry and replication (TMPRSS2 and RAB2A).
Asunto(s)
COVID-19 , Enfermedad Crítica , Predisposición Genética a la Enfermedad , Variación Genética , Estudio de Asociación del Genoma Completo , Humanos , COVID-19/genética , Predisposición Genética a la Enfermedad/genética , Variación Genética/genética , Genotipo , Técnicas de Genotipaje , Monocitos/metabolismo , Fenotipo , Proteínas de Unión al GTP rab/genética , Transcriptoma , Secuenciación Completa del GenomaRESUMEN
Critical COVID-19 is caused by immune-mediated inflammatory lung injury. Host genetic variation influences the development of illness requiring critical care1 or hospitalization2-4 after infection with SARS-CoV-2. The GenOMICC (Genetics of Mortality in Critical Care) study enables the comparison of genomes from individuals who are critically ill with those of population controls to find underlying disease mechanisms. Here we use whole-genome sequencing in 7,491 critically ill individuals compared with 48,400 controls to discover and replicate 23 independent variants that significantly predispose to critical COVID-19. We identify 16 new independent associations, including variants within genes that are involved in interferon signalling (IL10RB and PLSCR1), leucocyte differentiation (BCL11A) and blood-type antigen secretor status (FUT2). Using transcriptome-wide association and colocalization to infer the effect of gene expression on disease severity, we find evidence that implicates multiple genes-including reduced expression of a membrane flippase (ATP11A), and increased expression of a mucin (MUC1)-in critical disease. Mendelian randomization provides evidence in support of causal roles for myeloid cell adhesion molecules (SELE, ICAM5 and CD209) and the coagulation factor F8, all of which are potentially druggable targets. Our results are broadly consistent with a multi-component model of COVID-19 pathophysiology, in which at least two distinct mechanisms can predispose to life-threatening disease: failure to control viral replication; or an enhanced tendency towards pulmonary inflammation and intravascular coagulation. We show that comparison between cases of critical illness and population controls is highly efficient for the detection of therapeutically relevant mechanisms of disease.
Asunto(s)
COVID-19 , Enfermedad Crítica , Genoma Humano , Interacciones Huésped-Patógeno , Secuenciación Completa del Genoma , Transportadoras de Casetes de Unión a ATP , COVID-19/genética , COVID-19/mortalidad , COVID-19/patología , COVID-19/virología , Moléculas de Adhesión Celular , Cuidados Críticos , Enfermedad Crítica/mortalidad , Selectina E , Factor VIII , Fucosiltransferasas , Genoma Humano/genética , Estudio de Asociación del Genoma Completo , Interacciones Huésped-Patógeno/genética , Humanos , Subunidad beta del Receptor de Interleucina-10 , Lectinas Tipo C , Mucina-1 , Proteínas del Tejido Nervioso , Proteínas de Transferencia de Fosfolípidos , Receptores de Superficie Celular , Proteínas Represoras , SARS-CoV-2/patogenicidad , Galactósido 2-alfa-L-FucosiltransferasaRESUMEN
Host-mediated lung inflammation is present1, and drives mortality2, in the critical illness caused by coronavirus disease 2019 (COVID-19). Host genetic variants associated with critical illness may identify mechanistic targets for therapeutic development3. Here we report the results of the GenOMICC (Genetics Of Mortality In Critical Care) genome-wide association study in 2,244 critically ill patients with COVID-19 from 208 UK intensive care units. We have identified and replicated the following new genome-wide significant associations: on chromosome 12q24.13 (rs10735079, P = 1.65 × 10-8) in a gene cluster that encodes antiviral restriction enzyme activators (OAS1, OAS2 and OAS3); on chromosome 19p13.2 (rs74956615, P = 2.3 × 10-8) near the gene that encodes tyrosine kinase 2 (TYK2); on chromosome 19p13.3 (rs2109069, P = 3.98 × 10-12) within the gene that encodes dipeptidyl peptidase 9 (DPP9); and on chromosome 21q22.1 (rs2236757, P = 4.99 × 10-8) in the interferon receptor gene IFNAR2. We identified potential targets for repurposing of licensed medications: using Mendelian randomization, we found evidence that low expression of IFNAR2, or high expression of TYK2, are associated with life-threatening disease; and transcriptome-wide association in lung tissue revealed that high expression of the monocyte-macrophage chemotactic receptor CCR2 is associated with severe COVID-19. Our results identify robust genetic signals relating to key host antiviral defence mechanisms and mediators of inflammatory organ damage in COVID-19. Both mechanisms may be amenable to targeted treatment with existing drugs. However, large-scale randomized clinical trials will be essential before any change to clinical practice.
Asunto(s)
COVID-19/genética , COVID-19/fisiopatología , Enfermedad Crítica , 2',5'-Oligoadenilato Sintetasa/genética , COVID-19/patología , Cromosomas Humanos Par 12/genética , Cromosomas Humanos Par 19/genética , Cromosomas Humanos Par 21/genética , Cuidados Críticos , Dipeptidil-Peptidasas y Tripeptidil-Peptidasas/genética , Reposicionamiento de Medicamentos , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Inflamación/genética , Inflamación/patología , Inflamación/fisiopatología , Pulmón/patología , Pulmón/fisiopatología , Pulmón/virología , Masculino , Familia de Multigenes/genética , Receptor de Interferón alfa y beta/genética , Receptores CCR2/genética , TYK2 Quinasa/genética , Reino UnidoRESUMEN
ABSTRACT: Coagulation factor VIII (FVIII) and its carrier protein von Willebrand factor (VWF) are critical to coagulation and platelet aggregation. We leveraged whole-genome sequence data from the Trans-Omics for Precision Medicine (TOPMed) program along with TOPMed-based imputation of genotypes in additional samples to identify genetic associations with circulating FVIII and VWF levels in a single-variant meta-analysis, including up to 45 289 participants. Gene-based aggregate tests were implemented in TOPMed. We identified 3 candidate causal genes and tested their functional effect on FVIII release from human liver endothelial cells (HLECs) and VWF release from human umbilical vein endothelial cells. Mendelian randomization was also performed to provide evidence for causal associations of FVIII and VWF with thrombotic outcomes. We identified associations (P < 5 × 10-9) at 7 new loci for FVIII (ST3GAL4, CLEC4M, B3GNT2, ASGR1, F12, KNG1, and TREM1/NCR2) and 1 for VWF (B3GNT2). VWF, ABO, and STAB2 were associated with FVIII and VWF in gene-based analyses. Multiphenotype analysis of FVIII and VWF identified another 3 new loci, including PDIA3. Silencing of B3GNT2 and the previously reported CD36 gene decreased release of FVIII by HLECs, whereas silencing of B3GNT2, CD36, and PDIA3 decreased release of VWF by HVECs. Mendelian randomization supports causal association of higher FVIII and VWF with increased risk of thrombotic outcomes. Seven new loci were identified for FVIII and 1 for VWF, with evidence supporting causal associations of FVIII and VWF with thrombotic outcomes. B3GNT2, CD36, and PDIA3 modulate the release of FVIII and/or VWF in vitro.
Asunto(s)
Moléculas de Adhesión Celular , Factor VIII , Quininógenos , Lectinas Tipo C , Receptores de Superficie Celular , Factor de von Willebrand , Humanos , Factor de von Willebrand/genética , Factor de von Willebrand/metabolismo , Factor VIII/genética , Factor VIII/metabolismo , Polimorfismo de Nucleótido Simple , Células Endoteliales de la Vena Umbilical Humana/metabolismo , Análisis de la Aleatorización Mendeliana , Estudio de Asociación del Genoma Completo , Trombosis/genética , Trombosis/sangre , Estudios de Asociación Genética , Masculino , Células Endoteliales/metabolismo , FemeninoRESUMEN
Cardiometabolic diseases, such as type 2 diabetes and cardiovascular disease, have a high public health burden. Understanding the genetically determined regulation of proteins that are dysregulated in disease can help to dissect the complex biology underpinning them. Here, we perform a protein quantitative trait locus (pQTL) analysis of 248 serum proteins relevant to cardiometabolic processes in 2893 individuals. Meta-analyzing whole-genome sequencing (WGS) data from two Greek cohorts, MANOLIS (n = 1356; 22.5× WGS) and Pomak (n = 1537; 18.4× WGS), we detect 301 independently associated pQTL variants for 170 proteins, including 12 rare variants (minor allele frequency < 1%). We additionally find 15 pQTL variants that are rare in non-Finnish European populations but have drifted up in the frequency in the discovery cohorts here. We identify proteins causally associated with cardiometabolic traits, including Mep1b for high-density lipoprotein (HDL) levels, and describe a knock-out (KO) Mep1b mouse model. Our findings furnish insights into the genetic architecture of the serum proteome, identify new protein-disease relationships and demonstrate the importance of isolated populations in pQTL analysis.
Asunto(s)
Enfermedades Cardiovasculares , Diabetes Mellitus Tipo 2 , Animales , Ratones , Fenotipo , Secuenciación Completa del Genoma , Proteínas Sanguíneas/genética , Estudio de Asociación del Genoma CompletoRESUMEN
Changes in the N-glycosylation of immunoglobulin G (IgG) are often observed in pathological states, such as autoimmune, inflammatory, neurodegenerative, cardiovascular diseases and some types of cancer. However, in most cases, it is not clear if the disease onset causes these changes, or if the changes in IgG N-glycosylation are among the risk factors for the diseases. The aim of this study was to investigate the casual relationships between IgG N-glycosylation traits and 12 diseases, in which the alterations of IgG N-glycome were previously reported, using two sample Mendelian randomization (MR) approach. We have performed two sample MR using publicly available summary statistics of genome-wide association studies of IgG N-glycosylation and disease risks. Our results indicate positive causal effect of systemic lupus erythematosus (SLE) on the abundance of N-glycans with bisecting N-acetylglucosamine in the total IgG N-glycome. Therefore, we suggest regarding this IgG glycosylation trait as a biomarker of SLE. We also emphasize the need for more powerful GWAS studies of IgG N-glycosylation to further elucidate the causal effect of IgG N-glycome on the diseases.
Asunto(s)
Inmunoglobulina G , Lupus Eritematoso Sistémico , Estudio de Asociación del Genoma Completo , Glicosilación , Humanos , Inmunoglobulina G/genética , Inmunoglobulina G/metabolismo , Lupus Eritematoso Sistémico/genética , Polisacáridos/genéticaRESUMEN
AIMS/HYPOTHESIS: We previously demonstrated that N-glycosylation of plasma proteins and IgGs is different in children with recent-onset type 1 diabetes compared with their healthy siblings. To search for genetic variants contributing to these changes, we undertook a genetic association study of the plasma protein and IgG N-glycome in type 1 diabetes. METHODS: A total of 1105 recent-onset type 1 diabetes patients from the Danish Registry of Childhood and Adolescent Diabetes were genotyped at 183,546 genetic markers, testing these for genetic association with variable levels of 24 IgG and 39 plasma protein N-glycan traits. In the follow-up study, significant associations were validated in 455 samples. RESULTS: This study confirmed previously known plasma protein and/or IgG N-glycosylation loci (candidate genes MGAT3, MGAT5 and ST6GAL1, encoding beta-1,4-mannosyl-glycoprotein 4-beta-N-acetylglucosaminyltransferase, alpha-1,6-mannosylglycoprotein 6-beta-N-acetylglucosaminyltransferase and ST6 beta-galactoside alpha-2,6-sialyltransferase 1 gene, respectively) and identified novel associations that were not previously reported for the general European population. First, novel genetic associations of IgG-bound glycans were found with SNPs on chromosome 22 residing in two genomic intervals close to candidate gene MGAT3; these include core fucosylated digalactosylated disialylated IgG N-glycan with bisecting N-acetylglucosamine (GlcNAc) (pdiscovery=7.65 × 10-12, preplication=8.33 × 10-6 for the top associated SNP rs5757680) and core fucosylated digalactosylated glycan with bisecting GlcNAc (pdiscovery=2.88 × 10-10, preplication=3.03 × 10-3 for the top associated SNP rs137702). The most significant genetic associations of IgG-bound glycans were those with MGAT3. Second, two SNPs in high linkage disequilibrium (missense rs1047286 and synonymous rs2230203) located on chromosome 19 within the protein coding region of the complement C3 gene (C3) showed association with the oligomannose plasma protein N-glycan (pdiscovery=2.43 × 10-11, preplication=8.66 × 10-4 for the top associated SNP rs1047286). CONCLUSIONS/INTERPRETATION: This study identified novel genetic associations driving the distinct N-glycosylation of plasma proteins and IgGs identified previously at type 1 diabetes onset. Our results highlight the importance of further exploring the potential role of N-glycosylation and its influence on complement activation and type 1 diabetes susceptibility.
Asunto(s)
Diabetes Mellitus Tipo 1 , Adolescente , Niño , Humanos , Glicosilación , Diabetes Mellitus Tipo 1/genética , Glicómica/métodos , Estudios de Seguimiento , N-Acetilglucosaminiltransferasas/genética , Inmunoglobulina G/metabolismo , Proteínas Sanguíneas/metabolismo , Polisacáridos/metabolismoRESUMEN
BACKGROUND: SARS-CoV-2, the causal agent of COVID-19, enters human cells using the ACE2 (angiotensin-converting enzyme 2) protein as a receptor. ACE2 is thus key to the infection and treatment of the coronavirus. ACE2 is highly expressed in the heart and respiratory and gastrointestinal tracts, playing important regulatory roles in the cardiovascular and other biological systems. However, the genetic basis of the ACE2 protein levels is not well understood. METHODS: We have conducted the largest genome-wide association meta-analysis of plasma ACE2 levels in >28 000 individuals of the SCALLOP Consortium (Systematic and Combined Analysis of Olink Proteins). We summarize the cross-sectional epidemiological correlates of circulating ACE2. Using the summary statistics-based high-definition likelihood method, we estimate relevant genetic correlations with cardiometabolic phenotypes, COVID-19, and other human complex traits and diseases. We perform causal inference of soluble ACE2 on vascular disease outcomes and COVID-19 severity using mendelian randomization. We also perform in silico functional analysis by integrating with other types of omics data. RESULTS: We identified 10 loci, including 8 novel, capturing 30% of the heritability of the protein. We detected that plasma ACE2 was genetically correlated with vascular diseases, severe COVID-19, and a wide range of human complex diseases and medications. An X-chromosome cis-protein quantitative trait loci-based mendelian randomization analysis suggested a causal effect of elevated ACE2 levels on COVID-19 severity (odds ratio, 1.63 [95% CI, 1.10-2.42]; P=0.01), hospitalization (odds ratio, 1.52 [95% CI, 1.05-2.21]; P=0.03), and infection (odds ratio, 1.60 [95% CI, 1.08-2.37]; P=0.02). Tissue- and cell type-specific transcriptomic and epigenomic analysis revealed that the ACE2 regulatory variants were enriched for DNA methylation sites in blood immune cells. CONCLUSIONS: Human plasma ACE2 shares a genetic basis with cardiovascular disease, COVID-19, and other related diseases. The genetic architecture of the ACE2 protein is mapped, providing a useful resource for further biological and clinical studies on this coronavirus receptor.
Asunto(s)
Enzima Convertidora de Angiotensina 2 , COVID-19 , Enzima Convertidora de Angiotensina 2/genética , COVID-19/genética , Estudios Transversales , Estudio de Asociación del Genoma Completo , Humanos , Receptores de Coronavirus , SARS-CoV-2RESUMEN
The N-glycosylation of immunoglobulin G (IgG) affects its structure and function. It has been demonstrated that IgG N-glycosylation patterns are inherited as complex quantitative traits. Genome-wide association studies identified loci harboring genes encoding enzymes directly involved in protein glycosylation as well as loci likely to be involved in regulation of glycosylation biochemical pathways. Many of these loci could be linked to immune functions and risk of inflammatory and autoimmune diseases. The aim of the present study was to discover and replicate new loci associated with IgG N-glycosylation and to investigate possible pleiotropic effects of these loci onto immune function and the risk of inflammatory and autoimmune diseases. We conducted a multivariate genome-wide association analysis of 23 IgG N-glycosylation traits measured in 8090 individuals of European ancestry. The discovery stage was followed up by replication in 3147 people and in silico functional analysis. Our study increased the total number of replicated loci from 22 to 29. For the discovered loci, we suggest a number of genes potentially involved in the control of IgG N-glycosylation. Among the new loci, two (near RNF168 and TNFRSF13B) were previously implicated in rare immune deficiencies and were associated with levels of circulating immunoglobulins. For one new locus (near AP5B1/OVOL1), we demonstrated a potential pleiotropic effect on the risk of asthma. Our findings underline an important link between IgG N-glycosylation and immune function and provide new clues to understanding their interplay.
Asunto(s)
Sitios Genéticos/genética , Pleiotropía Genética/genética , Estudio de Asociación del Genoma Completo/métodos , Inmunidad/genética , Inmunoglobulina G/genética , Alelos , Enfermedades Autoinmunes/genética , Estudios de Cohortes , Simulación por Computador , Frecuencia de los Genes , Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Genotipo , Glicosilación , Humanos , Inmunoglobulina G/metabolismo , Inflamación/genética , Análisis Multivariante , Fenotipo , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo/genéticaRESUMEN
BACKGROUND: Human plasma contains a wide variety of circulating proteins. These proteins can be important clinical biomarkers in disease and also possible drug targets. Large scale genomics studies of circulating proteins can identify genetic variants that lead to relative protein abundance. METHODS: We conducted a meta-analysis on genome-wide association studies of autosomal chromosomes in 22,997 individuals of primarily European ancestry across 12 cohorts to identify protein quantitative trait loci (pQTL) for 92 cardiometabolic associated plasma proteins. RESULTS: We identified 503 (337 cis and 166 trans) conditionally independent pQTLs, including several novel variants not reported in the literature. We conducted a sex-stratified analysis and found that 118 (23.5%) of pQTLs demonstrated heterogeneity between sexes. The direction of effect was preserved but there were differences in effect size and significance. Additionally, we annotate trans-pQTLs with nearest genes and report plausible biological relationships. Using Mendelian randomization, we identified causal associations for 18 proteins across 19 phenotypes, of which 10 have additional genetic colocalization evidence. We highlight proteins associated with a constellation of cardiometabolic traits including angiopoietin-related protein 7 (ANGPTL7) and Semaphorin 3F (SEMA3F). CONCLUSION: Through large-scale analysis of protein quantitative trait loci, we provide a comprehensive overview of common variants associated with plasma proteins. We highlight possible biological relationships which may serve as a basis for further investigation into possible causal roles in cardiometabolic diseases.
RESUMEN
BACKGROUND: Atherosclerotic cardiovascular diseases (CVD) is the leading cause of death in diabetes, but the full range of biomarkers reflecting atherosclerotic burden and CVD risk in people with diabetes is unknown. Metabolomics may help identify novel biomarkers potentially involved in development of atherosclerosis. We investigated the serum metabolomic profile of subclinical atherosclerosis, measured using ankle brachial index (ABI), in people with type 2 diabetes, compared with the profile for symptomatic CVD in the same population. METHODS: The Edinburgh Type 2 Diabetes Study is a cohort of 1,066 individuals with type 2 diabetes. ABI was measured at baseline, years 4 and 10, with cardiovascular events assessed at baseline and during 10 years of follow-up. A panel of 228 metabolites was measured at baseline using nuclear magnetic resonance spectrometry, and their association with both ABI and prevalent CVD was explored using univariate regression models and least absolute shrinkage and selection operator (LASSO). Metabolites associated with baseline ABI were further explored for association with follow-up ABI and incident CVD. RESULTS: Mean (standard deviation, SD) ABI at baseline was 0.97 (0.18, N = 1025), and prevalence of CVD was 35.0%. During 10-year follow-up, mean (SD) change in ABI was + 0.006 (0.178, n = 436), and 257 CVD events occurred. Lactate, glycerol, creatinine and glycoprotein acetyls levels were associated with baseline ABI in both univariate regression [ßs (95% confidence interval, CI) ranged from - 0.025 (- 0.036, - 0.015) to - 0.023 (- 0.034, - 0.013), all p < 0.0002] and LASSO analysis. The associations remained nominally significant after adjustment for major vascular risk factors. In prospective analyses, lactate was nominally associated with ABI measured at years 4 and 10 after adjustment for baseline ABI. The four ABI-associated metabolites were all positively associated with prevalent CVD [odds ratios (ORs) ranged from 1.29 (1.13, 1.47) to 1.49 (1.29, 1.74), all p < 0.0002], and they were also positively associated with incident CVD [ORs (95% CI) ranged from 1.19 (1.02, 1.39) to 1.35 (1.17, 1.56), all p < 0.05]. CONCLUSIONS: Serum metabolites relating to glycolysis, fluid balance and inflammation were independently associated with both a marker of subclinical atherosclerosis and with symptomatic CVD in people with type 2 diabetes. Additional investigation is warranted to determine their roles as possible etiological and/or predictive biomarkers for atherosclerotic CVD.
Asunto(s)
Aterosclerosis , Enfermedades Cardiovasculares , Diabetes Mellitus Tipo 2 , Biomarcadores , Enfermedades Cardiovasculares/complicaciones , Enfermedades Cardiovasculares/diagnóstico , Enfermedades Cardiovasculares/epidemiología , Diabetes Mellitus Tipo 2/complicaciones , Diabetes Mellitus Tipo 2/diagnóstico , Diabetes Mellitus Tipo 2/epidemiología , Humanos , Lactatos , Metabolómica , Fenotipo , Estudios ProspectivosRESUMEN
Human population isolates provide a snapshot of the impact of historical demographic processes on population genetics. Such data facilitate studies of the functional impact of rare sequence variants on biomedical phenotypes, as strong genetic drift can result in higher frequencies of variants that are otherwise rare. We present the first whole genome sequencing (WGS) study of the VIKING cohort, a representative collection of samples from the isolated Shetland population in northern Scotland, and explore how its genetic characteristics compare to a mainland Scottish population. Our analyses reveal the strong contributions played by the founder effect and genetic drift in shaping genomic variation in the VIKING cohort. About one tenth of all high-quality variants discovered are unique to the VIKING cohort or are seen at frequencies at least ten fold higher than in more cosmopolitan control populations. Multiple lines of evidence also suggest relaxation of purifying selection during the evolutionary history of the Shetland isolate. We demonstrate enrichment of ultra-rare VIKING variants in exonic regions and for the first time we also show that ultra-rare variants are enriched within regulatory regions, particularly promoters, suggesting that gene expression patterns may diverge relatively rapidly in human isolates.
Asunto(s)
Demografía , Variación Genética/genética , Genética de Población , Secuencias Reguladoras de Ácidos Nucleicos/genética , Regiones no Traducidas 5'/genética , Alelos , Cromatina/genética , Europa (Continente) , Exones/genética , Efecto Fundador , Flujo Genético , Estudio de Asociación del Genoma Completo , Genómica , Humanos , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Regiones Promotoras Genéticas/genética , Escocia , Secuenciación Completa del GenomaRESUMEN
Glycosylation is a common post-translational modification of proteins. Glycosylation is associated with a number of human diseases. Defining genetic factors altering glycosylation may provide a basis for novel approaches to diagnostic and pharmaceutical applications. Here we report a genome-wide association study of the human blood plasma N-glycome composition in up to 3811 people measured by Ultra Performance Liquid Chromatography (UPLC) technology. Starting with the 36 original traits measured by UPLC, we computed an additional 77 derived traits leading to a total of 113 glycan traits. We studied associations between these traits and genetic polymorphisms located on human autosomes. We discovered and replicated 12 loci. This allowed us to demonstrate an overlap in genetic control between total plasma protein and IgG glycosylation. The majority of revealed loci contained genes that encode enzymes directly involved in glycosylation (FUT3/FUT6, FUT8, B3GAT1, ST6GAL1, B4GALT1, ST3GAL4, MGAT3 and MGAT5) and a known regulator of plasma protein fucosylation (HNF1A). However, we also found loci that could possibly reflect other more complex aspects of glycosylation process. Functional genomic annotation suggested the role of several genes including DERL3, CHCHD10, TMEM121, IGH and IKZF1. The hypotheses we generated may serve as a starting point for further functional studies in this research area.
Asunto(s)
Fucosiltransferasas/genética , Glicosiltransferasas/genética , Polisacáridos/sangre , Cromatografía Líquida de Alta Presión , Estudios de Cohortes , Fucosiltransferasas/sangre , Fucosiltransferasas/química , Estudio de Asociación del Genoma Completo , Glucuronosiltransferasa/sangre , Glucuronosiltransferasa/química , Glicosilación , Factor Nuclear 1-alfa del Hepatocito/sangre , Factor Nuclear 1-alfa del Hepatocito/química , Humanos , Inmunoglobulina G/metabolismo , Proteínas de la Membrana/metabolismo , Polimorfismo Genético , Sitios de Carácter CuantitativoRESUMEN
BACKGROUND: With the exponential growth in available biomedical data, there is a need for data integration methods that can extract information about relationships between the data sets. However, these data sets might have very different characteristics. For interpretable results, data-specific variation needs to be quantified. For this task, Two-way Orthogonal Partial Least Squares (O2PLS) has been proposed. To facilitate application and development of the methodology, free and open-source software is required. However, this is not the case with O2PLS. RESULTS: We introduce OmicsPLS, an open-source implementation of the O2PLS method in R. It can handle both low- and high-dimensional datasets efficiently. Generic methods for inspecting and visualizing results are implemented. Both a standard and faster alternative cross-validation methods are available to determine the number of components. A simulation study shows good performance of OmicsPLS compared to alternatives, in terms of accuracy and CPU runtime. We demonstrate OmicsPLS by integrating genetic and glycomic data. CONCLUSIONS: We propose the OmicsPLS R package: a free and open-source implementation of O2PLS for statistical data integration. OmicsPLS is available at https://cran.r-project.org/package=OmicsPLS and can be installed in R via install.packages("OmicsPLS").
Asunto(s)
Genómica/métodos , Metabolómica/métodos , Humanos , Análisis de los Mínimos Cuadrados , Programas InformáticosRESUMEN
BACKGROUND: Glycosylation is one of the most common post-translation modifications with large influences on protein structure and function. The effector function of immunoglobulin G (IgG) alters between pro- and anti-inflammatory, based on its glycosylation. IgG glycan synthesis is highly complex and dynamic. METHODS: With the use of two different analytical methods for assessing IgG glycosylation, we aim to elucidate the link between DNA methylation and glycosylation of IgG by means of epigenome-wide association studies. In total, 3000 individuals from 4 cohorts were analyzed. RESULTS: The overlap of the results from the two glycan measurement panels yielded DNA methylation of 7 CpG-sites on 5 genomic locations to be associated with IgG glycosylation: cg25189904 (chr.1, GNG12); cg05951221, cg21566642 and cg01940273 (chr.2, ALPPL2); cg05575921 (chr.5, AHRR); cg06126421 (6p21.33); and cg03636183 (chr.19, F2RL3). Mediation analyses with respect to smoking revealed that the effect of smoking on IgG glycosylation may be at least partially mediated via DNA methylation levels at these 7 CpG-sites. CONCLUSION: Our results suggest the presence of an indirect link between DNA methylation and IgG glycosylation that may in part capture environmental exposures. GENERAL SIGNIFICANCE: An epigenome-wide analysis conducted in four population-based cohorts revealed an association between DNA methylation and IgG glycosylation patterns. Presumably, DNA methylation mediates the effect of smoking on IgG glycosylation.
Asunto(s)
Metilación de ADN , Inmunoglobulina G/química , Procesamiento Proteico-Postraduccional , Fumar/efectos adversos , Mapeo Cromosómico , Estudios de Cohortes , Islas de CpG , Epigenómica/métodos , Europa (Continente) , Glicosilación , Humanos , Inmunoglobulina G/metabolismo , Estudios Multicéntricos como Asunto , Polisacáridos/análisis , Estudios en Gemelos como AsuntoRESUMEN
BACKGROUND: Type 2 diabetes results from interplay between genetic and acquired factors. Glycans on proteins reflect genetic, metabolic and environmental factors. However, associations of IgG glycans with type 2 diabetes have not been described. We compared IgG N-glycan patterns in type 2 diabetes with healthy subjects. METHODS: In the DiaGene study, a population-based case-control study, (1886 cases and 854 controls) 58 IgG glycan traits were analyzed. Findings were replicated in the population-based CROATIA-Korcula-CROATIA-Vis-ORCADES studies (162 cases and 3162 controls), and meta-analyzed. AUCs of ROC-curves were calculated using 10-fold cross-validation for clinical characteristics, IgG glycans and their combination. RESULTS: After correction for extensive clinical covariates, 5 IgG glycans and 13 derived traits significantly associated with type 2 diabetes in meta-analysis (after Bonferroni correction). Adding IgG glycans to age and sex increased the AUC from 0.542 to 0.734. Adding them to the extensive model did not substantially improve the AUC. The AUC for IgG glycans alone was 0.729. CONCLUSIONS: Several IgG glycans and traits firmly associate with type 2 diabetes, reflecting a pro-inflammatory and biologically-aged state. IgG glycans showed limited improvement of AUCs. However, IgG glycans showed good prediction alone, indicating they may capture information of combined covariates. The associations found may yield insights in type 2 diabetes pathophysiology. GENERAL SIGNIFICANCE: This work shows that IgG glycomic changes have biomarker potential and may yield important insights into pathophysiology of complex public health diseases, illustrated here for the first time in type 2 diabetes.
Asunto(s)
Diabetes Mellitus Tipo 2/etiología , Inmunoglobulina G/metabolismo , Anciano , Área Bajo la Curva , Femenino , Galactosa/metabolismo , Glicosilación , Humanos , Masculino , Persona de Mediana Edad , Ácido N-Acetilneuramínico/metabolismoRESUMEN
The biological and clinical relevance of glycosylation is becoming increasingly recognized, leading to a growing interest in large-scale clinical and population-based studies. In the past few years, several methods for high-throughput analysis of glycans have been developed, but thorough validation and standardization of these methods is required before significant resources are invested in large-scale studies. In this study, we compared liquid chromatography, capillary gel electrophoresis, and two MS methods for quantitative profiling of N-glycosylation of IgG in the same data set of 1201 individuals. To evaluate the accuracy of the four methods we then performed analysis of association with genetic polymorphisms and age. Chromatographic methods with either fluorescent or MS-detection yielded slightly stronger associations than MS-only and multiplexed capillary gel electrophoresis, but at the expense of lower levels of throughput. Advantages and disadvantages of each method were identified, which should inform the selection of the most appropriate method in future studies.
Asunto(s)
Ensayos Analíticos de Alto Rendimiento/métodos , Inmunoglobulina G/genética , Espectrometría de Masas/métodos , Polisacáridos/genética , Adulto , Cromatografía Liquida , Electroforesis Capilar , Glicosilación , Humanos , Interacciones Hidrofóbicas e Hidrofílicas , Polimorfismo Genético , Polisacáridos/aislamiento & purificaciónRESUMEN
For breast and ovarian cancer risk assessment in the isolated populations of the Northern Isles of Orkney and Shetland (in Scotland, UK) and their diasporas, quantifying genetically drifted BRCA1 and BRCA2 pathogenic variants is important. Two actionable variants in these genes have reached much higher frequencies than in cosmopolitan UK populations. Here, we report a BRCA2 splice acceptor variant, c.517-2A>G, found in breast and ovarian cancer families from Shetland. We investigated the frequency and origin of this variant in a population-based research cohort of people of Shetland ancestry, VIKING I. The variant segregates with female breast and ovarian cancer in diagnosed cases and is classified as pathogenic. Exome sequence data from 2108 VIKING I participants with three or more Shetlandic grandparents was used to estimate the population prevalence of c.517-2A>G in Shetlanders. Nine VIKING I research volunteers carry this variant, on a shared haplotype (carrier frequency 0.4%). This frequency is ~130-fold higher than in UK Biobank, where the small group of carriers has a different haplotype. Records of birth, marriage and death indicate genealogical linkage of VIKING I carriers to a founder from the Isle of Whalsay, Shetland, similar to our observations for the BRCA1 founder variant c.5207T>C from Westray, Orkney. In total, 93.5% of pathogenic BRCA variant carriers in Northern Isles exomes are accounted for by these two drifted variants. We thus provide the scientific evidence of an opportunity for screening people of Orcadian and Shetlandic origins for each drifted pathogenic variant, particularly women with Westray or Whalsay ancestry.
RESUMEN
Biological age captures physiological deterioration better than chronological age and is amenable to interventions. Blood-based biomarkers have been identified as suitable candidates for biological age estimation. This study aims to improve biological age estimation using machine learning models and a feature-set of 60 circulating biomarkers available from the UK Biobank (n = 306,116). We implement an Elastic-Net derived Cox model with 25 selected biomarkers to predict mortality risk (C-Index = 0.778; 95% CI [0.767-0.788]), which outperforms the well-known blood-biomarker based PhenoAge model (C-Index = 0.750; 95% CI [0.739-0.761]), providing a C-Index lift of 0.028 representing an 11% relative increase in predictive value. Importantly, we then show that using common clinical assay panels, with few biomarkers, alongside imputation and the model derived on the full set of biomarkers, does not substantially degrade predictive accuracy from the theoretical maximum achievable for the available biomarkers. Biological age is estimated as the equivalent age within the same-sex population which corresponds to an individual's mortality risk. Values ranged between 20-years younger and 20-years older than individuals' chronological age, exposing the magnitude of ageing signals contained in blood markers. Thus, we demonstrate a practical and cost-efficient method of estimating an improved measure of Biological Age, available to the general population.