ABSTRACT
OBJECTIVE: Primary sclerosing cholangitis (PSC) is characterised by bile duct strictures and progressive liver disease, eventually requiring liver transplantation. Although the pathogenesis of PSC remains incompletely understood, strong associations with HLA-class II haplotypes have been described. As specific HLA-DP molecules can bind the activating NK-cell receptor NKp44, we investigated the role of HLA-DP/NKp44-interactions in PSC. DESIGN: Liver tissue, intrahepatic and peripheral blood lymphocytes of individuals with PSC and control individuals were characterised using flow cytometry, immunohistochemical and immunofluorescence analyses. HLA-DPA1 and HLA-DPB1 imputation and association analyses were performed in 3408 individuals with PSC and 34 213 controls. NK cell activation on NKp44/HLA-DP interactions was assessed in vitro using plate-bound HLA-DP molecules and HLA-DPB wildtype versus knock-out human cholangiocyte organoids. RESULTS: NKp44+NK cells were enriched in livers, and intrahepatic bile ducts of individuals with PSC showed higher expression of HLA-DP. HLA-DP haplotype analysis revealed a highly elevated PSC risk for HLA-DPA1*02:01~B1*01:01 (OR 1.99, p=6.7×10-50). Primary NKp44+NK cells exhibited significantly higher degranulation in response to plate-bound HLA-DPA1*02:01-DPB1*01:01 compared with control HLA-DP molecules, which were inhibited by anti-NKp44-blocking. Human cholangiocyte organoids expressing HLA-DPA1*02:01-DPB1*01:01 after IFN-γ-exposure demonstrated significantly increased binding to NKp44-Fc constructs compared with unstimulated controls. Importantly, HLA-DPA1*02:01-DPB1*01:01-expressing organoids increased degranulation of NKp44+NK cells compared with HLA-DPB1-KO organoids. CONCLUSION: Our studies identify a novel PSC risk haplotype HLA-DP A1*02:01~DPB1*01:01 and provide clinical and functional data implicating NKp44+NK cells that recognise HLA-DPA1*02:01-DPB1*01:01 expressed on cholangiocytes in PSC pathogenesis.
Subject(s)
Cholangitis, Sclerosing , Humans , Haplotypes , Cholangitis, Sclerosing/genetics , HLA-DP alpha-Chains/genetics , Killer Cells, NaturalABSTRACT
Given the highly variable clinical phenotype of Coronavirus disease 2019 (COVID-19), a deeper analysis of the host genetic contribution to severe COVID-19 is important to improve our understanding of underlying disease mechanisms. Here, we describe an extended genome-wide association meta-analysis of a well-characterized cohort of 3255 COVID-19 patients with respiratory failure and 12 488 population controls from Italy, Spain, Norway and Germany/Austria, including stratified analyses based on age, sex and disease severity, as well as targeted analyses of chromosome Y haplotypes, the human leukocyte antigen region and the SARS-CoV-2 peptidome. By inversion imputation, we traced a reported association at 17q21.31 to a ~0.9-Mb inversion polymorphism that creates two highly differentiated haplotypes and characterized the potential effects of the inversion in detail. Our data, together with the 5th release of summary statistics from the COVID-19 Host Genetics Initiative including non-Caucasian individuals, also identified a new locus at 19q13.33, including NAPSA, a gene which is expressed primarily in alveolar cells responsible for gas exchange in the lung.
Subject(s)
COVID-19 , Humans , COVID-19/genetics , SARS-CoV-2/genetics , Genome-Wide Association Study , Haplotypes , Polymorphism, GeneticABSTRACT
BACKGROUND & AIMS: Ulcerative colitis (UC) is characterized by severe inflammation and destruction of the intestinal epithelium, and is associated with specific risk single nucleotide polymorphisms in HLA class II. Given the recently discovered interactions between subsets of HLA-DP molecules and the activating natural killer (NK) cell receptor NKp44, genetic associations of UC and HLA-DP haplotypes and their functional implications were investigated. METHODS: HLA-DP haplotype and UC risk association analyses were performed (UC: n = 13,927; control: n = 26,764). Expression levels of HLA-DP on intestinal epithelial cells (IECs) in individuals with and without UC were quantified. Human intestinal 3-dimensional (3D) organoid cocultures with human NK cells were used to determine functional consequences of interactions between HLA-DP and NKp44. RESULTS: These studies identified HLA-DPA1∗01:03-DPB1∗04:01 (HLA-DP401) as a risk haplotype and HLA-DPA1∗01:03-DPB1∗03:01 (HLA-DP301) as a protective haplotype for UC in European populations. HLA-DP expression was significantly higher on IECs of individuals with UC compared with controls. IECs in human intestinal 3D organoids derived from HLA-DP401pos individuals showed significantly stronger binding of NKp44 compared with HLA-DP301pos IECs. HLA-DP401pos IECs in organoids triggered increased degranulation and tumor necrosis factor production by NKp44+ NK cells in cocultures, resulting in enhanced epithelial cell death compared with HLA-DP301pos organoids. Blocking of HLA-DP401-NKp44 interactions (anti-NKp44) abrogated NK cell activity in cocultures. CONCLUSIONS: We identified an UC risk HLA-DP haplotype that engages NKp44 and activates NKp44+ NK cells, mediating damage to intestinal epithelial cells in an HLA-DP haplotype-dependent manner. The molecular interaction between NKp44 and HLA-DP401 in UC can be targeted by therapeutic interventions to reduce NKp44+ NK cell-mediated destruction of the intestinal epithelium in UC.
Subject(s)
Colitis, Ulcerative , HLA-DP Antigens , Humans , HLA-DP Antigens/genetics , Colitis, Ulcerative/genetics , Killer Cells, Natural , Haplotypes , Epithelial CellsABSTRACT
Patients with cystic fibrosis (CF) exhibit pronounced respiratory damage and were initially considered among those at highest risk for serious harm from SARS-CoV-2 infection. Numerous clinical studies have subsequently reported that individuals with CF in North America and Europe-while susceptible to severe COVID-19-are often spared from the highest levels of virus-associated mortality. To understand features that might influence COVID-19 among patients with cystic fibrosis, we studied relationships between SARS-CoV-2 and the gene responsible for CF (i.e., the cystic fibrosis transmembrane conductance regulator, CFTR). In contrast to previous reports, we found no association between CFTR carrier status (mutation heterozygosity) and more severe COVID-19 clinical outcomes. We did observe an unexpected trend toward higher mortality among control individuals compared with silent carriers of the common F508del CFTR variant-a finding that will require further study. We next performed experiments to test the influence of homozygous CFTR deficiency on viral propagation and showed that SARS-CoV-2 production in primary airway cells was not altered by the absence of functional CFTR using two independent protocols. On the contrary, experiments performed in vitro strongly indicated that virus proliferation depended on features of the mucosal fluid layer known to be disrupted by absent CFTR in patients with CF, including both low pH and increased viscosity. These results point to the acidic, viscous, and mucus-obstructed airways in patients with cystic fibrosis as unfavorable for the establishment of coronaviral infection. Our findings provide new and important information concerning relationships between the CF clinical phenotype and severity of COVID-19.
Subject(s)
COVID-19 , Cystic Fibrosis , Humans , Cystic Fibrosis/complications , Cystic Fibrosis/genetics , Cystic Fibrosis Transmembrane Conductance Regulator/genetics , Mutation , Patient Acuity , SARS-CoV-2ABSTRACT
Autoimmune neurological syndromes (AINS) with autoantibodies against the 65 kDa isoform of the glutamic acid decarboxylase (GAD65) present with limbic encephalitis, including temporal lobe seizures or epilepsy, cerebellitis with ataxia, and stiff-person-syndrome or overlap forms. Anti-GAD65 autoantibodies are also detected in autoimmune diabetes mellitus, which has a strong genetic susceptibility conferred by human leukocyte antigen (HLA) and non-HLA genomic regions. We investigated the genetic predisposition in patients with anti-GAD65 AINS. We performed a genome-wide association study (GWAS) and an association analysis of the HLA region in a large German cohort of 1214 individuals. These included 167 patients with anti-GAD65 AINS, recruited by the German Network for Research on Autoimmune Encephalitis (GENERATE), and 1047 individuals without neurological or endocrine disease as population-based controls. Predictions of protein expression changes based on GWAS findings were further explored and validated in the CSF proteome of a virtually independent cohort of 10 patients with GAD65-AINS and 10 controls. Our GWAS identified 16 genome-wide significant (P < 5 × 10-8) loci for the susceptibility to anti-GAD65 AINS. The top variant, rs2535288 [P = 4.42 × 10-16, odds ratio (OR) = 0.26, 95% confidence interval (CI) = 0.187-0.358], localized to an intergenic segment in the middle of the HLA class I region. The great majority of variants in these loci (>90%) mapped to non-coding regions of the genome. Over 40% of the variants have known regulatory functions on the expression of 48 genes in disease relevant cells and tissues, mainly CD4+ T cells and the cerebral cortex. The annotation of epigenomic marks suggested specificity for neural and immune cells. A network analysis of the implicated protein-coding genes highlighted the role of protein kinase C beta (PRKCB) and identified an enrichment of numerous biological pathways participating in immunity and neural function. Analysis of the classical HLA alleles and haplotypes showed no genome-wide significant associations. The strongest associations were found for the DQA1*03:01-DQB1*03:02-DRB1*04:01HLA haplotype (P = 4.39 × 10-4, OR = 2.5, 95%CI = 1.499-4.157) and DRB1*04:01 allele (P = 8.3 × 10-5, OR = 2.4, 95%CI = 1.548-3.682) identified in our cohort. As predicted, the CSF proteome showed differential levels of five proteins (HLA-A/B, C4A, ATG4D and NEO1) of expression quantitative trait loci genes from our GWAS in the CSF proteome of anti-GAD65 AINS. These findings suggest a strong genetic predisposition with direct functional implications for immunity and neural function in anti-GAD65 AINS, mainly conferred by genomic regions outside the classical HLA alleles.
Subject(s)
Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Genetic Predisposition to Disease/genetics , Proteome/genetics , Histocompatibility Antigens Class II , HLA Antigens , Haplotypes , Alleles , Autoantibodies , HLA-DRB1 Chains/geneticsABSTRACT
Inflammatory bowel disease (IBD) is a chronic inflammatory disease of the gut. Genetic association studies have identified the highly variable human leukocyte antigen (HLA) region as the strongest susceptibility locus for IBD and specifically DRB1*01:03 as a determining factor for ulcerative colitis (UC). However, for most of the association signal such as delineation could not be made because of tight structures of linkage disequilibrium within the HLA. The aim of this study was therefore to further characterize the HLA signal using a transethnic approach. We performed a comprehensive fine mapping of single HLA alleles in UC in a cohort of 9272 individuals with African American, East Asian, Puerto Rican, Indian and Iranian descent and 40 691 previously analyzed Caucasians, additionally analyzing whole HLA haplotypes. We computationally characterized the binding of associated HLA alleles to human self-peptides and analyzed the physicochemical properties of the HLA proteins and predicted self-peptidomes. Highlighting alleles of the HLA-DRB1*15 group and their correlated HLA-DQ-DR haplotypes, we not only identified consistent associations (regarding effects directions/magnitudes) across different ethnicities but also identified population-specific signals (regarding differences in allele frequencies). We observed that DRB1*01:03 is mostly present in individuals of Western European descent and hardly present in non-Caucasian individuals. We found peptides predicted to bind to risk HLA alleles to be rich in positively charged amino acids. We conclude that the HLA plays an important role for UC susceptibility across different ethnicities. This research further implicates specific features of peptides that are predicted to bind risk and protective HLA proteins.
Subject(s)
Colitis, Ulcerative/genetics , Ethnicity/genetics , Genetic Predisposition to Disease , HLA Antigens/genetics , HLA-DQ Antigens/genetics , HLA-DRB1 Chains/genetics , Peptides/genetics , Alleles , Cohort Studies , Gene Frequency , Genetic Association Studies , Genotype , Haplotypes , Humans , Linkage Disequilibrium , Polymorphism, Single Nucleotide , Protein BindingABSTRACT
BACKGROUND: There is considerable variation in disease behavior among patients infected with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus that causes coronavirus disease 2019 (Covid-19). Genomewide association analysis may allow for the identification of potential genetic factors involved in the development of Covid-19. METHODS: We conducted a genomewide association study involving 1980 patients with Covid-19 and severe disease (defined as respiratory failure) at seven hospitals in the Italian and Spanish epicenters of the SARS-CoV-2 pandemic in Europe. After quality control and the exclusion of population outliers, 835 patients and 1255 control participants from Italy and 775 patients and 950 control participants from Spain were included in the final analysis. In total, we analyzed 8,582,968 single-nucleotide polymorphisms and conducted a meta-analysis of the two case-control panels. RESULTS: We detected cross-replicating associations with rs11385942 at locus 3p21.31 and with rs657152 at locus 9q34.2, which were significant at the genomewide level (P<5×10-8) in the meta-analysis of the two case-control panels (odds ratio, 1.77; 95% confidence interval [CI], 1.48 to 2.11; P = 1.15×10-10; and odds ratio, 1.32; 95% CI, 1.20 to 1.47; P = 4.95×10-8, respectively). At locus 3p21.31, the association signal spanned the genes SLC6A20, LZTFL1, CCR9, FYCO1, CXCR6 and XCR1. The association signal at locus 9q34.2 coincided with the ABO blood group locus; in this cohort, a blood-group-specific analysis showed a higher risk in blood group A than in other blood groups (odds ratio, 1.45; 95% CI, 1.20 to 1.75; P = 1.48×10-4) and a protective effect in blood group O as compared with other blood groups (odds ratio, 0.65; 95% CI, 0.53 to 0.79; P = 1.06×10-5). CONCLUSIONS: We identified a 3p21.31 gene cluster as a genetic susceptibility locus in patients with Covid-19 with respiratory failure and confirmed a potential involvement of the ABO blood-group system. (Funded by Stein Erik Hagen and others.).
Subject(s)
ABO Blood-Group System/genetics , Betacoronavirus , Chromosomes, Human, Pair 3/genetics , Coronavirus Infections/genetics , Genetic Predisposition to Disease , Pneumonia, Viral/genetics , Polymorphism, Single Nucleotide , Respiratory Insufficiency/genetics , Aged , COVID-19 , Case-Control Studies , Chromosomes, Human, Pair 9/genetics , Coronavirus Infections/complications , Female , Genetic Loci , Genome-Wide Association Study , Humans , Italy , Male , Middle Aged , Multigene Family , Pandemics , Pneumonia, Viral/complications , Respiratory Insufficiency/etiology , SARS-CoV-2 , SpainABSTRACT
Coronavirus disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) may lead to life-threatening respiratory symptoms. Understanding the genetic basis of the prognosis of COVID-19 is important for risk profiling of potentially severe symptoms. Here, we conducted a genome-wide epistasis study of COVID-19 severity in 2243 patients with severe symptoms and 12,612 patients with no or mild symptoms from the UK Biobank, followed by a replication study in an independent Spanish cohort (1416 cases, 4382 controls). Our study highlighted 3 interactions with genome-wide significance in the discovery phase, nominally significant in the replication phase, and enhanced significance in the meta-analysis. For example, the lead interaction was found between rs9792388 upstream of PDGFRL and rs3025892 downstream of SNAP25, where the composite genotype of rs3025892 CT and rs9792388 CA/AA showed higher risk of severe disease than any other genotypes (P = 2.77 × 10-12, proportion of severe cases = 0.24 ~ 0.29 vs. 0.09 ~ 0.18, genotypic OR = 1.96 ~ 2.70). This interaction was replicated in the Spanish cohort (P = 0.002, proportion of severe cases = 0.30 ~ 0.36 vs. 0.14 ~ 0.25, genotypic OR = 1.45 ~ 2.37) and showed enhanced significance in the meta-analysis (P = 4.97 × 10-14). Notably, these interactions indicated a possible molecular mechanism by which SARS-CoV-2 affects the nervous system. The first exhaustive genome-wide screening for epistasis improved our understanding of genetic basis underlying the severity of COVID-19.
Subject(s)
COVID-19 , Humans , COVID-19/genetics , SARS-CoV-2/genetics , Epistasis, Genetic , GenotypeABSTRACT
OBJECTIVE: One of the current hypotheses to explain the proinflammatory immune response in IBD is a dysregulated T cell reaction to yet unknown intestinal antigens. As such, it may be possible to identify disease-associated T cell clonotypes by analysing the peripheral and intestinal T-cell receptor (TCR) repertoire of patients with IBD and controls. DESIGN: We performed bulk TCR repertoire profiling of both the TCR alpha and beta chains using high-throughput sequencing in peripheral blood samples of a total of 244 patients with IBD and healthy controls as well as from matched blood and intestinal tissue of 59 patients with IBD and disease controls. We further characterised specific T cell clonotypes via single-cell RNAseq. RESULTS: We identified a group of clonotypes, characterised by semi-invariant TCR alpha chains, to be significantly enriched in the blood of patients with Crohn's disease (CD) and particularly expanded in the CD8+ T cell population. Single-cell RNAseq data showed an innate-like phenotype of these cells, with a comparable gene expression to unconventional T cells such as mucosal associated invariant T and natural killer T (NKT) cells, but with distinct TCRs. CONCLUSIONS: We identified and characterised a subpopulation of unconventional Crohn-associated invariant T (CAIT) cells. Multiple evidence suggests these cells to be part of the NKT type II population. The potential implications of this population for CD or a subset thereof remain to be elucidated, and the immunophenotype and antigen reactivity of CAIT cells need further investigations in future studies.
Subject(s)
Crohn Disease , Natural Killer T-Cells , CD8-Positive T-Lymphocytes , Crohn Disease/genetics , Humans , Receptors, Antigen, T-Cell/metabolism , Receptors, Antigen, T-Cell, alpha-beta/geneticsABSTRACT
Estimated glomerular filtration rate (eGFR) reflects kidney function. Progressive eGFR-decline can lead to kidney failure, necessitating dialysis or transplantation. Hundreds of loci from genome-wide association studies (GWAS) for eGFR help explain population cross section variability. Since the contribution of these or other loci to eGFR-decline remains largely unknown, we derived GWAS for annual eGFR-decline and meta-analyzed 62 longitudinal studies with eGFR assessed twice over time in all 343,339 individuals and in high-risk groups. We also explored different covariate adjustment. Twelve genome-wide significant independent variants for eGFR-decline unadjusted or adjusted for eGFR-baseline (11 novel, one known for this phenotype), including nine variants robustly associated across models were identified. All loci for eGFR-decline were known for cross-sectional eGFR and thus distinguished a subgroup of eGFR loci. Seven of the nine variants showed variant-by-age interaction on eGFR cross section (further about 350,000 individuals), which linked genetic associations for eGFR-decline with age-dependency of genetic cross-section associations. Clinically important were two to four-fold greater genetic effects on eGFR-decline in high-risk subgroups. Five variants associated also with chronic kidney disease progression mapped to genes with functional in-silico evidence (UMOD, SPATA7, GALNTL5, TPPP). An unfavorable versus favorable nine-variant genetic profile showed increased risk odds ratios of 1.35 for kidney failure (95% confidence intervals 1.03-1.77) and 1.27 for acute kidney injury (95% confidence intervals 1.08-1.50) in over 2000 cases each, with matched controls). Thus, we provide a large data resource, genetic loci, and prioritized genes for kidney function decline, which help inform drug development pipelines revealing important insights into the age-dependency of kidney function genetics.
Subject(s)
N-Acetylgalactosaminyltransferases , Renal Insufficiency, Chronic , Renal Insufficiency , Cross-Sectional Studies , Genetic Loci , Genome-Wide Association Study , Glomerular Filtration Rate/genetics , Humans , Kidney , Longitudinal Studies , N-Acetylgalactosaminyltransferases/genetics , Renal Insufficiency/geneticsABSTRACT
OBJECTIVE: Haemorrhoidal disease (HEM) affects a large and silently suffering fraction of the population but its aetiology, including suspected genetic predisposition, is poorly understood. We report the first genome-wide association study (GWAS) meta-analysis to identify genetic risk factors for HEM to date. DESIGN: We conducted a GWAS meta-analysis of 218 920 patients with HEM and 725 213 controls of European ancestry. Using GWAS summary statistics, we performed multiple genetic correlation analyses between HEM and other traits as well as calculated HEM polygenic risk scores (PRS) and evaluated their translational potential in independent datasets. Using functional annotation of GWAS results, we identified HEM candidate genes, which differential expression and coexpression in HEM tissues were evaluated employing RNA-seq analyses. The localisation of expressed proteins at selected loci was investigated by immunohistochemistry. RESULTS: We demonstrate modest heritability and genetic correlation of HEM with several other diseases from the GI, neuroaffective and cardiovascular domains. HEM PRS validated in 180 435 individuals from independent datasets allowed the identification of those at risk and correlated with younger age of onset and recurrent surgery. We identified 102 independent HEM risk loci harbouring genes whose expression is enriched in blood vessels and GI tissues, and in pathways associated with smooth muscles, epithelial and endothelial development and morphogenesis. Network transcriptomic analyses highlighted HEM gene coexpression modules that are relevant to the development and integrity of the musculoskeletal and epidermal systems, and the organisation of the extracellular matrix. CONCLUSION: HEM has a genetic component that predisposes to smooth muscle, epithelial and connective tissue dysfunction.
ABSTRACT
BACKGROUND: The human leukocyte antigen (HLA) proteins play a fundamental role in the adaptive immune system as they present peptides to T cells. Mass-spectrometry-based immunopeptidomics is a promising and powerful tool for characterizing the immunopeptidomic landscape of HLA proteins, that is the peptides presented on HLA proteins. Despite the growing interest in the technology, and the recent rise of immunopeptidomics-specific identification pipelines, there is still a gap in data-analysis and software tools that are specialized in analyzing and visualizing immunopeptidomics data. RESULTS: We present the IPTK library which is an open-source Python-based library for analyzing, visualizing, comparing, and integrating different omics layers with the identified peptides for an in-depth characterization of the immunopeptidome. Using different datasets, we illustrate the ability of the library to enrich the result of the identified peptidomes. Also, we demonstrate the utility of the library in developing other software and tools by developing an easy-to-use dashboard that can be used for the interactive analysis of the results. CONCLUSION: IPTK provides a modular and extendable framework for analyzing and integrating immunopeptidomes with different omics layers. The library is deployed into PyPI at https://pypi.org/project/IPTKL/ and into Bioconda at https://anaconda.org/bioconda/iptkl , while the source code of the library and the dashboard, along with the online tutorials are available at https://github.com/ikmb/iptoolkit .
Subject(s)
Data Analysis , Software , Histocompatibility Antigens Class I , Humans , Mass Spectrometry , PeptidesABSTRACT
Rapid decline of glomerular filtration rate estimated from creatinine (eGFRcrea) is associated with severe clinical endpoints. In contrast to cross-sectionally assessed eGFRcrea, the genetic basis for rapid eGFRcrea decline is largely unknown. To help define this, we meta-analyzed 42 genome-wide association studies from the Chronic Kidney Diseases Genetics Consortium and United Kingdom Biobank to identify genetic loci for rapid eGFRcrea decline. Two definitions of eGFRcrea decline were used: 3 mL/min/1.73m2/year or more ("Rapid3"; encompassing 34,874 cases, 107,090 controls) and eGFRcrea decline 25% or more and eGFRcrea under 60 mL/min/1.73m2 at follow-up among those with eGFRcrea 60 mL/min/1.73m2 or more at baseline ("CKDi25"; encompassing 19,901 cases, 175,244 controls). Seven independent variants were identified across six loci for Rapid3 and/or CKDi25: consisting of five variants at four loci with genome-wide significance (near UMOD-PDILT (2), PRKAG2, WDR72, OR2S2) and two variants among 265 known eGFRcrea variants (near GATM, LARP4B). All these loci were novel for Rapid3 and/or CKDi25 and our bioinformatic follow-up prioritized variants and genes underneath these loci. The OR2S2 locus is novel for any eGFRcrea trait including interesting candidates. For the five genome-wide significant lead variants, we found supporting effects for annual change in blood urea nitrogen or cystatin-based eGFR, but not for GATM or LARP4B. Individuals at high compared to those at low genetic risk (8-14 vs. 0-5 adverse alleles) had a 1.20-fold increased risk of acute kidney injury (95% confidence interval 1.08-1.33). Thus, our identified loci for rapid kidney function decline may help prioritize therapeutic targets and identify mechanisms and individuals at risk for sustained deterioration of kidney function.
Subject(s)
Genome-Wide Association Study , Kidney , AMP-Activated Protein Kinases , Creatinine , Glomerular Filtration Rate/genetics , Humans , Protein Disulfide-Isomerases , United KingdomABSTRACT
Genotype imputation of the human leukocyte antigen (HLA) region is a cost-effective means to infer classical HLA alleles from inexpensive and dense SNP array data. In the research setting, imputation helps avoid costs for wet lab-based HLA typing and thus renders association analyses of the HLA in large cohorts feasible. Yet, most HLA imputation reference panels target Caucasian ethnicities and multi-ethnic panels are scarce. We compiled a high-quality multi-ethnic reference panel based on genotypes measured with Illumina's Immunochip genotyping array and HLA types established using a high-resolution next generation sequencing approach. Our reference panel includes more than 1,300 samples from Germany, Malta, China, India, Iran, Japan and Korea and samples of African American ancestry for all classical HLA class I and II alleles including HLA-DRB3/4/5. Applying extensive cross-validation, we benchmarked the imputation using the HLA imputation tool HIBAG, our multi-ethnic reference and an independent, previously published data set compiled of subpopulations of the 1000 Genomes project. We achieved average imputation accuracies higher than 0.924 for the commonly studied HLA-A, -B, -C, -DQB1 and -DRB1 genes across all ethnicities. We investigated allele-specific imputation challenges in regard to geographic origin of the samples using sensitivity and specificity measurements as well as allele frequencies and identified HLA alleles that are challenging to impute for each of the populations separately. In conclusion, our new multi-ethnic reference data set allows for high resolution HLA imputation of genotypes at all classical HLA class I and II genes including the HLA-DRB3/4/5 loci based on diverse ancestry populations.
Subject(s)
Histocompatibility Antigens Class II/genetics , Histocompatibility Antigens Class I/genetics , Black or African American/ethnology , Black or African American/genetics , Alleles , Asian People , Benchmarking , Cluster Analysis , Ethnicity , Gene Frequency , Genotype , HLA Antigens/genetics , HLA-DRB3 Chains/genetics , HLA-DRB4 Chains/genetics , HLA-DRB5 Chains/genetics , Haplotypes , High-Throughput Nucleotide Sequencing , Humans , Polymorphism, Single Nucleotide , Retrospective Studies , White People/ethnology , White People/geneticsABSTRACT
Machine learning methods and in particular random forests are promising approaches for prediction based on high dimensional omics data sets. They provide variable importance measures to rank predictors according to their predictive power. If building a prediction model is the main goal of a study, often a minimal set of variables with good prediction performance is selected. However, if the objective is the identification of involved variables to find active networks and pathways, approaches that aim to select all relevant variables should be preferred. We evaluated several variable selection procedures based on simulated data as well as publicly available experimental methylation and gene expression data. Our comparison included the Boruta algorithm, the Vita method, recurrent relative variable importance, a permutation approach and its parametric variant (Altmann) as well as recursive feature elimination (RFE). In our simulation studies, Boruta was the most powerful approach, followed closely by the Vita method. Both approaches demonstrated similar stability in variable selection, while Vita was the most robust approach under a pure null model without any predictor variables related to the outcome. In the analysis of the different experimental data sets, Vita demonstrated slightly better stability in variable selection and was less computationally intensive than Boruta. In conclusion, we recommend the Boruta and Vita approaches for the analysis of high-dimensional data sets. Vita is considerably faster than Boruta and thus more suitable for large data sets, but only Boruta can also be applied in low-dimensional settings.
Subject(s)
Algorithms , Biomarkers, Tumor/genetics , Breast Neoplasms/genetics , Computational Biology/methods , Computer Simulation , DNA Methylation , Gene Expression Profiling/methods , Female , Humans , Machine LearningABSTRACT
BACKGROUND: Fifteen percent of atopic dermatitis (AD) liability-scale heritability could be attributed to 31 susceptibility loci identified by using genome-wide association studies, with only 3 of them (IL13, IL-6 receptor [IL6R], and filaggrin [FLG]) resolved to protein-coding variants. OBJECTIVE: We examined whether a significant portion of unexplained AD heritability is further explained by low-frequency and rare variants in the gene-coding sequence. METHODS: We evaluated common, low-frequency, and rare protein-coding variants using exome chip and replication genotype data of 15,574 patients and 377,839 control subjects combined with whole-transcriptome data on lesional, nonlesional, and healthy skin samples of 27 patients and 38 control subjects. RESULTS: An additional 12.56% (SE, 0.74%) of AD heritability is explained by rare protein-coding variation. We identified docking protein 2 (DOK2) and CD200 receptor 1 (CD200R1) as novel genome-wide significant susceptibility genes. Rare coding variants associated with AD are further enriched in 5 genes (IL-4 receptor [IL4R], IL13, Janus kinase 1 [JAK1], JAK2, and tyrosine kinase 2 [TYK2]) of the IL13 pathway, all of which are targets for novel systemic AD therapeutics. Multiomics-based network and RNA sequencing analysis revealed DOK2 as a central hub interacting with, among others, CD200R1, IL6R, and signal transducer and activator of transcription 3 (STAT3). Multitissue gene expression profile analysis for 53 tissue types from the Genotype-Tissue Expression project showed that disease-associated protein-coding variants exert their greatest effect in skin tissues. CONCLUSION: Our discoveries highlight a major role of rare coding variants in AD acting independently of common variants. Further extensive functional studies are required to detect all potential causal variants and to specify the contribution of the novel susceptibility genes DOK2 and CD200R1 to overall disease susceptibility.
Subject(s)
Adaptor Proteins, Signal Transducing/genetics , Dermatitis, Atopic/genetics , Genotype , Orexin Receptors/genetics , Phosphoproteins/genetics , Skin/metabolism , Adult , Cohort Studies , Filaggrin Proteins , Gene Frequency , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Organ Specificity , Polymorphism, Genetic , Risk , TranscriptomeABSTRACT
BACKGROUND: The aim of our study was the identification of genetic variants associated with postoperative complications after cardiac surgery. METHODS: We conducted a prospective, double-blind, multicenter, randomized trial (RIPHeart). We performed a genome-wide association study (GWAS) in 1170 patients of both genders (871 males, 299 females) from the RIPHeart-Study cohort. Patients undergoing non-emergent cardiac surgery were included. Primary endpoint comprises a binary composite complication rate covering atrial fibrillation, delirium, non-fatal myocardial infarction, acute renal failure and/or any new stroke until hospital discharge with a maximum of fourteen days after surgery. RESULTS: A total of 547,644 genotyped markers were available for analysis. Following quality control and adjustment for clinical covariate, one SNP reached genome-wide significance (PHLPP2, rs78064607, p = 3.77 × 10- 8) and 139 (adjusted for all other outcomes) SNPs showed promising association with p < 1 × 10- 5 from the GWAS. CONCLUSIONS: We identified several potential loci, in particular PHLPP2, BBS9, RyR2, DUSP4 and HSPA8, associated with new-onset of atrial fibrillation, delirium, myocardial infarction, acute kidney injury and stroke after cardiac surgery. TRIAL REGISTRATION: The study was registered with ClinicalTrials.gov NCT01067703, prospectively registered on 11 Feb 2010.
Subject(s)
Acute Kidney Injury/genetics , Atrial Fibrillation/genetics , Cardiac Surgical Procedures/adverse effects , Delirium/genetics , Myocardial Infarction/genetics , Polymorphism, Single Nucleotide , Stroke/genetics , Acute Kidney Injury/diagnosis , Aged , Atrial Fibrillation/diagnosis , Cytoskeletal Proteins/genetics , Delirium/diagnosis , Dual-Specificity Phosphatases/genetics , Female , Genetic Predisposition to Disease , Genome-Wide Association Study , HSC70 Heat-Shock Proteins/genetics , Humans , Male , Middle Aged , Mitogen-Activated Protein Kinase Phosphatases/genetics , Multicenter Studies as Topic , Myocardial Infarction/diagnosis , Phosphoprotein Phosphatases/genetics , Randomized Controlled Trials as Topic , Risk Factors , Ryanodine Receptor Calcium Release Channel/genetics , Stroke/diagnosis , Treatment OutcomeSubject(s)
Colitis, Ulcerative , Crohn Disease , Humans , T-Lymphocytes , Crohn Disease/drug therapyABSTRACT
Coenzyme Q10 (CoQ10) is a lipophilic redox molecule that is present in membranes of almost all cells in human tissues. CoQ10 is, amongst other functions, essential for the respiratory transport chain and is a modulator of inflammatory processes and gene expression. Rare monogenetic CoQ10 deficiencies show noticeable symptoms in tissues (e.g. kidney) and cell types (e.g. neurons) with a high energy demand. To identify common genetic variants influencing serum CoQ10 levels, we performed a fixed effects meta-analysis in two independent cross-sectional Northern German cohorts comprising 1300 individuals in total. We identified two genome-wide significant susceptibility loci. The best associated single nucleotide polymorphism (SNP) was rs9952641 (P value = 1.31 × 10 -8, ß = 0.063, CI0.95 [0.041, 0.085]) within the COLEC12 gene on chromosome 18. The SNP rs933585 within the NRXN-1 gene on chromosome 2 also showed genome wide significance (P value = 3.64 × 10 -8, ß = -0.034, CI0.95 [-0.046, -0.022]). Both genes have been previously linked to neuronal diseases like Alzheimer's disease, autism and schizophrenia. Among our 'top-10' associated variants, four additional loci with known neuronal connections showed suggestive associations with CoQ10 levels. In summary, this study demonstrates that serum CoQ10 levels are associated with common genetic loci that are linked to neuronal diseases.
Subject(s)
Nerve Degeneration/genetics , Ubiquinone/analogs & derivatives , Adult , Aged , Ataxia/genetics , Ataxia/metabolism , Calcium-Binding Proteins , Cell Adhesion Molecules, Neuronal/genetics , Collectins/genetics , Cross-Sectional Studies , Female , Genetic Loci/genetics , Genetic Predisposition to Disease/genetics , Genetic Variation/genetics , Genome-Wide Association Study , Genotype , Humans , Male , Middle Aged , Mitochondrial Diseases/genetics , Mitochondrial Diseases/metabolism , Muscle Weakness/genetics , Muscle Weakness/metabolism , Nerve Degeneration/etiology , Nerve Tissue Proteins/genetics , Neural Cell Adhesion Molecules , Neurons , Polymorphism, Single Nucleotide/genetics , Receptors, Scavenger/genetics , Ubiquinone/blood , Ubiquinone/deficiency , Ubiquinone/genetics , Ubiquinone/metabolismABSTRACT
BACKGROUND & AIMS: The role of tobacco smoke in the etiology of inflammatory bowel disease (IBD) is unclear. We investigated interactions between genes and smoking (gene-smoking interactions) that affect risk for Crohn's disease (CD) and ulcerative colitis (UC) in a case-only study of patients and in mouse models of IBD. METHODS: We used 55 Immunochip-wide datasets that included 19,735 IBD cases (10,856 CD cases and 8879 UC cases) of known smoking status. We performed 3 meta-analyses each for CD, UC, and IBD (CD and UC combined), comparing data for never vs ever smokers, never vs current smokers, and never vs former smokers. We studied the effects of exposure to cigarette smoke in Il10-/- and Nod2-/- mice, as well as in Balb/c mice without disruption of these genes (wild-type mice). Mice were exposed to the smoke of 5 cigarettes per day, 5 days a week, for 8 weeks, in a ventilated smoking chamber, or ambient air (controls). Intestines were collected and analyzed histologically and by reverse transcription polymerase chain reaction. RESULTS: We identified 64 single nucleotide polymorphisms (SNPs) for which the association between the SNP and IBD were modified by smoking behavior (meta-analysis Wald test P < 5.0 × 10-5; heterogeneity Cochrane Q test P > .05). Twenty of these variants were located within the HLA region at 6p21. Analysis of classical HLA alleles (imputed from SNP genotypes) revealed an interaction with smoking. We replicated the interaction of a variant in NOD2 with current smoking in relation to the risk for CD (frameshift variant fs1007insC; rs5743293). We identified 2 variants in the same genomic region (rs2270368 and rs17221417) that interact with smoking in relation to CD risk. Approximately 45% of the SNPs that interact with smoking were in close vicinity (≤1 Mb) to SNPs previously associated with IBD; many were located near or within genes that regulate mucosal barrier function and immune tolerance. Smoking modified the disease risk of some variants in opposite directions for CD vs UC. Exposure of Interleukin 10 (il10)-deficient mice to cigarette smoke accelerated development of colitis and increased expression of interferon gamma in the small intestine compared to wild-type mice exposed to smoke. NOD2-deficient mice exposed to cigarette smoke developed ileitis, characterized by increased expression of interferon gamma, compared to wild-type mice exposed to smoke. CONCLUSIONS: In an analysis of 55 Immunochip-wide datasets, we identified 64 SNPs whose association with risk for IBD is modified by tobacco smoking. Gene-smoking interactions were confirmed in mice with disruption of Il10 and Nod2-variants of these genes have been associated with risk for IBD. Our findings from mice and humans revealed that the effects of smoking on risk for IBD depend on genetic variants.