ABSTRACT
An outbreak of acute hepatitis of unknown aetiology in children was reported in Scotland1 in April 2022 and has now been identified in 35 countries2. Several recent studies have suggested an association with human adenovirus with this outbreak, a virus not commonly associated with hepatitis. Here we report a detailed case-control investigation and find an association between adeno-associated virus 2 (AAV2) infection and host genetics in disease susceptibility. Using next-generation sequencing, PCR with reverse transcription, serology and in situ hybridization, we detected recent infection with AAV2 in plasma and liver samples in 26 out of 32 (81%) cases of hepatitis compared with 5 out of 74 (7%) of samples from unaffected individuals. Furthermore, AAV2 was detected within ballooned hepatocytes alongside a prominent T cell infiltrate in liver biopsy samples. In keeping with a CD4+ T-cell-mediated immune pathology, the human leukocyte antigen (HLA) class II HLA-DRB1*04:01 allele was identified in 25 out of 27 cases (93%) compared with a background frequency of 10 out of 64 (16%; P = 5.49 × 10-12). In summary, we report an outbreak of acute paediatric hepatitis associated with AAV2 infection (most likely acquired as a co-infection with human adenovirus that is usually required as a 'helper virus' to support AAV2 replication) and disease susceptibility related to HLA class II status.
Subject(s)
Adenovirus Infections, Human , Dependovirus , Hepatitis , Child , Humans , Acute Disease/epidemiology , Adenovirus Infections, Human/epidemiology , Adenovirus Infections, Human/genetics , Adenovirus Infections, Human/virology , Alleles , Case-Control Studies , CD4-Positive T-Lymphocytes/immunology , Coinfection/epidemiology , Coinfection/virology , Dependovirus/isolation & purification , Genetic Predisposition to Disease , Helper Viruses/isolation & purification , Hepatitis/epidemiology , Hepatitis/genetics , Hepatitis/virology , Hepatocytes/virology , HLA-DRB1 Chains/genetics , HLA-DRB1 Chains/immunology , Liver/virologyABSTRACT
Critical illness in COVID-19 is an extreme and clinically homogeneous disease phenotype that we have previously shown1 to be highly efficient for discovery of genetic associations2. Despite the advanced stage of illness at presentation, we have shown that host genetics in patients who are critically ill with COVID-19 can identify immunomodulatory therapies with strong beneficial effects in this group3. Here we analyse 24,202 cases of COVID-19 with critical illness comprising a combination of microarray genotype and whole-genome sequencing data from cases of critical illness in the international GenOMICC (11,440 cases) study, combined with other studies recruiting hospitalized patients with a strong focus on severe and critical disease: ISARIC4C (676 cases) and the SCOURGE consortium (5,934 cases). To put these results in the context of existing work, we conduct a meta-analysis of the new GenOMICC genome-wide association study (GWAS) results with previously published data. We find 49 genome-wide significant associations, of which 16 have not been reported previously. To investigate the therapeutic implications of these findings, we infer the structural consequences of protein-coding variants, and combine our GWAS results with gene expression data using a monocyte transcriptome-wide association study (TWAS) model, as well as gene and protein expression using Mendelian randomization. We identify potentially druggable targets in multiple systems, including inflammatory signalling (JAK1), monocyte-macrophage activation and endothelial permeability (PDE4A), immunometabolism (SLC2A5 and AK5), and host factors required for viral entry and replication (TMPRSS2 and RAB2A).
Subject(s)
COVID-19 , Critical Illness , Genetic Predisposition to Disease , Genetic Variation , Genome-Wide Association Study , Humans , COVID-19/genetics , Genetic Predisposition to Disease/genetics , Genetic Variation/genetics , Genotype , Genotyping Techniques , Monocytes/metabolism , Phenotype , rab GTP-Binding Proteins/genetics , Transcriptome , Whole Genome SequencingABSTRACT
Critical COVID-19 is caused by immune-mediated inflammatory lung injury. Host genetic variation influences the development of illness requiring critical care1 or hospitalization2-4 after infection with SARS-CoV-2. The GenOMICC (Genetics of Mortality in Critical Care) study enables the comparison of genomes from individuals who are critically ill with those of population controls to find underlying disease mechanisms. Here we use whole-genome sequencing in 7,491 critically ill individuals compared with 48,400 controls to discover and replicate 23 independent variants that significantly predispose to critical COVID-19. We identify 16 new independent associations, including variants within genes that are involved in interferon signalling (IL10RB and PLSCR1), leucocyte differentiation (BCL11A) and blood-type antigen secretor status (FUT2). Using transcriptome-wide association and colocalization to infer the effect of gene expression on disease severity, we find evidence that implicates multiple genes-including reduced expression of a membrane flippase (ATP11A), and increased expression of a mucin (MUC1)-in critical disease. Mendelian randomization provides evidence in support of causal roles for myeloid cell adhesion molecules (SELE, ICAM5 and CD209) and the coagulation factor F8, all of which are potentially druggable targets. Our results are broadly consistent with a multi-component model of COVID-19 pathophysiology, in which at least two distinct mechanisms can predispose to life-threatening disease: failure to control viral replication; or an enhanced tendency towards pulmonary inflammation and intravascular coagulation. We show that comparison between cases of critical illness and population controls is highly efficient for the detection of therapeutically relevant mechanisms of disease.
Subject(s)
COVID-19 , Critical Illness , Genome, Human , Host-Pathogen Interactions , Whole Genome Sequencing , ATP-Binding Cassette Transporters , COVID-19/genetics , COVID-19/mortality , COVID-19/pathology , COVID-19/virology , Cell Adhesion Molecules , Critical Care , Critical Illness/mortality , E-Selectin , Factor VIII , Fucosyltransferases , Genome, Human/genetics , Genome-Wide Association Study , Host-Pathogen Interactions/genetics , Humans , Interleukin-10 Receptor beta Subunit , Lectins, C-Type , Mucin-1 , Nerve Tissue Proteins , Phospholipid Transfer Proteins , Receptors, Cell Surface , Repressor Proteins , SARS-CoV-2/pathogenicity , Galactoside 2-alpha-L-fucosyltransferaseABSTRACT
Host-mediated lung inflammation is present1, and drives mortality2, in the critical illness caused by coronavirus disease 2019 (COVID-19). Host genetic variants associated with critical illness may identify mechanistic targets for therapeutic development3. Here we report the results of the GenOMICC (Genetics Of Mortality In Critical Care) genome-wide association study in 2,244 critically ill patients with COVID-19 from 208 UK intensive care units. We have identified and replicated the following new genome-wide significant associations: on chromosome 12q24.13 (rs10735079, P = 1.65 × 10-8) in a gene cluster that encodes antiviral restriction enzyme activators (OAS1, OAS2 and OAS3); on chromosome 19p13.2 (rs74956615, P = 2.3 × 10-8) near the gene that encodes tyrosine kinase 2 (TYK2); on chromosome 19p13.3 (rs2109069, P = 3.98 × 10-12) within the gene that encodes dipeptidyl peptidase 9 (DPP9); and on chromosome 21q22.1 (rs2236757, P = 4.99 × 10-8) in the interferon receptor gene IFNAR2. We identified potential targets for repurposing of licensed medications: using Mendelian randomization, we found evidence that low expression of IFNAR2, or high expression of TYK2, are associated with life-threatening disease; and transcriptome-wide association in lung tissue revealed that high expression of the monocyte-macrophage chemotactic receptor CCR2 is associated with severe COVID-19. Our results identify robust genetic signals relating to key host antiviral defence mechanisms and mediators of inflammatory organ damage in COVID-19. Both mechanisms may be amenable to targeted treatment with existing drugs. However, large-scale randomized clinical trials will be essential before any change to clinical practice.
Subject(s)
COVID-19/genetics , COVID-19/physiopathology , Critical Illness , 2',5'-Oligoadenylate Synthetase/genetics , COVID-19/pathology , Chromosomes, Human, Pair 12/genetics , Chromosomes, Human, Pair 19/genetics , Chromosomes, Human, Pair 21/genetics , Critical Care , Dipeptidyl-Peptidases and Tripeptidyl-Peptidases/genetics , Drug Repositioning , Female , Genome-Wide Association Study , Humans , Inflammation/genetics , Inflammation/pathology , Inflammation/physiopathology , Lung/pathology , Lung/physiopathology , Lung/virology , Male , Multigene Family/genetics , Receptor, Interferon alpha-beta/genetics , Receptors, CCR2/genetics , TYK2 Kinase/genetics , United KingdomABSTRACT
BACKGROUND: SARS-CoV-2, the causal agent of COVID-19, enters human cells using the ACE2 (angiotensin-converting enzyme 2) protein as a receptor. ACE2 is thus key to the infection and treatment of the coronavirus. ACE2 is highly expressed in the heart and respiratory and gastrointestinal tracts, playing important regulatory roles in the cardiovascular and other biological systems. However, the genetic basis of the ACE2 protein levels is not well understood. METHODS: We have conducted the largest genome-wide association meta-analysis of plasma ACE2 levels in >28 000 individuals of the SCALLOP Consortium (Systematic and Combined Analysis of Olink Proteins). We summarize the cross-sectional epidemiological correlates of circulating ACE2. Using the summary statistics-based high-definition likelihood method, we estimate relevant genetic correlations with cardiometabolic phenotypes, COVID-19, and other human complex traits and diseases. We perform causal inference of soluble ACE2 on vascular disease outcomes and COVID-19 severity using mendelian randomization. We also perform in silico functional analysis by integrating with other types of omics data. RESULTS: We identified 10 loci, including 8 novel, capturing 30% of the heritability of the protein. We detected that plasma ACE2 was genetically correlated with vascular diseases, severe COVID-19, and a wide range of human complex diseases and medications. An X-chromosome cis-protein quantitative trait loci-based mendelian randomization analysis suggested a causal effect of elevated ACE2 levels on COVID-19 severity (odds ratio, 1.63 [95% CI, 1.10-2.42]; P=0.01), hospitalization (odds ratio, 1.52 [95% CI, 1.05-2.21]; P=0.03), and infection (odds ratio, 1.60 [95% CI, 1.08-2.37]; P=0.02). Tissue- and cell type-specific transcriptomic and epigenomic analysis revealed that the ACE2 regulatory variants were enriched for DNA methylation sites in blood immune cells. CONCLUSIONS: Human plasma ACE2 shares a genetic basis with cardiovascular disease, COVID-19, and other related diseases. The genetic architecture of the ACE2 protein is mapped, providing a useful resource for further biological and clinical studies on this coronavirus receptor.
Subject(s)
Angiotensin-Converting Enzyme 2 , COVID-19 , Angiotensin-Converting Enzyme 2/genetics , COVID-19/genetics , Cross-Sectional Studies , Genome-Wide Association Study , Humans , Receptors, Coronavirus , SARS-CoV-2ABSTRACT
By uniformly analyzing 723 RNA-seq data from 91 tissues and cell types, we built a comprehensive gene atlas and studied tissue specificity of genes in cattle. We demonstrated that tissue-specific genes significantly reflected the tissue-relevant biology, showing distinct promoter methylation and evolution patterns (e.g., brain-specific genes evolve slowest, whereas testis-specific genes evolve fastest). Through integrative analyses of those tissue-specific genes with large-scale genome-wide association studies, we detected relevant tissues/cell types and candidate genes for 45 economically important traits in cattle, including blood/immune system (e.g., CCDC88C) for male fertility, brain (e.g., TRIM46 and RAB6A) for milk production, and multiple growth-related tissues (e.g., FGF6 and CCND2) for body conformation. We validated these findings by using epigenomic data across major somatic tissues and sperm. Collectively, our findings provided novel insights into the genetic and biological mechanisms underlying complex traits in cattle, and our transcriptome atlas can serve as a primary source for biological interpretation, functional validation, studies of adaptive evolution, and genomic improvement in livestock.
Subject(s)
Cattle/genetics , Transcriptome , Animals , Cattle/growth & development , Cattle/physiology , DNA Methylation , Female , Genes , Milk , Organ Specificity , RNA-Seq , ReproductionABSTRACT
To efficiently transform genetic associations into drug targets requires evidence that a particular gene, and its encoded protein, contribute causally to a disease. To achieve this, we employ a three-step proteome-by-phenome Mendelian Randomization (MR) approach. In step one, 154 protein quantitative trait loci (pQTLs) were identified and independently replicated. From these pQTLs, 64 replicated locally-acting variants were used as instrumental variables for proteome-by-phenome MR across 846 traits (step two). When its assumptions are met, proteome-by-phenome MR, is equivalent to simultaneously running many randomized controlled trials. Step 2 yielded 38 proteins that significantly predicted variation in traits and diseases in 509 instances. Step 3 revealed that amongst the 271 instances from GeneAtlas (UK Biobank), 77 showed little evidence of pleiotropy (HEIDI), and 92 evidence of colocalization (eCAVIAR). Results were wide ranging: including, for example, new evidence for a causal role of tyrosine-protein phosphatase non-receptor type substrate 1 (SHPS1; SIRPA) in schizophrenia, and a new finding that intestinal fatty acid binding protein (FABP2) abundance contributes to the pathogenesis of cardiovascular disease. We also demonstrated confirmatory evidence for the causal role of four further proteins (FGF5, IL6R, LPL, LTA) in cardiovascular disease risk.
Subject(s)
Cardiovascular Diseases/genetics , Mendelian Randomization Analysis , Proteome/genetics , Schizophrenia/genetics , Antigens, Differentiation/genetics , Cardiovascular Diseases/pathology , Fatty Acid-Binding Proteins/genetics , Female , Fibroblast Growth Factor 5/genetics , Genetic Association Studies/methods , Humans , Lipoprotein Lipase/genetics , Lymphotoxin-alpha/genetics , Male , Quantitative Trait Loci , Receptors, Immunologic/genetics , Receptors, Interleukin-6/genetics , Schizophrenia/pathologyABSTRACT
DNA methylation (DNAm) measured in lymphoblastoid cell lines has been repeatedly demonstrated to differ between various human populations. Due to the role that DNAm plays in controlling gene expression, these differences could significantly contribute to ethnic phenotypic differences. However, because previous studies have compared distinct ethnic groups where genetic and environmental context are confounded, their relative contribution to phenotypic differences between ethnicities remains unclear. Using DNAm assayed in whole blood and colorectal tissue of 132 admixed individuals from Colombia, we identified sites where differential DNAm levels were associated with the local ancestral genetic context. Our results are consistent with population specific DNAm being primarily driven by between population genetic differences in cis, with little environmental contribution, and with consistent effects across tissues. The findings offer new insights into a possible mechanism driving phenotypic differences among different ethnic groups, and could help explain ethnic differences in colorectal cancer incidence.
Subject(s)
Colorectal Neoplasms/genetics , DNA Methylation/genetics , Epigenomics , Genetics, Population , Colombia/epidemiology , Colorectal Neoplasms/epidemiology , CpG Islands/genetics , Female , Genotype , Hispanic or Latino , Humans , MaleABSTRACT
Phenotypic correlations among partners for traits such as longevity or late-onset disease have been found to be comparable to phenotypic correlations in first-degree relatives. How these correlations arise in late life is poorly understood. Here we introduce a novel paradigm to establish the presence of indirect assortment on factors correlated across generations, by examining correlations between parents of couples, i.e., in-laws. Using correlations in additive genetic values we further corroborate the presence of indirect assortment on heritable factors. Specifically, using couples from the UK Biobank cohort, we show that longevity and disease history of the parents of White British couples are correlated, with correlations of up to 0.09. The correlations in parental longevity are replicated in the FamiLinx cohort, a larger and geographically more diverse historical ancestry dataset spanning a broader time frame. These correlations in parental longevity significantly (pval < 0.0093 for all pairs of parents) exceed what would be expected due to variations in lifespan based on year and location of birth. For cardiovascular diseases, in particular hypertension, we find significant correlations (r = 0.028, pval = 0.005) in genetic values among partners, supporting a model where partners assort for risk factors to some extent genetically correlated with cardiovascular disease. Partitioning the relative importance of indirect assortative mating and shared common environment will require large, well-characterized longitudinal cohorts aimed at understanding phenotypic correlations among none-blood relatives. Identifying the factors that mediate indirect assortment on longevity and human disease risk will help to unravel factors affecting human disease and ultimately longevity.
Subject(s)
Longevity/genetics , Reproduction/genetics , Environment , Female , Humans , Male , Phenotype , White People/geneticsABSTRACT
DNA methylation (DNAm) has been linked to changes in chromatin structure, gene expression and disease. The DNAm level can be affected by genetic variation; although, how this differs by CpG dinucleotide density and genic location of the DNAm site is not well understood. Moreover, the effect of disease causing variants on the DNAm level in a tissue relevant to disease has yet to be fully elucidated. To this end, we investigated the phenotypic profiles, genetic effects and regional genomic heritability for 196080 DNAm sites in healthy colorectum tissue from 132 unrelated Colombian individuals. DNAm sites in regions of low-CpG density were more variable, on average more methylated and were more likely to be significantly heritable when compared with DNAm sites in regions of high-CpG density. DNAm sites located in intergenic regions had a higher mean DNAm level and were more likely to be heritable when compared with DNAm sites in the transcription start site (TSS) of a gene expressed in colon tissue. Within CpG-dense regions, the propensity of the DNAm level to be heritable was lower in the TSS of genes expressed in colon tissue than in the TSS of genes not expressed in colon tissue. In addition, regional genetic variation was associated with variation in local DNAm level no more frequently for DNAm sites within colorectal cancer risk regions than it was for DNAm sites outside such regions. Overall, DNAm sites located in different genomic contexts exhibited distinguishable profiles and may have a different biological function.
Subject(s)
Colon/metabolism , DNA Methylation/genetics , Epigenesis, Genetic , Rectum/metabolism , Colonic Polyps/genetics , Colonic Polyps/metabolism , CpG Islands/genetics , Female , Gene Expression Regulation , Genome, Human , Genomics , Humans , Male , Oligonucleotide Array Sequence Analysis , Promoter Regions, GeneticABSTRACT
Despite extensive global research into genetic predisposition for severe COVID-19, knowledge on the role of rare host genetic variants and their relation to other risk factors remains limited. Here, 52 genes with prior etiological evidence were sequenced in 1,772 severe COVID-19 cases and 5,347 population-based controls from Spain/Italy. Rare deleterious TLR7 variants were present in 2.4% of young (<60 years) cases with no reported clinical risk factors (n = 378), compared to 0.24% of controls (odds ratio [OR] = 12.3, p = 1.27 × 10-10). Incorporation of the results of either functional assays or protein modeling led to a pronounced increase in effect size (ORmax = 46.5, p = 1.74 × 10-15). Association signals for the X-chromosomal gene TLR7 were also detected in the female-only subgroup, suggesting the existence of additional mechanisms beyond X-linked recessive inheritance in males. Additionally, supporting evidence was generated for a contribution to severe COVID-19 of the previously implicated genes IFNAR2, IFIH1, and TBK1. Our results refine the genetic contribution of rare TLR7 variants to severe COVID-19 and strengthen evidence for the etiological relevance of genes in the interferon signaling pathway.
Subject(s)
COVID-19 , Genetic Predisposition to Disease , SARS-CoV-2 , Toll-Like Receptor 7 , Humans , Toll-Like Receptor 7/genetics , COVID-19/genetics , COVID-19/epidemiology , Male , Female , Middle Aged , SARS-CoV-2/genetics , Adult , Spain/epidemiology , Case-Control Studies , Italy/epidemiology , Aged , Severity of Illness Index , Genetic Variation/geneticsABSTRACT
The Farm Animal Genotype-Tissue Expression (FarmGTEx) project has been established to develop a public resource of genetic regulatory variants in livestock, which is essential for linking genetic polymorphisms to variation in phenotypes, helping fundamental biological discovery and exploitation in animal breeding and human biomedicine. Here we show results from the pilot phase of PigGTEx by processing 5,457 RNA-sequencing and 1,602 whole-genome sequencing samples passing quality control from pigs. We build a pig genotype imputation panel and associate millions of genetic variants with five types of transcriptomic phenotypes in 34 tissues. We evaluate tissue specificity of regulatory effects and elucidate molecular mechanisms of their action using multi-omics data. Leveraging this resource, we decipher regulatory mechanisms underlying 207 pig complex phenotypes and demonstrate the similarity of pigs to humans in gene expression and the genetic regulation behind complex phenotypes, supporting the importance of pigs as a human biomedical model.
Subject(s)
Gene Expression Profiling , Gene Expression Regulation , Swine/genetics , Animals , Humans , Genotype , Phenotype , Sequence Analysis, RNAABSTRACT
There is increasing evidence that the complexity of the retinal vasculature measured as fractal dimension, Df, might offer earlier insights into the progression of coronary artery disease (CAD) before traditional biomarkers can be detected. This association could be partly explained by a common genetic basis; however, the genetic component of Df is poorly understood. We present a genome-wide association study (GWAS) of 38,000 individuals with white British ancestry from the UK Biobank aimed to comprehensively study the genetic component of Df and analyse its relationship with CAD. We replicated 5 Df loci and found 4 additional loci with suggestive significance (P < 1e-05) to contribute to Df variation, which previously were reported in retinal tortuosity and complexity, hypertension, and CAD studies. Significant negative genetic correlation estimates support the inverse relationship between Df and CAD, and between Df and myocardial infarction (MI), one of CAD's fatal outcomes. Fine-mapping of Df loci revealed Notch signalling regulatory variants supporting a shared mechanism with MI outcomes. We developed a predictive model for MI incident cases, recorded over a 10-year period following clinical and ophthalmic evaluation, combining clinical information, Df, and a CAD polygenic risk score. Internal cross-validation demonstrated a considerable improvement in the area under the curve (AUC) of our predictive model (AUC = 0.770 ± 0.001) when comparing with an established risk model, SCORE, (AUC = 0.741 ± 0.002) and extensions thereof leveraging the PRS (AUC = 0.728 ± 0.001). This evidences that Df provides risk information beyond demographic, lifestyle, and genetic risk factors. Our findings shed new light on the genetic basis of Df, unveiling a common control with MI, and highlighting the benefits of its application in individualised MI risk prediction.
Subject(s)
Coronary Artery Disease , Myocardial Infarction , Humans , Genome-Wide Association Study , Genetic Predisposition to Disease , Myocardial Infarction/genetics , Coronary Artery Disease/genetics , Risk FactorsABSTRACT
BACKGROUND: The mammalian thalamus relays sensory information from the periphery to the cerebral cortex for cognitive processing via the thalamocortical tract. The thalamocortical tract forms during embryonic development controlled by mechanisms that are not fully understood. ß-catenin is a nuclear and cytosolic protein that transduces signals from secreted signaling molecules to regulate both cell motility via the cytoskeleton and gene expression in the nucleus. In this study we tested whether ß-catenin is likely to play a role in thalamocortical connectivity by examining its expression and activity in developing thalamic neurons and their axons. RESULTS: At embryonic day (E)15.5, the time when thalamocortical axonal projections are forming, we found that the thalamus is a site of particularly high ß-catenin mRNA and protein expression. As well as being expressed at high levels in thalamic cell bodies, ß-catenin protein is enriched in the axons and growth cones of thalamic axons and its growth cone concentration is sensitive to Netrin-1. Using mice carrying the ß-catenin reporter BAT-gal we find high levels of reporter activity in the thalamus. Further, Netrin-1 induces BAT-gal reporter expression and upregulates levels of endogenous transcripts encoding ß-actin and L1 proteins in cultured thalamic cells. We found that ß-catenin mRNA is enriched in thalamic axons and its 3'UTR is phylogenetically conserved and is able to direct heterologous mRNAs along the thalamic axon, where they can be translated. CONCLUSION: We provide evidence that ß-catenin protein is likely to be an important player in thalamocortcial development. It is abundant both in the nucleus and in the growth cones of post-mitotic thalamic cells during the development of thalamocortical connectivity and ß-catenin mRNA is targeted to thalamic axons and growth cones where it could potentially be translated. ß-catenin is involved in transducing the Netrin-1 signal to thalamic cells suggesting a mechanism by which Netrin-1 guides thalamocortical development.
Subject(s)
Axons/metabolism , Cerebral Cortex/metabolism , Neurons/metabolism , Thalamus/metabolism , beta Catenin/metabolism , Animals , Cells, Cultured , Cerebral Cortex/embryology , Gene Expression Regulation, Developmental , Growth Cones/metabolism , Mice , Nerve Growth Factors/genetics , Nerve Growth Factors/metabolism , Netrin-1 , Neural Pathways/embryology , Neural Pathways/metabolism , Thalamus/embryology , Tumor Suppressor Proteins/genetics , Tumor Suppressor Proteins/metabolism , beta Catenin/geneticsABSTRACT
Background and Objectives: Based on previous case reports and disease-based cohorts, a minority of patients with cerebral small vessel disease (cSVD) have a monogenic cause, with many also manifesting extracerebral phenotypes. We investigated the frequency, penetrance, and phenotype associations of putative pathogenic variants in cSVD genes in the UK Biobank (UKB), a large population-based study. Methods: We used a systematic review of previous literature and ClinVar to identify putative pathogenic rare variants in CTSA, TREX1, HTRA1, and COL4A1/2. We mapped phenotypes previously attributed to these variants (phenotypes-of-interest) to disease coding systems used in the UKB's linked health data from UK hospital admissions, death records, and primary care. Among 199,313 exome-sequenced UKB participants, we assessed the following: the proportion of participants carrying ≥1 variant(s); phenotype-of-interest penetrance; and the association between variant carrier status and phenotypes-of-interest using a binary (any phenotype present/absent) and phenotype burden (linear score of the number of phenotypes a participant possessed) approach. Results: Among UKB participants, 0.5% had ≥1 variant(s) in studied genes. Using hospital admission and death records, 4%-20% of variant carriers per gene had an associated phenotype. This increased to 7%-55% when including primary care records. Only COL4A1 variant carrier status was significantly associated with having ≥1 phenotype-of-interest and a higher phenotype score (OR = 1.29, p = 0.006). Discussion: While putative pathogenic rare variants in monogenic cSVD genes occur in 1:200 people in the UKB population, only approximately half of variant carriers have a relevant disease phenotype recorded in their linked health data. We could not replicate most previously reported gene-phenotype associations, suggesting lower penetrance rates, overestimated pathogenicity, and/or limited statistical power.
ABSTRACT
BACKGROUND: Cross-species comparison of transcriptomes is important for elucidating evolutionary molecular mechanisms underpinning phenotypic variation between and within species, yet to date it has been essentially limited to model organisms with relatively small sample sizes. RESULTS: Here, we systematically analyze and compare 10,830 and 4866 publicly available RNA-seq samples in humans and cattle, respectively, representing 20 common tissues. Focusing on 17,315 orthologous genes, we demonstrate that mean/median gene expression, inter-individual variation of expression, expression quantitative trait loci, and gene co-expression networks are generally conserved between humans and cattle. By examining large-scale genome-wide association studies for 46 human traits (average n = 327,973) and 45 cattle traits (average n = 24,635), we reveal that the heritability of complex traits in both species is significantly more enriched in transcriptionally conserved than diverged genes across tissues. CONCLUSIONS: In summary, our study provides a comprehensive comparison of transcriptomes between humans and cattle, which might help decipher the genetic and evolutionary basis of complex traits in both species.
Subject(s)
Genome-Wide Association Study , Transcriptome , Animals , Cattle/genetics , Humans , Multifactorial Inheritance , Phenotype , Quantitative Trait LociABSTRACT
Characterization of genetic regulatory variants acting on livestock gene expression is essential for interpreting the molecular mechanisms underlying traits of economic value and for increasing the rate of genetic gain through artificial selection. Here we build a Cattle Genotype-Tissue Expression atlas (CattleGTEx) as part of the pilot phase of the Farm animal GTEx (FarmGTEx) project for the research community based on 7,180 publicly available RNA-sequencing (RNA-seq) samples. We describe the transcriptomic landscape of more than 100 tissues/cell types and report hundreds of thousands of genetic associations with gene expression and alternative splicing for 23 distinct tissues. We evaluate the tissue-sharing patterns of these genetic regulatory effects, and functionally annotate them using multiomics data. Finally, we link gene expression in different tissues to 43 economically important traits using both transcriptome-wide association and colocalization analyses to decipher the molecular regulatory mechanisms underpinning such agronomic traits in cattle.
Subject(s)
Quantitative Trait Loci , Transcriptome , Animals , Cattle/genetics , Gene Expression Regulation , Phenotype , Quantitative Trait Loci/genetics , Sequence Analysis, RNA , Transcriptome/geneticsABSTRACT
Indirect genetic effects, the effects of the genotype of one individual on the phenotype of other individuals, are environmental factors associated with human disease and complex trait variation that could help to expand our understanding of the environment linked to complex traits. Here, we study indirect genetic effects in 80,889 human couples of European ancestry for 105 complex traits. Using a linear mixed model approach, we estimate partner indirect heritability and find evidence of partner heritability on ~50% of the analysed traits. Follow-up analysis suggests that in at least ~25% of these traits, the partner heritability is consistent with the existence of indirect genetic effects including a wide variety of traits such as dietary traits, mental health and disease. This shows that the environment linked to complex traits is partially explained by the genotype of other individuals and motivates the need to find new ways of studying the environment.
Subject(s)
Gene-Environment Interaction , Genotype , Health Status , Inheritance Patterns , Life Style , Phenotype , Adult , Female , Humans , Male , Sex Factors , Spouses , White PeopleABSTRACT
Males and females present differences in complex traits and in the risk of a wide array of diseases. Genotype by sex (GxS) interactions are thought to account for some of these differences. However, the extent and basis of GxS are poorly understood. In the present study, we provide insights into both the scope and the mechanism of GxS across the genome of about 450,000 individuals of European ancestry and 530 complex traits in the UK Biobank. We found small yet widespread differences in genetic architecture across traits. We also found that, in some cases, sex-agnostic analyses may be missing trait-associated loci and looked into possible improvements in the prediction of high-level phenotypes. Finally, we studied the potential functional role of the differences observed through sex-biased gene expression and gene-level analyses. Our results suggest the need to consider sex-aware analyses for future studies to shed light onto possible sex-specific molecular mechanisms.