Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 30
Filter
1.
Am J Hum Genet ; 110(11): 1888-1902, 2023 11 02.
Article in English | MEDLINE | ID: mdl-37890495

ABSTRACT

Admixed individuals offer unique opportunities for addressing limited transferability in polygenic scores (PGSs), given the substantial trans-ancestry genetic correlation in many complex traits. However, they are rarely considered in PGS training, given the challenges in representing ancestry-matched linkage-disequilibrium reference panels for admixed individuals. Here we present inclusive PGS (iPGS), which captures ancestry-shared genetic effects by finding the exact solution for penalized regression on individual-level data and is thus naturally applicable to admixed individuals. We validate our approach in a simulation study across 33 configurations with varying heritability, polygenicity, and ancestry composition in the training set. When iPGS is applied to n = 237,055 ancestry-diverse individuals in the UK Biobank, it shows the greatest improvements in Africans by 48.9% on average across 60 quantitative traits and up to 50-fold improvements for some traits (neutrophil count, R2 = 0.058) over the baseline model trained on the same number of European individuals. When we allowed iPGS to use n = 284,661 individuals, we observed an average improvement of 60.8% for African, 11.6% for South Asian, 7.3% for non-British White, 4.8% for White British, and 17.8% for the other individuals. We further developed iPGS+refit to jointly model the ancestry-shared and -dependent genetic effects when heterogeneous genetic associations were present. For neutrophil count, for example, iPGS+refit showed the highest predictive performance in the African group (R2 = 0.115), which exceeds the best predictive performance for the White British group (R2 = 0.090 in the iPGS model), even though only 1.49% of individuals used in the iPGS training are of African ancestry. Our results indicate the power of including diverse individuals for developing more equitable PGS models.


Subject(s)
Multifactorial Inheritance , White People , Humans , Multifactorial Inheritance/genetics , White People/genetics , Phenotype , Black People/genetics , Asian People/genetics , Genome-Wide Association Study/methods
2.
Cell ; 186(20): 4386-4403.e29, 2023 09 28.
Article in English | MEDLINE | ID: mdl-37774678

ABSTRACT

Altered microglial states affect neuroinflammation, neurodegeneration, and disease but remain poorly understood. Here, we report 194,000 single-nucleus microglial transcriptomes and epigenomes across 443 human subjects and diverse Alzheimer's disease (AD) pathological phenotypes. We annotate 12 microglial transcriptional states, including AD-dysregulated homeostatic, inflammatory, and lipid-processing states. We identify 1,542 AD-differentially-expressed genes, including both microglia-state-specific and disease-stage-specific alterations. By integrating epigenomic, transcriptomic, and motif information, we infer upstream regulators of microglial cell states, gene-regulatory networks, enhancer-gene links, and transcription-factor-driven microglial state transitions. We demonstrate that ectopic expression of our predicted homeostatic-state activators induces homeostatic features in human iPSC-derived microglia-like cells, while inhibiting activators of inflammation can block inflammatory progression. Lastly, we pinpoint the expression of AD-risk genes in microglial states and differential expression of AD-risk genes and their regulators during AD progression. Overall, we provide insights underlying microglial states, including state-specific and AD-stage-specific microglial alterations at unprecedented resolution.


Subject(s)
Alzheimer Disease , Microglia , Humans , Alzheimer Disease/genetics , Alzheimer Disease/pathology , Gene Expression Regulation , Inflammation/pathology , Microglia/metabolism , Transcription Factors/metabolism , Transcriptome , Epigenome
3.
Cell Metab ; 34(10): 1578-1593.e6, 2022 10 04.
Article in English | MEDLINE | ID: mdl-36198295

ABSTRACT

Exercise training is critical for the prevention and treatment of obesity, but its underlying mechanisms remain incompletely understood given the challenge of profiling heterogeneous effects across multiple tissues and cell types. Here, we address this challenge and opposing effects of exercise and high-fat diet (HFD)-induced obesity at single-cell resolution in subcutaneous and visceral white adipose tissue and skeletal muscle in mice with diet and exercise training interventions. We identify a prominent role of mesenchymal stem cells (MSCs) in obesity and exercise-induced tissue adaptation. Among the pathways regulated by exercise and HFD in MSCs across the three tissues, extracellular matrix remodeling and circadian rhythm are the most prominent. Inferred cell-cell interactions implicate within- and multi-tissue crosstalk centered around MSCs. Overall, our work reveals the intricacies and diversity of multi-tissue molecular responses to exercise and obesity and uncovers a previously underappreciated role of MSCs in tissue-specific and multi-tissue beneficial effects of exercise.


Subject(s)
Adipose Tissue , Mesenchymal Stem Cells , Adipose Tissue/metabolism , Animals , Diet, High-Fat , Mesenchymal Stem Cells/metabolism , Mice , Mice, Inbred C57BL , Muscle, Skeletal/metabolism , Obesity/metabolism
4.
Ann Appl Stat ; 16(3): 1891-1918, 2022 Sep.
Article in English | MEDLINE | ID: mdl-36091495

ABSTRACT

In high-dimensional regression problems, often a relatively small subset of the features are relevant for predicting the outcome, and methods that impose sparsity on the solution are popular. When multiple correlated outcomes are available (multitask), reduced rank regression is an effective way to borrow strength and capture latent structures that underlie the data. Our proposal is motivated by the UK Biobank population-based cohort study, where we are faced with large-scale, ultrahigh-dimensional features, and have access to a large number of outcomes (phenotypes)-lifestyle measures, biomarkers, and disease outcomes. We are hence led to fit sparse reduced-rank regression models, using computational strategies that allow us to scale to problems of this size. We use a scheme that alternates between solving the sparse regression problem and solving the reduced rank decomposition. For the sparse regression component we propose a scalable iterative algorithm based on adaptive screening that leverages the sparsity assumption and enables us to focus on solving much smaller subproblems. The full solution is reconstructed and tested via an optimality condition to make sure it is a valid solution for the original problem. We further extend the method to cope with practical issues, such as the inclusion of confounding variables and imputation of missing values among the phenotypes. Experiments on both synthetic data and the UK Biobank data demonstrate the effectiveness of the method and the algorithm. We present multiSnpnet package, available at http://github.com/junyangq/multiSnpnet that works on top of PLINK2 files, which we anticipate to be a valuable tool for generating polygenic risk scores from human genetic studies.

5.
Nat Commun ; 13(1): 5107, 2022 08 30.
Article in English | MEDLINE | ID: mdl-36042219

ABSTRACT

The SARS-CoV-2 pandemic has differentially impacted populations across race and ethnicity. A multi-omic approach represents a powerful tool to examine risk across multi-ancestry genomes. We leverage a pandemic tracking strategy in which we sequence viral and host genomes and transcriptomes from nasopharyngeal swabs of 1049 individuals (736 SARS-CoV-2 positive and 313 SARS-CoV-2 negative) and integrate them with digital phenotypes from electronic health records from a diverse catchment area in Northern California. Genome-wide association disaggregated by admixture mapping reveals novel COVID-19-severity-associated regions containing previously reported markers of neurologic, pulmonary and viral disease susceptibility. Phylodynamic tracking of consensus viral genomes reveals no association with disease severity or inferred ancestry. Summary data from multiomic investigation reveals metagenomic and HLA associations with severe COVID-19. The wealth of data available from residual nasopharyngeal swabs in combination with clinical data abstracted automatically at scale highlights a powerful strategy for pandemic tracking, and reveals distinct epidemiologic, genetic, and biological associations for those at the highest risk.


Subject(s)
COVID-19 , Pandemics , COVID-19/epidemiology , Genome, Viral , Genome-Wide Association Study , Humans , SARS-CoV-2/genetics
6.
PLoS Comput Biol ; 18(8): e1010378, 2022 08.
Article in English | MEDLINE | ID: mdl-36040971

ABSTRACT

We present WhichTF, a computational method to identify functionally important transcription factors (TFs) from chromatin accessibility measurements. To rank TFs, WhichTF applies an ontology-guided functional approach to compute novel enrichment by integrating accessibility measurements, high-confidence pre-computed conservation-aware TF binding sites, and putative gene-regulatory models. Comparison with prior sheer abundance-based methods reveals the unique ability of WhichTF to identify context-specific TFs with functional relevance, including NF-κB family members in lymphocytes and GATA factors in cardiac cells. To distinguish the transcriptional regulatory landscape in closely related samples, we apply differential analysis and demonstrate its utility in lymphocyte, mesoderm developmental, and disease cells. We find suggestive, under-characterized TFs, such as RUNX3 in mesoderm development and GLI1 in systemic lupus erythematosus. We also find TFs known for stress response, suggesting routine experimental caveats that warrant careful consideration. WhichTF yields biological insight into known and novel molecular mechanisms of TF-mediated transcriptional regulation in diverse contexts, including human and mouse cell types, cell fate trajectories, and disease-associated cells.


Subject(s)
Chromatin , Transcription Factors , Animals , Binding Sites , Chromatin/genetics , Gene Expression Regulation , Humans , Mice , Protein Binding , Transcription Factors/metabolism
7.
Am J Hum Genet ; 109(6): 1055-1064, 2022 06 02.
Article in English | MEDLINE | ID: mdl-35588732

ABSTRACT

Polygenic risk scores (PRSs) quantify the contribution of multiple genetic loci to an individual's likelihood of a complex trait or disease. However, existing PRSs estimate this likelihood with common genetic variants, excluding the impact of rare variants. Here, we report on a method to identify rare variants associated with outlier gene expression and integrate their impact into PRS predictions for body mass index (BMI), obesity, and bariatric surgery. Between the top and bottom 10%, we observed a 20.8% increase in risk for obesity (p = 3 × 10-14), 62.3% increase in risk for severe obesity (p = 1 × 10-6), and median 5.29 years earlier onset for bariatric surgery (p = 0.008), as a function of expression outlier-associated rare variant burden when controlling for common variant PRS. We show that these predictions were more significant than integrating the effects of rare protein-truncating variants (PTVs), observing a mean 19% increase in phenotypic variance explained with expression outlier-associated rare variants when compared with PTVs (p = 2 × 10-15). We replicated these findings by using data from the Million Veteran Program and demonstrated that PRSs across multiple traits and diseases can benefit from the inclusion of expression outlier-associated rare variants identified through population-scale transcriptome sequencing.


Subject(s)
Multifactorial Inheritance , Obesity , Body Mass Index , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Multifactorial Inheritance/genetics , Obesity/genetics , Phenotype , Risk Factors
8.
PLoS Genet ; 18(3): e1010105, 2022 03.
Article in English | MEDLINE | ID: mdl-35324888

ABSTRACT

We present a systematic assessment of polygenic risk score (PRS) prediction across more than 1,500 traits using genetic and phenotype data in the UK Biobank. We report 813 sparse PRS models with significant (p < 2.5 x 10-5) incremental predictive performance when compared against the covariate-only model that considers age, sex, types of genotyping arrays, and the principal component loadings of genotypes. We report a significant correlation between the number of genetic variants selected in the sparse PRS model and the incremental predictive performance (Spearman's ⍴ = 0.61, p = 2.2 x 10-59 for quantitative traits, ⍴ = 0.21, p = 9.6 x 10-4 for binary traits). The sparse PRS model trained on European individuals showed limited transferability when evaluated on non-European individuals in the UK Biobank. We provide the PRS model weights on the Global Biobank Engine (https://biobankengine.stanford.edu/prs).


Subject(s)
Genome-Wide Association Study , Multifactorial Inheritance , Biological Specimen Banks , Genetic Predisposition to Disease , Humans , Multifactorial Inheritance/genetics , Phenotype , Risk Factors , United Kingdom
9.
Biostatistics ; 23(2): 522-540, 2022 04 13.
Article in English | MEDLINE | ID: mdl-32989444

ABSTRACT

We develop a scalable and highly efficient algorithm to fit a Cox proportional hazard model by maximizing the $L^1$-regularized (Lasso) partial likelihood function, based on the Batch Screening Iterative Lasso (BASIL) method developed in Qian and others (2019). Our algorithm is particularly suitable for large-scale and high-dimensional data that do not fit in the memory. The output of our algorithm is the full Lasso path, the parameter estimates at all predefined regularization parameters, as well as their validation accuracy measured using the concordance index (C-index) or the validation deviance. To demonstrate the effectiveness of our algorithm, we analyze a large genotype-survival time dataset across 306 disease outcomes from the UK Biobank (Sudlow and others, 2015). We provide a publicly available implementation of the proposed approach for genetics data on top of the PLINK2 package and name it snpnet-Cox.


Subject(s)
Algorithms , Biological Specimen Banks , Humans , Likelihood Functions , Proportional Hazards Models , United Kingdom
11.
Am J Hum Genet ; 108(12): 2354-2367, 2021 12 02.
Article in English | MEDLINE | ID: mdl-34822764

ABSTRACT

Whole-genome sequencing studies applied to large populations or biobanks with extensive phenotyping raise new analytic challenges. The need to consider many variants at a locus or group of genes simultaneously and the potential to study many correlated phenotypes with shared genetic architecture provide opportunities for discovery not addressed by the traditional one variant, one phenotype association study. Here, we introduce a Bayesian model comparison approach called MRP (multiple rare variants and phenotypes) for rare-variant association studies that considers correlation, scale, and direction of genetic effects across a group of genetic variants, phenotypes, and studies, requiring only summary statistic data. We apply our method to exome sequencing data (n = 184,698) across 2,019 traits from the UK Biobank, aggregating signals in genes. MRP demonstrates an ability to recover signals such as associations between PCSK9 and LDL cholesterol levels. We additionally find MRP effective in conducting meta-analyses in exome data. Non-biomarker findings include associations between MC1R and red hair color and skin color, IL17RA and monocyte count, and IQGAP2 and mean platelet volume. Finally, we apply MRP in a multi-phenotype setting; after clustering the 35 biomarker phenotypes based on genetic correlation estimates, we find that joint analysis of these phenotypes results in substantial power gains for gene-trait associations, such as in TNFRSF13B in one of the clusters containing diabetes- and lipid-related traits. Overall, we show that the MRP model comparison approach improves upon useful features from widely used meta-analysis approaches for rare-variant association analyses and prioritizes protective modifiers of disease risk.


Subject(s)
Genetic Variation , Genome-Wide Association Study , Models, Genetic , Bayes Theorem , Female , Humans , Male , Phenotype
12.
Nat Genet ; 53(10): 1415-1424, 2021 10.
Article in English | MEDLINE | ID: mdl-34594039

ABSTRACT

Current genome-wide association studies do not yet capture sufficient diversity in populations and scope of phenotypes. To expand an atlas of genetic associations in non-European populations, we conducted 220 deep-phenotype genome-wide association studies (diseases, biomarkers and medication usage) in BioBank Japan (n = 179,000), by incorporating past medical history and text-mining of electronic medical records. Meta-analyses with the UK Biobank and FinnGen (ntotal = 628,000) identified ~5,000 new loci, which improved the resolution of the genomic map of human traits. This atlas elucidated the landscape of pleiotropy as represented by the major histocompatibility complex locus, where we conducted HLA fine-mapping. Finally, we performed statistical decomposition of matrices of phenome-wide summary statistics, and identified latent genetic components, which pinpointed responsible variants and biological mechanisms underlying current disease classifications across populations. The decomposed components enabled genetically informed subtyping of similar diseases (for example, allergic diseases). Our study suggests a potential avenue for hypothesis-free re-investigation of human diseases through genetics.


Subject(s)
Genetic Association Studies , Genetic Predisposition to Disease , ABO Blood-Group System/genetics , Biological Specimen Banks , Genetic Loci , Genetic Pleiotropy , Genome-Wide Association Study , Humans , Major Histocompatibility Complex/genetics , Meta-Analysis as Topic , Mutation/genetics , Phenotype
14.
Lipids Health Dis ; 20(1): 113, 2021 Sep 21.
Article in English | MEDLINE | ID: mdl-34548093

ABSTRACT

BACKGROUND: Hypertriglyceridemia has emerged as a critical coronary artery disease (CAD) risk factor. Rare loss-of-function (LoF) variants in apolipoprotein C-III have been reported to reduce triglycerides (TG) and are cardioprotective in American Indians and Europeans. However, there is a lack of data in other Europeans and non-Europeans. Also, whether genetically increased plasma TG due to ApoC-III is causally associated with increased CAD risk is still unclear and inconsistent. The objectives of this study were to verify the cardioprotective role of earlier reported six LoF variants of APOC3 in South Asians and other multi-ethnic cohorts and to evaluate the causal association of TG raising common variants for increasing CAD risk. METHODS: We performed gene-centric and Mendelian randomization analyses and evaluated the role of genetic variation encompassing APOC3 for affecting circulating TG and the risk for developing CAD. RESULTS: One rare LoF variant (rs138326449) with a 37% reduction in TG was associated with lowered risk for CAD in Europeans (p = 0.007), but we could not confirm this association in Asian Indians (p = 0.641). Our data could not validate the cardioprotective role of other five LoF variants analysed. A common variant rs5128 in the APOC3 was strongly associated with elevated TG levels showing a p-value 2.8 × 10- 424. Measures of plasma ApoC-III in a small subset of Sikhs revealed a 37% increase in ApoC-III concentrations among homozygous mutant carriers than the wild-type carriers of rs5128. A genetically instrumented per 1SD increment of plasma TG level of 15 mg/dL would cause a mild increase (3%) in the risk for CAD (p = 0.042). CONCLUSIONS: Our results highlight the challenges of inclusion of rare variant information in clinical risk assessment and the generalizability of implementation of ApoC-III inhibition for treating atherosclerotic disease. More studies would be needed to confirm whether genetically raised TG and ApoC-III concentrations would increase CAD risk.


Subject(s)
Apolipoprotein C-III/genetics , Coronary Artery Disease/genetics , Genetic Variation , Aged , Alleles , Coronary Artery Disease/ethnology , Europe/epidemiology , Female , Genetic Association Studies , Genotype , Heterozygote , Humans , India/epidemiology , Male , Mendelian Randomization Analysis , Middle Aged , Mutation , Risk , Sequence Analysis, DNA , Triglycerides/blood
15.
Bioinformatics ; 37(22): 4148-4155, 2021 11 18.
Article in English | MEDLINE | ID: mdl-34146108

ABSTRACT

MOTIVATION: Large-scale and high-dimensional genome sequencing data poses computational challenges. General-purpose optimization tools are usually not optimal in terms of computational and memory performance for genetic data. RESULTS: We develop two efficient solvers for optimization problems arising from large-scale regularized regressions on millions of genetic variants sequenced from hundreds of thousands of individuals. These genetic variants are encoded by the values in the set {0,1,2,NA}. We take advantage of this fact and use two bits to represent each entry in a genetic matrix, which reduces memory requirement by a factor of 32 compared to a double precision floating point representation. Using this representation, we implemented an iteratively reweighted least square algorithm to solve Lasso regressions on genetic matrices, which we name snpnet-2.0. When the dataset contains many rare variants, the predictors can be encoded in a sparse matrix. We utilize the sparsity in the predictor matrix to further reduce memory requirement and computational speed. Our sparse genetic matrix implementation uses both the compact two-bit representation and a simplified version of compressed sparse block format so that matrix-vector multiplications can be effectively parallelized on multiple CPU cores. To demonstrate the effectiveness of this representation, we implement an accelerated proximal gradient method to solve group Lasso on these sparse genetic matrices. This solver is named sparse-snpnet, and will also be included as part of snpnet R package. Our implementation is able to solve Lasso and group Lasso, linear, logistic and Cox regression problems on sparse genetic matrices that contain 1 000 000 variants and almost 100 000 individuals within 10 min and using less than 32GB of memory. AVAILABILITY AND IMPLEMENTATION: https://github.com/rivas-lab/snpnet/tree/compact.


Subject(s)
Biological Specimen Banks , Genome , Humans , Algorithms , Chromosome Mapping , Least-Squares Analysis
16.
Bioinformatics ; 37(23): 4437-4443, 2021 12 07.
Article in English | MEDLINE | ID: mdl-33560296

ABSTRACT

MOTIVATION: The prediction performance of Cox proportional hazard model suffers when there are only few uncensored events in the training data. RESULTS: We propose a Sparse-Group regularized Cox regression method to improve the prediction performance of large-scale and high-dimensional survival data with few observed events. Our approach is applicable when there is one or more other survival responses that 1. has a large number of observed events; 2. share a common set of associated predictors with the rare event response. This scenario is common in the UK Biobank dataset where records for a large number of common and less prevalent diseases of the same set of individuals are available. By analyzing these responses together, we hope to achieve higher prediction performance than when they are analyzed individually. To make this approach practical for large-scale data, we developed an accelerated proximal gradient optimization algorithm as well as a screening procedure inspired by Qian et al. AVAILABILITYANDIMPLEMENTATION: https://github.com/rivas-lab/multisnpnet-Cox.


Subject(s)
Algorithms , Humans , Survival Analysis , Proportional Hazards Models , Regression Analysis
17.
Eur J Hum Genet ; 29(7): 1071-1081, 2021 07.
Article in English | MEDLINE | ID: mdl-33558700

ABSTRACT

Polygenic risk models have led to significant advances in understanding complex diseases and their clinical presentation. While polygenic risk scores (PRS) can effectively predict outcomes, they do not generally account for disease subtypes or pathways which underlie within-trait diversity. Here, we introduce a latent factor model of genetic risk based on components from Decomposition of Genetic Associations (DeGAs), which we call the DeGAs polygenic risk score (dPRS). We compute DeGAs using genetic associations for 977 traits and find that dPRS performs comparably to standard PRS while offering greater interpretability. We show how to decompose an individual's genetic risk for a trait across DeGAs components, with examples for body mass index (BMI) and myocardial infarction (heart attack) in 337,151 white British individuals in the UK Biobank, with replication in a further set of 25,486 non-British white individuals. We find that BMI polygenic risk factorizes into components related to fat-free mass, fat mass, and overall health indicators like physical activity. Most individuals with high dPRS for BMI have strong contributions from both a fat-mass component and a fat-free mass component, whereas a few "outlier" individuals have strong contributions from only one of the two components. Overall, our method enables fine-scale interpretation of the drivers of genetic risk for complex traits.


Subject(s)
Genetic Association Studies , Genetic Predisposition to Disease , Multifactorial Inheritance , Quantitative Trait, Heritable , Algorithms , Biological Specimen Banks , Databases, Genetic , Genetic Association Studies/methods , Genome-Wide Association Study , Humans , Models, Genetic , Phenotype , Population Surveillance , Reproducibility of Results , Risk Assessment , Risk Factors , United Kingdom/epidemiology
18.
Nat Genet ; 53(2): 185-194, 2021 02.
Article in English | MEDLINE | ID: mdl-33462484

ABSTRACT

Clinical laboratory tests are a critical component of the continuum of care. We evaluate the genetic basis of 35 blood and urine laboratory measurements in the UK Biobank (n = 363,228 individuals). We identify 1,857 loci associated with at least one trait, containing 3,374 fine-mapped associations and additional sets of large-effect (>0.1 s.d.) protein-altering, human leukocyte antigen (HLA) and copy number variant (CNV) associations. Through Mendelian randomization (MR) analysis, we discover 51 causal relationships, including previously known agonistic effects of urate on gout and cystatin C on stroke. Finally, we develop polygenic risk scores (PRSs) for each biomarker and build 'multi-PRS' models for diseases using 35 PRSs simultaneously, which improved chronic kidney disease, type 2 diabetes, gout and alcoholic cirrhosis genetic risk stratification in an independent dataset (FinnGen; n = 135,500) relative to single-disease PRSs. Together, our results delineate the genetic basis of biomarkers and their causal influences on diseases and improve genetic risk stratification for common diseases.


Subject(s)
Biomarkers/blood , Biomarkers/urine , HLA Antigens/genetics , Proteins/genetics , Biological Specimen Banks , Cardiovascular Diseases/genetics , Cardiovascular Diseases/metabolism , DNA Copy Number Variations , Diabetes Mellitus, Type 2/genetics , Diabetes Mellitus, Type 2/metabolism , Genetic Pleiotropy , Humans , Linkage Disequilibrium , Liver-Specific Organic Anion Transporter 1/genetics , Mendelian Randomization Analysis , Polymorphism, Single Nucleotide , Renal Insufficiency, Chronic , Serine Endopeptidases/genetics , United Kingdom
19.
Eur J Hum Genet ; 29(1): 154-163, 2021 01.
Article in English | MEDLINE | ID: mdl-32873964

ABSTRACT

Sex differences have been shown in laboratory biomarkers; however, the extent to which this is due to genetics is unknown. In this study, we infer sex-specific genetic parameters (heritability and genetic correlation) across 33 quantitative biomarker traits in 181,064 females and 156,135 males from the UK Biobank study. We apply a Bayesian Mixture Model, Sex Effects Mixture Model (SEMM), to Genome-wide Association Study summary statistics in order to (1) estimate the contributions of sex to the genetic variance of these biomarkers and (2) identify variants whose statistical association with these traits is sex-specific. We find that the genetics of most biomarker traits are shared between males and females, with the notable exception of testosterone, where we identify 119 female and 445 male-specific variants. These include protein-altering variants in steroid hormone production genes (POR, UGT2B7). Using the sex-specific variants as genetic instruments for Mendelian randomization, we find evidence for causal links between testosterone levels and height, body mass index, waist and hip circumference, and type 2 diabetes. We also show that sex-specific polygenic risk score models for testosterone outperform a combined model. Overall, these results demonstrate that while sex has a limited role in the genetics of most biomarker traits, sex plays an important role in testosterone genetics.


Subject(s)
Biomarkers/analysis , Multifactorial Inheritance/genetics , Sex Characteristics , Body Composition , Cytochrome P-450 Enzyme System/genetics , Female , Glucuronosyltransferase/genetics , Humans , Male , Mendelian Randomization Analysis , Testosterone/genetics
20.
Circ Genom Precis Med ; 13(6): e003014, 2020 12.
Article in English | MEDLINE | ID: mdl-33125279

ABSTRACT

BACKGROUND: The aortic valve is an important determinant of cardiovascular physiology and anatomic location of common human diseases. METHODS: From a sample of 34 287 white British ancestry participants, we estimated functional aortic valve area by planimetry from prospectively obtained cardiac magnetic resonance imaging sequences of the aortic valve. Aortic valve area measurements were submitted to genome-wide association testing, followed by polygenic risk scoring and phenome-wide screening, to identify genetic comorbidities. RESULTS: A genome-wide association study of aortic valve area in these UK Biobank participants showed 3 significant associations, indexed by rs71190365 (chr13:50764607, DLEU1, P=1.8×10-9), rs35991305 (chr12:94191968, CRADD, P=3.4×10-8), and chr17:45013271:C:T (GOSR2, P=5.6×10-8). Replication on an independent set of 8145 unrelated European ancestry participants showed consistent effect sizes in all 3 loci, although rs35991305 did not meet nominal significance. We constructed a polygenic risk score for aortic valve area, which in a separate cohort of 311 728 individuals without imaging demonstrated that smaller aortic valve area is predictive of increased risk for aortic valve disease (odds ratio, 1.14; P=2.3×10-6). After excluding subjects with a medical diagnosis of aortic valve stenosis (remaining n=308 683 individuals), phenome-wide association of >10 000 traits showed multiple links between the polygenic score for aortic valve disease and key health-related comorbidities involving the cardiovascular system and autoimmune disease. Genetic correlation analysis supports a shared genetic etiology with between aortic valve area and birth weight along with other cardiovascular conditions. CONCLUSIONS: These results illustrate the use of automated phenotyping of cardiac imaging data from the general population to investigate the genetic etiology of aortic valve disease, perform clinical prediction, and uncover new clinical and genetic correlates of cardiac anatomy.


Subject(s)
Aortic Valve/diagnostic imaging , Biological Specimen Banks , Cardiovascular Diseases/diagnostic imaging , Cardiovascular Diseases/genetics , Genome-Wide Association Study , Magnetic Resonance Imaging , Adult , Aged , Aortic Valve/pathology , Aortic Valve Stenosis/diagnostic imaging , Aortic Valve Stenosis/genetics , Comorbidity , Female , Genome, Human , Humans , Male , Middle Aged , Multifactorial Inheritance/genetics , Phenomics , Phenotype , Survival Analysis , United Kingdom
SELECTION OF CITATIONS
SEARCH DETAIL
...