RESUMO
Transcriptome-wide association study (TWAS) tools have been applied to conduct proteome-wide association studies (PWASs) by integrating proteomics data with genome-wide association study (GWAS) summary data. The genetic effects of PWAS-identified significant genes are potentially mediated through genetically regulated protein abundance, thus informing the underlying disease mechanisms better than GWAS loci. However, existing TWAS/PWAS tools are limited by considering only one statistical model. We propose an omnibus PWAS pipeline to account for multiple statistical models and demonstrate improved performance by simulation and application studies of Alzheimer disease (AD) dementia. We employ the Aggregated Cauchy Association Test to derive omnibus PWAS (PWAS-O) p values from PWAS p values obtained by three existing tools assuming complementary statistical models-TIGAR, PrediXcan, and FUSION. Our simulation studies demonstrated improved power, with well-calibrated type I error, for PWAS-O over all three individual tools. We applied PWAS-O to studying AD dementia with reference proteomic data profiled from dorsolateral prefrontal cortex of postmortem brains from individuals of European ancestry. We identified 43 risk genes, including 5 not identified by previous studies, which are interconnected through a protein-protein interaction network that includes the well-known AD risk genes TOMM40, APOC1, and APOC2. We also validated causal genetic effects mediated through the proteome for 27 (63%) PWAS-O risk genes, providing insights into the underlying biological mechanisms of AD dementia and highlighting promising targets for therapeutic development. PWAS-O can be easily applied to studying other complex diseases.
Assuntos
Doença de Alzheimer , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Proteoma , Doença de Alzheimer/genética , Doença de Alzheimer/metabolismo , Humanos , Proteoma/genética , Proteoma/metabolismo , Proteômica/métodos , Apolipoproteína C-I/genética , Apolipoproteína C-I/metabolismo , Polimorfismo de Nucleotídeo Único , Fatores de Risco , Transcriptoma , Proteínas do Complexo de Importação de Proteína Precursora MitocondrialRESUMO
AIMS/HYPOTHESIS: Several studies have reported associations between specific proteins and type 2 diabetes risk in European populations. To better understand the role played by proteins in type 2 diabetes aetiology across diverse populations, we conducted a large proteome-wide association study using genetic instruments across four racial and ethnic groups: African; Asian; Hispanic/Latino; and European. METHODS: Genome and plasma proteome data from the Multi-Ethnic Study of Atherosclerosis (MESA) study involving 182 African, 69 Asian, 284 Hispanic/Latino and 409 European individuals residing in the USA were used to establish protein prediction models by using potentially associated cis- and trans-SNPs. The models were applied to genome-wide association study summary statistics of 250,127 type 2 diabetes cases and 1,222,941 controls from different racial and ethnic populations. RESULTS: We identified three, 44 and one protein associated with type 2 diabetes risk in Asian, European and Hispanic/Latino populations, respectively. Meta-analysis identified 40 proteins associated with type 2 diabetes risk across the populations, including well-established as well as novel proteins not yet implicated in type 2 diabetes development. CONCLUSIONS/INTERPRETATION: Our study improves our understanding of the aetiology of type 2 diabetes in diverse populations. DATA AVAILABILITY: The summary statistics of multi-ethnic type 2 diabetes GWAS of MVP, DIAMANTE, Biobank Japan and other studies are available from The database of Genotypes and Phenotypes (dbGaP) under accession number phs001672.v3.p1. MESA genetic, proteome and covariate data can be accessed through dbGaP under phs000209.v13.p3. All code is available on GitHub ( https://github.com/Arthur1021/MESA-1K-PWAS ).
RESUMO
It remains challenging to translate the findings from genome-wide association studies (GWAS) of autoimmune diseases (AIDs) into interventional targets, presumably due to the lack of knowledge on how the GWAS risk variants contribute to AIDs. In addition, current immunomodulatory drugs for AIDs are broad in action rather than disease-specific. We performed a comprehensive protein-centric omics integration analysis to identify AIDs-associated plasma proteins through integrating protein quantitative trait loci datasets of plasma protein (1348 proteins and 7213 individuals) and totally ten large-scale GWAS summary statistics of AIDs under a cutting-edge systematic analytic framework. Specifically, we initially screened out the protein-AID associations using proteome-wide association study (PWAS), followed by enrichment analysis to reveal the underlying biological processes and pathways. Then, we performed both Mendelian randomization (MR) and colocalization analyses to further identify protein-AID pairs with putatively causal relationships. We finally prioritized the potential drug targets for AIDs. A total of 174 protein-AID associations were identified by PWAS. AIDs-associated plasma proteins were significantly enriched in immune-related biological process and pathways, such as inflammatory response (P = 3.96 × 10-10). MR analysis further identified 97 protein-AID pairs with potential causal relationships, among which 21 pairs were highly supported by colocalization analysis (PP.H4 > 0.75), 10 of 21 were the newly discovered pairs and not reported in previous GWAS analyses. Further explorations showed that four proteins (TLR3, FCGR2A, IL23R, TCN1) have corresponding drugs, and 17 proteins have druggability. These findings will help us to further understand the biological mechanism of AIDs and highlight the potential of these proteins to develop as therapeutic targets for AIDs.
Assuntos
Doenças Autoimunes , Proteínas Sanguíneas , Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Humanos , Doenças Autoimunes/genética , Proteínas Sanguíneas/genética , Proteínas Sanguíneas/metabolismo , Análise da Randomização Mendeliana , Polimorfismo de Nucleotídeo Único , Predisposição Genética para Doença , Proteoma , Proteômica/métodosRESUMO
BACKGROUND: Genome-wide association studies have identified dozens of genomic loci for obesity. However, functional genes and their detailed genetic mechanisms underlying these loci are mainly unknown. In this study, we conducted an integrative study to prioritize plausibly functional genes by combining information from genome-, transcriptome- and proteome-wide association analyses. METHODS: We first conducted proteome-wide association analyses and transcriptome-wide association analyses for the six obesity-related traits. We then performed colocalization analysis on the identified loci shared between the proteome- and transcriptome-association analyses. Finally, we validated the identified genes with other plasma/blood reference panels. The highlighted genes were assessed for expression of other tissues, single-cell and tissue specificity, and druggability. RESULTS: We prioritized 4 high-confidence genes (FASN, ICAM1, PDCD6IP, and YWHAB) by proteome-wide association studies, transcriptome-wide association studies, and colocalization analyses, which consistently influenced the variation of obesity traits at both mRNA and protein levels. These 4 genes were successfully validated using other plasma/blood reference panels. These 4 genes shared regulatory structures in obesity-related tissues. Single-cell and tissue-specific analyses showed that FASN and ICAM1 were explicitly expressed in metabolism- and immunity-related tissues and cells. Furthermore, FASN and ICAM1 had been developed as drug targets. CONCLUSION: Our study provided novel promising protein targets for further mechanistic and therapeutic studies of obesity.
RESUMO
Ancestry-specific proteome-wide association studies (PWAS) based on genetically predicted protein expression can reveal complex disease etiology specific to certain ancestral groups. These studies require ancestry-specific models for protein expression as a function of SNP genotypes. In order to improve protein expression prediction in ancestral populations historically underrepresented in genomic studies, we propose a new penalized maximum likelihood estimator for fitting ancestry-specific joint protein quantitative trait loci models. Our estimator borrows information across ancestral groups, while simultaneously allowing for heterogeneous error variances and regression coefficients. We propose an alternative parameterization of our model that makes the objective function convex and the penalty scale invariant. To improve computational efficiency, we propose an approximate version of our method and study its theoretical properties. Our method provides a substantial improvement in protein expression prediction accuracy in individuals of African ancestry, and in a downstream PWAS analysis, leads to the discovery of multiple associations between protein expression and blood lipid traits in the African ancestry population.
Assuntos
Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Humanos , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Análise de Regressão , Funções Verossimilhança , População Negra/genética , População Negra/estatística & dados numéricos , Proteoma , Simulação por Computador , Modelos Estatísticos , Biometria/métodosRESUMO
The pathogenesis of ocular diseases (ODs) remains unclear, although genome-wide association studies (GWAS) have identified numerous associated genetic risk loci. We integrated protein quantitative trait loci (pQTL) datasets and five large-scale GWAS summary statistics of ODs under a cutting-edge systematic analytic framework. Proteome-wide association studies (PWAS) identified plasma and brain proteins associated with ODs, and 11 plasma proteins were identified by Mendelian randomization (MR) and colocalization (COLOC) analyses as being potentially causally associated with ODs. Five of these proteins (protein-coding genes ECI1, LCT, and NPTXR for glaucoma, WARS1 for age-related macular degeneration (AMD), and SIGLEC14 for diabetic retinopathy (DR)) are newly reported. Twenty brain-protein-OD pairs were identified by COLOC analysis. Eight pairs (protein-coding genes TOM1L2, MXRA7, RHPN2, and HINT1 for senile cataract, WARS1 and TDRD7 for AMD, STAT6 for myopia, and TPPP3 for DR) are newly reported in this study. Phenotype-disease mapping analysis revealed 10 genes related to the eye/vision phenotype or ODs. Combined with a drug exploration analysis, we found that the drugs related to C3 and TXN have been used for the treatment of ODs, and another eight genes (GSTM3 for senile cataract, IGFBP7 and CFHR1 for AMD, PTPMT1 for glaucoma, EFEMP1 and ACP1 for myopia, SIRPG and CTSH for DR) are promising targets for pharmacological interventions. Our study highlights the role played by proteins in ODs, in which brain proteins were taken into account due to the deepening of eye-brain connection studies. The potential pathogenic proteins finally identified provide a more reliable reference range for subsequent medical studies.
Assuntos
Encéfalo , Oftalmopatias , Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Humanos , Oftalmopatias/genética , Oftalmopatias/metabolismo , Oftalmopatias/sangue , Encéfalo/metabolismo , Proteínas Sanguíneas/genética , Proteínas Sanguíneas/metabolismo , Predisposição Genética para Doença , Análise da Randomização Mendeliana , Polimorfismo de Nucleotídeo Único , Proteoma/metabolismoRESUMO
BACKGROUND: Genome-wide association studies (GWAS) have revealed numerous loci associated with stroke. However, the underlying mechanisms at these loci in the pathogenesis of stroke and effective stroke drug targets are elusive. Therefore, we aimed to identify causal genes in the pathogenesis of stroke and its subtypes. METHODS: Utilizing multidimensional high-throughput data generated, we integrated proteome-wide association study (PWAS), transcriptome-wide association study (TWAS), Mendelian randomization (MR), and Bayesian colocalization analysis to prioritize genes that contribute to stroke and its subtypes risk via affecting their expression and protein abundance in brain and blood. RESULTS: Our integrative analysis revealed that ICA1L was associated with small-vessel stroke (SVS), according to robust evidence at both protein and transcriptional levels based on brain-derived data. We also identified NBEAL1 that was causally related to SVS via its cis-regulated brain expression level. In blood, we identified 5 genes (MMP12, SCARF1, ABO, F11, and CKAP2) that had causal relationships with stroke and stroke subtypes. CONCLUSIONS: Together, via using an integrative analysis to deal with multidimensional data, we prioritized causal genes in the pathogenesis of SVS, which offered hints for future biological and therapeutic studies.
Assuntos
Estudo de Associação Genômica Ampla , Acidente Vascular Cerebral , Teorema de Bayes , Encéfalo/metabolismo , Predisposição Genética para Doença , Humanos , Polimorfismo de Nucleotídeo Único/genética , Proteoma/genética , Proteoma/metabolismo , Acidente Vascular Cerebral/complicações , Transcriptoma/genéticaRESUMO
Small vessel strokes (SVS) and intracerebral haemorrhages (ICH) are acute outcomes of cerebral small vessel disease (SVD). Genetic studies combining both phenotypes have identified three loci associated with both traits. However, the genetic cis-regulation at the protein level associated with SVD has not been studied before. We performed a proteome-wide association study (PWAS) using FUSION to integrate a genome-wide association study (GWAS) and brain proteomic data to discover the common mechanisms regulating both SVS and ICH. Dorsolateral prefrontal cortex (dPFC) brain proteomes from the ROS/MAP study (N = 376 subjects and 1443 proteins) and the summary statistics for the SVS GWAS from the MEGASTROKE study (N = 237,511) and multi-trait analysis of GWAS (MTAG)-ICH−SVS from Chung et al. (N = 240,269) were selected. We performed PWAS and then a co-localization analysis with COLOC. The significant and nominal results were validated using a replication dPFC proteome (N = 152). The replicated results (q-value < 0.05) were further investigated for the causality relationship using summary data-based Mendelian randomization (SMR). One protein (ICA1L) was significantly associated with SVS (z-score = −4.42 and p-value = 9.6 × 10−6) and non-lobar ICH (z-score = −4.8 and p-value = 1.58 × 10−6) in the discovery PWAS, with a high co-localization posterior probability of 4. In the validation PWAS, ICA1L remained significantly associated with both traits. The SMR results for ICA1L indicated a causal association of protein expression levels in the brain with SVS (p-value = 3.66 × 10−5) and non-lobar ICH (p-value = 1.81 × 10−5). Our results show that the association of ICA1L with SVS and non-lobar ICH is conditioned by the cis-regulation of its protein levels in the brain.
Assuntos
Proteoma , Acidente Vascular Cerebral , Hemorragia Cerebral/complicações , Hemorragia Cerebral/genética , Estudo de Associação Genômica Ampla , Humanos , Proteoma/genética , Proteômica , Acidente Vascular Cerebral/etiologiaRESUMO
Numerous genome-wide association studies have identified risk genes for chronic pain, yet the mechanisms by which genetic variants modify susceptibility have remained elusive. We sought to identify key genes modulating chronic pain risk by regulating brain protein expression. We integrated brain proteomic data with the largest genome-wide dataset for multisite chronic pain (N = 387,649) in a proteome-wide association study (PWAS) using discovery and confirmatory proteomic datasets (N = 376 and 152) from the dorsolateral prefrontal cortex. Leveraging summary data-based Mendelian randomization and Bayesian colocalization analysis, we pinpointed potential causal genes, while a transcriptome-wide association study integrating 452 human brain transcriptomes investigated whether cis-effects on protein abundance extended to the transcriptome. Single-cell RNA-sequencing data and single-nucleus transcriptomic data revealed cell-type-specific expression patterns for identified causal genes in the dorsolateral prefrontal cortex and dorsal root ganglia (DRG), complemented by RNA microarray analysis of expression profiles in other pain-related brain regions. Of the 22 genes cis-regulating protein abundance identified by the discovery PWAS, 18 (82%) were deemed causal by summary data-based Mendelian randomization or Bayesian colocalization analysis analyses, with 7 of these 18 genes (39%) replicating in the confirmatory PWAS, including guanosine diphosphate-mannose pyrophosphorylase B, which also associated at the transcriptome level. Several causal genes exhibited selective expression in excitatory and inhibitory neurons, oligodendrocytes, and astrocytes, while most identified genes were expressed across additional pain-related brain regions. This integrative proteogenomic approach identified 18 high-confidence causal genes for chronic pain, regulated by cis-effects on brain protein levels, suggesting promising avenues for treatment research and indicating a contributory role for the DRG. PERSPECTIVE: The current post genome-wide association study analyses identified 18 high-confidence causal genes regulating chronic pain risk via cis-modulation of brain protein abundance, suggesting promising avenues for future chronic pain therapies. Additionally, the significant expression of these genes in the DRG indicated a potential contributory role, warranting further investigation.
Assuntos
Encéfalo , Dor Crônica , Estudo de Associação Genômica Ampla , Proteoma , Humanos , Dor Crônica/genética , Dor Crônica/metabolismo , Encéfalo/metabolismo , Proteoma/metabolismo , Transcriptoma , ProteômicaRESUMO
BACKGROUND: The underlying pathogenesis of anxiety remain elusive, making the pinpointing of potential therapeutic and diagnostic biomarkers for anxiety paramount to its efficient treatment. METHODS: We undertook a proteome-wide association study (PWAS), fusing human brain proteomes from both discovery (ROS/MAP; N = 376) and validation cohorts (Banner; N = 152) with anxiety genome-wide association study (GWAS) summary statistics. Complementing this, we executed transcriptome-wide association studies (TWAS) leveraging human brain transcriptomic data from the Common Mind Consortium (CMC) to discern the confluence of genetic influences spanning both proteomic and transcriptomic levels. We further scrutinized significant genes through a suite of methodologies. RESULTS: We discerned 14 genes instrumental in the genesis of anxiety through their specific cis-regulated brain protein abundance. Out of these, 6 were corroborated in the confirmatory PWAS, with 4 also showing associations with anxiety via their cis-regulated brain mRNA levels. A heightened confidence level was attributed to 5 genes (RAB27B, CCDC92, BTN2A1, TMEM106B, and DOC2A), taking into account corroborative evidence from both the confirmatory PWAS and TWAS, coupled with insights from mendelian randomization analysis and colocalization evaluations. A majority of the identified genes manifest in brain regions intricately linked to anxiety and predominantly partake in lysosomal metabolic processes. LIMITATIONS: The limited scope of the brain proteome reference datasets, stemming from a relatively modest sample size, potentially curtails our grasp on the entire gamut of genetic effects. CONCLUSION: The genes pinpointed in our research present a promising groundwork for crafting therapeutic interventions and diagnostic tools for anxiety.
Assuntos
Ansiedade , Encéfalo , Estudo de Associação Genômica Ampla , Proteoma , Humanos , Proteoma/genética , Encéfalo/metabolismo , Ansiedade/genética , Ansiedade/metabolismo , Transcriptoma , Proteômica , Transtornos de Ansiedade/genética , Transtornos de Ansiedade/metabolismoRESUMO
Background: Endometriosis (EM) is a chronic painful condition that predominantly affects women of reproductive age. Currently, surgery or medication can only provide limited symptom relief. This study used a comprehensive genetic analytical approach to explore potential drug targets for EM in the plasma proteome. Methods: In this study, 2,923 plasma proteins were selected as exposure and EM as outcome for two-sample Mendelian randomization (MR) analyses. The plasma proteomic data were derived from the UK Biobank Pharmaceutical Proteomics Project (UKB-PPP), while the EM dataset from the FinnGen consortium R10 release data. Several sensitivity analyses were performed, including summary-data-based MR (SMR) analyses, heterogeneity in dependent instruments (HEIDI) test, reverse MR analyses, steiger detection test, and bayesian co-localization analyses. Furthermore, proteome-wide association study (PWAS) and single-cell transcriptomic analyses were also conducted to validate the findings. Results: Six significant (p < 3.06 × 10-5) plasma protein-EM pairs were identified by MR analyses. These included EPHB4 (OR = 1.40, 95% CI: 1.20 - 1.63), FSHB (OR = 3.91, 95% CI: 3.13 - 4.87), RSPO3 (OR = 1.60, 95% CI: 1.38 - 1.86), SEZ6L2 (OR = 1.44, 95% CI: 1.23 - 1.68) and WASHC3 (OR = 2.00, 95% CI: 1.54 - 2.59) were identified as risk factors, whereas KDR (OR = 0.80, 95% CI: 0.75 - 0.90) was found to be a protective factor. All six plasma proteins passed the SMR test (P < 8.33 × 10-3), but only four plasma proteins passed the HEIDI heterogeneity test (PHEIDI > 0.05), namely FSHB, RSPO3, SEZ6L2 and EPHB4. These four proteins showed strong evidence of co-localization (PPH4 > 0.7). In particular, RSPO3 and EPHB4 were replicated in the validated PWAS. Single-cell analyses revealed high expression of SEZ6L2 and EPHB4 in stromal and epithelial cells within EM lesions, while RSPO3 exhibited elevated expression in stromal cells and fibroblasts. Conclusion: Our study identified FSHB, RSPO3, SEZ6L2, and EPHB4 as potential drug targets for EM and highlighted the critical role of stromal and epithelial cells in disease development. These findings provide new insights into the diagnosis and treatment of EM.
Assuntos
Endometriose , Proteoma , Proteômica , Humanos , Feminino , Endometriose/sangue , Endometriose/tratamento farmacológico , Endometriose/metabolismo , Proteoma/metabolismo , Proteômica/métodos , Proteínas Sanguíneas/metabolismo , Adulto , Análise da Randomização Mendeliana , Biomarcadores/sangue , Estudo de Associação Genômica Ampla , Trombospondinas/metabolismo , Trombospondinas/genéticaRESUMO
Due to the limitations of the present risk genes in understanding the etiology of amyotrophic lateral sclerosis (ALS), it is necessary to find additional causative genes utilizing novel approaches. In this study, we conducted a two-stage proteome-wide association study (PWAS) using ALS genome-wide association study (GWAS) data (N = 152,268) and two distinct human brain protein quantitative trait loci (pQTL) datasets (ROSMAP N = 376 and Banner N = 152) to identify ALS risk genes and prioritized candidate genes with Mendelian randomization (MR) and Bayesian colocalization analysis. Next, we verified the aberrant expression of risk genes in multiple tissues, including lower motor neurons, skeletal muscle, and whole blood. Six ALS risk genes (SCFD1, SARM1, TMEM175, BCS1L, WIPI2, and DHRS11) were found during the PWAS discovery phase, and SARM1 and BCS1L were confirmed during the validation phase. The following MR (p = 2.10 × 10-7) and Bayesian colocalization analysis (ROSMAP PP4 = 0.999, Banner PP4 = 0.999) confirmed the causal association between SARM1 and ALS. Further differential expression analysis revealed that SARM1 was markedly downregulated in lower motor neurons (p = 7.64 × 10-3), skeletal muscle (p = 9.34 × 10-3), and whole blood (p = 1.94 × 10-3). Our findings identified some promising protein candidates for future investigation as therapeutic targets. The dysregulation of SARM1 in multiple tissues provides a new way to explain ALS pathology.
Assuntos
Esclerose Lateral Amiotrófica , Humanos , Esclerose Lateral Amiotrófica/metabolismo , Estudo de Associação Genômica Ampla , Teorema de Bayes , Encéfalo/metabolismo , Proteoma/metabolismo , RNA Mensageiro/genética , ATPases Associadas a Diversas Atividades Celulares/genética , ATPases Associadas a Diversas Atividades Celulares/metabolismo , Complexo III da Cadeia de Transporte de Elétrons/metabolismo , 17-Hidroxiesteroide Desidrogenases/metabolismoRESUMO
Inflammatory bowel disease (IBD) is a chronic disease that includes Crohn's disease (CD) and ulcerative colitis (UC). Although genome-wide association studies (GWASs) have identified many relevant genetic risk loci, the impact of these loci on protein abundance and their potential utility as clinical therapeutic targets remain uncertain. Therefore, this study aimed to investigate the pathogenesis of IBD and identify effective therapeutic targets through a comprehensive and integrated analysis. We systematically integrated GWAS data related to IBD, UC and CD (N = 25,305) by the study of de Lange KM with the human blood proteome (N = 7213) by the Atherosclerosis Risk in Communities (ARIC) study. Proteome-wide association study (PWAS), mendelian randomisation (MR) and Bayesian colocalisation analysis were used to identify proteins contributing to the risk of IBD. Integrative analysis revealed that genetic variations in IBD, UC and CD affected the abundance of five (ERAP2, RIPK2, TALDO1, CADM2 and RHOC), three (VSIR, HGFAC and CADM2) and two (MST1 and FLRT3) cis-regulated plasma proteins, respectively (P < 0.05). Among the proteins identified via Bayesian colocalisation analysis, CADM2 was found to be an important common protein between IBD and UC. A drug and five druggable target genes were identified from DGIdb after Bayesian colocalisation analysis. Our study's findings from genetic and proteomic approaches have identified compelling proteins that may serve as important leads for future functional studies and potential drug targets for IBD (UC and CD).
Assuntos
Teorema de Bayes , Estudo de Associação Genômica Ampla , Doenças Inflamatórias Intestinais , Proteômica , Humanos , Proteômica/métodos , Doenças Inflamatórias Intestinais/genética , Doenças Inflamatórias Intestinais/tratamento farmacológico , Doenças Inflamatórias Intestinais/sangue , Colite Ulcerativa/genética , Colite Ulcerativa/tratamento farmacológico , Colite Ulcerativa/sangue , Predisposição Genética para Doença , Doença de Crohn/genética , Doença de Crohn/tratamento farmacológico , Doença de Crohn/sangue , Proteoma/metabolismo , Polimorfismo de Nucleotídeo Único , Proteínas Sanguíneas/genética , Proteínas Sanguíneas/metabolismo , Análise da Randomização MendelianaRESUMO
BACKGROUND: Genome-wide association studies (GWAS) have identified numerous variants associated with psychiatric disorders. However, it remains largely unknown on how GWAS risk variants contribute to psychiatric disorders. METHODS: Through integrating two largest, publicly available, independent protein quantitative trait loci datasets of plasma protein and nine large-scale GWAS summary statistics of psychiatric disorders, we first performed proteome-wide association study (PWAS) to identify psychiatric disorders-associated plasma proteins, followed by enrichment analysis to reveal the underlying biological processes and pathways. Then, we conducted Mendelian randomization (MR) and Bayesian colocalization (COLOC) analyses, with both discovery and parallel replication datasets, to further identify protein-disorder pairs with putatively causal relationships. We finally prioritized the potential drug targets using Drug Gene Interaction Database. RESULTS: PWAS totally identified 112 proteins, which were significantly enriched in biological processes relevant to immune regulation and response to stimulus including regulation of immune system process (adjusted Pâ¯=â¯1.69â¯×â¯10-7) and response to external stimulus (adjusted Pâ¯=â¯4.13â¯×â¯10-7), and viral infection related pathways, including COVID-19 (adjusted Pâ¯=â¯2.94â¯×â¯10-2). MR and COLOC analysis further identified 26 potentially causal protein-disorder pairs in both discovery and replication analysis. Notably, eight protein-coding genes were immune-related, such as IRF3, CSK, and ACE, five among 16 druggable genes were reported to interact with drugs, including ACE, CSK, PSMB4, XPNPEP1, and MICB. CONCLUSIONS: Our findings highlighted the immunological hypothesis and identified potentially causal plasma proteins for psychiatric disorders, providing biological insights into the pathogenesis and benefit the development of preventive or therapeutic drugs for psychiatric disorders.
RESUMO
How genome-wide associated loci confer risk for Parkinson's disease is unclear. We aim to reveal causal genes through effects on brain proteins to provide new pathogenesis insights for Parkinson's disease. Proteome-wide and transcriptome-wide associations were determined by functional summary-based imputation leveraging data from genome-wide association summary (56 306 Europeans, 1.4 million controls), brain proteomes (528 cases from 2 separate data sets), and transcriptome (452 cases), followed by Mendelian randomization, Bayesian colocalization, cell-type-specific and brain regional expression, and drug-gene interaction analyses. As a result, genetically regulated protein abundances of 11 genes were associated with Parkinson's disease. Five genes (CD38, GPNMB, TMEM175, RAB7L1, and HIP1R) were colocalized. Four genes (GPNMB, SEC23IP, CD38, and DGKQ) demonstrated Mendelian randomized correlations (p < 8.10 × 10-5). Higher GPNMB level (1.47, 1.28-1.68) and lower CD38 level (0.319, 0.24-0.43) were causally associated with higher risk of Parkinson's disease, consistent with transcriptomic evaluations. CD38 and GPNMB were preferentially enriched in astrocytes and oligodendrocyte precursor cells, respectively. And CD38 and GPNMB were suggested to be the targets of many oncological drugs from Drug-Gene Interaction database. In conclusion, utilizing multidimensional data, GPNMB and CD38 were prioritized as the causal genes of Parkinson's disease, crucial for mechanistic and therapeutic investigations.
Assuntos
Doença de Parkinson , Humanos , Doença de Parkinson/genética , Doença de Parkinson/metabolismo , Proteoma/metabolismo , Estudo de Associação Genômica Ampla/métodos , Transcriptoma , Análise da Randomização Mendeliana/métodos , Teorema de Bayes , Encéfalo/metabolismo , Polimorfismo de Nucleotídeo Único , Predisposição Genética para Doença , Glicoproteínas de Membrana/genética , Glicoproteínas de Membrana/metabolismoRESUMO
OBJECTIVE: Genetic approaches are increasingly advantageous in characterizing treatment-resistant schizophrenia (TRS). We aimed to identify TRS-associated functional brain proteins, providing a potential pathway for improving psychiatric classification and developing better-tailored therapeutic targets. METHODS: TRS-related proteome-wide association studies (PWAS) were conducted on genome-wide association studies (GWAS) from CLOZUK and the Psychiatric Genomics Consortium (PGC), which provided TRS individuals (n = 10,501) and non-TRS individuals (n = 20,325), respectively. The reference datasets for the human brain proteome were obtained from ROS/MAP and Banner, with 8,356 and 11,518 proteins collected, respectively. We then performed colocalization analysis and functional enrichment analysis to further explore the biological functions of the proteins identified by PWAS. RESULTS: In PWAS, two statistically significant proteins were identified using the ROS/MAP and then replicated using the Banner reference dataset, including CPT2 (PPWAS-ROS/MAP = 4.15 × 10-2 and PPWAS-Banner = 3.38 × 10-3) and APOL2 (PPWAS-ROS/MAP = 4.49 × 10-3 and PPWAS-Banner = 8.26 × 10-3). Colocalization analysis identified three variants that were causally related to protein expression in the human brain, including CCDC91 (PP4 = 0.981), PRDX1 (PP4 = 0.894), and WARS2 (PP4 = 0.757). We extended PWAS results from gene-based analysis to pathway-based analysis, identifying 14 gene ontology (GO) terms and the only candidate pathway for TRS, metabolic pathways (all P < 0.05). CONCLUSIONS: Our results identified two protein biomarkers, and cautiously support that the pathological mechanism of TRS is linked to lipid oxidation and inflammation, where mitochondria-related functions may play a role.
Assuntos
Esquizofrenia , Humanos , Esquizofrenia/tratamento farmacológico , Esquizofrenia/genética , Proteoma/genética , Esquizofrenia Resistente ao Tratamento , Estudo de Associação Genômica Ampla , Espécies Reativas de Oxigênio/uso terapêutico , Encéfalo/metabolismoRESUMO
Objectives: This study aimed to identify plasma proteins that are associated with and causative of breast cancer through Proteome and Transcriptome-wide association studies combining Mendelian Randomization. Methods: Utilizing high-throughput datasets, we designed a two-phase analytical framework aimed at identifying novel plasma proteins that are both associated with and causative of breast cancer. Initially, we conducted Proteome/Transcriptome-wide association studies (P/TWAS) to identify plasma proteins with significant associations. Subsequently, Mendelian Randomization was employed to ascertain the causation. The validity and robustness of our findings were further reinforced through external validation and various sensitivity analyses, including Bayesian colocalization, Steiger filtering, heterogeneity and pleiotropy. Additionally, we performed functional enrichment analysis of the identified proteins to better understand their roles in breast cancer and to assess their potential as druggable targets. Results: We identified 5 plasma proteins demonstrating strong associations and causative links with breast cancer. Specifically, PEX14 (OR = 1.201, p = 0.016) and CTSF (OR = 1.114, p < 0.001) both displayed positive and causal association with breast cancer. In contrast, SNUPN (OR = 0.905, p < 0.001), CSK (OR = 0.962, p = 0.038), and PARK7 (OR = 0.954, p < 0.001) were negatively associated with the disease. For the ER-positive subtype, 3 plasma proteins were identified, with CSK and CTSF exhibiting consistent trends, while GDI2 (OR = 0.920, p < 0.001) was distinct to this subtype. In ER-negative subtype, PEX14 (OR = 1.645, p < 0.001) stood out as the sole protein, even showing a stronger causal effect compared to breast cancer. These associations were robustly supported by colocalization and sensitivity analyses. Conclusion: Integrating multiple data dimensions, our study successfully pinpointed plasma proteins significantly associated with and causative of breast cancer, offering valuable insights for future research and potential new biomarkers and therapeutic targets.
RESUMO
Background: The genome-wide association study (GWAS) is a common tool to identify genetic variants associated with complex traits, including psychiatric disorders (PDs). However, post-GWAS analyses are needed to extend the statistical inference to biologically relevant entities, e.g., genes, proteins, and pathways. To achieve this goal, researchers developed methods that incorporate biologically relevant intermediate molecular phenotypes, such as gene expression and protein abundance, which are posited to mediate the variant-trait association. Transcriptome-wide association study (TWAS) and proteome-wide association study (PWAS) are commonly used methods to test the association between these molecular mediators and the trait. Summary: In this review, we discuss the most recent developments in TWAS and PWAS. These methods integrate existing "omic" information with the GWAS summary statistics for trait(s) of interest. Specifically, they impute transcript/protein data and test the association between imputed gene expression/protein level with phenotype of interest by using (i) GWAS summary statistics and (ii) reference transcriptomic/proteomic/genomic datasets. TWAS and PWAS are suitable as analysis tools for (i) primary association scan and (ii) fine-mapping to identify potentially causal genes for PDs. Key Messages: As post-GWAS analyses, TWAS and PWAS have the potential to highlight causal genes for PDs. These prioritized genes could indicate targets for the development of novel drug therapies. For researchers attempting such analyses, we recommend Mendelian randomization tools that use GWAS statistics for both trait and reference datasets, e.g., summary Mendelian randomization (SMR). We base our recommendation on (i) being able to use the same tool for both TWAS and PWAS, (ii) not requiring the pre-computed weights (and thus easier to update for larger reference datasets), and (iii) most larger transcriptome reference datasets are publicly available and easy to transform into a compatible format for SMR analysis.
RESUMO
Vertigo is a leading symptom of various peripheral and central vestibular disorders. Although genome-wide association studies (GWASs) have identified multiple risk variants for vertigo, how these risk variants contribute to the risk of vertigo remains unknown. Discovery proteome-wide association study (PWAS) was first performed by integrating the protein quantitative trait loci from the dorsolateral prefrontal cortex (DLPFC) in the Banner Sun Health Research Institute dataset (n = 152) and GWAS summary of vertigo (n = 942 613), followed by replication PWAS using the protein quantitative trait loci from the DLPFC in Religious Orders Study or the Rush Memory and Aging Project dataset (n = 376). Transcriptome-wide association studies (TWASs) were then performed by integrating the same GWAS datasets of vertigo (n = 942 613) with mRNA expression reference from human fetal brain, and DLPFC. Chemical-related gene set enrichment analysis (GSEA) and Gene ontology/Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses were finally conducted to further reveal the pathogenesis of vertigo. Permutation-based empirical P values were calculated in PWAS, TWAS, and GSEA. By integrating the GWAS of vertigo and two independent brain proteomes from human DLPFC, three genes were identified to genetically regulate protein abundance levels in vertigo, and were not previously implicated by GWAS, including MTERFD2 (P Banner = 0.045, P ROSMAP = 0.031), MGST1 (P Banner = 0.014, P ROSMAP = 0.018), and RAB3B (P Banner = 0.045, P ROSMAP = 0.035). Compared with TWAS results, we identified overlapping genes RAB3B (P TWAS = 0.017) and MTERFD2 (P TWAS = 0.003) that showed significant associations with vertigo at both proteome-wide and transcriptome-wide levels. Chemical-related GSEA identified multiple chemicals that might be associated with vertigo, such as nickel (P = 0.007), glycidamide (P = 0.005), and proanthocyanidins (P = 0.015). Our study provides novel clues for understanding the biological mechanism of vertigo, and highlights several possible risks and therapeutic chemicals for vertigo.
RESUMO
BACKGROUND: Comparing with the general population, the pain in depression patients has more complex biological mechanism. We aim to explore the etiological mechanism of pain in depression patients from the perspective of genetics. METHODS: Utilizing the UK Biobank samples with self-reported depression status or PHQ score ≥10, we conducted genome-wide association studies (GWAS) of seven pain traits (N = 1,133-58,349). Additionally, we used FUSION pipeline to perform proteome-wide association study (PWAS) and transcriptome-wide association study (TWAS) by integrating GWAS summary data with two different proteome reference weights (ROS/MAP and Banner) and Rnaseq gene expression reference weights, respectively. RESULTS: GWAS identified 3 significant genes associated with different pain traits in depression patients, including TRIOBP (PGWAS = 4.48 × 10-8) for stomach or abdominal pain, SLC9A9(PGWAS = 2.77 × 10-8) for multisite chronic pain (MCP) and ADGRF1 (PGWAS = 1.51 × 10-8) for neck or shoulder pain. In addition, PWAS and TWAS analysis also identified multiple candidate genes associated with different pain traits in depression patients, such as TPRG1L (PPWAS-Banner = 3.38 × 10-2) and SIRPA (PPWAS-Banner = 3.65 × 10-2) for MCP, etc. Notably, when comparing the results of PWAS and TWAS analysis, we found overlapping candidate genes in these pain traits, such as GSTM3 (PPWAS-Banner = 3.38 × 10-2, PTWAS = 6.92 × 10-3) in the stomach or abdominal pain phenotype, ATG7 (PPWAS-Rosmap = 3.15 × 10-2, PTWAS = 2.98 × 10-2) in the MCP, etc. CONCLUSIONS: We identified multiple novel candidate genes for pain traits in depression patients from different perspectives of genetics, which provided novel clues for understanding the genetic mechanisms underlying the pain in depression patients.