Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 92
Filter
Add more filters

Country/Region as subject
Affiliation country
Publication year range
1.
Brief Bioinform ; 25(2)2024 Jan 22.
Article in English | MEDLINE | ID: mdl-38487847

ABSTRACT

Causal discovery is a powerful tool to disclose underlying structures by analyzing purely observational data. Genetic variants can provide useful complementary information for structure learning. Recently, Mendelian randomization (MR) studies have provided abundant marginal causal relationships of traits. Here, we propose a causal network pruning algorithm MRSL (MR-based structure learning algorithm) based on these marginal causal relationships. MRSL combines the graph theory with multivariable MR to learn the conditional causal structure using only genome-wide association analyses (GWAS) summary statistics. Specifically, MRSL utilizes topological sorting to improve the precision of structure learning. It proposes MR-separation instead of d-separation and three candidates of sufficient separating set for MR-separation. The results of simulations revealed that MRSL had up to 2-fold higher F1 score and 100 times faster computing time than other eight competitive methods. Furthermore, we applied MRSL to 26 biomarkers and 44 International Classification of Diseases 10 (ICD10)-defined diseases using GWAS summary data from UK Biobank. The results cover most of the expected causal links that have biological interpretations and several new links supported by clinical case reports or previous observational literatures.


Subject(s)
Algorithms , Genome-Wide Association Study , Causality , Phenotype , Protein Transport , Mendelian Randomization Analysis , Polymorphism, Single Nucleotide
2.
PLoS Genet ; 18(3): e1010107, 2022 03.
Article in English | MEDLINE | ID: mdl-35298462

ABSTRACT

Nonrandom selection in one-sample Mendelian Randomization (MR) results in biased estimates and inflated type I error rates only when the selection effects are sufficiently large. In two-sample MR, the different selection mechanisms in two samples may more seriously affect the causal effect estimation. Firstly, we propose sufficient conditions for causal effect invariance under different selection mechanisms using two-sample MR methods. In the simulation study, we consider 49 possible selection mechanisms in two-sample MR, which depend on genetic variants (G), exposures (X), outcomes (Y) and their combination. We further compare eight pleiotropy-robust methods under different selection mechanisms. Results of simulation reveal that nonrandom selection in sample II has a larger influence on biases and type I error rates than those in sample I. Furthermore, selections depending on X+Y, G+Y, or G+X+Y in sample II lead to larger biases than other selection mechanisms. Notably, when selection depends on Y, bias of causal estimation for non-zero causal effect is larger than that for null causal effect. Especially, the mode based estimate has the largest standard errors among the eight methods. In the absence of pleiotropy, selections depending on Y or G in sample II show nearly unbiased causal effect estimations when the casual effect is null. In the scenarios of balanced pleiotropy, all eight MR methods, especially MR-Egger, demonstrate large biases because the nonrandom selections result in the violation of the Instrument Strength Independent of Direct Effect (InSIDE) assumption. When directional pleiotropy exists, nonrandom selections have a severe impact on the eight MR methods. Application demonstrates that the nonrandom selection in sample II (coronary heart disease patients) can magnify the causal effect estimation of obesity on HbA1c levels. In conclusion, nonrandom selection in two-sample MR exacerbates the bias of causal effect estimation for pleiotropy-robust MR methods.


Subject(s)
Genetic Variation , Mendelian Randomization Analysis , Bias , Causality , Genetic Pleiotropy , Humans , Mendelian Randomization Analysis/methods
3.
Am J Hum Genet ; 108(2): 240-256, 2021 02 04.
Article in English | MEDLINE | ID: mdl-33434493

ABSTRACT

A transcriptome-wide association study (TWAS) integrates data from genome-wide association studies and gene expression mapping studies for investigating the gene regulatory mechanisms underlying diseases. Existing TWAS methods are primarily univariate in nature, focusing on analyzing one outcome trait at a time. However, many complex traits are correlated with each other and share a common genetic basis. Consequently, analyzing multiple traits jointly through multivariate analysis can potentially improve the power of TWASs. Here, we develop a method, moPMR-Egger (multiple outcome probabilistic Mendelian randomization with Egger assumption), for analyzing multiple outcome traits in TWAS applications. moPMR-Egger examines one gene at a time, relies on its cis-SNPs that are in potential linkage disequilibrium with each other to serve as instrumental variables, and tests its causal effects on multiple traits jointly. A key feature of moPMR-Egger is its ability to test and control for potential horizontal pleiotropic effects from instruments, thus maximizing power while minimizing false associations for TWASs. In simulations, moPMR-Egger provides calibrated type I error control for both causal effects testing and horizontal pleiotropic effects testing and is more powerful than existing univariate TWAS approaches in detecting causal associations. We apply moPMR-Egger to analyze 11 traits from 5 trait categories in the UK Biobank. In the analysis, moPMR-Egger identified 13.15% more gene associations than univariate approaches across trait categories and revealed distinct regulatory mechanisms underlying systolic and diastolic blood pressures.


Subject(s)
Genetic Association Studies , Multifactorial Inheritance , Transcriptome , Blood Pressure/genetics , Computer Simulation , Genetic Pleiotropy , Humans , Linkage Disequilibrium , Mendelian Randomization Analysis , Models, Genetic , Multivariate Analysis , Phenotype , Polymorphism, Single Nucleotide
4.
Clin Endocrinol (Oxf) ; 100(3): 294-303, 2024 03.
Article in English | MEDLINE | ID: mdl-38214116

ABSTRACT

This study aimed to evaluate whether there is a causal relationship between autoimmune thyroid disorders (AITDs) and telomere length (TL) in the European population and whether there is reverse causality. In this study, Mendelian randomization (MR) and colocalization analysis were conducted to assess the potential causal relationship between AITDs and TL using summary statistics from large-scale genome-wide association studies, followed by analysis of the relationship between TL and thyroid stimulating hormone and free thyroxine (FT4) to help interpret the findings. The inverse variance weighted (IVW) method was used to estimate the causal estimates. The weighted median, MR-Egger and leave-one-out methods were used as sensitivity analyses. The IVW method results showed a significant causal relationship between autoimmune hyperthyroidism and TL (ß = -1.93 × 10-2 ; p = 4.54 × 10-5 ). There was no causal relationship between autoimmune hypothyroidism and TL (ß = -3.99 × 10-3 ; p = 0.324). The results of the reverse MR analysis showed that genetically TL had a significant causal relationship on autoimmune hyperthyroidism (IVW: odds ratio (OR) = 0.49; p = 2.83 × 10-4 ) and autoimmune hypothyroidism (IVW: OR = 0.86; p = 7.46 × 10-3 ). Both horizontal pleiotropy and heterogeneity tests indicated the validity of our bidirectional MR study. Finally, colocalization analysis suggested that there were shared causal variants between autoimmune hyperthyroidism and TL, further highlighting the robustness of the results. In conclusion, autoimmune hyperthyroidism may accelerate telomere attrition, and telomere attrition is a causal factor for AITDs.


Subject(s)
Graves Disease , Hashimoto Disease , Hypothyroidism , Thyroiditis, Autoimmune , Humans , Genome-Wide Association Study , Mendelian Randomization Analysis , Telomere/genetics , Hypothyroidism/genetics
5.
Respir Res ; 25(1): 8, 2024 Jan 04.
Article in English | MEDLINE | ID: mdl-38178157

ABSTRACT

BACKGROUND: The mortality rate of acute respiratory distress syndrome (ARDS) increases with age (≥ 65 years old) in critically ill patients, and it is necessary to prevent mortality in elderly patients with ARDS in the intensive care unit (ICU). Among the potential risk factors, dynamic subphenotypes of respiratory rate (RR), heart rate (HR), and respiratory rate-oxygenation (ROX) and their associations with 28-day mortality have not been clearly explored. METHODS: Based on the eICU Collaborative Research Database (eICU-CRD), this study used a group-based trajectory model to identify longitudinal subphenotypes of RR, HR, and ROX during the first 72 h of ICU stays. A logistic model was used to evaluate the associations of trajectories with 28-day mortality considering the group with the lowest rate of mortality as a reference. Restricted cubic spline was used to quantify linear and nonlinear effects of static RR-related factors during the first 72 h of ICU stays on 28-day mortality. Receiver operating characteristic (ROC) curves were used to assess the prediction models with the Delong test. RESULTS: A total of 938 critically ill elderly patients with ARDS were involved with five and 5 trajectories of RR and HR, respectively. A total of 204 patients fit 4 ROX trajectories. In the subphenotypes of RR, when compared with group 4, the odds ratios (ORs) and 95% confidence intervals (CIs) of group 3 were 2.74 (1.48-5.07) (P = 0.001). Regarding the HR subphenotypes, in comparison to group 1, the ORs and 95% CIs were 2.20 (1.19-4.08) (P = 0.012) for group 2, 2.70 (1.40-5.23) (P = 0.003) for group 3, 2.16 (1.04-4.49) (P = 0.040) for group 5. Low last ROX had a higher mortality risk (P linear = 0.023, P nonlinear = 0.010). Trajectories of RR and HR improved the predictive ability for 28-day mortality (AUC increased by 2.5%, P = 0.020). CONCLUSIONS: For RR and HR, longitudinal subphenotypes are risk factors for 28-day mortality and have additional predictive enrichment, whereas the last ROX during the first 72 h of ICU stays is associated with 28-day mortality. These findings indicate that maintaining the health dynamic subphenotypes of RR and HR in the ICU and elevating static ROX after initial critical care may have potentially beneficial effects on prognosis in critically ill elderly patients with ARDS.


Subject(s)
Critical Illness , Respiratory Distress Syndrome , Humans , Aged , Respiratory Distress Syndrome/diagnosis , Lung , Prognosis , Vital Signs , Retrospective Studies
6.
Stat Med ; 2024 Jun 24.
Article in English | MEDLINE | ID: mdl-38922944

ABSTRACT

The brain functional connectivity can typically be represented as a brain functional network, where nodes represent regions of interest (ROIs) and edges symbolize their connections. Studying group differences in brain functional connectivity can help identify brain regions and recover the brain functional network linked to neurodegenerative diseases. This process, known as differential network analysis focuses on the differences between estimated precision matrices for two groups. Current methods struggle with individual heterogeneity in measuring the brain connectivity, false discovery rate (FDR) control, and accounting for confounding factors, resulting in biased estimates and diminished power. To address these issues, we present a two-stage FDR-controlled feature selection method for differential network analysis using functional magnetic resonance imaging (fMRI) data. First, we create individual brain connectivity measures using a high-dimensional precision matrix estimation technique. Next, we devise a penalized logistic regression model that employs individual brain connectivity data and integrates a new knockoff filter for FDR control when detecting significant differential edges. Through extensive simulations, we showcase the superiority of our approach compared to other methods. Additionally, we apply our technique to fMRI data to identify differential edges between Alzheimer's disease and control groups. Our results are consistent with prior experimental studies, emphasizing the practical applicability of our method.

7.
Hum Genet ; 2023 Dec 24.
Article in English | MEDLINE | ID: mdl-38143258

ABSTRACT

It remains challenging to translate the findings from genome-wide association studies (GWAS) of autoimmune diseases (AIDs) into interventional targets, presumably due to the lack of knowledge on how the GWAS risk variants contribute to AIDs. In addition, current immunomodulatory drugs for AIDs are broad in action rather than disease-specific. We performed a comprehensive protein-centric omics integration analysis to identify AIDs-associated plasma proteins through integrating protein quantitative trait loci datasets of plasma protein (1348 proteins and 7213 individuals) and totally ten large-scale GWAS summary statistics of AIDs under a cutting-edge systematic analytic framework. Specifically, we initially screened out the protein-AID associations using proteome-wide association study (PWAS), followed by enrichment analysis to reveal the underlying biological processes and pathways. Then, we performed both Mendelian randomization (MR) and colocalization analyses to further identify protein-AID pairs with putatively causal relationships. We finally prioritized the potential drug targets for AIDs. A total of 174 protein-AID associations were identified by PWAS. AIDs-associated plasma proteins were significantly enriched in immune-related biological process and pathways, such as inflammatory response (P = 3.96 × 10-10). MR analysis further identified 97 protein-AID pairs with potential causal relationships, among which 21 pairs were highly supported by colocalization analysis (PP.H4 > 0.75), 10 of 21 were the newly discovered pairs and not reported in previous GWAS analyses. Further explorations showed that four proteins (TLR3, FCGR2A, IL23R, TCN1) have corresponding drugs, and 17 proteins have druggability. These findings will help us to further understand the biological mechanism of AIDs and highlight the potential of these proteins to develop as therapeutic targets for AIDs.

8.
BMC Genomics ; 23(1): 562, 2022 Aug 06.
Article in English | MEDLINE | ID: mdl-35933330

ABSTRACT

BACKGROUND: Transcriptome-wide association studies (TWASs) have shown great promise in interpreting the findings from genome-wide association studies (GWASs) and exploring the disease mechanisms, by integrating GWAS and eQTL mapping studies. Almost all TWAS methods only focus on one gene at a time, with exception of only two published multiple-gene methods nevertheless failing to account for the inter-dependence as well as the network structure among multiple genes, which may lead to power loss in TWAS analysis as complex disease often owe to multiple genes that interact with each other as a biological network. We therefore developed a Network Regression method in a two-stage TWAS framework (NeRiT) to detect whether a given network is associated with the traits of interest. NeRiT adopts the flexible Bayesian Dirichlet process regression to obtain the gene expression prediction weights in the first stage, uses pointwise mutual information to represent the general between-node correlation in the second stage and can effectively take the network structure among different gene nodes into account. RESULTS: Comprehensive and realistic simulations indicated NeRiT had calibrated type I error control for testing both the node effect and edge effect, and yields higher power than the existed methods, especially in testing the edge effect. The results were consistent regardless of the GWAS sample size, the gene expression prediction model in the first step of TWAS, the network structure as well as the correlation pattern among different gene nodes. Real data applications through analyzing systolic blood pressure and diastolic blood pressure from UK Biobank showed that NeRiT can simultaneously identify the trait-related nodes as well as the trait-related edges. CONCLUSIONS: NeRiT is a powerful and efficient network regression method in TWAS.


Subject(s)
Genome-Wide Association Study , Transcriptome , Bayes Theorem , Genetic Predisposition to Disease , Genome-Wide Association Study/methods , Humans , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Regression Analysis
9.
Hum Mol Genet ; 29(13): 2261-2274, 2020 08 03.
Article in English | MEDLINE | ID: mdl-32329512

ABSTRACT

Observational studies showed an inverse association between birth weight and chronic kidney disease (CKD) in adulthood existed. However, whether such an association is causal remains fully elusive. Moreover, none of prior studies distinguished the direct fetal effect from the indirect maternal effect. Herein, we aimed to investigate the causal relationship between birth weight and CKD and to understand the relative fetal and maternal contributions. Meta-analysis (n = ~22 million) showed that low birth weight led to ~83% (95% confidence interval [CI] 37-146%) higher risk of CKD in late life. With summary statistics from large scale GWASs (n = ~300 000 for birth weight and ~481 000 for CKD), linkage disequilibrium score regression demonstrated birth weight had a negative maternal, but not fetal, genetic correlation with CKD and several other kidney-function related phenotypes. Furthermore, with multiple instruments of birth weight, Mendelian randomization showed there existed a negative fetal casual association (OR = 1.10, 95% CI 1.01-1.16) between birth weight and CKD; a negative but non-significant maternal casual association (OR = 1.09, 95% CI 0.98-1.21) was also identified. Those associations were robust against various sensitivity analyses. However, no maternal/fetal casual effects of birth weight were significant for other kidney-function related phenotypes. Overall, our study confirmed the inverse association between birth weight and CKD observed in prior studies, and further revealed the shared maternal genetic foundation between low birth weight and CKD, and the direct fetal and indirect maternal causal effects of birth weight may commonly drive this negative relationship.


Subject(s)
Birth Weight/genetics , Kidney/metabolism , Renal Insufficiency, Chronic/genetics , Birth Weight/physiology , Female , Genome-Wide Association Study , Humans , Infant, Low Birth Weight/growth & development , Infant, Low Birth Weight/metabolism , Infant, Newborn , Kidney/pathology , Male , Mendelian Randomization Analysis , Meta-Analysis as Topic , Polymorphism, Single Nucleotide/genetics , Renal Insufficiency, Chronic/epidemiology , Renal Insufficiency, Chronic/physiopathology , Systematic Reviews as Topic
10.
BMC Med ; 20(1): 214, 2022 06 22.
Article in English | MEDLINE | ID: mdl-35729600

ABSTRACT

BACKGROUND: The current genome-wide association study (GWAS) of Lewy body dementia (LBD) suffers from low power due to a limited sample size. In addition, the genetic determinants underlying LBD and the shared genetic etiology with Alzheimer's disease (AD) and Parkinson's disease (PD) remain poorly understood. METHODS: Using the largest GWAS summary statistics of LBD to date (2591 cases and 4027 controls), late-onset AD (86,531 cases and 676,386 controls), and PD (33,674 cases and 449,056 controls), we comprehensively investigated the genetic basis of LBD and shared genetic etiology among LBD, AD, and PD. We first conducted genetic correlation analysis using linkage disequilibrium score regression (LDSC), followed by multi-trait analysis of GWAS (MTAG) and association analysis based on SubSETs (ASSET) to identify the trait-specific SNPs. We then performed SNP-level functional annotation to identify significant genomic risk loci paired with Bayesian fine-mapping and colocalization analysis to identify potential causal variants. Parallel gene-level analysis including GCTA-fastBAT and transcriptome-wide association analysis (TWAS) was implemented to explore novel LBD-associated genes, followed by pathway enrichment analysis to understand underlying biological mechanisms. RESULTS: Pairwise LDSC analysis found positive genome-wide genetic correlations between LBD and AD (rg = 0.6603, se = 0.2001; P = 0.0010), between LBD and PD (rg = 0.6352, se = 0.1880; P = 0.0007), and between AD and PD (rg = 0.2136, se = 0.0860; P = 0.0130). We identified 13 significant loci for LBD, including 5 previously reported loci (1q22, 2q14.3, 4p16.3, 4q22.1, and 19q13.32) and 8 novel biologically plausible genetic associations (5q12.1, 5q33.3, 6p21.1, 8p23.1, 8p21.1, 16p11.2, 17p12, and 17q21.31), among which APOC1 (19q13.32), SNCA (4q22.1), TMEM175 (4p16.3), CLU (8p21.1), MAPT (17q21.31), and FBXL19 (16p11.2) were also validated by gene-level analysis. Pathway enrichment analysis of 40 common genes identified by GCTA-fastBAT and TWAS implicated significant role of neurofibrillary tangle assembly (GO:1902988, adjusted P = 1.55 × 10-2). CONCLUSIONS: Our findings provide novel insights into the genetic determinants of LBD and the shared genetic etiology and biological mechanisms of LBD, AD, and PD, which could benefit the understanding of the co-pathology as well as the potential treatment of these diseases simultaneously.


Subject(s)
Alzheimer Disease , Lewy Body Disease , Parkinson Disease , Alzheimer Disease/genetics , Alzheimer Disease/pathology , Bayes Theorem , Genome-Wide Association Study , Humans , Lewy Body Disease/genetics , Lewy Body Disease/pathology , Parkinson Disease/genetics
11.
BMC Cancer ; 22(1): 1070, 2022 Oct 17.
Article in English | MEDLINE | ID: mdl-36253742

ABSTRACT

BACKGROUND: Breast cancer (BC) is one of the most prevalent cancers worldwide but its etiology remains unclear. Obesity is recognized as a risk factor for BC, and many obesity-related genes may be involved in its occurrence and development. Research assessing the complex genetic mechanisms of BC should not only consider the effect of a single gene on the disease, but also focus on the interaction between genes. This study sought to construct a gene interaction network to identify potential pathogenic BC genes. METHODS: The study included 953 BC patients and 963 control individuals. Chi-square analysis was used to assess the correlation between demographic characteristics and BC. The joint density-based non-parametric differential interaction network analysis and classification (JDINAC) was used to build a BC gene interaction network using single nucleotide polymorphisms (SNP). The odds ratio (OR) and 95% confidence interval (95% CI) of hub gene SNPs were evaluated using a logistic regression model. To assess reliability, the hub genes were quantified by edgeR program using BC RNA-seq data from The Cancer Genome Atlas (TCGA) and identical edges were verified by logistic regression using UK Biobank datasets. Go and KEGG enrichment analysis were used to explore the biological functions of interactive genes. RESULTS: Body mass index (BMI) and menopause are important risk factors for BC. After adjusting for potential confounding factors, the BC gene interaction network was identified using JDINAC. LEP, LEPR, XRCC6, and RETN were identified as hub genes and both hub genes and edges were verified. LEPR genetic polymorphisms (rs1137101 and rs4655555) were also significantly associated with BC. Enrichment analysis showed that the identified genes were mainly involved in energy regulation and fat-related signaling pathways. CONCLUSION: We explored the interaction network of genes derived from SNP data in BC progression. Gene interaction networks provide new insight into the underlying mechanisms of BC.


Subject(s)
Breast Neoplasms , Breast Neoplasms/pathology , Female , Gene Expression Regulation, Neoplastic , Gene Regulatory Networks , Humans , Machine Learning , Obesity/genetics , Polymorphism, Single Nucleotide , Reproducibility of Results
12.
Int J Mol Sci ; 23(21)2022 Nov 04.
Article in English | MEDLINE | ID: mdl-36362342

ABSTRACT

Genome-wide association study (GWAS) of Juvenile idiopathic arthritis (JIA) suffers from low power due to limited sample size and the interpretation challenge due to most signals located in non-coding regions. Gene-level analysis could alleviate these issues. Using GWAS summary statistics, we performed two typical gene-level analysis of JIA, transcriptome-wide association studies (TWAS) using FUnctional Summary-based ImputatiON (FUSION) and gene-based analysis using eQTL Multi-marker Analysis of GenoMic Annotation (eMAGMA), followed by comprehensive enrichment analysis. Among 33 overlapped significant genes from these two methods, 11 were previously reported, including TYK2 (PFUSION = 5.12 × 10-6, PeMAGMA = 1.94 × 10-7 for whole blood), IL-6R (PFUSION = 8.63 × 10-7, PeMAGMA = 2.74 × 10-6 for cells EBV-transformed lymphocytes), and Fas (PFUSION = 5.21 × 10-5, PeMAGMA = 1.08 × 10-6 for muscle skeletal). Some newly plausible JIA-associated genes are also reported, including IL-27 (PFUSION = 2.10 × 10-7, PeMAGMA = 3.93 × 10-8 for Liver), LAT (PFUSION = 1.53 × 10-4, PeMAGMA = 4.62 × 10-7 for Artery Aorta), and MAGI3 (PFUSION = 1.30 × 10-5, PeMAGMA = 1.73 × 10-7 for Muscle Skeletal). Enrichment analysis further highlighted 4 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways and 10 Gene Ontology (GO) terms. Our findings can benefit the understanding of genetic determinants and potential therapeutic targets for JIA.


Subject(s)
Arthritis, Juvenile , Transcriptome , Humans , Genome-Wide Association Study/methods , Arthritis, Juvenile/genetics , RNA, Messenger/genetics , Gene Ontology , Genetic Predisposition to Disease , Polymorphism, Single Nucleotide
13.
BMC Genet ; 21(1): 90, 2020 08 26.
Article in English | MEDLINE | ID: mdl-32847502

ABSTRACT

BACKGROUND: Genome-wide association studies (GWAS) have successfully identified genetic susceptible variants for complex diseases. However, the underlying mechanism of such association remains largely unknown. Most disease-associated genetic variants have been shown to reside in noncoding regions, leading to the hypothesis that regulation of gene expression may be the primary biological mechanism. Current methods to characterize gene expression mediating the effect of genetic variant on diseases, often analyzed one gene at a time and ignored the network structure. The impact of genetic variant can propagate to other genes along the links in the network, then to the final disease. There could be multiple pathways from the genetic variant to the final disease, with each having the chain structure since the first node is one specific SNP (Single Nucleotide Polymorphism) variant and the end is disease outcome. One key but inadequately addressed question is how to measure the between-node connection strength and rank the effects of such chain-type pathways, which can provide statistical evidence to give the priority of some pathways for potential drug development in a cost-effective manner. RESULTS: We first introduce the maximal correlation coefficient (MCC) to represent the between-node connection, and then integrate MCC with K shortest paths algorithm to rank and identify the potential pathways from genetic variant to disease. The pathway importance score (PIS) was further provided to quantify the importance of each pathway. We termed this method as "MCC-SP". Various simulations are conducted to illustrate MCC is a better measurement of the between-node connection strength than other quantities including Pearson correlation, Spearman correlation, distance correlation, mutual information, and maximal information coefficient. Finally, we applied MCC-SP to analyze one real dataset from the Religious Orders Study and the Memory and Aging Project, and successfully detected 2 typical pathways from APOE genotype to Alzheimer's disease (AD) through gene expression enriched in Alzheimer's disease pathway. CONCLUSIONS: MCC-SP has powerful and robust performance in identifying the pathway(s) from the genetic variant to the disease. The source code of MCC-SP is freely available at GitHub ( https://github.com/zhuyuchen95/ADnet ).


Subject(s)
Genetic Predisposition to Disease , Genome-Wide Association Study , Polymorphism, Single Nucleotide , Algorithms , Alzheimer Disease/genetics , Computer Simulation , Genotype , Humans , Models, Genetic , Software
14.
BMC Med Genet ; 19(1): 153, 2018 08 29.
Article in English | MEDLINE | ID: mdl-30157802

ABSTRACT

BACKGROUND: Previous studies have reported that the potassium voltage-gated channel subfamily Q member 1 (KCNQ1) gene is associated with diabetes in both European and Asian population. This study aims to find a predictable single nucleotide polymorphism (SNP) to predict the risk of metabolic syndrome (MetS) through investigating the association of SNP in KCNQ1 gene with MetS in Han Chinese women of northern urban area. METHODS: Six SNPs were selected and genotyped in 1381 unrelated women aged 21 and above, who have had physical check-up in Shandong Provincial Qianfoshan Hospital. Cox proportional model was conducted to access the association between SNPs and MetS. RESULTS: Sixty one women developed MetS between 2010 and 2015 during the 3055 person-year of follow-up. The cumulative incidence density was 19.964/1000 person-year. The SNP rs163182 was associated with MetS both in the additive genetic model (RR = 1.658, 95% CI: 1.144-2.402) and in the recessive genetic model (RR = 2.461, 95% CI: 1.347-4.496). It remained significant after adjustment. This relationship was also observed in MetS components (BMI and SBP). CONCLUSION: A novel association between rs163182 and MetS was found in this study, which can predict the occurrence of MetS among northern urban Han Chinese women. More investigations are needed to be done to assess the possible pathway in which KCNQ1 gene affects MetS.


Subject(s)
Asian People/genetics , Genetic Predisposition to Disease/genetics , KCNQ1 Potassium Channel/economics , Metabolic Syndrome/genetics , Polymorphism, Single Nucleotide/genetics , Adult , Cohort Studies , Female , Genotype , Humans , Incidence , Middle Aged , Risk Factors
15.
BMC Endocr Disord ; 18(1): 17, 2018 Mar 07.
Article in English | MEDLINE | ID: mdl-29514621

ABSTRACT

BACKGROUND: Thyroid nodules are highly prevalent, but a robust, feasible method for malignancy differentiation has not yet been well documented. This study aimed to establish a practical model for thyroid nodule discrimination. METHODS: Records for 2984 patients who underwent thyroidectomy were analyzed. Clinical, laboratory, and US variables were assessed retrospectively. Multivariate logistic regression analysis was performed and a mathematical model was established for malignancy prediction. RESULTS: The results showed that the malignant group was younger and had smaller nodules than the benign group (43.5 ± 11.6 vs. 48.5 ± 11.5 y, p < 0.001; 1.96 ± 1.16 vs. 2.75 ± 1.70 cm, p < 0.001, respectively). The serum thyrotropin (TSH) level (median = 1.63 mIU/L, IQR (0.89-2.66) vs. 1.19 (0.59-2.10), p < 0.001) was higher in the malignant group than in the benign group. Patients with malignancies tested positive for anti-thyroglobulin antibody (TGAb) and anti-thyroid peroxidase antibody (TPOAb) more frequently than those with benign nodules (TGAb, 30.3% vs. 15.0%, p < 0.001; TPOAb, 25.6% vs. 18.0%, p = 0.028). The prevalence of ultrasound (US) features (irregular shape, ill-defined margin, solid structure, hypoechogenicity, microcalcifications, macrocalcifications and central intranodular flow) was significantly higher in the malignant group. Multivariate logistic regression analysis confirmed that age (OR = 0.963, 95% CI = 0.934-0.993, p = 0.017), TGAb (OR = 4.435, 95% CI = 1.902-10.345, p = 0.001), hypoechogenicity (OR = 2.830, 95% CI = 1.113-7.195, p = 0.029), microcalcifications (OR = 4.624, 95% CI = 2.008-10.646, p < 0.001), and central intranodular flow (OR = 2.155, 95% CI = 1.011-4.594, p < 0.05) were independent predictors of thyroid malignancy. A predictive model including four variables (age, TGAb, hypoechogenicity and microcalcification) showed an optimal discriminatory accuracy (area under the curve, AUC) of 0.808 (95% CI = 0.761-0.855). The best cut-off value for prediction was 0.52, achieving sensitivity and specificity of 84.6% and 76.3%, respectively. CONCLUSION: A predictive model of malignancy that combines clinical, laboratory and sonographic characteristics would aid clinicians in avoiding unnecessary procedures and making better clinical decisions.


Subject(s)
Autoantibodies/blood , Models, Theoretical , Thyroid Hormones/blood , Thyroid Neoplasms/diagnosis , Thyroid Nodule/diagnosis , Ultrasonography/methods , Adult , Female , Follow-Up Studies , Humans , Male , Middle Aged , Prognosis , Retrospective Studies , Thyroid Neoplasms/blood , Thyroid Neoplasms/diagnostic imaging , Thyroid Nodule/blood , Thyroid Nodule/diagnostic imaging
16.
Lipids Health Dis ; 17(1): 78, 2018 Apr 11.
Article in English | MEDLINE | ID: mdl-29642923

ABSTRACT

BACKGROUND: Macrosomia is a serious public health problem worldwide due to its increasing prevalence and adverse influences on maternal and neonatal outcomes. Maternal dyslipidemia exerts potential and adverse impacts on pregnant women and newborns. However, the association between maternal serum lipids and the risk of macrosomia has not yet been clearly elucidated. We explored the association between the maternal lipids profile at late gestation and the risk of having macrosomia among women without diabetes mellitus (DM). METHODS: The medical records of 5407 pregnant women giving birth to single live babies at term were retrospectively analyzed. Subjects with DM, hypertension, thyroid disorders and fetal malformation were excluded. Maternal fasting serum lipids were measured during late pregnancy. Logistic regression analysis was used to analyze the variables associated with the risk of macrosomia. RESULTS: Maternal serum triglyceride (TG) and high-density lipoprotein cholesterol (HDL-C) levels were related to macrosomia; each 1 mmol/L increase in TG resulted in a 27% increase in macrosomia risk, while each 1 mmol/L increase in HDL-C level resulted in a 37% decrease in macrosomia risk, even after adjusting for potential confounders. Notably, the risk of macrosomia increased progressively with increased maternal serum TG levels and decreased HDL-C levels. Compared with women with serum TG levels < 2.5 mmol/L, women with TG levels greater than 3.92 mmol/L had an approximately 2.8-fold increased risk of macrosomia. Compared with women with serum HDL-C levels above 2.23 mmol/L, women with HDL-C levels of less than 1.62 mmol/L had a 1.9-fold increased risk of giving birth to an infan with macrosomia. In addition, a higher risk of macrosomia was observed in women with simultaneous hypertriglyceridemia and low serum HDL-C levels (odds ratio [OR] 2.400, 95% confidence interval [CI]: 1.760-3.274) compared to those with hypertriglyceridemia or low serum HDL-C alone (OR 2.074, 95% CI: 1.609-2.673 and OR 1.363, 95% CI: 1.028-1.809, respectively). CONCLUSIONS: Maternal serum TG levels and HDL-C levels at late gestation are independent predictors of macrosomia in women without DM.


Subject(s)
Diabetes, Gestational/blood , Fetal Macrosomia/blood , Lipids/blood , Adult , Birth Weight , Cholesterol, HDL/blood , Female , Humans , Infant, Newborn , Logistic Models , Multivariate Analysis , Pregnancy , Risk Factors , Triglycerides/blood
18.
BMC Med Res Methodol ; 17(1): 177, 2017 12 28.
Article in English | MEDLINE | ID: mdl-29281984

ABSTRACT

BACKGROUND: Confounders can produce spurious associations between exposure and outcome in observational studies. For majority of epidemiologists, adjusting for confounders using logistic regression model is their habitual method, though it has some problems in accuracy and precision. It is, therefore, important to highlight the problems of logistic regression and search the alternative method. METHODS: Four causal diagram models were defined to summarize confounding equivalence. Both theoretical proofs and simulation studies were performed to verify whether conditioning on different confounding equivalence sets had the same bias-reducing potential and then to select the optimum adjusting strategy, in which logistic regression model and inverse probability weighting based marginal structural model (IPW-based-MSM) were compared. The "do-calculus" was used to calculate the true causal effect of exposure on outcome, then the bias and standard error were used to evaluate the performances of different strategies. RESULTS: Adjusting for different sets of confounding equivalence, as judged by identical Markov boundaries, produced different bias-reducing potential in the logistic regression model. For the sets satisfied G-admissibility, adjusting for the set including all the confounders reduced the equivalent bias to the one containing the parent nodes of the outcome, while the bias after adjusting for the parent nodes of exposure was not equivalent to them. In addition, all causal effect estimations through logistic regression were biased, although the estimation after adjusting for the parent nodes of exposure was nearest to the true causal effect. However, conditioning on different confounding equivalence sets had the same bias-reducing potential under IPW-based-MSM. Compared with logistic regression, the IPW-based-MSM could obtain unbiased causal effect estimation when the adjusted confounders satisfied G-admissibility and the optimal strategy was to adjust for the parent nodes of outcome, which obtained the highest precision. CONCLUSIONS: All adjustment strategies through logistic regression were biased for causal effect estimation, while IPW-based-MSM could always obtain unbiased estimation when the adjusted set satisfied G-admissibility. Thus, IPW-based-MSM was recommended to adjust for confounders set.


Subject(s)
Algorithms , Confounding Factors, Epidemiologic , Logistic Models , Models, Theoretical , Bias , Computer Simulation , Humans
19.
BMC Bioinformatics ; 17: 86, 2016 Feb 12.
Article in English | MEDLINE | ID: mdl-26867929

ABSTRACT

BACKGROUND: Complex disease is largely determined by a number of biomolecules interwoven into networks, rather than a single biomolecule. A key but inadequately addressed issue is how to test possible differences of the networks between two groups. Group-level comparison of network properties may shed light on underlying disease mechanisms and benefit the design of drug targets for complex diseases. We therefore proposed a powerful score-based statistic to detect group difference in weighted networks, which simultaneously capture the vertex changes and edge changes. RESULTS: Simulation studies indicated that the proposed network difference measure (NetDifM) was stable and outperformed other methods existed, under various sample sizes and network topology structure. One application to real data about GWAS of leprosy successfully identified the specific gene interaction network contributing to leprosy. For additional gene expression data of ovarian cancer, two candidate subnetworks, PI3K-AKT and Notch signaling pathways, were considered and identified respectively. CONCLUSIONS: The proposed method, accounting for the vertex changes and edge changes simultaneously, is valid and powerful to capture the group difference of biological networks.


Subject(s)
Gene Regulatory Networks , Leprosy/genetics , Models, Statistical , Ovarian Neoplasms/genetics , Signal Transduction , Epistasis, Genetic , Female , Humans
20.
BMC Genet ; 17: 51, 2016 Mar 09.
Article in English | MEDLINE | ID: mdl-26957081

ABSTRACT

BACKGROUND: We propose a novel Markov Blanket-based repeated-fishing strategy (MBRFS) in attempt to increase the power of existing Markov Blanket method (DASSO-MB) and maintain its advantages in omic data analysis. RESULTS: Both simulation and real data analysis were conducted to assess its performances by comparing with other methods including χ(2) test with Bonferroni and B-H adjustment, least absolute shrinkage and selection operator (LASSO) and DASSO-MB. A serious of simulation studies showed that the true discovery rate (TDR) of proposed MBRFS was always close to zero under null hypothesis (odds ratio = 1 for each SNPs) with excellent stability in all three scenarios of independent phenotype-related SNPs without linkage disequilibrium (LD) around them, correlated phenotype-related SNPs without LD around them, and phenotype-related SNPs with strong LD around them. As expected, under different odds ratio and minor allel frequency (MAFs), MBRFS always had the best performances in capturing the true phenotype-related biomarkers with higher matthews correlation coefficience (MCC) for all three scenarios above. More importantly, since proposed MBRFS using the repeated fishing strategy, it still captures more phenotype-related SNPs with minor effects when non-significant phenotype-related SNPs emerged under χ(2) test after Bonferroni multiple correction. The various real omics data analysis, including GWAS data, DNA methylation data, gene expression data and metabolites data, indicated that the proposed MBRFS always detected relatively reasonable biomarkers. CONCLUSIONS: Our proposed MBRFS can exactly capture the true phenotype-related biomarkers with the reduction of false negative rate when the phenotype-related biomarkers are independent or correlated, as well as the circumstance that phenotype-related biomarkers are associated with non-phenotype-related ones.


Subject(s)
Genetic Markers , Genomics/methods , Markov Chains , Phenotype , Asian People/genetics , Breast Neoplasms/diagnosis , Breast Neoplasms/genetics , Case-Control Studies , Computer Simulation , DNA Methylation , Databases, Genetic , Gene Frequency , Genome-Wide Association Study , Humans , Leprosy/diagnosis , Leprosy/genetics , Linkage Disequilibrium , Models, Theoretical , Polymorphism, Single Nucleotide , Schizophrenia/diagnosis , Schizophrenia/genetics
SELECTION OF CITATIONS
SEARCH DETAIL