Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 36
Filtrar
1.
Stat Med ; 2024 Jun 24.
Artículo en Inglés | MEDLINE | ID: mdl-38922944

RESUMEN

The brain functional connectivity can typically be represented as a brain functional network, where nodes represent regions of interest (ROIs) and edges symbolize their connections. Studying group differences in brain functional connectivity can help identify brain regions and recover the brain functional network linked to neurodegenerative diseases. This process, known as differential network analysis focuses on the differences between estimated precision matrices for two groups. Current methods struggle with individual heterogeneity in measuring the brain connectivity, false discovery rate (FDR) control, and accounting for confounding factors, resulting in biased estimates and diminished power. To address these issues, we present a two-stage FDR-controlled feature selection method for differential network analysis using functional magnetic resonance imaging (fMRI) data. First, we create individual brain connectivity measures using a high-dimensional precision matrix estimation technique. Next, we devise a penalized logistic regression model that employs individual brain connectivity data and integrates a new knockoff filter for FDR control when detecting significant differential edges. Through extensive simulations, we showcase the superiority of our approach compared to other methods. Additionally, we apply our technique to fMRI data to identify differential edges between Alzheimer's disease and control groups. Our results are consistent with prior experimental studies, emphasizing the practical applicability of our method.

2.
NAR Genom Bioinform ; 6(2): lqae071, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38881578

RESUMEN

Mass spectrometry is a powerful and widely used tool for generating proteomics, lipidomics and metabolomics profiles, which is pivotal for elucidating biological processes and identifying biomarkers. However, missing values in mass spectrometry-based omics data may pose a critical challenge for the comprehensive identification of biomarkers and elucidation of the biological processes underlying human complex disorders. To alleviate this issue, various imputation methods for mass spectrometry-based omics data have been developed. However, a comprehensive comparison of these imputation methods is still lacking, and researchers are frequently confronted with a multitude of options without a clear rationale for method selection. To address this pressing need, we developed omicsMIC (mass spectrometry-based omics with Missing values Imputation methods Comparison platform), an interactive platform that provides researchers with a versatile framework to evaluate the performance of 28 diverse imputation methods. omicsMIC offers a nuanced perspective, acknowledging the inherent heterogeneity in biological data and the unique attributes of each dataset. Our platform empowers researchers to make data-driven decisions in imputation method selection based on real-time visualizations of the outcomes associated with different imputation strategies. The comprehensive benchmarking and versatility of omicsMIC make it a valuable tool for the scientific community engaged in mass spectrometry-based omics research. omicsMIC is freely available at https://github.com/WQLin8/omicsMIC.

3.
Nat Genet ; 56(2): 348-356, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38279040

RESUMEN

Transcriptome-wide association studies (TWASs) aim to integrate genome-wide association studies with expression-mapping studies to identify genes with genetically predicted expression (GReX) associated with a complex trait. In the present report, we develop a method, GIFT (gene-based integrative fine-mapping through conditional TWAS), that performs conditional TWAS analysis by explicitly controlling for GReX of all other genes residing in a local region to fine-map putatively causal genes. GIFT is frequentist in nature, explicitly models both expression correlation and cis-single nucleotide polymorphism linkage disequilibrium across multiple genes and uses a likelihood framework to account for expression prediction uncertainty. As a result, GIFT produces calibrated P values and is effective for fine-mapping. We apply GIFT to analyze six traits in the UK Biobank, where GIFT narrows down the set size of putatively causal genes by 32.16-91.32% compared with existing TWAS fine-mapping approaches. The genes identified by GIFT highlight the importance of vessel regulation in determining blood pressures and lipid metabolism for regulating lipid levels.


Asunto(s)
Estudio de Asociación del Genoma Completo , Transcriptoma , Humanos , Estudio de Asociación del Genoma Completo/métodos , Sitios de Carácter Cuantitativo/genética , Fenotipo , Desequilibrio de Ligamiento , Polimorfismo de Nucleótido Simple/genética , Predisposición Genética a la Enfermedad/genética
4.
bioRxiv ; 2023 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-37745599

RESUMEN

Mass spectrometry is a powerful and widely used tool for generating proteomics, lipidomics, and metabolomics profiles, which is pivotal for elucidating biological processes and identifying biomarkers. However, missing values in spectrometry-based omics data may pose a critical challenge for the comprehensive identification of biomarkers and elucidation of the biological processes underlying human complex disorders. To alleviate this issue, various imputation methods for mass spectrometry-based omics data have been developed. However, a comprehensive and systematic comparison of these imputation methods is still lacking, and researchers are frequently confronted with a multitude of options without a clear rationale for method selection. To address this pressing need, we developed omicsMIC (mass spectrometry-based omics with Missing values Imputation methods Comparison platform), an interactive platform that provides researchers with a versatile framework to simulate and evaluate the performance of 28 diverse imputation methods. omicsMIC offers a nuanced perspective, acknowledging the inherent heterogeneity in biological data and the unique attributes of each dataset. Our platform empowers researchers to make data-driven decisions in imputation method selection based on real-time visualizations of the outcomes associated with different imputation strategies. The comprehensive benchmarking and versatility of omicsMIC make it a valuable tool for the scientific community engaged in mass spectrometry-based omics research. OmicsMIC is freely available at https://github.com/WQLin8/omicsMIC.

5.
China CDC Wkly ; 5(9): 206-212, 2023 Mar 03.
Artículo en Inglés | MEDLINE | ID: mdl-37007865

RESUMEN

Introduction: Biases in cancer incidence characteristics have led to significant imbalances in databases constructed by prospective cohort studies. Since they use imbalanced databases, many traditional algorithms for training cancer risk prediction models perform poorly. Methods: To improve prediction performance, we introduced a Bagging ensemble framework to an absolute risk model based on ensemble penalized Cox regression (EPCR). We then tested whether the EPCR model outperformed other traditional regression models by varying the censoring rate of the simulated data. Results: Six different simulation studies were performed with 100 replicates. To assess model performance, we calculated mean false discovery rate, false omission rate, true positive rate, true negative rate, and the areas under the receiver operating characteristic curve (AUC) values. We found that the EPCR procedure could reduce the false discovery rate (FDR) for important variables at the same true positive rate (TPR), thereby achieving more accurate variable screening. In addition, we used the EPCR procedure to build a breast cancer risk prediction model based on the Breast Cancer Cohort Study in Chinese Women database. AUCs for 3- and 5-year predictions were 0.691 and 0.642, representing improvements of 0.189 and 0.117 over the classical Gail model, respectively. Discussion: We conclude that the EPCR procedure can overcome challenges posed by imbalanced data and improve the performance of cancer risk assessment tools.

6.
Genes (Basel) ; 14(3)2023 02 25.
Artículo en Inglés | MEDLINE | ID: mdl-36980857

RESUMEN

Transcriptome-wide association studies (TWASs) aim to detect associations between genetically predicted gene expression and complex diseases or traits through integrating genome-wide association studies (GWASs) and expression quantitative trait loci (eQTL) mapping studies. Most current TWAS methods analyze one gene at a time, ignoring the correlations between multiple genes. Few of the existing TWAS methods focus on survival outcomes. Here, we propose a novel method, namely a COx proportional hazards model for NEtwork regression in TWAS (CoNet), that is applicable for identifying the association between one given network and the survival time. CoNet considers the general relationship among the predicted gene expression as edges of the network and quantifies it through pointwise mutual information (PMI), which is under a two-stage TWAS. Extensive simulation studies illustrate that CoNet can not only achieve type I error calibration control in testing both the node effect and edge effect, but it can also gain more power compared with currently available methods. In addition, it demonstrates superior performance in real data application, namely utilizing the breast cancer survival data of UK Biobank. CoNet effectively accounts for network structure and can simultaneously identify the potential effecting nodes and edges that are related to survival outcomes in TWAS.


Asunto(s)
Neoplasias de la Mama , Transcriptoma , Humanos , Femenino , Transcriptoma/genética , Neoplasias de la Mama/genética , Estudio de Asociación del Genoma Completo/métodos , Simulación por Computador , Análisis de Supervivencia
7.
J Clin Endocrinol Metab ; 108(4): 941-949, 2023 03 10.
Artículo en Inglés | MEDLINE | ID: mdl-36263677

RESUMEN

INTRODUCTION: Systemic lupus erythematosus (SLE) and hypothyroidism often coexist in observational studies; however, the causal relationship between them remains controversial. METHODS: Complementary genetic approaches, including genetic correlation, Mendelian randomization (MR), and colocalization analysis, were conducted to assess the potential causal association between SLE and primary hypothyroidism using summary statistics from large-scale genome-wide association studies. The association between SLE and thyroid-stimulating hormone (TSH) was further analyzed to help interpret the findings. In addition, findings were verified using a validation data set, as well as through different MR methods with different model assumptions. RESULTS: The linkage disequilibrium score regression revealed a shared genetic structure between SLE and primary hypothyroidism, with the significant genetic correlation estimated to be 0.2488 (P = 6.00 × 10-4). MR analysis with the inverse variance weighted method demonstrated a bidirectional causal relationship between SLE and primary hypothyroidism. The odds ratio (OR) of SLE on primary hypothyroidism was 1.037 (95% CI, 1.013-1.061; P = 2.00 × 10-3) and that of primary hypothyroidism on SLE was 1.359 (95% CI, 1.217-1.520; P < 0.001). The OR of SLE on TSH was 1.007 (95% CI, 1.001-1.013; P = 0.032). However, TSH was not causally associated with SLE (P = 0.152). Similar results were found using different MR methods. In addition, colocalization analysis suggested that shared causal variants existed between SLE and primary hypothyroidism. The results of the validation analysis indicated a bidirectional causal relationship between SLE and primary hypothyroidism, as well as shared loci. CONCLUSION: In summary, a bidirectional causal relationship between SLE and primary hypothyroidism was observed with complementary genetic approaches.


Asunto(s)
Hipotiroidismo , Lupus Eritematoso Sistémico , Humanos , Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Hipotiroidismo/complicaciones , Hipotiroidismo/epidemiología , Hipotiroidismo/genética , Tirotropina/genética , Lupus Eritematoso Sistémico/complicaciones , Lupus Eritematoso Sistémico/epidemiología , Lupus Eritematoso Sistémico/genética , Análisis de la Aleatorización Mendeliana
8.
Genetics ; 222(4)2022 11 30.
Artículo en Inglés | MEDLINE | ID: mdl-36227056

RESUMEN

Transcriptome-wide association studies aim to integrate genome-wide association studies and expression quantitative trait loci mapping studies for exploring the gene regulatory mechanisms underlying diseases. Existing transcriptome-wide association study methods primarily focus on 1 gene at a time. However, complex diseases are seldom resulted from the abnormality of a single gene, but from the biological network involving multiple genes. In addition, binary or ordinal categorical phenotypes are commonly encountered in biomedicine. We develop a proportional odds logistic model for network regression in transcriptome-wide association study, Proportional Odds LOgistic model for NEtwork regression in Transcriptome-wide association study, to detect the association between a network and binary or ordinal categorical phenotype. Proportional Odds LOgistic model for NEtwork regression in Transcriptome-wide association study relies on 2-stage transcriptome-wide association study framework. It first adopts the distribution-robust nonparametric Dirichlet process regression model in expression quantitative trait loci study to obtain the SNP effect estimate on each gene within the network. Then, Proportional Odds LOgistic model for NEtwork regression in Transcriptome-wide association study uses pointwise mutual information to represent the general relationship among the network nodes of predicted gene expression in genome-wide association study, followed by the association analysis with all nodes and edges involved in proportional odds logistic model. A key feature of Proportional Odds LOgistic model for NEtwork regression in Transcriptome-wide association study is its ability to simultaneously identify the disease-related network nodes or edges. With extensive realistic simulations including those under various between-node correlation patterns, we show Proportional Odds LOgistic model for NEtwork regression in Transcriptome-wide association study can provide calibrated type I error control and yield higher power than other existing methods. We finally apply Proportional Odds LOgistic model for NEtwork regression in Transcriptome-wide association study to analyze bipolar and major depression status and blood pressure from UK Biobank to illustrate its benefits in real data analysis.


Asunto(s)
Estudio de Asociación del Genoma Completo , Transcriptoma , Humanos , Estudio de Asociación del Genoma Completo/métodos , Sitios de Carácter Cuantitativo , Fenotipo , Análisis de Regresión , Polimorfismo de Nucleótido Simple , Predisposición Genética a la Enfermedad
9.
BMC Cancer ; 22(1): 1070, 2022 Oct 17.
Artículo en Inglés | MEDLINE | ID: mdl-36253742

RESUMEN

BACKGROUND: Breast cancer (BC) is one of the most prevalent cancers worldwide but its etiology remains unclear. Obesity is recognized as a risk factor for BC, and many obesity-related genes may be involved in its occurrence and development. Research assessing the complex genetic mechanisms of BC should not only consider the effect of a single gene on the disease, but also focus on the interaction between genes. This study sought to construct a gene interaction network to identify potential pathogenic BC genes. METHODS: The study included 953 BC patients and 963 control individuals. Chi-square analysis was used to assess the correlation between demographic characteristics and BC. The joint density-based non-parametric differential interaction network analysis and classification (JDINAC) was used to build a BC gene interaction network using single nucleotide polymorphisms (SNP). The odds ratio (OR) and 95% confidence interval (95% CI) of hub gene SNPs were evaluated using a logistic regression model. To assess reliability, the hub genes were quantified by edgeR program using BC RNA-seq data from The Cancer Genome Atlas (TCGA) and identical edges were verified by logistic regression using UK Biobank datasets. Go and KEGG enrichment analysis were used to explore the biological functions of interactive genes. RESULTS: Body mass index (BMI) and menopause are important risk factors for BC. After adjusting for potential confounding factors, the BC gene interaction network was identified using JDINAC. LEP, LEPR, XRCC6, and RETN were identified as hub genes and both hub genes and edges were verified. LEPR genetic polymorphisms (rs1137101 and rs4655555) were also significantly associated with BC. Enrichment analysis showed that the identified genes were mainly involved in energy regulation and fat-related signaling pathways. CONCLUSION: We explored the interaction network of genes derived from SNP data in BC progression. Gene interaction networks provide new insight into the underlying mechanisms of BC.


Asunto(s)
Neoplasias de la Mama , Neoplasias de la Mama/patología , Femenino , Regulación Neoplásica de la Expresión Génica , Redes Reguladoras de Genes , Humanos , Aprendizaje Automático , Obesidad/genética , Polimorfismo de Nucleótido Simple , Reproducibilidad de los Resultados
10.
BMC Genomics ; 23(1): 562, 2022 Aug 06.
Artículo en Inglés | MEDLINE | ID: mdl-35933330

RESUMEN

BACKGROUND: Transcriptome-wide association studies (TWASs) have shown great promise in interpreting the findings from genome-wide association studies (GWASs) and exploring the disease mechanisms, by integrating GWAS and eQTL mapping studies. Almost all TWAS methods only focus on one gene at a time, with exception of only two published multiple-gene methods nevertheless failing to account for the inter-dependence as well as the network structure among multiple genes, which may lead to power loss in TWAS analysis as complex disease often owe to multiple genes that interact with each other as a biological network. We therefore developed a Network Regression method in a two-stage TWAS framework (NeRiT) to detect whether a given network is associated with the traits of interest. NeRiT adopts the flexible Bayesian Dirichlet process regression to obtain the gene expression prediction weights in the first stage, uses pointwise mutual information to represent the general between-node correlation in the second stage and can effectively take the network structure among different gene nodes into account. RESULTS: Comprehensive and realistic simulations indicated NeRiT had calibrated type I error control for testing both the node effect and edge effect, and yields higher power than the existed methods, especially in testing the edge effect. The results were consistent regardless of the GWAS sample size, the gene expression prediction model in the first step of TWAS, the network structure as well as the correlation pattern among different gene nodes. Real data applications through analyzing systolic blood pressure and diastolic blood pressure from UK Biobank showed that NeRiT can simultaneously identify the trait-related nodes as well as the trait-related edges. CONCLUSIONS: NeRiT is a powerful and efficient network regression method in TWAS.


Asunto(s)
Estudio de Asociación del Genoma Completo , Transcriptoma , Teorema de Bayes , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo/métodos , Humanos , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Análisis de Regresión
12.
Front Oncol ; 12: 899900, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35761863

RESUMEN

Background: With the rapid development and wide application of high-throughput sequencing technology, biomedical research has entered the era of large-scale omics data. We aim to identify genes associated with breast cancer prognosis by integrating multi-omics data. Method: Gene-gene interactions were taken into account, and we applied two differential network methods JDINAC and LGCDG to identify differential genes. The patients were divided into case and control groups according to their survival time. The TCGA and METABRIC database were used as the training and validation set respectively. Result: In the TCGA dataset, C11orf1, OLA1, RPL31, SPDL1 and IL33 were identified to be associated with prognosis of breast cancer. In the METABRIC database, ZNF273, ZBTB37, TRIM52, TSGA10, ZNF727, TRAF2, TSPAN17, USP28 and ZNF519 were identified as hub genes. In addition, RPL31, TMEM163 and ZNF273 were screened out in both datasets. GO enrichment analysis shows that most of these hub genes were involved in zinc ion binding. Conclusion: In this study, a total of 15 hub genes associated with long-term survival of breast cancer were identified, which can promote understanding of the molecular mechanism of breast cancer and provide new insight into clinical research and treatment.

13.
Biostatistics ; 23(3): 967-989, 2022 07 18.
Artículo en Inglés | MEDLINE | ID: mdl-33769450

RESUMEN

Growing evidence has shown that the brain connectivity network experiences alterations for complex diseases such as Alzheimer's disease (AD). Network comparison, also known as differential network analysis, is thus particularly powerful to reveal the disease pathologies and identify clinical biomarkers for medical diagnoses (classification). Data from neurophysiological measurements are multidimensional and in matrix-form. Naive vectorization method is not sufficient as it ignores the structural information within the matrix. In the article, we adopt the Kronecker product covariance matrices framework to capture both spatial and temporal correlations of the matrix-variate data while the temporal covariance matrix is treated as a nuisance parameter. By recognizing that the strengths of network connections may vary across subjects, we develop an ensemble-learning procedure, which identifies the differential interaction patterns of brain regions between the case group and the control group and conducts medical diagnosis (classification) of the disease simultaneously. Simulation studies are conducted to assess the performance of the proposed method. We apply the proposed procedure to the functional connectivity analysis of an functional magnetic resonance imaging study on AD. The hub nodes and differential interaction patterns identified are consistent with existing experimental studies, and satisfactory out-of-sample classification performance is achieved for medical diagnosis of AD.


Asunto(s)
Enfermedad de Alzheimer , Encéfalo , Enfermedad de Alzheimer/diagnóstico por imagen , Encéfalo/diagnóstico por imagen , Humanos , Imagen por Resonancia Magnética/métodos
14.
Biometrics ; 77(4): 1409-1421, 2021 12.
Artículo en Inglés | MEDLINE | ID: mdl-32829503

RESUMEN

Brain functional connectivity reveals the synchronization of brain systems through correlations in neurophysiological measures of brain activities. Growing evidence now suggests that the brain connectivity network experiences alterations with the presence of numerous neurological disorders, thus differential brain network analysis may provide new insights into disease pathologies. The data from neurophysiological measurement are often multidimensional and in a matrix form, posing a challenge in brain connectivity analysis. Existing graphical model estimation methods either assume a vector normal distribution that in essence requires the columns of the matrix data to be independent or fail to address the estimation of differential networks across different populations. To tackle these issues, we propose an innovative matrix-variate differential network (MVDN) model. We exploit the D-trace loss function and a Lasso-type penalty to directly estimate the spatial differential partial correlation matrix and use an alternating direction method of multipliers algorithm for the optimization problem. Theoretical and simulation studies demonstrate that MVDN significantly outperforms other state-of-the-art methods in dynamic differential network analysis. We illustrate with a functional connectivity analysis of an attention deficit hyperactivity disorder dataset. The hub nodes and differential interaction patterns identified are consistent with existing experimental studies.


Asunto(s)
Encéfalo , Imagen por Resonancia Magnética , Algoritmos , Encéfalo/diagnóstico por imagen , Mapeo Encefálico/métodos , Imagen por Resonancia Magnética/métodos , Distribución Normal
15.
Front Genet ; 11: 556259, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-33193633

RESUMEN

Complex diseases are believed to be the consequence of intracellular network(s) involving a range of factors. An improved understanding of a disease-predisposing biological network could lead to better identification of genes and pathways that confer disease risk and therefore inform drug development. The group difference in biological networks, as is often characterized by graphs of nodes and edges, is attributable to effects of these nodes and edges. Here we introduced pointwise mutual information (PMI) as a measure of the connection between a pair of nodes with either a linear relationship or nonlinear dependence. We then proposed a PMI-based network regression (PMINR) model to differentiate patterns of network changes (in node or edge) linking a disease outcome. Through simulation studies with various sample sizes and inter-node correlation structures, we showed that PMINR can accurately identify these changes with higher power than current methods and be robust to the network topology. Finally, we illustrated, with publicly available data on lung cancer and gene methylation data on aging and Alzheimer's disease, an evaluation of the practical performance of PMINR. We concluded that PMI is able to capture the generic inter-node correlation pattern in biological networks, and PMINR is a powerful and efficient approach for biological network analysis.

16.
Lipids Health Dis ; 19(1): 233, 2020 Nov 04.
Artículo en Inglés | MEDLINE | ID: mdl-33148263

RESUMEN

PURPOSE: Previous studies have shown that serum carcinoembryonic antigen (CEA) is independently associated with metabolic syndrome (MetS). However, these studies were mainly cross-sectional analyses, and cause was not clarified. In the present study, two bidirectional cohort studies were conducted to investigate the bidirectional associations between CEA and MetS using a Chinese male sample cohort. METHODS: The initial longitudinal cohort included 9629 Chinese males enrolled from January 2010 to December 2015. Two bidirectional cohorts were conducted in the study: subcohort A (from CEA to MetS, n = 6439) included participants without MetS at baseline to estimate the risk of developing incident MetS; subcohort B (from MetS to CEA, n = 8533) included participants without an elevated CEA level (Hyper-CEA) at baseline to examine the risk of developing incident Hyper-CEA. Hazard ratios (HRs) and 95% confidence intervals (CIs) were estimated using Cox proportional hazards models. RESULTS: In subcohort A, the incidence densities of MetS among participants with and without Hyper-CEA were 84.56 and 99.28 per 1000 person-years, respectively. No significant effects of Hyper-CEA on incident MetS were observed in subcohort A (HR, 0.89; 95% CI, 0.71 to 1.12; P = 0.326). In subcohort B, a higher incidence density of Hyper-CEA was found among participants with MetS (33.42 and 29.13 per 1000 person-years for those with and without MetS, respectively). For nonsmoking participants aged > 65 years, MetS increased the risk of incident Hyper-CEA (HR, 1.87; 95% CI, 1.09 to 3.20; P = 0.022). CONCLUSION: For the direction of CEA on incident MetS, no significant association was observed. For the direction of MetS on incident Hyper-CEA, MetS in nonsmoking elderly men could increase the risk of incident Hyper-CEA, while this association was not found in other stratified participants. The clinical implications of the association between CEA and MetS should be interpreted with caution.


Asunto(s)
Antígeno Carcinoembrionario/sangre , Síndrome Metabólico/sangre , Adulto , Pueblo Asiatico , Estudios de Cohortes , Humanos , Incidencia , Masculino , Síndrome Metabólico/epidemiología , Persona de Mediana Edad , Fumar
17.
BMC Genet ; 21(1): 90, 2020 08 26.
Artículo en Inglés | MEDLINE | ID: mdl-32847502

RESUMEN

BACKGROUND: Genome-wide association studies (GWAS) have successfully identified genetic susceptible variants for complex diseases. However, the underlying mechanism of such association remains largely unknown. Most disease-associated genetic variants have been shown to reside in noncoding regions, leading to the hypothesis that regulation of gene expression may be the primary biological mechanism. Current methods to characterize gene expression mediating the effect of genetic variant on diseases, often analyzed one gene at a time and ignored the network structure. The impact of genetic variant can propagate to other genes along the links in the network, then to the final disease. There could be multiple pathways from the genetic variant to the final disease, with each having the chain structure since the first node is one specific SNP (Single Nucleotide Polymorphism) variant and the end is disease outcome. One key but inadequately addressed question is how to measure the between-node connection strength and rank the effects of such chain-type pathways, which can provide statistical evidence to give the priority of some pathways for potential drug development in a cost-effective manner. RESULTS: We first introduce the maximal correlation coefficient (MCC) to represent the between-node connection, and then integrate MCC with K shortest paths algorithm to rank and identify the potential pathways from genetic variant to disease. The pathway importance score (PIS) was further provided to quantify the importance of each pathway. We termed this method as "MCC-SP". Various simulations are conducted to illustrate MCC is a better measurement of the between-node connection strength than other quantities including Pearson correlation, Spearman correlation, distance correlation, mutual information, and maximal information coefficient. Finally, we applied MCC-SP to analyze one real dataset from the Religious Orders Study and the Memory and Aging Project, and successfully detected 2 typical pathways from APOE genotype to Alzheimer's disease (AD) through gene expression enriched in Alzheimer's disease pathway. CONCLUSIONS: MCC-SP has powerful and robust performance in identifying the pathway(s) from the genetic variant to the disease. The source code of MCC-SP is freely available at GitHub ( https://github.com/zhuyuchen95/ADnet ).


Asunto(s)
Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Algoritmos , Enfermedad de Alzheimer/genética , Simulación por Computador , Genotipo , Humanos , Modelos Genéticos , Programas Informáticos
18.
Environ Health ; 19(1): 12, 2020 01 30.
Artículo en Inglés | MEDLINE | ID: mdl-32000783

RESUMEN

BACKGROUND: Exposure to air pollution is associated with chronic obstructive pulmonary disease (COPD). However, findings on the effects of air pollution on lung function and systemic inflammation in Chinese COPD patients are inconsistent and scarce. This study aims to evaluate the effects of ambient air pollution on lung function parameters and serum cytokine levels in a COPD cohort in Beijing, China. METHODS: We enrolled COPD participants on a rolling basis from December 2015 to September 2017 in Beijing, China. Follow-ups were performed every 3 months for each participant. Serum levels of 20 cytokines were detected every 6 months. Hourly ambient pollutant levels over the same periods were obtained from 35 monitoring stations across Beijing. Geocoded residential addresses of the participants were used to estimate daily mean pollution exposures. A linear mixed-effect model was applied to explore the effects of air pollutants on health in the first-year of follow-up. RESULTS: A total of 84 COPD patients were enrolled at baseline. Of those, 75 COPD patients completed the first-year of follow-up. We found adverse cumulative effects of particulate matter less than 2.5 µm in aerodynamic diameter (PM2.5), nitrogen dioxide (NO2), sulfur dioxide (SO2) and carbon monoxide (CO) on the forced vital capacity % predicted (FVC % pred) in patients with COPD. Further analyses illustrated that among COPD patients, air pollution exposure was associated with reduced levels of serum eotaxin, interleukin 4 (IL-4) and IL-13 and was correlated with increased serum IL-2, IL-12, IL-17A, interferon γ (IFNγ), monocyte displacing protein 1 (MCP-1) and soluble CD40 ligand (sCD40L). CONCLUSION: Acute exposures to PM2.5, NO2, SO2 and CO were associated with a reduction in FVC % pred in COPD patients. Furthermore, short-term exposure to air pollutants increased systemic inflammation in COPD patients; this may be attributed to increased Th1 and Th17 cytokines and decreased Th2 cytokines.


Asunto(s)
Contaminación del Aire/efectos adversos , Citocinas/sangre , Inflamación/fisiopatología , Pulmón/fisiopatología , Enfermedad Pulmonar Obstructiva Crónica/complicaciones , Adulto , Anciano , Beijing , Femenino , Humanos , Inflamación/inducido químicamente , Masculino , Persona de Mediana Edad , Pacientes , Pruebas de Función Respiratoria , Suero/química , Factores de Tiempo , Adulto Joven
19.
Stat Med ; 39(30): 4869-4884, 2020 12 30.
Artículo en Inglés | MEDLINE | ID: mdl-33617001

RESUMEN

Multiomics or integrative omics data have been increasingly common in biomedical studies, holding a promise in better understanding human health and disease. In this article, we propose an integrative copula discrimination analysis classifier in the context of two-class classification, which relaxes the common Gaussian assumption and gains power by borrowing information from multiple omics data types in discriminant analysis. Numerical studies are conducted to assess the finite sample performance of the new classifier. We apply our model to the Religious Orders Study and Memory and Aging Project (ROSMAP) Study, integrating gene expression and DNA methylation data for better prediction.


Asunto(s)
Metilación de ADN , Análisis Discriminante , Humanos , Distribución Normal
20.
Clin Epidemiol ; 11: 1047-1055, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31849535

RESUMEN

OBJECTIVE: Dyslipidemia has been recognized as a major risk factor of several diseases, and early prevention and management of dyslipidemia is effective in the primary prevention of cardiovascular events. The present study aims to develop risk models for predicting dyslipidemia using Random Survival Forest (RSF), which take the complex relationship between the variables into account. METHODS: We used data from 6328 participants aged between 19 and 90 years free of dyslipidemia at baseline with a maximum follow-up of 5 years. RSF was applied to develop gender-specific risk model for predicting dyslipidemia using variables from anthropometric and laboratory test in the cohort. Cox regression was also adopted in comparison with the RSF model, and Harrell's concordance statistic with 10-fold cross-validation was used to validate the models. RESULTS: The incidence density of dyslipidemia was 101/1000 in total and subgroup incidence densities were 121/1000 for men and 69/1000 for women. Twenty-four predictors were identified in the prediction model of males and 23 in females. The C-statistics of the prediction models for males and females were 0.731 and 0.801, respectively. The RSF model shows better discriminative performance than CPH model (0.719 for males and 0.787 for females). Moreover, some predictors were observed to have a nonlinear effect on dyslipidemia. CONCLUSION: The RSF model is a promising method in identifying high-risk individuals for the prevention of dyslipidemia and related diseases.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...