Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
1.
Heliyon ; 10(11): e31373, 2024 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-38841513

RESUMO

Objective: The traditional Chinese patent medicine (TCPM), Simo decoction (Simo decoction oral solution), with its primary ingredient Arecae semen (Binglang, Areca catechu L.), known for its potential carcinogenic effects, is the subject of this study. The research aims to analyze the effectiveness and potential risks of Simo decoction, particularly as a carcinogen, and to suggest a framework for evaluating the risks and benefits of other herbal medicines. Methods: The study is based on post-marketing research of Simo decoction and Arecae semen. It utilized a wide range of sources, including ancient and modern literature, focusing on the efficacy and safety of Simo decoction. The research includes retrospective data on the sources, varieties, and toxicological studies of Arecae semen from databases such as Pubmed, Clinical Trials, Chinese Clinical Trial Registry, China National Knowledge Infrastructure, WHO-UMC Vigibase, and China National Center for ADR Monitoring. Results: Common adverse drug reactions (ADRs) associated with Simo decoction include skin rash, nausea, vomiting, abdominal pain, and diarrhea. However, no studies exist reporting the severe ADRs, such as carcinogenic effects. Arecae semen is distributed across approximately 60 varieties in tropical Asia and Australia. According to the WHO-UMC Vigibase and the National Adverse Drug Reaction Monitoring System databases, there are currently no reports of toxicity related to Arecae semen in the International System for Classification of ADRs (ISCR) or clinical studies. Conclusion: Risk-benefit analysis in TCPM presents more challenges compared to conventional drugs. The development of a practical pharmacovigilance system and risk-benefit analysis framework is crucial for marketing authorization holders, researchers, and regulatory bodies. This approach is vital for scientific supervision and ensuring the safety and efficacy of drug applications, thus protecting public health.

2.
Sci Rep ; 14(1): 3948, 2024 02 17.
Artigo em Inglês | MEDLINE | ID: mdl-38366092

RESUMO

Feature selection is an indispensable step for the analysis of high-dimensional molecular data. Despite its importance, consensus is lacking on how to choose the most appropriate feature selection methods, especially when the performance of the feature selection methods itself depends on hyper-parameters. Bayesian optimization has demonstrated its advantages in automatically configuring the settings of hyper-parameters for various models. However, it remains unclear whether Bayesian optimization can benefit feature selection methods. In this research, we conducted extensive simulation studies to compare the performance of various feature selection methods, with a particular focus on the impact of Bayesian optimization on those where hyper-parameters tuning is needed. We further utilized the gene expression data obtained from the Alzheimer's Disease Neuroimaging Initiative to predict various brain imaging-related phenotypes, where various feature selection methods were employed to mine the data. We found through simulation studies that feature selection methods with hyper-parameters tuned using Bayesian optimization often yield better recall rates, and the analysis of transcriptomic data further revealed that Bayesian optimization-guided feature selection can improve the accuracy of disease risk prediction models. In conclusion, Bayesian optimization can facilitate feature selection methods when hyper-parameter tuning is needed and has the potential to substantially benefit downstream tasks.


Assuntos
Perfilação da Expressão Gênica , Neuroimagem , Teorema de Bayes , Simulação por Computador
3.
Front Pharmacol ; 15: 1288479, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38318135

RESUMO

Background: This study aimed to assess the overall reporting quality of randomized controlled trials (RCTs) in Chinese herbal medicine (CHM) formulas for patients with diabetes, and to identify factors associated with better reporting quality. Methods: Four databases including PubMed, Embase, Cochrane Library and Web of Science were systematically searched from their inception to December 2022. The reporting quality was assessed based on the Consolidated Standards of Reporting Trials (CONSORT) statement and its CHM formula extension. The overall CONSORT and its CHM formula extension scores were calculated and expressed as proportions separately. We also analyzed the pre-specified study characteristics and performed exploratory regressions to determine their associations with the reporting quality. Results: Seventy-two RCTs were included. Overall reporting quality (mean adherence) were 53.56% and 45.71% on the CONSORT statement and its CHM formula extension, respectively. The strongest associations with reporting quality based on the CONSORT statement were multiple centers and larger author numbers. Compliance with the CHM formula extension, particularly regarding the disclosure of the targeted traditional Chinese medicine (TCM) pattern (s), was generally insufficient. Conclusion: The reporting quality of RCTs in CHM formulas for diabetes remains unsatisfactory, and the adherence to the CHM formula extension is even poorer. In order to ensure transparent and standardized reporting of RCTs, it is essential to advocate for or even mandate adherence of the CONSORT statement and its CHM formula extension when reporting trials in CHM formulas for diabetes by both authors and editors.

4.
Biomolecules ; 13(12)2023 12 08.
Artigo em Inglês | MEDLINE | ID: mdl-38136632

RESUMO

The detection of Parkinson's disease (PD) in its early stages is of great importance for its treatment and management, but consensus is lacking on what information is necessary and what models should be used to best predict PD risk. In our study, we first grouped PD-associated factors based on their cost and accessibility, and then gradually incorporated them into risk predictions, which were built using eight commonly used machine learning models to allow for comprehensive assessment. Finally, the Shapley Additive Explanations (SHAP) method was used to investigate the contributions of each factor. We found that models built with demographic variables, hospital admission examinations, clinical assessment, and polygenic risk score achieved the best prediction performance, and the inclusion of invasive biomarkers could not further enhance its accuracy. Among the eight machine learning models considered, penalized logistic regression and XGBoost were the most accurate algorithms for assessing PD risk, with penalized logistic regression achieving an area under the curve of 0.94 and a Brier score of 0.08. Olfactory function and polygenic risk scores were the most important predictors for PD risk. Our research has offered a practical framework for PD risk assessment, where necessary information and efficient machine learning tools were highlighted.


Assuntos
Doença de Parkinson , Humanos , Doença de Parkinson/diagnóstico , Doença de Parkinson/genética , Algoritmos , Estratificação de Risco Genético , Hospitalização , Aprendizado de Máquina
5.
Front Genet ; 14: 1267704, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37928242

RESUMO

Motivation: Family-based study design is one of the popular designs used in genetic research, and the whole-genome sequencing data obtained from family-based studies offer many unique features for risk prediction studies. They can not only provide a more comprehensive view of many complex diseases, but also utilize information in the design to further improve the prediction accuracy. While promising, existing analytical methods often ignore the information embedded in the study design and overlook the predictive effects of rare variants, leading to a prediction model with sub-optimal performance. Results: We proposed a Bayesian linear mixed model for the prediction analysis of sequencing data obtained from family-based studies. Our method can not only capture predictive effects from both common and rare variants, but also easily accommodate various disease model assumptions. It uses information embedded in the study design to form surrogates, where the predictive effects from unmeasured/unknown genetic and environmental risk factors can be modelled. Through extensive simulation studies and the analysis of sequencing data obtained from the Michigan State University Twin Registry study, we have demonstrated that the proposed method outperforms commonly adopted techniques. Availability: R package is available at https://github.com/yhai943/FBLMM.

6.
Bioinformatics ; 39(11)2023 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-37882747

RESUMO

MOTIVATION: Accurate disease risk prediction is an essential step in the modern quest for precision medicine. While high-dimensional multi-omics data have provided unprecedented data resources for prediction studies, their high-dimensionality and complex inter/intra-relationships have posed significant analytical challenges. RESULTS: We proposed a two-step Bayesian linear mixed model framework (TBLMM) for risk prediction analysis on multi-omics data. TBLMM models the predictive effects from multi-omics data using a hybrid of the sparsity regression and linear mixed model with multiple random effects. It can resemble the shape of the true effect size distributions and accounts for non-linear, including interaction effects, among multi-omics data via kernel fusion. It infers its parameters via a computationally efficient variational Bayes algorithm. Through extensive simulation studies and the prediction analyses on the positron emission tomography imaging outcomes using data obtained from the Alzheimer's Disease Neuroimaging Initiative, we have demonstrated that TBLMM can consistently outperform the existing method in predicting the risk of complex traits. AVAILABILITY AND IMPLEMENTATION: The corresponding R package is available on GitHub (https://github.com/YaluWen/TBLMM).


Assuntos
Algoritmos , Multiômica , Teorema de Bayes , Modelos Lineares , Simulação por Computador
7.
Front Psychiatry ; 14: 1144697, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37426090

RESUMO

Introduction: The comorbidity between major depressive disorder (MDD) and coronavirus disease of 2019 (COVID-19) related traits have long been identified in clinical settings, but their shared genetic foundation and causal relationships are unknown. Here, we investigated the genetic mechanisms behind COVID-19 related traits and MDD using the cross-trait meta-analysis, and evaluated the underlying causal relationships between MDD and 3 different COVID-19 outcomes (severe COVID-19, hospitalized COVID-19, and COVID-19 infection). Methods: In this study, we conducted a comprehensive analysis using the most up-to-date and publicly available GWAS summary statistics to explore shared genetic etiology and the causality between MDD and COVID-19 outcomes. We first used genome-wide cross-trait meta-analysis to identify the pleiotropic genomic SNPs and the genes shared by MDD and COVID-19 outcomes, and then explore the potential bidirectional causal relationships between MDD and COVID-19 outcomes by implementing a bidirectional MR study design. We further conducted functional annotations analyses to obtain biological insight for shared genes from the results of cross-trait meta-analysis. Results: We have identified 71 SNPs located on 25 different genes are shared between MDD and COVID-19 outcomes. We have also found that genetic liability to MDD is a causal factor for COVID-19 outcomes. In particular, we found that MDD has causal effect on severe COVID-19 (OR = 1.832, 95% CI = 1.037-3.236) and hospitalized COVID-19 (OR = 1.412, 95% CI = 1.021-1.953). Functional analysis suggested that the shared genes are enriched in Cushing syndrome, neuroactive ligand-receptor interaction. Discussion: Our findings provide convincing evidence on shared genetic etiology and causal relationships between MDD and COVID-19 outcomes, which is crucial to prevention, and therapeutic treatment of MDD and COVID-19.

8.
Sci Rep ; 13(1): 5478, 2023 04 04.
Artigo em Inglês | MEDLINE | ID: mdl-37015993

RESUMO

While the high-dimensional biological data have provided unprecedented data resources for the identification of biomarkers, consensus is still lacking on how to best analyze them. The recently developed Gaussian mirror (GM) and Model-X (MX) knockoff-based methods have much related model assumptions, which makes them appealing for the detection of new biomarkers. However, there are no guidelines for their practical use. In this research, we systematically compared the performance of MX-based and GM methods, where the impacts of the distribution of explanatory variables, their relatedness and the signal-to-noise ratio were evaluated. MX with knockoff generated using the second-order approximates (MX-SO) has the best performance as compared to other MX-based methods. MX-SO and GM have similar levels of power and computational speed under most of the simulations, but GM is more robust in the control of false discovery rate (FDR). In particular, MX-SO can only control the FDR well when there are weak correlations among explanatory variables and the sample size is at least moderate. On the contrary, GM can have the desired FDR as long as explanatory variables are not highly correlated. We further used GM and MX-based methods to detect biomarkers that are associated with the Alzheimer's disease-related PET-imaging trait and the Parkinson's disease-related T-tau of cerebrospinal fluid. We found that MX-based and GM methods are both powerful for the analysis of big biological data. Although genes selected from MX-based methods are more similar as compared to those from the GM method, both MX-based and GM methods can identify the well-known disease-associated genes for each disease. While MX-based methods can have a slightly higher power than that of the GM method, it is less robust, especially for data with small sample sizes, unknown distributions, and high correlations.


Assuntos
Doença de Alzheimer , Doença de Parkinson , Humanos , Doença de Alzheimer/genética , Doença de Alzheimer/líquido cefalorraquidiano , Biomarcadores/líquido cefalorraquidiano , Fenótipo , Distribuição Normal , Doença de Parkinson/genética
9.
Front Genet ; 13: 1017380, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36276959

RESUMO

Brain imaging outcomes are important for Alzheimer's disease (AD) detection, and their prediction based on both genetic and demographic risk factors can facilitate the ongoing prevention and treatment of AD. Existing studies have identified numerous significantly AD-associated SNPs. However, how to make the best use of them for prediction analyses remains unknown. In this research, we first explored the relationship between genetic architecture and prediction accuracy of linear mixed models via visualizing the Manhattan plots generated based on the data obtained from the Wellcome Trust Case Control Consortium, and then constructed prediction models for eleven AD-related brain imaging outcomes using data from United Kingdom Biobank and Alzheimer's Disease Neuroimaging Initiative studies. We found that the simple Manhattan plots can be informative for the selection of prediction models. For traits that do not exhibit any significant signals from the Manhattan plots, the simple genomic best linear unbiased prediction (gBLUP) model is recommended due to its robust and accurate prediction performance as well as its computational efficiency. For diseases and traits that show spiked signals on the Manhattan plots, the latent Dirichlet process regression is preferred, as it can flexibly accommodate both the oligogenic and omnigenic models. For the prediction of AD-related traits, the Manhattan plots suggest their polygenic nature, and gBLUP has achieved robust performance for all these traits. We found that for these AD-related traits, genetic factors themselves only explain a very small proportion of the heritability, and the well-known AD risk factors can substantially improve the prediction model.

10.
Bioinformatics ; 38(23): 5222-5228, 2022 11 30.
Artigo em Inglês | MEDLINE | ID: mdl-36205617

RESUMO

MOTIVATION: Linear mixed models (LMMs) have long been the method of choice for risk prediction analysis on high-dimensional data. However, it remains computationally challenging to simultaneously model a large amount of variants that can be noise or have predictive effects of complex forms. RESULTS: In this work, we have developed a penalized LMM with generalized method of moments (pLMMGMM) estimators for prediction analysis. pLMMGMM is built within the LMM framework, where random effects are used to model the joint predictive effects from all variants within a region. Different from existing methods that focus on linear relationships and use empirical criteria for variable screening, pLMMGMM can efficiently detect regions that harbor genetic variants with both linear and non-linear predictive effects. In addition, unlike existing LMMs that can only handle a very limited number of random effects, pLMMGMM is much less computationally demanding. It can jointly consider a large number of regions and accurately detect those that are predictive. Through theoretical investigations, we have shown that our method has the selection consistency and asymptotic normality. Through extensive simulations and the analysis of PET-imaging outcomes, we have demonstrated that pLMMGMM outperformed existing models and it can accurately detect regions that harbor risk factors with various forms of predictive effects. AVAILABILITY AND IMPLEMENTATION: The R-package is available at https://github.com/XiaQiong/GMMLasso. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Modelos Lineares , Fenótipo
11.
PLoS Comput Biol ; 18(7): e1010328, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35839250

RESUMO

Building an accurate disease risk prediction model is an essential step in the modern quest for precision medicine. While high-dimensional genomic data provides valuable data resources for the investigations of disease risk, their huge amount of noise and complex relationships between predictors and outcomes have brought tremendous analytical challenges. Deep learning model is the state-of-the-art methods for many prediction tasks, and it is a promising framework for the analysis of genomic data. However, deep learning models generally suffer from the curse of dimensionality and the lack of biological interpretability, both of which have greatly limited their applications. In this work, we have developed a deep neural network (DNN) based prediction modeling framework. We first proposed a group-wise feature importance score for feature selection, where genes harboring genetic variants with both linear and non-linear effects are efficiently detected. We then designed an explainable transfer-learning based DNN method, which can directly incorporate information from feature selection and accurately capture complex predictive effects. The proposed DNN-framework is biologically interpretable, as it is built based on the selected predictive genes. It is also computationally efficient and can be applied to genome-wide data. Through extensive simulations and real data analyses, we have demonstrated that our proposed method can not only efficiently detect predictive features, but also accurately predict disease risk, as compared to many existing methods.


Assuntos
Genômica , Redes Neurais de Computação , Aprendizado de Máquina
12.
Brief Bioinform ; 23(4)2022 07 18.
Artigo em Inglês | MEDLINE | ID: mdl-35649346

RESUMO

With the advances in high-throughput biotechnologies, high-dimensional multi-layer omics data become increasingly available. They can provide both confirmatory and complementary information to disease risk and thus have offered unprecedented opportunities for risk prediction studies. However, the high-dimensionality and complex inter/intra-relationships among multi-omics data have brought tremendous analytical challenges. Here we present a computationally efficient penalized linear mixed model with generalized method of moments estimator (MpLMMGMM) for the prediction analysis on multi-omics data. Our method extends the widely used linear mixed model proposed for genomic risk predictions to model multi-omics data, where kernel functions are used to capture various types of predictive effects from different layers of omics data and penalty terms are introduced to reduce the impact of noise. Compared with existing penalized linear mixed models, the proposed method adopts the generalized method of moments estimator and it is much more computationally efficient. Through extensive simulation studies and the analysis of positron emission tomography imaging outcomes, we have demonstrated that MpLMMGMM can simultaneously consider a large number of variables and efficiently select those that are predictive from the corresponding omics layers. It can capture both linear and nonlinear predictive effects and achieves better prediction performance than competing methods.


Assuntos
Algoritmos , Genômica , Genoma , Genômica/métodos , Modelos Lineares , Projetos de Pesquisa
14.
Stat Med ; 41(3): 517-542, 2022 02 10.
Artigo em Inglês | MEDLINE | ID: mdl-34811777

RESUMO

Converging evidence from genetic studies and population genetics theory suggest that complex diseases are characterized by remarkable genetic heterogeneity, and individual rare mutations with different effects could collectively play an important role in human diseases. Many existing statistical models for association analysis assume homogeneous effects of genetic variants across all individuals, and could be subject to power loss in the presence of genetic heterogeneity. To consider possible heterogeneous genetic effects among individuals, we propose a conditional autoregressive model. In the proposed method, the genetic effect is considered as a random effect and a score test is developed to test the variance component of genetic random effect. Through simulations, we compare the type I error and power performance of the proposed method with those of the generalized genetic random field and the sequence kernel association test methods under different disease scenarios. We find that our method outperforms the other two methods when (i) the rare variants have the major contribution to the disease, or (ii) the genetic effects vary in different individuals or subgroups of individuals. Finally, we illustrate the new method by applying it to the whole genome sequencing data from the Alzheimer's Disease Neuroimaging Initiative.


Assuntos
Heterogeneidade Genética , Modelos Genéticos , Testes Genéticos , Variação Genética , Humanos , Modelos Estatísticos
15.
Sci Total Environ ; 804: 150194, 2022 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-34798737

RESUMO

Biochar has been utilized as a renewable biomass resource to develop sustainable and eco-friendly pavements. This study focuses on the influence of biochar as an asphalt modifier on the improvement of high-temperature performance of asphalt. A series of tests were performed to comprehensively evaluate the high-temperature performance of the biochar modified binder. The interaction mechanism between the biochar and the binder was explored using scanning electron microscopy and Fourier-transform infrared spectroscopy (FTIR). The results indicated that the complex modulus and penetration of the biochar-modified asphalt binder could be increased by up to 35% and 36.5%, respectively, compared with those in case of the matrix asphalt, thereby improving the deformation resistance. In addition, the observed increase in the complex modulus, rutting factor, and viscosity-temperature index contributed to the improvement of temperature sensitivity and anti-rutting properties. These relationships are attributed to the fact that biochar has a fibrous porous structure and forms a skeleton and stiffening zone in the binder. Although biochar has a negative effect on the low-temperature properties of the binder, this can be alleviated by controlling the biochar content. Moreover, the FTIR results showed that no new chemical functional groups appeared after the incorporation of biochar into the binder. The internal chemical environment of the biochar-modified asphalt binder was different from that of the matrix asphalt. In conclusion, biochar is feasible as a modifier for binders owing to its high-temperature properties.


Assuntos
Carvão Vegetal , Hidrocarbonetos , Temperatura
16.
Zhongguo Zhong Yao Za Zhi ; 46(7): 1839-1845, 2021 Apr.
Artigo em Chinês | MEDLINE | ID: mdl-33982489

RESUMO

According to the notice on revision of the instructions for traditional Chinese medicine injections(TCMIs) issued by the National Medical Products Administration(NMPA) from January 2006 to May 2020, the revised contents in the instructions for 29 varieties involved in the notice were sorted out, and the existing problems in the instructions for TCMIs were analyzed, so as to provide the basis for dynamic revision of the instructions. It was found that the revised items of instructions for 29 varieties all involved adverse reactions, contraindications and precautions, and warnings were added for 82.76% of 29 TCMIs preparations, indicating that all the revised contents were related to safety issues. In addition, 33.33% of the drugs risks mentioned in the precautions were not indicated in the adverse reactions; 82.76% instructions did not indicate drug interactions; 17.24% instructions lacked medication notes for special populations; 48.28% instructions did not indicate traditional Chinese medicine(TCM) syndromes of the main disease; 44.83% instructions did not indicate the type and stage of indication; and 86.21% instructions did not indicate the course of treatment. It could be concluded that the instructions for TCMIs have known risks of drugs that are not fully reflected in adverse reactions and the effective information is not comprehensive. The risk control measures proposed in the precautions need to have aftereffect evaluation and there is a lack of drug interactions and medications for special populations. As an important part of the full life-cycle management of drugs, the revision of instructions for TCMIs should be continuously improved to provide the basis for safe and reasonable application of TCMIs. Based on the above problems, it is proposed that the marketing license holder as the main body of the revision of instructions should actively carry out post-marketing basic and clinical research in accordance with the characteristics of TCM, combine the updated research with the guidance of TCM theory and improve the revision level of instructions for TCMIs to provide the basis for post-marketing evaluation.


Assuntos
Medicamentos de Ervas Chinesas , Medicina Tradicional Chinesa , Humanos , Injeções , Síndrome
17.
Bioinformatics ; 36(22-23): 5415-5423, 2021 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-33331865

RESUMO

MOTIVATION: Accurate disease risk prediction is essential for precision medicine. Existing models either assume that diseases are caused by groups of predictors with small-to-moderate effects or a few isolated predictors with large effects. Their performance can be sensitive to the underlying disease mechanisms, which are usually unknown in advance. RESULTS: We developed a Bayesian linear mixed model (BLMM), where genetic effects were modelled using a hybrid of the sparsity regression and linear mixed model with multiple random effects. The parameters in BLMM were inferred through a computationally efficient variational Bayes algorithm. The proposed method can resemble the shape of the true effect size distributions, captures the predictive effects from both common and rare variants, and is robust against various disease models. Through extensive simulations and the application to a whole-genome sequencing dataset obtained from the Alzheimer's Disease Neuroimaging Initiatives, we have demonstrated that BLMM has better prediction performance than existing methods and can detect variables and/or genetic regions that are predictive. AVAILABILITYAND IMPLEMENTATION: The R-package is available at https://github.com/yhai943/BLMM. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

18.
Stat Med ; 39(9): 1311-1327, 2020 04 30.
Artigo em Inglês | MEDLINE | ID: mdl-31985088

RESUMO

Linear mixed models (LMMs) and their extensions have been widely used for high-dimensional genomic data analyses. While LMMs hold great promise for risk prediction research, the high dimensionality of the data and different effect sizes of genomic regions bring great analytical and computational challenges. In this work, we present a multikernel linear mixed model with adaptive lasso (KLMM-AL) to predict phenotypes using high-dimensional genomic data. We develop two algorithms for estimating parameters from our model and also establish the asymptotic properties of LMM with adaptive lasso when only one dependent observation is available. The proposed KLMM-AL can account for heterogeneous effect sizes from different genomic regions, capture both additive and nonadditive genetic effects, and adaptively and efficiently select predictive genomic regions and their corresponding effects. Through simulation studies, we demonstrate that KLMM-AL outperforms most of existing methods. Moreover, KLMM-AL achieves high sensitivity and specificity of selecting predictive genomic regions. KLMM-AL is further illustrated by an application to the sequencing dataset obtained from the Alzheimer's disease neuroimaging initiative.


Assuntos
Algoritmos , Genômica , Simulação por Computador , Modelos Lineares , Fenótipo
19.
Bioinformatics ; 36(8): 2365-2374, 2020 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-31913435

RESUMO

MOTIVATION: The emerging multilayer omics data provide unprecedented opportunities for detecting biomarkers that are associated with complex diseases at various molecular levels. However, the high-dimensionality of multiomics data and the complex disease etiologies have brought tremendous analytical challenges. RESULTS: We developed a U-statistics-based non-parametric framework for the association analysis of multilayer omics data, where consensus and permutation-based weighting schemes are developed to account for various types of disease models. Our proposed method is flexible for analyzing different types of outcomes as it makes no assumptions about their distributions. Moreover, it explicitly accounts for various types of underlying disease models through weighting schemes and thus provides robust performance against them. Through extensive simulations and the application to dataset obtained from the Alzheimer's Disease Neuroimaging Initiatives, we demonstrated that our method outperformed the commonly used kernel regression-based methods. AVAILABILITY AND IMPLEMENTATION: The R-package is available at https://github.com/YaluWen/Uomic. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genômica , Proteômica , Projetos de Pesquisa , Biomarcadores , Software
20.
Stat Methods Med Res ; 29(1): 44-56, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-30612522

RESUMO

Genetic association studies using high-throughput genotyping and sequencing technologies have identified a large number of genetic variants associated with complex human diseases. These findings have provided an unprecedented opportunity to identify individuals in the population at high risk for disease who carry causal genetic mutations and hold great promise for early intervention and individualized medicine. While interest is high in building risk prediction models based on recent genetic findings, it is crucial to have appropriate statistical measurements to assess the performance of a genetic risk prediction model. Predictiveness curves were recently proposed as a graphic tool for evaluating a risk prediction model on the basis of a single continuous biomarker. The curve evaluates a risk prediction model for classification performance as well as its usefulness when applied to a population. In this article, we extend the predictiveness curve to measure the collective contribution of multiple genetic variants. We further propose a nonparametric, U-statistics-based measurement, referred to as the U-Index, to quantify the performance of a multi-locus predictiveness curve. In particular, a global U-Index and a partial U-Index can be used in the general population and a subpopulation of particular clinical interest, respectively. Through simulation studies, we demonstrate that the proposed U-Index has advantages over several existing summary statistics under various disease models. We also show that the partial U-Index can have its own uniqueness when rare variants have a substantial contribution to disease risk. Finally, we use the proposed predictiveness curve and its corresponding U-Index to evaluate the performance of a genetic risk prediction model for nicotine dependence.


Assuntos
Predisposição Genética para Doença , Modelos Genéticos , Modelos Estatísticos , Tabagismo/genética , Biomarcadores/análise , Variação Genética , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Valor Preditivo dos Testes , Medição de Risco
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...