Búsqueda | Portal de Búsqueda de la BVS Ecuador

GUESS-ing polygenic associations with multiple phenotypes using a GPU-based evolutionary stochastic search algorithm.

Bottolo, Leonardo; Chadeau-Hyam, Marc; Hastie, David I; Zeller, Tanja; Liquet, Benoit; Newcombe, Paul; Yengo, Loic; Wild, Philipp S; Schillert, Arne; Ziegler, Andreas; Nielsen, Sune F; Butterworth, Adam S; Ho, Weang Kee; Castagné, Raphaële; Munzel, Thomas; Tregouet, David; Falchi, Mario; Cambien, François; Nordestgaard, Børge G; Fumeron, Fredéric; Tybjærg-Hansen, Anne; Froguel, Philippe; Danesh, John; Petretto, Enrico; Blankenberg, Stefan; Tiret, Laurence; Richardson, Sylvia.

PLoS Genet ; 9(8): e1003657, 2013.

Artículo en Inglés | MEDLINE | ID: mdl-23950726

RESUMEN

Genome-wide association studies (GWAS) yielded significant advances in defining the genetic architecture of complex traits and disease. Still, a major hurdle of GWAS is narrowing down multiple genetic associations to a few causal variants for functional studies. This becomes critical in multi-phenotype GWAS where detection and interpretability of complex SNP(s)-trait(s) associations are complicated by complex Linkage Disequilibrium patterns between SNPs and correlation between traits. Here we propose a computationally efficient algorithm (GUESS) to explore complex genetic-association models and maximize genetic variant detection. We integrated our algorithm with a new Bayesian strategy for multi-phenotype analysis to identify the specific contribution of each SNP to different trait combinations and study genetic regulation of lipid metabolism in the Gutenberg Health Study (GHS). Despite the relatively small size of GHS (n â= â3,175), when compared with the largest published meta-GWAS (n > 100,000), GUESS recovered most of the major associations and was better at refining multi-trait associations than alternative methods. Amongst the new findings provided by GUESS, we revealed a strong association of SORT1 with TG-APOB and LIPC with TG-HDL phenotypic groups, which were overlooked in the larger meta-GWAS and not revealed by competing approaches, associations that we replicated in two independent cohorts. Moreover, we demonstrated the increased power of GUESS over alternative multi-phenotype approaches, both Bayesian and non-Bayesian, in a simulation study that mimics real-case scenarios. We showed that our parallel implementation based on Graphics Processing Units outperforms alternative multi-phenotype methods. Beyond multivariate modelling of multi-phenotypes, our Bayesian model employs a flexible hierarchical prior structure for genetic effects that adapts to any correlation structure of the predictors and increases the power to identify associated variants. This provides a powerful tool for the analysis of diverse genomic features, for instance including gene expression and exome sequencing data, where complex dependencies are present in the predictor space.

Asunto(s)

Algoritmos , Evolución Biológica , Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo/genética , Teorema de Bayes , Exoma/genética , Expresión Génica , Humanos , Desequilibrio de Ligamiento , Fenotipo , Polimorfismo de Nucleótido Simple/genética

PReMiuM: An R Package for Profile Regression Mixture Models Using Dirichlet Processes.

Liverani, Silvia; Hastie, David I; Azizi, Lamiae; Papathomas, Michail; Richardson, Sylvia.

J Stat Softw ; 64(7): 1-30, 2015 Mar 20.

Artículo en Inglés | MEDLINE | ID: mdl-27307779

RESUMEN

PReMiuM is a recently developed R package for Bayesian clustering using a Dirichlet process mixture model. This model is an alternative to regression models, non-parametrically linking a response vector to covariate data through cluster membership (Molitor, Papathomas, Jerrett, and Richardson 2010). The package allows binary, categorical, count and continuous response, as well as continuous and discrete covariates. Additionally, predictions may be made for the response, and missing values for the covariates are handled. Several samplers and label switching moves are implemented along with diagnostic tools to assess convergence. A number of R functions for post-processing of the output are also provided. In addition to fitting mixtures, it may additionally be of interest to determine which covariates actively drive the mixture components. This is implemented in the package as variable selection.

A semi-parametric approach to estimate risk functions associated with multi-dimensional exposure profiles: application to smoking and lung cancer.

Hastie, David I; Liverani, Silvia; Azizi, Lamiae; Richardson, Sylvia; Stücker, Isabelle.

BMC Med Res Methodol ; 13: 129, 2013 Oct 23.

Artículo en Inglés | MEDLINE | ID: mdl-24152389

RESUMEN

BACKGROUND: A common characteristic of environmental epidemiology is the multi-dimensional aspect of exposure patterns, frequently reduced to a cumulative exposure for simplicity of analysis. By adopting a flexible Bayesian clustering approach, we explore the risk function linking exposure history to disease. This approach is applied here to study the relationship between different smoking characteristics and lung cancer in the framework of a population based case control study. METHODS: Our study includes 4658 males (1995 cases, 2663 controls) with full smoking history (intensity, duration, time since cessation, pack-years) from the ICARE multi-centre study conducted from 2001-2007. We extend Bayesian clustering techniques to explore predictive risk surfaces for covariate profiles of interest. RESULTS: We were able to partition the population into 12 clusters with different smoking profiles and lung cancer risk. Our results confirm that when compared to intensity, duration is the predominant driver of risk. On the other hand, using pack-years of cigarette smoking as a single summary leads to a considerable loss of information. CONCLUSIONS: Our method estimates a disease risk associated to a specific exposure profile by robustly accounting for the different dimensions of exposure and will be helpful in general to give further insight into the effect of exposures that are accumulated through different time patterns.

Asunto(s)

Adenocarcinoma/etiología , Neoplasias Pulmonares/etiología , Fumar/efectos adversos , Teorema de Bayes , Estudios de Casos y Controles , Interpretación Estadística de Datos , Exposición a Riesgos Ambientales/efectos adversos , Humanos , Masculino , Modelos Estadísticos , Análisis Multivariante , Oportunidad Relativa , Factores de Riesgo , Sensibilidad y Especificidad

ESS++: a C++ objected-oriented algorithm for Bayesian stochastic search model exploration.

Bottolo, Leonardo; Chadeau-Hyam, Marc; Hastie, David I; Langley, Sarah R; Petretto, Enrico; Tiret, Laurence; Tregouet, David; Richardson, Sylvia.

Bioinformatics ; 27(4): 587-8, 2011 Feb 15.

Artículo en Inglés | MEDLINE | ID: mdl-21233165

RESUMEN

SUMMARY: ESS++ is a C++ implementation of a fully Bayesian variable selection approach for single and multiple response linear regression. ESS++ works well both when the number of observations is larger than the number of predictors and in the 'large p, small n' case. In the current version, ESS++ can handle several hundred observations, thousands of predictors and a few responses simultaneously. The core engine of ESS++ for the selection of relevant predictors is based on Evolutionary Monte Carlo. Our implementation is open source, allowing community-based alterations and improvements. AVAILABILITY: C++ source code and documentation including compilation instructions are available under GNU licence at http://bgx.org.uk/software/ESS.html.

Asunto(s)

Algoritmos , Teorema de Bayes , Modelos Estadísticos , Lenguajes de Programación , Programas Informáticos , Regulación de la Expresión Génica , Modelos Lineales , Procesos Estocásticos

Sampling from Dirichlet process mixture models with unknown concentration parameter: mixing issues in large data implementations.

Hastie, David I; Liverani, Silvia; Richardson, Sylvia.

Stat Comput ; 25(5): 1023-1037, 2015.

Artículo en Inglés | MEDLINE | ID: mdl-26321800

RESUMEN

We consider the question of Markov chain Monte Carlo sampling from a general stick-breaking Dirichlet process mixture model, with concentration parameter [Formula: see text]. This paper introduces a Gibbs sampling algorithm that combines the slice sampling approach of Walker (Communications in Statistics - Simulation and Computation 36:45-54, 2007) and the retrospective sampling approach of Papaspiliopoulos and Roberts (Biometrika 95(1):169-186, 2008). Our general algorithm is implemented as efficient open source C++ software, available as an R package, and is based on a blocking strategy similar to that suggested by Papaspiliopoulos (A note on posterior sampling from Dirichlet mixture models, 2008) and implemented by Yau et al. (Journal of the Royal Statistical Society, Series B (Statistical Methodology) 73:37-57, 2011). We discuss the difficulties of achieving good mixing in MCMC samplers of this nature in large data sets and investigate sensitivity to initialisation. We additionally consider the challenges when an additional layer of hierarchy is added such that joint inference is to be made on [Formula: see text]. We introduce a new label-switching move and compute the marginal partition posterior to help to surmount these difficulties. Our work is illustrated using a profile regression (Molitor et al. Biostatistics 11(3):484-498, 2010) application, where we demonstrate good mixing behaviour for both synthetic and real examples.

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA