Búsqueda | OPS/OMS Uruguay

Improving GWAS discovery and genomic prediction accuracy in biobank data.

Orliac, Etienne J; Trejo Banos, Daniel; Ojavee, Sven E; Läll, Kristi; Mägi, Reedik; Visscher, Peter M; Robinson, Matthew R.

Proc Natl Acad Sci U S A ; 119(31): e2121279119, 2022 08 02.

Artículo en Inglés | MEDLINE | ID: mdl-35905320

RESUMEN

Genetically informed, deep-phenotyped biobanks are an important research resource and it is imperative that the most powerful, versatile, and efficient analysis approaches are used. Here, we apply our recently developed Bayesian grouped mixture of regressions model (GMRM) in the UK and Estonian Biobanks and obtain the highest genomic prediction accuracy reported to date across 21 heritable traits. When compared to other approaches, GMRM accuracy was greater than annotation prediction models run in the LDAK or LDPred-funct software by 15% (SE 7%) and 14% (SE 2%), respectively, and was 18% (SE 3%) greater than a baseline BayesR model without single-nucleotide polymorphism (SNP) markers grouped into minor allele frequency-linkage disequilibrium (MAF-LD) annotation categories. For height, the prediction accuracy R2 was 47% in a UK Biobank holdout sample, which was 76% of the estimated [Formula: see text]. We then extend our GMRM prediction model to provide mixed-linear model association (MLMA) SNP marker estimates for genome-wide association (GWAS) discovery, which increased the independent loci detected to 16,162 in unrelated UK Biobank individuals, compared to 10,550 from BoltLMM and 10,095 from Regenie, a 62 and 65% increase, respectively. The average [Formula: see text] value of the leading markers increased by 15.24 (SE 0.41) for every 1% increase in prediction accuracy gained over a baseline BayesR model across the traits. Thus, we show that modeling genetic associations accounting for MAF and LD differences among SNP markers, and incorporating prior knowledge of genomic function, is important for both genomic prediction and discovery in large-scale individual-level studies.

Asunto(s)

Bases de Datos Genéticas , Estudio de Asociación del Genoma Completo , Medicina de Precisión , Carácter Cuantitativo Heredable , Teorema de Bayes , Inglaterra , Estonia , Genómica , Genotipo , Humanos , Fenotipo , Polimorfismo de Nucleótido Simple

Closed-loop cycles of experiment design, execution, and learning accelerate systems biology model development in yeast.

Coutant, Anthony; Roper, Katherine; Trejo-Banos, Daniel; Bouthinon, Dominique; Carpenter, Martin; Grzebyta, Jacek; Santini, Guillaume; Soldano, Henry; Elati, Mohamed; Ramon, Jan; Rouveirol, Celine; Soldatova, Larisa N; King, Ross D.

Proc Natl Acad Sci U S A ; 116(36): 18142-18147, 2019 09 03.

Artículo en Inglés | MEDLINE | ID: mdl-31420515

RESUMEN

One of the most challenging tasks in modern science is the development of systems biology models: Existing models are often very complex but generally have low predictive performance. The construction of high-fidelity models will require hundreds/thousands of cycles of model improvement, yet few current systems biology research studies complete even a single cycle. We combined multiple software tools with integrated laboratory robotics to execute three cycles of model improvement of the prototypical eukaryotic cellular transformation, the yeast (Saccharomyces cerevisiae) diauxic shift. In the first cycle, a model outperforming the best previous diauxic shift model was developed using bioinformatic and systems biology tools. In the second cycle, the model was further improved using automatically planned experiments. In the third cycle, hypothesis-led experiments improved the model to a greater extent than achieved using high-throughput experiments. All of the experiments were formalized and communicated to a cloud laboratory automation system (Eve) for automatic execution, and the results stored on the semantic web for reuse. The final model adds a substantial amount of knowledge about the yeast diauxic shift: 92 genes (+45%), and 1,048 interactions (+147%). This knowledge is also relevant to understanding cancer, the immune system, and aging. We conclude that systems biology software tools can be combined and integrated with laboratory robots in closed-loop cycles.

Asunto(s)

Biología Computacional , Regulación Fúngica de la Expresión Génica , Robótica , Saccharomyces cerevisiae , Programas Informáticos , Biología de Sistemas , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo

A Bayesian approach for structure learning in oscillating regulatory networks.

Trejo Banos, Daniel; Millar, Andrew J; Sanguinetti, Guido.

Bioinformatics ; 31(22): 3617-24, 2015 Nov 15.

Artículo en Inglés | MEDLINE | ID: mdl-26177966

RESUMEN

MOTIVATION: Oscillations lie at the core of many biological processes, from the cell cycle, to circadian oscillations and developmental processes. Time-keeping mechanisms are essential to enable organisms to adapt to varying conditions in environmental cycles, from day/night to seasonal. Transcriptional regulatory networks are one of the mechanisms behind these biological oscillations. However, while identifying cyclically expressed genes from time series measurements is relatively easy, determining the structure of the interaction network underpinning the oscillation is a far more challenging problem. RESULTS: Here, we explicitly leverage the oscillatory nature of the transcriptional signals and present a method for reconstructing network interactions tailored to this special but important class of genetic circuits. Our method is based on projecting the signal onto a set of oscillatory basis functions using a Discrete Fourier Transform. We build a Bayesian Hierarchical model within a frequency domain linear model in order to enforce sparsity and incorporate prior knowledge about the network structure. Experiments on real and simulated data show that the method can lead to substantial improvements over competing approaches if the oscillatory assumption is met, and remains competitive also in cases it is not. AVAILABILITY: DSS, experiment scripts and data are available at http://homepages.inf.ed.ac.uk/gsanguin/DSS.zip. CONTACT: d.trejo-banos@sms.ed.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Redes Reguladoras de Genes , Arabidopsis/genética , Teorema de Bayes , Ciclo Celular/genética , Relojes Circadianos/genética , Simulación por Computador , Bases de Datos Genéticas , Perfilación de la Expresión Génica , Regulación Fúngica de la Expresión Génica , Regulación de la Expresión Génica de las Plantas , Saccharomyces cerevisiae/citología , Saccharomyces cerevisiae/genética

Genomic architecture and prediction of censored time-to-event phenotypes with a Bayesian genome-wide analysis.

Ojavee, Sven E; Kousathanas, Athanasios; Trejo Banos, Daniel; Orliac, Etienne J; Patxot, Marion; Läll, Kristi; Mägi, Reedik; Fischer, Krista; Kutalik, Zoltan; Robinson, Matthew R.

Nat Commun ; 12(1): 2337, 2021 04 20.

Artículo en Inglés | MEDLINE | ID: mdl-33879782

RESUMEN

While recent advancements in computation and modelling have improved the analysis of complex traits, our understanding of the genetic basis of the time at symptom onset remains limited. Here, we develop a Bayesian approach (BayesW) that provides probabilistic inference of the genetic architecture of age-at-onset phenotypes in a sampling scheme that facilitates biobank-scale time-to-event analyses. We show in extensive simulation work the benefits BayesW provides in terms of number of discoveries, model performance and genomic prediction. In the UK Biobank, we find many thousands of common genomic regions underlying the age-at-onset of high blood pressure (HBP), cardiac disease (CAD), and type-2 diabetes (T2D), and for the genetic basis of onset reflecting the underlying genetic liability to disease. Age-at-menopause and age-at-menarche are also highly polygenic, but with higher variance contributed by low frequency variants. Genomic prediction into the Estonian Biobank data shows that BayesW gives higher prediction accuracy than other approaches.

Asunto(s)

Edad de Inicio , Genoma Humano , Modelos Genéticos , Herencia Multifactorial , Factores de Edad , Algoritmos , Teorema de Bayes , Enfermedades Cardiovasculares/genética , Simulación por Computador , Bases de Datos Genéticas , Diabetes Mellitus Tipo 2/genética , Estonia , Femenino , Estudios de Asociación Genética , Estudio de Asociación del Genoma Completo , Genómica , Humanos , Hipertensión/genética , Menarquia/genética , Menopausia/genética , Fenotipo , Polimorfismo de Nucleótido Simple , Reino Unido

Multi-method genome- and epigenome-wide studies of inflammatory protein levels in healthy older adults.

Hillary, Robert F; Trejo-Banos, Daniel; Kousathanas, Athanasios; McCartney, Daniel L; Harris, Sarah E; Stevenson, Anna J; Patxot, Marion; Ojavee, Sven Erik; Zhang, Qian; Liewald, David C; Ritchie, Craig W; Evans, Kathryn L; Tucker-Drob, Elliot M; Wray, Naomi R; McRae, Allan F; Visscher, Peter M; Deary, Ian J; Robinson, Matthew R; Marioni, Riccardo E.

Genome Med ; 12(1): 60, 2020 07 08.

Artículo en Inglés | MEDLINE | ID: mdl-32641083

RESUMEN

BACKGROUND: The molecular factors which control circulating levels of inflammatory proteins are not well understood. Furthermore, association studies between molecular probes and human traits are often performed by linear model-based methods which may fail to account for complex structure and interrelationships within molecular datasets. METHODS: In this study, we perform genome- and epigenome-wide association studies (GWAS/EWAS) on the levels of 70 plasma-derived inflammatory protein biomarkers in healthy older adults (Lothian Birth Cohort 1936; n = 876; Olink® inflammation panel). We employ a Bayesian framework (BayesR+) which can account for issues pertaining to data structure and unknown confounding variables (with sensitivity analyses using ordinary least squares- (OLS) and mixed model-based approaches). RESULTS: We identified 13 SNPs associated with 13 proteins (n = 1 SNP each) concordant across OLS and Bayesian methods. We identified 3 CpG sites spread across 3 proteins (n = 1 CpG each) that were concordant across OLS, mixed-model and Bayesian analyses. Tagged genetic variants accounted for up to 45% of variance in protein levels (for MCP2, 36% of variance alone attributable to 1 polymorphism). Methylation data accounted for up to 46% of variation in protein levels (for CXCL10). Up to 66% of variation in protein levels (for VEGFA) was explained using genetic and epigenetic data combined. We demonstrated putative causal relationships between CD6 and IL18R1 with inflammatory bowel disease and between IL12B and Crohn's disease. CONCLUSIONS: Our data may aid understanding of the molecular regulation of the circulating inflammatory proteome as well as causal relationships between inflammatory mediators and disease.

Asunto(s)

Biomarcadores , Epigenómica , Estudio de Asociación del Genoma Completo , Genómica , Proteínas/genética , Factores de Edad , Anciano , Anciano de 80 o más Años , Proteínas Sanguíneas/genética , Biología Computacional/métodos , Metilación de ADN , Susceptibilidad a Enfermedades , Epigénesis Genética , Epigenómica/métodos , Femenino , Regulación de la Expresión Génica , Genómica/métodos , Voluntarios Sanos , Humanos , Inflamación/etiología , Inflamación/metabolismo , Mediadores de Inflamación , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple , Proteínas/metabolismo , Sitios de Carácter Cuantitativo

Bayesian reassessment of the epigenetic architecture of complex traits.

Trejo Banos, Daniel; McCartney, Daniel L; Patxot, Marion; Anchieri, Lucas; Battram, Thomas; Christiansen, Colette; Costeira, Ricardo; Walker, Rosie M; Morris, Stewart W; Campbell, Archie; Zhang, Qian; Porteous, David J; McRae, Allan F; Wray, Naomi R; Visscher, Peter M; Haley, Chris S; Evans, Kathryn L; Deary, Ian J; McIntosh, Andrew M; Hemani, Gibran; Bell, Jordana T; Marioni, Riccardo E; Robinson, Matthew R.

Nat Commun ; 11(1): 2865, 2020 06 08.

Artículo en Inglés | MEDLINE | ID: mdl-32513961

RESUMEN

Linking epigenetic marks to clinical outcomes improves insight into molecular processes, disease prediction, and therapeutic target identification. Here, a statistical approach is presented to infer the epigenetic architecture of complex disease, determine the variation captured by epigenetic effects, and estimate phenotype-epigenetic probe associations jointly. Implicitly adjusting for probe correlations, data structure (cell-count or relatedness), and single-nucleotide polymorphism (SNP) marker effects, improves association estimates and in 9,448 individuals, 75.7% (95% CI 71.70-79.3) of body mass index (BMI) variation and 45.6% (95% CI 37.3-51.9) of cigarette consumption variation was captured by whole blood methylation array data. Pathway-linked probes of blood cholesterol, lipid transport and sterol metabolism for BMI, and xenobiotic stimuli response for smoking, showed >1.5 times larger associations with >95% posterior inclusion probability. Prediction accuracy improved by 28.7% for BMI and 10.2% for smoking over a LASSO model, with age-, and tissue-specificity, implying associations are a phenotypic consequence rather than causal.

Asunto(s)

Epigénesis Genética , Carácter Cuantitativo Heredable , Adulto , Algoritmos , Teorema de Bayes , Biomarcadores/análisis , Índice de Masa Corporal , Simulación por Computador , Metilación de ADN/genética , Humanos , Anotación de Secuencia Molecular , Especificidad de Órganos/genética , Reproducibilidad de los Resultados

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA