Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 36
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
PLoS Genet ; 16(6): e1008775, 2020 06.
Artículo en Inglés | MEDLINE | ID: mdl-32492070

RESUMEN

Late-Onset Alzheimer's disease (LOAD) is a common, complex genetic disorder well-known for its heterogeneous pathology. The genetic heterogeneity underlying common, complex diseases poses a major challenge for targeted therapies and the identification of novel disease-associated variants. Case-control approaches are often limited to examining a specific outcome in a group of heterogenous patients with different clinical characteristics. Here, we developed a novel approach to define relevant transcriptomic endophenotypes and stratify decedents based on molecular profiles in three independent human LOAD cohorts. By integrating post-mortem brain gene co-expression data from 2114 human samples with LOAD, we developed a novel quantitative, composite phenotype that can better account for the heterogeneity in genetic architecture underlying the disease. We used iterative weighted gene co-expression network analysis (WGCNA) to reduce data dimensionality and to isolate gene sets that are highly co-expressed within disease subtypes and represent specific molecular pathways. We then performed single variant association testing using whole genome-sequencing data for the novel composite phenotype in order to identify genetic loci that contribute to disease heterogeneity. Distinct LOAD subtypes were identified for all three study cohorts (two in ROSMAP, three in Mayo Clinic, and two in Mount Sinai Brain Bank). Single variant association analysis identified a genome-wide significant variant in TMEM106B (p-value < 5×10-8, rs1990620G) in the ROSMAP cohort that confers protection from the inflammatory LOAD subtype. Taken together, our novel approach can be used to stratify LOAD into distinct molecular subtypes based on affected disease pathways.


Asunto(s)
Enfermedad de Alzheimer/genética , Genes Modificadores , Transcriptoma , Anciano , Anciano de 80 o más Años , Enfermedad de Alzheimer/patología , Corteza Cerebral/metabolismo , Corteza Cerebral/patología , Femenino , Perfilación de la Expresión Génica/métodos , Heterogeneidad Genética , Estudio de Asociación del Genoma Completo/métodos , Humanos , Masculino , Proteínas de la Membrana/genética , Proteínas del Tejido Nervioso/genética , Polimorfismo de Nucleótido Simple
2.
Bioinformatics ; 35(14): i568-i576, 2019 07 15.
Artículo en Inglés | MEDLINE | ID: mdl-31510680

RESUMEN

MOTIVATION: Late onset Alzheimer's disease is currently a disease with no known effective treatment options. To better understand disease, new multi-omic data-sets have recently been generated with the goal of identifying molecular causes of disease. However, most analytic studies using these datasets focus on uni-modal analysis of the data. Here, we propose a data driven approach to integrate multiple data types and analytic outcomes to aggregate evidences to support the hypothesis that a gene is a genetic driver of the disease. The main algorithmic contributions of our article are: (i) a general machine learning framework to learn the key characteristics of a few known driver genes from multiple feature sets and identifying other potential driver genes which have similar feature representations, and (ii) A flexible ranking scheme with the ability to integrate external validation in the form of Genome Wide Association Study summary statistics. While we currently focus on demonstrating the effectiveness of the approach using different analytic outcomes from RNA-Seq studies, this method is easily generalizable to other data modalities and analysis types. RESULTS: We demonstrate the utility of our machine learning algorithm on two benchmark multiview datasets by significantly outperforming the baseline approaches in predicting missing labels. We then use the algorithm to predict and rank potential drivers of Alzheimer's. We show that our ranked genes show a significant enrichment for single nucleotide polymorphisms associated with Alzheimer's and are enriched in pathways that have been previously associated with the disease. AVAILABILITY AND IMPLEMENTATION: Source code and link to all feature sets is available at https://github.com/Sage-Bionetworks/EvidenceAggregatedDriverRanking.


Asunto(s)
Algoritmos , Enfermedad de Alzheimer , Estudio de Asociación del Genoma Completo , Enfermedad de Alzheimer/genética , Humanos , Aprendizaje Automático , Programas Informáticos
3.
PLoS Comput Biol ; 12(5): e1004888, 2016 05.
Artículo en Inglés | MEDLINE | ID: mdl-27145341

RESUMEN

We present a computational framework, called DISCERN (DIfferential SparsE Regulatory Network), to identify informative topological changes in gene-regulator dependence networks inferred on the basis of mRNA expression datasets within distinct biological states. DISCERN takes two expression datasets as input: an expression dataset of diseased tissues from patients with a disease of interest and another expression dataset from matching normal tissues. DISCERN estimates the extent to which each gene is perturbed-having distinct regulator connectivity in the inferred gene-regulator dependencies between the disease and normal conditions. This approach has distinct advantages over existing methods. First, DISCERN infers conditional dependencies between candidate regulators and genes, where conditional dependence relationships discriminate the evidence for direct interactions from indirect interactions more precisely than pairwise correlation. Second, DISCERN uses a new likelihood-based scoring function to alleviate concerns about accuracy of the specific edges inferred in a particular network. DISCERN identifies perturbed genes more accurately in synthetic data than existing methods to identify perturbed genes between distinct states. In expression datasets from patients with acute myeloid leukemia (AML), breast cancer and lung cancer, genes with high DISCERN scores in each cancer are enriched for known tumor drivers, genes associated with the biological processes known to be important in the disease, and genes associated with patient prognosis, in the respective cancer. Finally, we show that DISCERN can uncover potential mechanisms underlying network perturbation by explaining observed epigenomic activity patterns in cancer and normal tissue types more accurately than alternative methods, based on the available epigenomic data from the ENCODE project.


Asunto(s)
Redes Reguladoras de Genes , Modelos Genéticos , Neoplasias/genética , Neoplasias de la Mama/genética , Biología Computacional , Simulación por Computador , Bases de Datos Genéticas , Epigénesis Genética , Femenino , Regulación Neoplásica de la Expresión Génica , Humanos , Leucemia Mieloide Aguda/genética , Funciones de Verosimilitud , Neoplasias Pulmonares/genética , Pronóstico
4.
Nucleic Acids Res ; 43(3): 1332-44, 2015 Feb 18.
Artículo en Inglés | MEDLINE | ID: mdl-25583238

RESUMEN

We define a new category of candidate tumor drivers in cancer genome evolution: 'selected expression regulators' (SERs)-genes driving dysregulated transcriptional programs in cancer evolution. The SERs are identified from genome-wide tumor expression data with a novel method, namely SPARROW ( SPAR: se selected exp R: essi O: n regulators identified W: ith penalized regression). SPARROW uncovers a previously unknown connection between cancer expression variation and driver events, by using a novel sparse regression technique. Our results indicate that SPARROW is a powerful complementary approach to identify candidate genes containing driver events that are hard to detect from sequence data, due to a large number of passenger mutations and lack of comprehensive sequence information from a sufficiently large number of samples. SERs identified by SPARROW reveal known driver mutations in multiple human cancers, along with known cancer-associated processes and survival-associated genes, better than popular methods for inferring gene expression networks. We demonstrate that when applied to acute myeloid leukemia expression data, SPARROW identifies an apoptotic biomarker (PYCARD) for an investigational drug obatoclax. The PYCARD and obatoclax association is validated in 30 AML patient samples.


Asunto(s)
Neoplasias Encefálicas/genética , Perfilación de la Expresión Génica , Glioblastoma/genética , Leucemia Mieloide Aguda/genética , Redes Reguladoras de Genes , Humanos , Mutación
5.
Genet Epidemiol ; 38(1): 21-30, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24482836

RESUMEN

Recently, many statistical methods have been proposed to test for associations between rare genetic variants and complex traits. Most of these methods test for association by aggregating genetic variations within a predefined region, such as a gene. Although there is evidence that "aggregate" tests are more powerful than the single marker test, these tests generally ignore neutral variants and therefore are unable to identify specific variants driving the association with phenotype. We propose a novel aggregate rare-variant test that explicitly models a fraction of variants as neutral, tests associations at the gene-level, and infers the rare-variants driving the association. Simulations show that in the practical scenario where there are many variants within a given region of the genome with only a fraction causal our approach has greater power compared to other popular tests such as the Sequence Kernel Association Test (SKAT), the Weighted Sum Statistic (WSS), and the collapsing method of Morris and Zeggini (MZ). Our algorithm leverages a fast variational Bayes approximate inference methodology to scale to exome-wide analyses, a significant computational advantage over exact inference model selection methodologies. To demonstrate the efficacy of our methodology we test for associations between von Willebrand Factor (VWF) levels and VWF missense rare-variants imputed from the National Heart, Lung, and Blood Institute's Exome Sequencing project into 2,487 African Americans within the VWF gene. Our method suggests that a relatively small fraction (~10%) of the imputed rare missense variants within VWF are strongly associated with lower VWF levels in African Americans.


Asunto(s)
Teorema de Bayes , Estudios de Asociación Genética/métodos , Variación Genética/genética , Factor de von Willebrand/genética , Negro o Afroamericano/genética , Algoritmos , Exoma/genética , Femenino , Humanos , Masculino , Modelos Genéticos , Mutación Missense/genética , National Heart, Lung, and Blood Institute (U.S.) , Fenotipo , Proyectos de Investigación , Análisis de Secuencia de ADN , Programas Informáticos , Estados Unidos , Factor de von Willebrand/análisis
6.
Am J Hum Genet ; 91(5): 794-808, 2012 Nov 02.
Artículo en Inglés | MEDLINE | ID: mdl-23103231

RESUMEN

Researchers have successfully applied exome sequencing to discover causal variants in selected individuals with familial, highly penetrant disorders. We demonstrate the utility of exome sequencing followed by imputation for discovering low-frequency variants associated with complex quantitative traits. We performed exome sequencing in a reference panel of 761 African Americans and then imputed newly discovered variants into a larger sample of more than 13,000 African Americans for association testing with the blood cell traits hemoglobin, hematocrit, white blood count, and platelet count. First, we illustrate the feasibility of our approach by demonstrating genome-wide-significant associations for variants that are not covered by conventional genotyping arrays; for example, one such association is that between higher platelet count and an MPL c.117G>T (p.Lys39Asn) variant encoding a p.Lys39Asn amino acid substitution of the thrombopoietin receptor gene (p = 1.5 × 10(-11)). Second, we identified an association between missense variants of LCT and higher white blood count (p = 4 × 10(-13)). Third, we identified low-frequency coding variants that might account for allelic heterogeneity at several known blood cell-associated loci: MPL c.754T>C (p.Tyr252His) was associated with higher platelet count; CD36 c.975T>G (p.Tyr325(∗)) was associated with lower platelet count; and several missense variants at the α-globin gene locus were associated with lower hemoglobin. By identifying low-frequency missense variants associated with blood cell traits not previously reported by genome-wide association studies, we establish that exome sequencing followed by imputation is a powerful approach to dissecting complex, genetically heterogeneous traits in large population-based studies.


Asunto(s)
Negro o Afroamericano/genética , Células Sanguíneas/metabolismo , Exoma , Sitios de Carácter Cuantitativo , Carácter Cuantitativo Heredable , Adulto , Anciano , Femenino , Frecuencia de los Genes , Estudio de Asociación del Genoma Completo , Hematócrito , Enfermedades Hematológicas/genética , Hemoglobinas/genética , Humanos , Leucocitos/metabolismo , Masculino , Persona de Mediana Edad , Recuento de Plaquetas , Polimorfismo de Nucleótido Simple , Estados Unidos , Adulto Joven
7.
PLoS Genet ; 8(3): e1002491, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22423221

RESUMEN

Several genetic variants associated with platelet count and mean platelet volume (MPV) were recently reported in people of European ancestry. In this meta-analysis of 7 genome-wide association studies (GWAS) enrolling African Americans, our aim was to identify novel genetic variants associated with platelet count and MPV. For all cohorts, GWAS analysis was performed using additive models after adjusting for age, sex, and population stratification. For both platelet phenotypes, meta-analyses were conducted using inverse-variance weighted fixed-effect models. Platelet aggregation assays in whole blood were performed in the participants of the GeneSTAR cohort. Genetic variants in ten independent regions were associated with platelet count (N = 16,388) with p<5×10(-8) of which 5 have not been associated with platelet count in previous GWAS. The novel genetic variants associated with platelet count were in the following regions (the most significant SNP, closest gene, and p-value): 6p22 (rs12526480, LRRC16A, p = 9.1×10(-9)), 7q11 (rs13236689, CD36, p = 2.8×10(-9)), 10q21 (rs7896518, JMJD1C, p = 2.3×10(-12)), 11q13 (rs477895, BAD, p = 4.9×10(-8)), and 20q13 (rs151361, SLMO2, p = 9.4×10(-9)). Three of these loci (10q21, 11q13, and 20q13) were replicated in European Americans (N = 14,909) and one (11q13) in Hispanic Americans (N = 3,462). For MPV (N = 4,531), genetic variants in 3 regions were significant at p<5×10(-8), two of which were also associated with platelet count. Previously reported regions that were also significant in this study were 6p21, 6q23, 7q22, 12q24, and 19p13 for platelet count and 7q22, 17q11, and 19p13 for MPV. The most significant SNP in 1 region was also associated with ADP-induced maximal platelet aggregation in whole blood (12q24). Thus through a meta-analysis of GWAS enrolling African Americans, we have identified 5 novel regions associated with platelet count of which 3 were replicated in other ethnic groups. In addition, we also found one region associated with platelet aggregation that may play a potential role in atherothrombosis.


Asunto(s)
Negro o Afroamericano/genética , Plaquetas , Estudio de Asociación del Genoma Completo , Recuento de Plaquetas , Adulto , Anciano , Plaquetas/metabolismo , Femenino , Genotipo , Humanos , Masculino , Persona de Mediana Edad , Agregación Plaquetaria/genética , Polimorfismo de Nucleótido Simple
8.
PLoS Comput Biol ; 9(6): e1003101, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23825936

RESUMEN

Penalized Multiple Regression (PMR) can be used to discover novel disease associations in GWAS datasets. In practice, proposed PMR methods have not been able to identify well-supported associations in GWAS that are undetectable by standard association tests and thus these methods are not widely applied. Here, we present a combined algorithmic and heuristic framework for PUMA (Penalized Unified Multiple-locus Association) analysis that solves the problems of previously proposed methods including computational speed, poor performance on genome-scale simulated data, and identification of too many associations for real data to be biologically plausible. The framework includes a new minorize-maximization (MM) algorithm for generalized linear models (GLM) combined with heuristic model selection and testing methods for identification of robust associations. The PUMA framework implements the penalized maximum likelihood penalties previously proposed for GWAS analysis (i.e. Lasso, Adaptive Lasso, NEG, MCP), as well as a penalty that has not been previously applied to GWAS (i.e. LOG). Using simulations that closely mirror real GWAS data, we show that our framework has high performance and reliably increases power to detect weak associations, while existing PMR methods can perform worse than single marker testing in overall performance. To demonstrate the empirical value of PUMA, we analyzed GWAS data for type 1 diabetes, Crohns's disease, and rheumatoid arthritis, three autoimmune diseases from the original Wellcome Trust Case Control Consortium. Our analysis replicates known associations for these diseases and we discover novel etiologically relevant susceptibility loci that are invisible to standard single marker tests, including six novel associations implicating genes involved in pancreatic function, insulin pathways and immune-cell function in type 1 diabetes; three novel associations implicating genes in pro- and anti-inflammatory pathways in Crohn's disease; and one novel association implicating a gene involved in apoptosis pathways in rheumatoid arthritis. We provide software for applying our PUMA analysis framework.


Asunto(s)
Estudio de Asociación del Genoma Completo , Modelos Teóricos , Análisis de Regresión , Humanos
9.
PLoS Comput Biol ; 9(5): e1003047, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23671412

RESUMEN

Breast cancer is the most common malignancy in women and is responsible for hundreds of thousands of deaths annually. As with most cancers, it is a heterogeneous disease and different breast cancer subtypes are treated differently. Understanding the difference in prognosis for breast cancer based on its molecular and phenotypic features is one avenue for improving treatment by matching the proper treatment with molecular subtypes of the disease. In this work, we employed a competition-based approach to modeling breast cancer prognosis using large datasets containing genomic and clinical information and an online real-time leaderboard program used to speed feedback to the modeling team and to encourage each modeler to work towards achieving a higher ranked submission. We find that machine learning methods combined with molecular features selected based on expert prior knowledge can improve survival predictions compared to current best-in-class methodologies and that ensemble models trained across multiple user submissions systematically outperform individual models within the ensemble. We also find that model scores are highly consistent across multiple independent evaluations. This study serves as the pilot phase of a much larger competition open to the whole research community, with the goal of understanding general strategies for model optimization using clinical and molecular profiling data and providing an objective, transparent system for assessing prognostic models.


Asunto(s)
Neoplasias de la Mama , Biología Computacional/métodos , Modelos Biológicos , Modelos Estadísticos , Análisis de Supervivencia , Algoritmos , Análisis por Conglomerados , Bases de Datos Factuales , Femenino , Perfilación de la Expresión Génica , Humanos , Pronóstico
10.
Alzheimers Dement (N Y) ; 10(2): e12461, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38650747

RESUMEN

INTRODUCTION: Alzheimer's disease (AD) is the predominant dementia globally, with heterogeneous presentation and penetrance of clinical symptoms, variable presence of mixed pathologies, potential disease subtypes, and numerous associated endophenotypes. Beyond the difficulty of designing treatments that address the core pathological characteristics of the disease, therapeutic development is challenged by the uncertainty of which endophenotypic areas and specific targets implicated by those endophenotypes to prioritize for further translational research. However, publicly funded consortia driving large-scale open science efforts have produced multiple omic analyses that address both disease risk relevance and biological process involvement of genes across the genome. METHODS: Here we report the development of an informatic pipeline that draws from genetic association studies, predicted variant impact, and linkage with dementia associated phenotypes to create a genetic risk score. This is paired with a multi-omic risk score utilizing extensive sets of both transcriptomic and proteomic studies to identify system-level changes in expression associated with AD. These two elements combined constitute our target risk score that ranks AD risk genome-wide. The ranked genes are organized into endophenotypic space through the development of 19 biological domains associated with AD in the described genetics and genomics studies and accompanying literature. The biological domains are constructed from exhaustive Gene Ontology (GO) term compilations, allowing automated assignment of genes into objectively defined disease-associated biology. This rank-and-organize approach, performed genome-wide, allows the characterization of aggregations of AD risk across biological domains. RESULTS: The top AD-risk-associated biological domains are Synapse, Immune Response, Lipid Metabolism, Mitochondrial Metabolism, Structural Stabilization, and Proteostasis, with slightly lower levels of risk enrichment present within the other 13 biological domains. DISCUSSION: This provides an objective methodology to localize risk within specific biological endophenotypes and drill down into the most significantly associated sets of GO terms and annotated genes for potential therapeutic targets.

11.
Bioinformatics ; 28(13): 1738-44, 2012 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-22563072

RESUMEN

MOTIVATION: For many complex traits, including height, the majority of variants identified by genome-wide association studies (GWAS) have small effects, leaving a significant proportion of the heritable variation unexplained. Although many penalized multiple regression methodologies have been proposed to increase the power to detect associations for complex genetic architectures, they generally lack mechanisms for false-positive control and diagnostics for model over-fitting. Our methodology is the first penalized multiple regression approach that explicitly controls Type I error rates and provide model over-fitting diagnostics through a novel normally distributed statistic defined for every marker within the GWAS, based on results from a variational Bayes spike regression algorithm. RESULTS: We compare the performance of our method to the lasso and single marker analysis on simulated data and demonstrate that our approach has superior performance in terms of power and Type I error control. In addition, using the Women's Health Initiative (WHI) SNP Health Association Resource (SHARe) GWAS of African-Americans, we show that our method has power to detect additional novel associations with body height. These findings replicate by reaching a stringent cutoff of marginal association in a larger cohort. AVAILABILITY: An R-package, including an implementation of our variational Bayes spike regression (vBsr) algorithm, is available at http://kooperberg.fhcrc.org/soft.html.


Asunto(s)
Estatura/genética , Estudio de Asociación del Genoma Completo , Modelos Estadísticos , Negro o Afroamericano/genética , Algoritmos , Teorema de Bayes , Sitios Genéticos , Humanos , Polimorfismo de Nucleótido Simple , Análisis de Regresión
12.
BMC Bioinformatics ; 13: 53, 2012 Apr 02.
Artículo en Inglés | MEDLINE | ID: mdl-22471599

RESUMEN

BACKGROUND: We propose a novel variational Bayes network reconstruction algorithm to extract the most relevant disease factors from high-throughput genomic data-sets. Our algorithm is the only scalable method for regularized network recovery that employs Bayesian model averaging and that can internally estimate an appropriate level of sparsity to ensure few false positives enter the model without the need for cross-validation or a model selection criterion. We use our algorithm to characterize the effect of genetic markers and liver gene expression traits on mouse obesity related phenotypes, including weight, cholesterol, glucose, and free fatty acid levels, in an experiment previously used for discovery and validation of network connections: an F2 intercross between the C57BL/6 J and C3H/HeJ mouse strains, where apolipoprotein E is null on the background. RESULTS: We identified eleven genes, Gch1, Zfp69, Dlgap1, Gna14, Yy1, Gabarapl1, Folr2, Fdft1, Cnr2, Slc24a3, and Ccl19, and a quantitative trait locus directly connected to weight, glucose, cholesterol, or free fatty acid levels in our network. None of these genes were identified by other network analyses of this mouse intercross data-set, but all have been previously associated with obesity or related pathologies in independent studies. In addition, through both simulations and data analysis we demonstrate that our algorithm achieves superior performance in terms of power and type I error control than other network recovery algorithms that use the lasso and have bounds on type I error control. CONCLUSIONS: Our final network contains 118 previously associated and novel genes affecting weight, cholesterol, glucose, and free fatty acid levels that are excellent obesity risk candidates.


Asunto(s)
Algoritmos , Teorema de Bayes , Obesidad/genética , Obesidad/metabolismo , Animales , Apolipoproteínas E/genética , Simulación por Computador , Humanos , Ratones , Ratones Endogámicos C3H , Ratones Endogámicos C57BL , Sitios de Carácter Cuantitativo
13.
Am J Epidemiol ; 176(2): 164-73, 2012 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-22771729

RESUMEN

In this article, the authors propose to simultaneously test for marginal genetic association and gene-environment interaction to discover single nucleotide polymorphisms that may be involved in gene-environment or gene-treatment interaction. The asymptotic independence of the marginal association estimator and various interaction estimators leads to a simple and flexible way of combining the 2 tests, allowing for exploitation of gene-environment independence in estimating gene-environment interaction. The proposed test differs from the 2-df test proposed by Kraft et al. (Hum Hered. 2007;63(2):111-119) in two respects. First, for the genetic association component, it tests for marginal association, which is often the primary objective in inference, rather than the main effect in a model with gene-environment interaction. Second, the gene-environment testing component can easily exploit putative gene-environment independence using either the case-only estimator or the empirical Bayes estimator, depending on whether the goal is gene-treatment interaction in a randomized trial or gene-environment interaction in an observational study. The use of the proposed joint test is illustrated through simulations and a genetic study (1993-2005) from the Women's Health Initiative.


Asunto(s)
Interacción Gen-Ambiente , Predisposición Genética a la Enfermedad/epidemiología , Modelos Genéticos , Estudio de Asociación del Genoma Completo , Humanos , Modelos Estadísticos
14.
Sci Rep ; 12(1): 6117, 2022 04 12.
Artículo en Inglés | MEDLINE | ID: mdl-35413975

RESUMEN

Genetics play an important role in late-onset Alzheimer's Disease (AD) etiology and dozens of genetic variants have been implicated in AD risk through large-scale GWAS meta-analyses. However, the precise mechanistic effects of most of these variants have yet to be determined. Deeply phenotyped cohort data can reveal physiological changes associated with genetic risk for AD across an age spectrum that may provide clues to the biology of the disease. We utilized over 2000 high-quality quantitative measurements obtained from blood of 2831 cognitively normal adult clients of a consumer-based scientific wellness company, each with CLIA-certified whole-genome sequencing data. Measurements included: clinical laboratory blood tests, targeted chip-based proteomics, and metabolomics. We performed a phenome-wide association study utilizing this diverse blood marker data and 25 known AD genetic variants and an AD-specific polygenic risk score (PGRS), adjusting for sex, age, vendor (for clinical labs), and the first four genetic principal components; sex-SNP interactions were also assessed. We observed statistically significant SNP-analyte associations for five genetic variants after correction for multiple testing (for SNPs in or near NYAP1, ABCA7, INPP5D, and APOE), with effects detectable from early adulthood. The ABCA7 SNP and the APOE2 and APOE4 encoding alleles were associated with lipid variability, as seen in previous studies; in addition, six novel proteins were associated with the e2 allele. The most statistically significant finding was between the NYAP1 variant and PILRA and PILRB protein levels, supporting previous functional genomic studies in the identification of a putative causal variant within the PILRA gene. We did not observe associations between the PGRS and any analyte. Sex modified the effects of four genetic variants, with multiple interrelated immune-modulating effects associated with the PICALM variant. In post-hoc analysis, sex-stratified GWAS results from an independent AD case-control meta-analysis supported sex-specific disease effects of the PICALM variant, highlighting the importance of sex as a biological variable. Known AD genetic variation influenced lipid metabolism and immune response systems in a population of non-AD individuals, with associations observed from early adulthood onward. Further research is needed to determine whether and how these effects are implicated in early-stage biological pathways to AD. These analyses aim to complement ongoing work on the functional interpretation of AD-associated genetic variants.


Asunto(s)
Enfermedad de Alzheimer , Transportadoras de Casetes de Unión a ATP/genética , Adulto , Enfermedad de Alzheimer/genética , Apolipoproteína E2/genética , Femenino , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Genómica , Humanos , Masculino , Polimorfismo de Nucleótido Simple
16.
PLoS Comput Biol ; 6(12): e1001014, 2010 Dec 02.
Artículo en Inglés | MEDLINE | ID: mdl-21152011

RESUMEN

Cellular gene expression measurements contain regulatory information that can be used to discover novel network relationships. Here, we present a new algorithm for network reconstruction powered by the adaptive lasso, a theoretically and empirically well-behaved method for selecting the regulatory features of a network. Any algorithms designed for network discovery that make use of directed probabilistic graphs require perturbations, produced by either experiments or naturally occurring genetic variation, to successfully infer unique regulatory relationships from gene expression data. Our approach makes use of appropriately selected cis-expression Quantitative Trait Loci (cis-eQTL), which provide a sufficient set of independent perturbations for maximum network resolution. We compare the performance of our network reconstruction algorithm to four other approaches: the PC-algorithm, QTLnet, the QDG algorithm, and the NEO algorithm, all of which have been used to reconstruct directed networks among phenotypes leveraging QTL. We show that the adaptive lasso can outperform these algorithms for networks of ten genes and ten cis-eQTL, and is competitive with the QDG algorithm for networks with thirty genes and thirty cis-eQTL, with rich topologies and hundreds of samples. Using this novel approach, we identify unique sets of directed relationships in Saccharomyces cerevisiae when analyzing genome-wide gene expression data for an intercross between a wild strain and a lab strain. We recover novel putative network relationships between a tyrosine biosynthesis gene (TYR1), and genes involved in endocytosis (RCY1), the spindle checkpoint (BUB2), sulfonate catabolism (JLP1), and cell-cell communication (PRM7). Our algorithm provides a synthesis of feature selection methods and graphical model theory that has the potential to reveal new directed regulatory relationships from the analysis of population level genetic and gene expression data.


Asunto(s)
Algoritmos , Regulación Fúngica de la Expresión Génica/genética , Redes Reguladoras de Genes/genética , Modelos Genéticos , Biología de Sistemas/métodos , Regulación Fúngica de la Expresión Génica/fisiología , Redes Reguladoras de Genes/fisiología , Genoma Fúngico , Redes y Vías Metabólicas/genética , Redes y Vías Metabólicas/fisiología , Fenotipo , Sitios de Carácter Cuantitativo/genética , Sitios de Carácter Cuantitativo/fisiología , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/fisiología , Transducción de Señal/genética , Transducción de Señal/fisiología
17.
Alzheimers Dement (Amst) ; 13(1): e12140, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34027015

RESUMEN

INTRODUCTION: Genome-wide association studies (GWAS) for late onset Alzheimer's disease (AD) may miss genetic variants relevant for delineating disease stages when using clinically defined case/control as a phenotype due to its loose definition and heterogeneity. METHODS: We use a transfer learning technique to train three-dimensional convolutional neural network (CNN) models based on structural magnetic resonance imaging (MRI) from the screening stage in the Alzheimer's Disease Neuroimaging Initiative consortium to derive image features that reflect AD progression. RESULTS: CNN-derived image phenotypes are significantly associated with fasting metabolites related to early lipid metabolic changes as well as insulin resistance and with genetic variants mapped to candidate genes enriched for amyloid beta degradation, tau phosphorylation, calcium ion binding-dependent synaptic loss, APP-regulated inflammation response, and insulin resistance. DISCUSSION: This is the first attempt to show that non-invasive MRI biomarkers are linked to AD progression characteristics, reinforcing their use in early AD diagnosis and monitoring.

18.
Genome Med ; 13(1): 76, 2021 05 04.
Artículo en Inglés | MEDLINE | ID: mdl-33947463

RESUMEN

BACKGROUND: Alzheimer's disease (AD) is an incurable neurodegenerative disease currently affecting 1.75% of the US population, with projected growth to 3.46% by 2050. Identifying common genetic variants driving differences in transcript expression that confer AD risk is necessary to elucidate AD mechanism and develop therapeutic interventions. We modify the FUSION transcriptome-wide association study (TWAS) pipeline to ingest gene expression values from multiple neocortical regions. METHODS: A combined dataset of 2003 genotypes clustered to 1000 Genomes individuals from Utah with Northern and Western European ancestry (CEU) was used to construct a training set of 790 genotypes paired to 888 RNASeq profiles from temporal cortex (TCX = 248), prefrontal cortex (FP = 50), inferior frontal gyrus (IFG = 41), superior temporal gyrus (STG = 34), parahippocampal cortex (PHG = 34), and dorsolateral prefrontal cortex (DLPFC = 461). Following within-tissue normalization and covariate adjustment, predictive weights to impute expression components based on a gene's surrounding cis-variants were trained. The FUSION pipeline was modified to support input of pre-scaled expression values and support cross validation with a repeated measure design arising from the presence of multiple transcriptome samples from the same individual across different tissues. RESULTS: Cis-variant architecture alone was informative to train weights and impute expression for 6780 (49.67%) autosomal genes, the majority of which significantly correlated with gene expression; FDR < 5%: N = 6775 (99.92%), Bonferroni: N = 6716 (99.06%). Validation of weights in 515 matched genotype to RNASeq profiles from the CommonMind Consortium (CMC) was (72.14%) in DLPFC profiles. Association of imputed expression components from all 2003 genotype profiles yielded 8 genes significantly associated with AD (FDR < 0.05): APOC1, EED, CD2AP, CEACAM19, CLPTM1, MTCH2, TREM2, and KNOP1. CONCLUSIONS: We provide evidence of cis-genetic variation conferring AD risk through 8 genes across six distinct genomic loci. Moreover, we provide expression weights for 6780 genes as a valuable resource to the community, which can be abstracted across the neocortex and a wide range of neuronal phenotypes.


Asunto(s)
Enfermedad de Alzheimer/genética , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Neocórtex/metabolismo , Sitios de Carácter Cuantitativo , Transcriptoma , Biología Computacional/métodos , Regulación de la Expresión Génica , Estudio de Asociación del Genoma Completo/métodos , Humanos , Especificidad de Órganos/genética
19.
Nat Commun ; 12(1): 7035, 2021 12 02.
Artículo en Inglés | MEDLINE | ID: mdl-34857756

RESUMEN

RNA editing is a feature of RNA maturation resulting in the formation of transcripts whose sequence differs from the genome template. Brain RNA editing may be altered in Alzheimer's disease (AD). Here, we analyzed data from 1,865 brain samples covering 9 brain regions from 1,074 unrelated subjects on a transcriptome-wide scale to identify inter-regional differences in RNA editing. We expand the list of known brain editing events by identifying 58,761 previously unreported events. We note that only a small proportion of these editing events are found at the protein level in our proteome-wide validation effort. We also identified the occurrence of editing events associated with AD dementia, neuropathological measures and longitudinal cognitive decline in: SYT11, MCUR1, SOD2, ORAI2, HSDL2, PFKP, and GPRC5B. Thus, we present an extended reference set of brain RNA editing events, identify a subset that are found to be expressed at the protein level, and extend the narrative of transcriptomic perturbation in AD to RNA editing.


Asunto(s)
Enfermedad de Alzheimer/genética , Proteína ORAI2/genética , Edición de ARN , ARN/genética , Sinaptotagminas/genética , Transcriptoma , Enfermedad de Alzheimer/metabolismo , Enfermedad de Alzheimer/patología , Atlas como Asunto , Encéfalo/metabolismo , Encéfalo/patología , Química Encefálica , Perfilación de la Expresión Génica , Humanos , Hidroxiesteroide Deshidrogenasas/genética , Hidroxiesteroide Deshidrogenasas/metabolismo , Proteínas de la Membrana/genética , Proteínas de la Membrana/metabolismo , Proteínas Mitocondriales/genética , Proteínas Mitocondriales/metabolismo , Proteína ORAI2/metabolismo , Fosfofructoquinasa-1 Tipo C/genética , Fosfofructoquinasa-1 Tipo C/metabolismo , ARN/metabolismo , Receptores Acoplados a Proteínas G/genética , Receptores Acoplados a Proteínas G/metabolismo , Superóxido Dismutasa/genética , Superóxido Dismutasa/metabolismo , Sinaptotagminas/metabolismo
20.
Front Aging Neurosci ; 13: 735524, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34707490

RESUMEN

Late-onset Alzheimer's disease (AD; LOAD) is the most common human neurodegenerative disease, however, the availability and efficacy of disease-modifying interventions is severely lacking. Despite exceptional efforts to understand disease progression via legacy amyloidogenic transgene mouse models, focus on disease translation with innovative mouse strains that better model the complexity of human AD is required to accelerate the development of future treatment modalities. LOAD within the human population is a polygenic and environmentally influenced disease with many risk factors acting in concert to produce disease processes parallel to those often muted by the early and aggressive aggregate formation in popular mouse strains. In addition to extracellular deposits of amyloid plaques and inclusions of the microtubule-associated protein tau, AD is also defined by synaptic/neuronal loss, vascular deficits, and neuroinflammation. These underlying processes need to be better defined, how the disease progresses with age, and compared to human-relevant outcomes. To create more translatable mouse models, MODEL-AD (Model Organism Development and Evaluation for Late-onset AD) groups are identifying and integrating disease-relevant, humanized gene sequences from public databases beginning with APOEε4 and Trem2*R47H, two of the most powerful risk factors present in human LOAD populations. Mice expressing endogenous, humanized APOEε4 and Trem2*R47H gene sequences were extensively aged and assayed using a multi-disciplined phenotyping approach associated with and relative to human AD pathology. Robust analytical pipelines measured behavioral, transcriptomic, metabolic, and neuropathological phenotypes in cross-sectional cohorts for progression of disease hallmarks at all life stages. In vivo PET/MRI neuroimaging revealed regional alterations in glycolytic metabolism and vascular perfusion. Transcriptional profiling by RNA-Seq of brain hemispheres identified sex and age as the main sources of variation between genotypes including age-specific enrichment of AD-related processes. Similarly, age was the strongest determinant of behavioral change. In the absence of mouse amyloid plaque formation, many of the hallmarks of AD were not observed in this strain. However, as a sensitized baseline model with many additional alleles and environmental modifications already appended, the dataset from this initial MODEL-AD strain serves an important role in establishing the individual effects and interaction between two strong genetic risk factors for LOAD in a mouse host.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA