Search | VHL Regional Portal

The Alzheimer's Knowledge Base: A Knowledge Graph for Alzheimer Disease Research.

Romano, Joseph D; Truong, Van; Kumar, Rachit; Venkatesan, Mythreye; Graham, Britney E; Hao, Yun; Matsumoto, Nick; Li, Xi; Wang, Zhiping; Ritchie, Marylyn D; Shen, Li; Moore, Jason H.

J Med Internet Res ; 26: e46777, 2024 Apr 18.

Article in English | MEDLINE | ID: mdl-38635981

ABSTRACT

BACKGROUND: As global populations age and become susceptible to neurodegenerative illnesses, new therapies for Alzheimer disease (AD) are urgently needed. Existing data resources for drug discovery and repurposing fail to capture relationships central to the disease's etiology and response to drugs. OBJECTIVE: We designed the Alzheimer's Knowledge Base (AlzKB) to alleviate this need by providing a comprehensive knowledge representation of AD etiology and candidate therapeutics. METHODS: We designed the AlzKB as a large, heterogeneous graph knowledge base assembled using 22 diverse external data sources describing biological and pharmaceutical entities at different levels of organization (eg, chemicals, genes, anatomy, and diseases). AlzKB uses a Web Ontology Language 2 ontology to enforce semantic consistency and allow for ontological inference. We provide a public version of AlzKB and allow users to run and modify local versions of the knowledge base. RESULTS: AlzKB is freely available on the web and currently contains 118,902 entities with 1,309,527 relationships between those entities. To demonstrate its value, we used graph data science and machine learning to (1) propose new therapeutic targets based on similarities of AD to Parkinson disease and (2) repurpose existing drugs that may treat AD. For each use case, AlzKB recovers known therapeutic associations while proposing biologically plausible new ones. CONCLUSIONS: AlzKB is a new, publicly available knowledge resource that enables researchers to discover complex translational associations for AD drug discovery. Through 2 use cases, we show that it is a valuable tool for proposing novel therapeutic hypotheses based on public biomedical knowledge.

Subject(s)

Alzheimer Disease , Humans , Alzheimer Disease/drug therapy , Alzheimer Disease/genetics , Pattern Recognition, Automated , Knowledge Bases , Machine Learning , Knowledge

Estimating prevalence of human traits among populations from polygenic risk scores.

Graham, Britney E; Plotkin, Brian; Muglia, Louis; Moore, Jason H; Williams, Scott M.

Hum Genomics ; 15(1): 70, 2021 12 13.

Article in English | MEDLINE | ID: mdl-34903281

ABSTRACT

The genetic basis of phenotypic variation across populations has not been well explained for most traits. Several factors may cause disparities, from variation in environments to divergent population genetic structure. We hypothesized that a population-level polygenic risk score (PRS) can explain phenotypic variation among geographic populations based solely on risk allele frequencies. We applied a population-specific PRS (psPRS) to 26 populations from the 1000 Genomes to four phenotypes: lactase persistence (LP), melanoma, multiple sclerosis (MS) and height. Our models assumed additive genetic architecture among the polymorphisms in the psPRSs, as is convention. Linear psPRSs explained a significant proportion of trait variance ranging from 0.32 for height in men to 0.88 for melanoma. The best models for LP and height were linear, while those for melanoma and MS were nonlinear. As not all variants in a PRS may confer similar, or even any, risk among diverse populations, we also filtered out SNPs to assess whether variance explained was improved using psPRSs with fewer SNPs. Variance explained usually improved with fewer SNPs in the psPRS and was as high as 0.99 for height in men using only 548 of the initial 4208 SNPs. That reducing SNPs improves psPRSs performance may indicate that missing heritability is partially due to complex architecture that does not mandate additivity, undiscovered variants or spurious associations in the databases. We demonstrated that PRS-based analyses can be used across diverse populations and phenotypes for population prediction and that these comparisons can identify the universal risk variants.

Subject(s)

Multifactorial Inheritance , Polymorphism, Single Nucleotide , Genome-Wide Association Study , Humans , Multifactorial Inheritance/genetics , Phenotype , Polymorphism, Single Nucleotide/genetics , Prevalence , Risk Factors

Correction: Whole exome sequencing reveals HSPA1L as a genetic risk factor for spontaneous preterm birth.

Huusko, Johanna M; Karjalainen, Minna K; Graham, Britney E; Zhang, Ge; Farrow, Emily G; Miller, Neil A; Jacobsson, Bo; Eidem, Haley R; Murray, Jeffrey C; Bedell, Bruce; Breheny, Patrick; Brown, Noah W; Bødker, Frans L; Litterman, Nadia K; Jiang, Pan-Pan; Russell, Laura; Hinds, David A; Hu, Youna; Rokas, Antonis; Teramo, Kari; Christensen, Kaare; Williams, Scott M; Rämet, Mika; Kingsmore, Stephen F; Ryckman, Kelli K; Hallman, Mikko; Muglia, Louis J.

PLoS Genet ; 14(9): e1007673, 2018 09.

Article in English | MEDLINE | ID: mdl-30212495

ABSTRACT

[This corrects the article DOI: 10.1371/journal.pgen.1007394.].

Whole exome sequencing reveals HSPA1L as a genetic risk factor for spontaneous preterm birth.

PLoS Genet ; 14(7): e1007394, 2018 07.

Article in English | MEDLINE | ID: mdl-30001343

ABSTRACT

Preterm birth is a leading cause of morbidity and mortality in infants. Genetic and environmental factors play a role in the susceptibility to preterm birth, but despite many investigations, the genetic basis for preterm birth remain largely unknown. Our objective was to identify rare, possibly damaging, nucleotide variants in mothers from families with recurrent spontaneous preterm births (SPTB). DNA samples from 17 Finnish mothers who delivered at least one infant preterm were subjected to whole exome sequencing. All mothers were of northern Finnish origin and were from seven multiplex families. Additional replication samples of European origin consisted of 93 Danish sister pairs (and two sister triads), all with a history of a preterm delivery. Rare exonic variants (frequency <1%) were analyzed to identify genes and pathways likely to affect SPTB susceptibility. We identified rare, possibly damaging, variants in genes that were common to multiple affected individuals. The glucocorticoid receptor signaling pathway was the most significant (p<1.7e-8) with genes containing these variants in a subgroup of ten Finnish mothers, each having had 2-4 SPTBs. This pathway was replicated among the Danish sister pairs. A gene in this pathway, heat shock protein family A (Hsp70) member 1 like (HSPA1L), contains two likely damaging missense alleles that were found in four different Finnish families. One of the variants (rs34620296) had a higher frequency in cases compared to controls (0.0025 vs. 0.0010, p = 0.002) in a large preterm birth genome-wide association study (GWAS) consisting of mothers of general European ancestry. Sister pairs in replication samples also shared rare, likely damaging HSPA1L variants. Furthermore, in silico analysis predicted an additional phosphorylation site generated by rs34620296 that could potentially affect chaperone activity or HSPA1L protein stability. Finally, in vitro functional experiment showed a link between HSPA1L activity and decidualization. In conclusion, rare, likely damaging, variants in HSPA1L were observed in multiple families with recurrent SPTB.

Subject(s)

Genetic Predisposition to Disease , HSP70 Heat-Shock Proteins/genetics , Premature Birth/genetics , Adenosine Diphosphate/chemistry , Adenosine Diphosphate/metabolism , Case-Control Studies , Cell Line , Exome/genetics , Female , Fibroblasts , Finland , Genome-Wide Association Study , HSP70 Heat-Shock Proteins/chemistry , HSP70 Heat-Shock Proteins/metabolism , Humans , Infant, Newborn , Male , Models, Molecular , Phosphorylation/genetics , Polymorphism, Single Nucleotide , Pregnancy , Receptors, Glucocorticoid/metabolism , Recurrence , Risk Factors , Signal Transduction/genetics , Exome Sequencing

Evolutionarily derived networks to inform disease pathways.

Graham, Britney E; Darabos, Christian; Huang, Minjun; Muglia, Louis J; Moore, Jason H; Williams, Scott M.

Genet Epidemiol ; 41(8): 866-875, 2017 12.

Article in English | MEDLINE | ID: mdl-28944497

ABSTRACT

Methods to identify genes or pathways associated with complex diseases are often inadequate to elucidate most risk because they make implicit and oversimplified assumptions about underlying models of disease etiology. These can lead to incomplete or inadequate conclusions. To address this, we previously developed human phenotype networks (HPN), linking phenotypes based on shared biology. However, such visualization alone is often uninterpretable, and requires additional filtering. Here, we expand the HPN to include another method, evolutionary triangulation (ET). ET utilizes the hypothesis that alleles affecting disease risk in multiple populations are distributed consistently with differences in disease prevalence and compares allele frequencies among populations and their relationship to phenotype prevalence. We hypothesized that combining these methods will increase our ability to detect genetic patterns of association in complex diseases. We combined HPN and ET to identify network patterns associated with type 2 diabetes mellitus (T2DM), a leading cause of death worldwide. Fasting glucose, a continuous trait, was used as a proxy for T2DM and differs significantly among continental populations. The combined method identified several diabetes-related traits and several phenotypes related to cardiovascular diseases, for which diabetes is a major risk factor. ET-HPN found more phenotypes related to our target and related phenotypes than the application of either method alone. Not only could we detect phenotype connections related to T2DM, but we also identified phenotypes that are distributed in parallel to it, e.g., amyotrophic lateral sclerosis. Our analyses showed that ET-filtered HPN provides information that neither technique can individually.

Subject(s)

Diabetes Mellitus, Type 2/metabolism , Metabolic Networks and Pathways/genetics , Alleles , Cardiovascular Diseases/genetics , Cardiovascular Diseases/metabolism , Cardiovascular Diseases/pathology , Diabetes Mellitus, Type 2/genetics , Diabetes Mellitus, Type 2/pathology , Gene Frequency , Genome-Wide Association Study , Humans , Phenotype , Polymorphism, Single Nucleotide , Risk Factors

Evolutionary triangulation: informing genetic association studies with evolutionary evidence.

Huang, Minjun; Graham, Britney E; Zhang, Ge; Harder, Reed; Kodaman, Nuri; Moore, Jason H; Muglia, Louis; Williams, Scott M.

BioData Min ; 9: 12, 2016.

Article in English | MEDLINE | ID: mdl-27042214

ABSTRACT

Genetic studies of human diseases have identified many variants associated with pathogenesis and severity. However, most studies have used only statistical association to assess putative relationships to disease, and ignored other factors for evaluation. For example, evolution is a factor that has shaped disease risk, changing allele frequencies as human populations migrated into and inhabited new environments. Since many common variants differ among populations in frequency, as does disease prevalence, we hypothesized that patterns of disease and population structure, taken together, will inform association studies. Thus, the population distributions of allelic risk variants should reflect the distributions of their associated diseases. Evolutionary Triangulation (ET) exploits this evolutionary differentiation by comparing population structure among three populations with variable patterns of disease prevalence. By selecting populations based on patterns where two have similar rates of disease that differ substantially from a third, we performed a proof of principle analysis for this method. We examined three disease phenotypes, lactase persistence, melanoma, and Type 2 diabetes mellitus. We show that for lactase persistence, a phenotype with a simple genetic architecture, ET identifies the key gene, lactase. For melanoma, ET identifies several genes associated with this disease and/or phenotypes related to it, such as skin color genes. ET was less obviously successful for Type 2 diabetes mellitus, perhaps because of the small effect sizes in known risk loci and recent environmental changes that have altered disease risk. Alternatively, ET may have revealed new genes involved in conferring disease risk for diabetes that did not meet nominal GWAS significance thresholds. We also compared ET to another method used to filter for phenotype associated genes, population branch statistic (PBS), and show that ET performs better in identifying genes known to associate with diseases appropriately distributed among populations. Our results indicate that ET can filter association results to improve our ability to discover disease loci.

The multiscale backbone of the human phenotype network based on biological pathways.

Darabos, Christian; White, Marquitta J; Graham, Britney E; Leung, Derek N; Williams, Scott M; Moore, Jason H.

BioData Min ; 7(1): 1, 2014 Jan 25.

Article in English | MEDLINE | ID: mdl-24460644

ABSTRACT

BACKGROUND: Networks are commonly used to represent and analyze large and complex systems of interacting elements. In systems biology, human disease networks show interactions between disorders sharing common genetic background. We built pathway-based human phenotype network (PHPN) of over 800 physical attributes, diseases, and behavioral traits; based on about 2,300 genes and 1,200 biological pathways. Using GWAS phenotype-to-genes associations, and pathway data from Reactome, we connect human traits based on the common patterns of human biological pathways, detecting more pleiotropic effects, and expanding previous studies from a gene-centric approach to that of shared cell-processes. RESULTS: The resulting network has a heavily right-skewed degree distribution, placing it in the scale-free region of the network topologies spectrum. We extract the multi-scale information backbone of the PHPN based on the local densities of the network and discarding weak connection. Using a standard community detection algorithm, we construct phenotype modules of similar traits without applying expert biological knowledge. These modules can be assimilated to the disease classes. However, we are able to classify phenotypes according to shared biology, and not arbitrary disease classes. We present examples of expected clinical connections identified by PHPN as proof of principle. CONCLUSIONS: We unveil a previously uncharacterized connection between phenotype modules and discuss potential mechanistic connections that are obvious only in retrospect. The PHPN shows tremendous potential to become a useful tool both in the unveiling of the diseases' common biology, and in the elaboration of diagnosis and treatments.

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL