Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
J Med Internet Res ; 26: e46777, 2024 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-38635981

RESUMO

BACKGROUND: As global populations age and become susceptible to neurodegenerative illnesses, new therapies for Alzheimer disease (AD) are urgently needed. Existing data resources for drug discovery and repurposing fail to capture relationships central to the disease's etiology and response to drugs. OBJECTIVE: We designed the Alzheimer's Knowledge Base (AlzKB) to alleviate this need by providing a comprehensive knowledge representation of AD etiology and candidate therapeutics. METHODS: We designed the AlzKB as a large, heterogeneous graph knowledge base assembled using 22 diverse external data sources describing biological and pharmaceutical entities at different levels of organization (eg, chemicals, genes, anatomy, and diseases). AlzKB uses a Web Ontology Language 2 ontology to enforce semantic consistency and allow for ontological inference. We provide a public version of AlzKB and allow users to run and modify local versions of the knowledge base. RESULTS: AlzKB is freely available on the web and currently contains 118,902 entities with 1,309,527 relationships between those entities. To demonstrate its value, we used graph data science and machine learning to (1) propose new therapeutic targets based on similarities of AD to Parkinson disease and (2) repurpose existing drugs that may treat AD. For each use case, AlzKB recovers known therapeutic associations while proposing biologically plausible new ones. CONCLUSIONS: AlzKB is a new, publicly available knowledge resource that enables researchers to discover complex translational associations for AD drug discovery. Through 2 use cases, we show that it is a valuable tool for proposing novel therapeutic hypotheses based on public biomedical knowledge.


Assuntos
Doença de Alzheimer , Humanos , Doença de Alzheimer/tratamento farmacológico , Doença de Alzheimer/genética , Reconhecimento Automatizado de Padrão , Bases de Conhecimento , Aprendizado de Máquina , Conhecimento
2.
Bioinformatics ; 38(3): 878-880, 2022 01 12.
Artigo em Inglês | MEDLINE | ID: mdl-34677586

RESUMO

MOTIVATION: Novel machine learning and statistical modeling studies rely on standardized comparisons to existing methods using well-studied benchmark datasets. Few tools exist that provide rapid access to many of these datasets through a standardized, user-friendly interface that integrates well with popular data science workflows. RESULTS: This release of PMLB (Penn Machine Learning Benchmarks) provides the largest collection of diverse, public benchmark datasets for evaluating new machine learning and data science methods aggregated in one location. v1.0 introduces a number of critical improvements developed following discussions with the open-source community. AVAILABILITY AND IMPLEMENTATION: PMLB is available at https://github.com/EpistasisLab/pmlb. Python and R interfaces for PMLB can be installed through the Python Package Index and Comprehensive R Archive Network, respectively.


Assuntos
Benchmarking , Software , Aprendizado de Máquina , Modelos Estatísticos
3.
Hum Genet ; 141(9): 1529-1544, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-34713318

RESUMO

The genetic analysis of complex traits has been dominated by parametric statistical methods due to their theoretical properties, ease of use, computational efficiency, and intuitive interpretation. However, there are likely to be patterns arising from complex genetic architectures which are more easily detected and modeled using machine learning methods. Unfortunately, selecting the right machine learning algorithm and tuning its hyperparameters can be daunting for experts and non-experts alike. The goal of automated machine learning (AutoML) is to let a computer algorithm identify the right algorithms and hyperparameters thus taking the guesswork out of the optimization process. We review the promises and challenges of AutoML for the genetic analysis of complex traits and give an overview of several approaches and some example applications to omics data. It is our hope that this review will motivate studies to develop and evaluate novel AutoML methods and software in the genetics and genomics space. The promise of AutoML is to enable anyone, regardless of training or expertise, to apply machine learning as part of their genetic analysis strategy.


Assuntos
Aprendizado de Máquina , Herança Multifatorial , Algoritmos , Genômica/métodos , Humanos , Software
4.
Chem Res Toxicol ; 35(8): 1370-1382, 2022 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-35819939

RESUMO

ComptoxAI is a new data infrastructure for computational and artificial intelligence research in predictive toxicology. Here, we describe and showcase ComptoxAI's graph-structured knowledge base in the context of three real-world use-cases, demonstrating that it can rapidly answer complex questions about toxicology that are infeasible using previous technologies and data resources. These use-cases each demonstrate a tool for information retrieval from the knowledge base being used to solve a specific task: The "shortest path" module is used to identify mechanistic links between perfluorooctanoic acid (PFOA) exposure and nonalcoholic fatty liver disease; the "expand network" module identifies communities that are linked to dioxin toxicity; and the quantitative structure-activity relationship (QSAR) dataset generator predicts pregnane X receptor agonism in a set of 4,021 pesticide ingredients. The contents of ComptoxAI's source data are rigorously aggregated from a diverse array of public third-party databases, and ComptoxAI is designed as a free, public, and open-source toolkit to enable diverse classes of users including biomedical researchers, public health and regulatory officials, and the general public to predict toxicology of unknowns and modes of action.


Assuntos
Biologia Computacional , Toxicologia , Inteligência Artificial , Bases de Dados Factuais , Relação Quantitativa Estrutura-Atividade
5.
PLoS Comput Biol ; 16(11): e1008390, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-33180774

RESUMO

Papers describing software are an important part of computational fields of scientific research. These "software papers" are unique in a number of ways, and they require special consideration to improve their impact on the scientific community and their efficacy at conveying important information. Here, we discuss 10 specific rules for writing software papers, covering some of the different scenarios and publication types that might be encountered, and important questions from which all computational researchers would benefit by asking along the way.


Assuntos
Biologia Computacional , Editoração , Software , Humanos , Internet , Pesquisadores , Redação
6.
BMC Bioinformatics ; 21(1): 430, 2020 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-32998684

RESUMO

BACKGROUND: A typical task in bioinformatics consists of identifying which features are associated with a target outcome of interest and building a predictive model. Automated machine learning (AutoML) systems such as the Tree-based Pipeline Optimization Tool (TPOT) constitute an appealing approach to this end. However, in biomedical data, there are often baseline characteristics of the subjects in a study or batch effects that need to be adjusted for in order to better isolate the effects of the features of interest on the target. Thus, the ability to perform covariate adjustments becomes particularly important for applications of AutoML to biomedical big data analysis. RESULTS: We developed an approach to adjust for covariates affecting features and/or target in TPOT. Our approach is based on regressing out the covariates in a manner that avoids 'leakage' during the cross-validation training procedure. We describe applications of this approach to toxicogenomics and schizophrenia gene expression data sets. The TPOT extensions discussed in this work are available at https://github.com/EpistasisLab/tpot/tree/v0.11.1-resAdj . CONCLUSIONS: In this work, we address an important need in the context of AutoML, which is particularly crucial for applications to bioinformatics and medical informatics, namely covariate adjustments. To this end we present a substantial extension of TPOT, a genetic programming based AutoML approach. We show the utility of this extension by applications to large toxicogenomics and differential gene expression data. The method is generally applicable in many other scenarios from the biomedical field.


Assuntos
Big Data , Análise de Dados , Aprendizado de Máquina , Algoritmos , Automação , Humanos
7.
Living Rev Relativ ; 20(1): 2, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28690422

RESUMO

We review detection methods that are currently in use or have been proposed to search for a stochastic background of gravitational radiation. We consider both Bayesian and frequentist searches using ground-based and space-based laser interferometers, spacecraft Doppler tracking, and pulsar timing arrays; and we allow for anisotropy, non-Gaussianity, and non-standard polarization states. Our focus is on relevant data analysis issues, and not on the particular astrophysical or early Universe sources that might give rise to such backgrounds. We provide a unified treatment of these searches at the level of detector response functions, detection sensitivity curves, and, more generally, at the level of the likelihood function, since the choice of signal and noise models and prior probability distributions are actually what define the search. Pedagogical examples are given whenever possible to compare and contrast different approaches. We have tried to make the article as self-contained and comprehensive as possible, targeting graduate students and new researchers looking to enter this field.

8.
J Biomed Inform ; 54: 10-38, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25592479

RESUMO

The characterization of complex diseases remains a great challenge for biomedical researchers due to the myriad interactions of genetic and environmental factors. Network medicine approaches strive to accommodate these factors holistically. Phylogenomic techniques that can leverage available genomic data may provide an evolutionary perspective that may elucidate knowledge for gene networks of complex diseases and provide another source of information for network medicine approaches. Here, an automated method is presented that leverages publicly available genomic data and phylogenomic techniques, resulting in a gene network. The potential of approach is demonstrated based on a case study of nine genes associated with Alzheimer Disease, a complex neurodegenerative syndrome. The developed technique, which is incorporated into an update to a previously described Perl script called "ASAP," was implemented through a suite of Ruby scripts entitled "ASAP2," first compiles a list of sequence-similarity based orthologues using PSI-BLAST and a recursive NCBI BLAST+ search strategy, then constructs maximum parsimony phylogenetic trees for each set of nucleotide and protein sequences, and calculates phylogenetic metrics (Incongruence Length Difference between orthologue sets, partitioned Bremer support values, combined branch scores, and Robinson-Foulds distance) to provide an empirical assessment of evolutionary conservation within a given genetic network. In addition to the individual phylogenetic metrics, ASAP2 provides results in a way that can be used to generate a gene network that represents evolutionary similarity based on topological similarity (the Robinson-Foulds distance). The results of this study demonstrate the potential for using phylogenomic approaches that enable the study of multiple genes simultaneously to provide insights about potential gene relationships that can be studied within a network medicine framework that may not have been apparent using traditional, single-gene methods. Furthermore, the results provide an initial integrated evolutionary history of an Alzheimer Disease gene network and identify potentially important co-evolutionary clustering that may warrant further investigation.


Assuntos
Biologia Computacional/métodos , Estudos de Associação Genética/métodos , Predisposição Genética para Doença/genética , Filogenia , Doença de Alzheimer/genética , Animais , Análise por Conglomerados , Humanos , Mamíferos/classificação , Mamíferos/genética , Proteínas/genética , Análise de Sequência de DNA
9.
Artigo em Inglês | MEDLINE | ID: mdl-38723657

RESUMO

The progress of precision medicine research hinges on the gathering and analysis of extensive and diverse clinical datasets. With the continued expansion of modalities, scales, and sources of clinical datasets, it becomes imperative to devise methods for aggregating information from these varied sources to achieve a comprehensive understanding of diseases. In this review, we describe two important approaches for the analysis of diverse clinical datasets, namely the centralized model and federated model. We compare and contrast the strengths and weaknesses inherent in each model and present recent progress in methodologies and their associated challenges. Finally, we present an outlook on the opportunities that both models hold for the future analysis of clinical data.

10.
CPT Pharmacometrics Syst Pharmacol ; 12(8): 1072-1079, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37475158

RESUMO

In computational toxicology, prediction of complex endpoints has always been challenging, as they often involve multiple distinct mechanisms. State-of-the-art models are either limited by low accuracy, or lack of interpretability due to their black-box nature. Here, we introduce AIDTox, an interpretable deep learning model which incorporates curated knowledge of chemical-gene connections, gene-pathway annotations, and pathway hierarchy. AIDTox accurately predicts cytotoxicity outcomes in HepG2 and HEK293 cells. It also provides comprehensive explanations of cytotoxicity covering multiple aspects of drug activity, including target interaction, metabolism, and elimination. In summary, AIDTox provides a computational framework for unveiling cellular mechanisms for complex toxicity endpoints.


Assuntos
Reconhecimento Automatizado de Padrão , Humanos , Células HEK293
11.
Toxins (Basel) ; 15(7)2023 07 09.
Artigo em Inglês | MEDLINE | ID: mdl-37505720

RESUMO

Venoms are a diverse and complex group of natural toxins that have been adapted to treat many types of human disease, but rigorous computational approaches for discovering new therapeutic activities are scarce. We have designed and validated a new platform-named VenomSeq-to systematically identify putative associations between venoms and drugs/diseases via high-throughput transcriptomics and perturbational differential gene expression analysis. In this study, we describe the architecture of VenomSeq and its evaluation using the crude venoms from 25 diverse animal species and 9 purified teretoxin peptides. By integrating comparisons to public repositories of differential expression, associations between regulatory networks and disease, and existing knowledge of venom activity, we provide a number of new therapeutic hypotheses linking venoms to human diseases supported by multiple layers of preliminary evidence.


Assuntos
Peptídeos , Peçonhas , Animais , Humanos , Peçonhas/metabolismo , Peptídeos/genética , Peptídeos/farmacologia , Peptídeos/uso terapêutico , Perfilação da Expressão Gênica , Expressão Gênica
12.
Comput Toxicol ; 252023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37829618

RESUMO

Adverse outcome pathways provide a powerful tool for understanding the biological signaling cascades that lead to disease outcomes following toxicity. The framework outlines downstream responses known as key events, culminating in a clinically significant adverse outcome as a final result of the toxic exposure. Here we use the AOP framework combined with artificial intelligence methods to gain novel insights into genetic mechanisms that underlie toxicity-mediated adverse health outcomes. Specifically, we focus on liver cancer as a case study with diverse underlying mechanisms that are clinically significant. Our approach uses two complementary AI techniques: Generative modeling via automated machine learning and genetic algorithms, and graph machine learning. We used data from the US Environmental Protection Agency's Adverse Outcome Pathway Database (AOP-DB; aopdb.epa.gov) and the UK Biobank's genetic data repository. We use the AOP-DB to extract disease-specific AOPs and build graph neural networks used in our final analyses. We use the UK Biobank to retrieve real-world genotype and phenotype data, where genotypes are based on single nucleotide polymorphism data extracted from the AOP-DB, and phenotypes are case/control cohorts for the disease of interest (liver cancer) corresponding to those adverse outcome pathways. We also use propensity score matching to appropriately sample based on important covariates (demographics, comorbidities, and social deprivation indices) and to balance the case and control populations in our machine language training/testing datasets. Finally, we describe a novel putative risk factor for LC that depends on genetic variation in both the aryl-hydrocarbon receptor (AHR) and ATP binding cassette subfamily B member 11 (ABCB11) genes.

13.
Pac Symp Biocomput ; 27: 187-198, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34890148

RESUMO

Quantitative Structure-Activity Relationship (QSAR) modeling is a common computational technique for predicting chemical toxicity, but a lack of new methodological innovations has impeded QSAR performance on many tasks. We show that contemporary QSAR modeling for predictive toxicology can be substantially improved by incorporating semantic graph data aggregated from open-access public databases, and analyzing those data in the context of graph neural networks (GNNs). Furthermore, we introspect the GNNs to demonstrate how they can lead to more interpretable applications of QSAR, and use ablation analysis to explore the contribution of different data elements to the final models' performance.


Assuntos
Relação Quantitativa Estrutura-Atividade , Semântica , Biologia Computacional , Bases de Dados Factuais , Humanos , Redes Neurais de Computação
14.
Patterns (N Y) ; 3(9): 100565, 2022 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-36124309

RESUMO

In drug development, a major reason for attrition is the lack of understanding of cellular mechanisms governing drug toxicity. The black-box nature of conventional classification models has limited their utility in identifying toxicity pathways. Here we developed DTox (deep learning for toxicology), an interpretation framework for knowledge-guided neural networks, which can predict compound response to toxicity assays and infer toxicity pathways of individual compounds. We demonstrate that DTox can achieve the same level of predictive performance as conventional models with a significant improvement in interpretability. Using DTox, we were able to rediscover mechanisms of transcription activation by three nuclear receptors, recapitulate cellular activities induced by aromatase inhibitors and pregnane X receptor (PXR) agonists, and differentiate distinctive mechanisms leading to HepG2 cytotoxicity. Virtual screening by DTox revealed that compounds with predicted cytotoxicity are at higher risk for clinical hepatic phenotypes. In summary, DTox provides a framework for deciphering cellular mechanisms of toxicity in silico.

15.
Front Genet ; 10: 368, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31114606

RESUMO

The discovery of new pharmaceutical drugs is one of the preeminent tasks-scientifically, economically, and socially-in biomedical research. Advances in informatics and computational biology have increased productivity at many stages of the drug discovery pipeline. Nevertheless, drug discovery has slowed, largely due to the reliance on small molecules as the primary source of novel hypotheses. Natural products (such as plant metabolites, animal toxins, and immunological components) comprise a vast and diverse source of bioactive compounds, some of which are supported by thousands of years of traditional medicine, and are largely disjoint from the set of small molecules used commonly for discovery. However, natural products possess unique characteristics that distinguish them from traditional small molecule drug candidates, requiring new methods and approaches for assessing their therapeutic potential. In this review, we investigate a number of state-of-the-art techniques in bioinformatics, cheminformatics, and knowledge engineering for data-driven drug discovery from natural products. We focus on methods that aim to bridge the gap between traditional small-molecule drug candidates and different classes of natural products. We also explore the current informatics knowledge gaps and other barriers that need to be overcome to fully leverage these compounds for drug discovery. Finally, we conclude with a "road map" of research priorities that seeks to realize this goal.

16.
AMIA Jt Summits Transl Sci Proc ; 2019: 335-344, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31258986

RESUMO

For the past 11 years, the year-in-review (YIR) keynote presentation at the AMIA Informatics summit has been a perennial highlight. We hypothesized that the presented material from these keynotes could be used to assess both the recent trajectory of topics in informatics-especially translational bioinformatics (TBI)-as well as the scientific merit of the crowd-sourced process used to nominate, review, and select the papers presented at the YIR. We compare YIR articles to a background set of non-YIR articles from informatics journals using structured metadata and qualitative thematic analysis, paying specific attention to trends and popularity over time. These trends were inspected both internally (comparing the YIR sessions to each other) and externally (comparing them to the overall content of scientific literature for the same time period). In doing so, we identified some unexpected patterns that suggest important opportunities for TBI research in the future.

17.
Dement Geriatr Cogn Disord ; 25(4): 380-4, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-18376127

RESUMO

BACKGROUND/AIMS: A previous study found a high prevalence of headaches in persons with familial Alzheimer's disease (FAD) due to a PSEN1 mutation. In our study we compared the prevalence of headaches between nondemented FAD mutation carriers (MCs) and non-mutation-carrying controls (NCs). METHODS: A headache questionnaire that assessed the prevalence of significant headaches and diagnosis of migraine and aura by ICHD-2 criteria was administered to 27 individuals at risk for FAD. Frequency of significant headaches, migraine, and aura were compared between MCs and NCs by chi(2) or Fisher's exact tests. RESULTS: Twenty-three subjects were at risk for PSEN1 mutations and 4 for an APP substitution. The majority of subjects were female (23/27). MCs were more likely to report significant recurrent headache than NCs (67 vs. 25%, p = 0.031). Forty percent of MCs had headaches that met criteria for migraine whereas 17% of NCs met such criteria. The tendency for a higher prevalence of headaches in MCs held for different PSEN1 and APP mutations but was not significant unless all families were combined. CONCLUSIONS: In this population, headache was more common in nondemented FAD MCs than NCs. Possible mechanisms for this include cerebral inflammation, aberrant processing of Notch3, or disrupted intracellular calcium regulation.


Assuntos
Doença de Alzheimer/epidemiologia , Doença de Alzheimer/genética , Precursor de Proteína beta-Amiloide/genética , Cefaleia/epidemiologia , Cefaleia/genética , Presenilina-1/genética , Adulto , Saúde da Família , Feminino , Predisposição Genética para Doença/epidemiologia , Humanos , Masculino , Mutação , Prevalência , Recidiva
19.
AMIA Jt Summits Transl Sci Proc ; 2016: 209-18, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27570672

RESUMO

Venoms and venom-derived compounds constitute a rich and largely unexplored source of potentially therapeutic compounds. To facilitate biomedical research, it is necessary to design a robust informatics infrastructure that will allow semantic computation of venom concepts in a standardized, consistent manner. We have designed an ontology of venom-related concepts - named Venom Ontology - that reuses an existing public data source: UniProt's Tox-Prot database. In addition to describing the ontology and its construction, we have performed three separate case studies demonstrating its utility: (1) An exploration of venom peptide similarity networks within specific genera; (2) A broad overview of the distribution of available data among common taxonomic groups spanning the known tree of life; and (3) An analysis of the distribution of venom complexity across those same taxonomic groups. Venom Ontology is publicly available on BioPortal at http://bioportal.bioontology.org/ontologies/CU-VO.

20.
J Am Coll Emerg Physicians Open ; 2(3): e12471, 2021 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-34142106
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA