Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 41
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Tuberculosis (Edinb) ; 146: 102500, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38432118

RESUMEN

Tuberculosis (TB) is still a major global health challenge, killing over 1.5 million people each year, and hence, there is a need to identify and develop novel treatments for Mycobacterium tuberculosis (M. tuberculosis). The prevalence of infections caused by nontuberculous mycobacteria (NTM) is also increasing and has overtaken TB cases in the United States and much of the developed world. Mycobacterium abscessus (M. abscessus) is one of the most frequently encountered NTM and is difficult to treat. We describe the use of drug-disease association using a semantic knowledge graph approach combined with machine learning models that has enabled the identification of several molecules for testing anti-mycobacterial activity. We established that niclosamide (M. tuberculosis IC90 2.95 µM; M. abscessus IC90 59.1 µM) and tribromsalan (M. tuberculosis IC90 76.92 µM; M. abscessus IC90 147.4 µM) inhibit M. tuberculosis and M. abscessus in vitro. To investigate the mode of action, we determined the transcriptional response of M. tuberculosis and M. abscessus to both compounds in axenic log phase, demonstrating a broad effect on gene expression that differed from known M. tuberculosis inhibitors. Both compounds elicited transcriptional responses indicative of respiratory pathway stress and the dysregulation of fatty acid metabolism.


Asunto(s)
Infecciones por Mycobacterium no Tuberculosas , Mycobacterium abscessus , Mycobacterium tuberculosis , Salicilanilidas , Tuberculosis , Humanos , Mycobacterium tuberculosis/genética , Infecciones por Mycobacterium no Tuberculosas/microbiología , Niclosamida/farmacología , Reposicionamiento de Medicamentos , Micobacterias no Tuberculosas/genética , Tuberculosis/tratamiento farmacológico , Tuberculosis/microbiología
2.
Mil Med ; 188(Suppl 6): 377-384, 2023 11 08.
Artículo en Inglés | MEDLINE | ID: mdl-37948241

RESUMEN

INTRODUCTION: The advancement of the Army's National Emergency Tele-Critical Care Network (NETCCN) and planned evolution to an Intelligent Medical System rest on a digital transformation characterized by the application of analytic rigor anchored and machine learning.The goal is an enduring capability for telecritical care in support of the Nation's warfighters and, more broadly, for emergency response, crisis management, and mass casualty situations as the number and intensity of disasters increase nationwide. That said, technology alone is unlikely to solve the most pressing issues in operational medicine and combat casualty care. MATERIALS AND METHODS: A total performance system (TPS) creates opportunities to address vulnerabilities and overcome barriers to success. As applied during the NETCCN project, the TPS captures the best performance-centric information and know-how, increasing the potential to save lives, improve readiness, and accomplish missions. RESULTS: The purpose of this project was to apply a performance-based readiness model to aid in the evaluation of Army telehealth technologies. Through various user-facing surveys, polls, and reporting techniques, the project aimed to measure the perceived value of telehealth technologies within a sample of the project team member population. By providing a detailed approach to the collection of lessons learned, researchers were able to determine the importance of information and methods versus a focus on technology alone. The use of an emoji-based feedback assessment indicated that most lessons learned were helpful to the project team. CONCLUSIONS: Through the NETCCN TPS, we have been able to address product-related measures, knowledge of product efficacy, project metrics, and many implementation considerations that can be further investigated by setting and engagement type. Through the Technology in Disaster Environments learning accelerator, it was possible to rapidly acquire, process, organize, and disseminate best practices and learnings in near real time, providing a critical feedback and improvement loop.


Asunto(s)
Servicios Médicos de Urgencia , Personal Militar , Telemedicina , Humanos , Cuidados Críticos
3.
J Emerg Manag ; 21(5): 399-419, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37932944

RESUMEN

In this paper, we introduce the Analysis Platform for Risk, Resilience, and Expenditure in Disasters (APRED)-a disaster-analytic platform developed for crisis practitioners and economic developers across the United States (US). APRED provides practitioners with a centralized platform for exploring disaster resilience and vulnerability profiles of all counties across the US. The platform comprises five sections including: (1) Disaster Resilience Index, (2) Business Vulnerability Index, (3) Disaster Declaration History, (4) County Profile, and (5) Storm History sections. We further describe our end-to-end human-centered design and engineering process that involved contextual inquiry, community-based participatory design, and rapid prototyping with the support of US Economic Development Administration representatives and regional economic developers across the US. Findings from our study revealed that distributed cognition, content heuristic, shareability, and human-centered systems are crucial considerations for developing data-intensive visualization platforms for resilience planning. We discuss the implications of these findings and inform future research on developing sociotechnical visualization platforms to support resilience planning.


Asunto(s)
Planificación en Desastres , Desastres , Humanos , Ciencia de los Datos , Participación de la Comunidad , Internet
4.
BMC Bioinformatics ; 23(1): 37, 2022 Jan 12.
Artículo en Inglés | MEDLINE | ID: mdl-35021991

RESUMEN

BACKGROUND: LINCS, "Library of Integrated Network-based Cellular Signatures", and IDG, "Illuminating the Druggable Genome", are both NIH projects and consortia that have generated rich datasets for the study of the molecular basis of human health and disease. LINCS L1000 expression signatures provide unbiased systems/omics experimental evidence. IDG provides compiled and curated knowledge for illumination and prioritization of novel drug target hypotheses. Together, these resources can support a powerful new approach to identifying novel drug targets for complex diseases, such as Parkinson's disease (PD), which continues to inflict severe harm on human health, and resist traditional research approaches. RESULTS: Integrating LINCS and IDG, we built the Knowledge Graph Analytics Platform (KGAP) to support an important use case: identification and prioritization of drug target hypotheses for associated diseases. The KGAP approach includes strong semantics interpretable by domain scientists and a robust, high performance implementation of a graph database and related analytical methods. Illustrating the value of our approach, we investigated results from queries relevant to PD. Approved PD drug indications from IDG's resource DrugCentral were used as starting points for evidence paths exploring chemogenomic space via LINCS expression signatures for associated genes, evaluated as target hypotheses by integration with IDG. The KG-analytic scoring function was validated against a gold standard dataset of genes associated with PD as elucidated, published mechanism-of-action drug targets, also from DrugCentral. IDG's resource TIN-X was used to rank and filter KGAP results for novel PD targets, and one, SYNGR3 (Synaptogyrin-3), was manually investigated further as a case study and plausible new drug target for PD. CONCLUSIONS: The synergy of LINCS and IDG, via KG methods, empowers graph analytics methods for the investigation of the molecular basis of complex diseases, and specifically for identification and prioritization of novel drug targets. The KGAP approach enables downstream applications via integration with resources similarly aligned with modern KG methodology. The generality of the approach indicates that KGAP is applicable to many disease areas, in addition to PD, the focus of this paper.


Asunto(s)
Enfermedad de Parkinson , Biblioteca de Genes , Genoma , Humanos , Iluminación , Enfermedad de Parkinson/tratamiento farmacológico , Enfermedad de Parkinson/genética , Reconocimiento de Normas Patrones Automatizadas
5.
Bioinformatics ; 37(21): 3865-3873, 2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-34086846

RESUMEN

MOTIVATION: Genome-wide association studies can reveal important genotype-phenotype associations; however, data quality and interpretability issues must be addressed. For drug discovery scientists seeking to prioritize targets based on the available evidence, these issues go beyond the single study. RESULTS: Here, we describe rational ranking, filtering and interpretation of inferred gene-trait associations and data aggregation across studies by leveraging existing curation and harmonization efforts. Each gene-trait association is evaluated for confidence, with scores derived solely from aggregated statistics, linking a protein-coding gene and phenotype. We propose a method for assessing confidence in gene-trait associations from evidence aggregated across studies, including a bibliometric assessment of scientific consensus based on the iCite relative citation ratio, and meanRank scores, to aggregate multivariate evidence.This method, intended for drug target hypothesis generation, scoring and ranking, has been implemented as an analytical pipeline, available as open source, with public datasets of results, and a web application designed for usability by drug discovery scientists. AVAILABILITY AND IMPLEMENTATION: Web application, datasets and source code via https://unmtid-shinyapps.net/tiga/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Estudio de Asociación del Genoma Completo , Iluminación , Genotipo , Polimorfismo de Nucleótido Simple , Fenotipo
6.
BMC Bioinformatics ; 19(1): 265, 2018 07 16.
Artículo en Inglés | MEDLINE | ID: mdl-30012095

RESUMEN

BACKGROUND: Netpredictor is an R package for prediction of missing links in any given unipartite or bipartite network. The package provides utilities to compute missing links in a bipartite and well as unipartite networks using Random Walk with Restart and Network inference algorithm and a combination of both. The package also allows computation of Bipartite network properties, visualization of communities for two different sets of nodes, and calculation of significant interactions between two sets of nodes using permutation based testing. The application can also be used to search for top-K shortest paths between interactome and use enrichment analysis for disease, pathway and ontology. The R standalone package (including detailed introductory vignettes) and associated R Shiny web application is available under the GPL-2 Open Source license and is freely available to download. RESULTS: We compared different algorithms performance in different small datasets and found random walk supersedes rest of the algorithms. The package is developed to perform network based prediction of unipartite and bipartite networks and use the results to understand the functionality of proteins in an interactome using enrichment analysis. CONCLUSION: The rapid application development envrionment like shiny, helps non programmers to develop fast rich visualization apps and we beleieve it would continue to grow in future with further enhancements. We plan to update our algorithms in the package in near future and help scientist to analyse data in a much streamlined fashion.


Asunto(s)
Algoritmos , Sistemas de Liberación de Medicamentos , Ontología de Genes , Mapas de Interacción de Proteínas , Programas Informáticos
7.
J Cheminform ; 10(1): 24, 2018 May 21.
Artículo en Inglés | MEDLINE | ID: mdl-29785561

RESUMEN

Tuberculosis (TB) is the world's leading infectious killer with 1.8 million deaths in 2015 as reported by WHO. It is therefore imperative that alternate routes of identification of novel anti-TB compounds are explored given the time and costs involved in new drug discovery process. Towards this, we have developed RepTB. This is a unique drug repurposing approach for TB that uses molecular function correlations among known drug-target pairs to predict novel drug-target interactions. In this study, we have created a Gene Ontology based network containing 26,404 edges, 6630 drug and 4083 target nodes. The network, enriched with molecular function ontology, was analyzed using Network Based Inference (NBI). The association scores computed from NBI are used to identify novel drug-target interactions. These interactions are further evaluated based on a combined evidence approach for identification of potential drug repurposing candidates. In this approach, targets which have no known variation in clinical isolates, no human homologs, and are essential for Mtb's survival and or virulence are prioritized. We analyzed predicted DTIs to identify target pairs whose predicted drugs may have synergistic bactericidal effect. From the list of predicted DTIs from RepTB, four TB targets, namely, FolP1 (Dihydropteroate synthase), Tmk (Thymidylate kinase), Dut (Deoxyuridine 5'-triphosphate nucleotidohydrolase) and MenB (1,4-dihydroxy-2-naphthoyl-CoA synthase) may be selected for further validation. In addition, we observed that in some cases there is significant chemical structure similarity between predicted and reported drugs of prioritized targets, lending credence to our approach. We also report new chemical space for prioritized targets that may be tested further. We believe that with increasing drug-target interaction dataset RepTB will be able to offer better predictive value and is amenable for identification of drug-repurposing candidates for other disease indications too.

8.
J Biomed Semantics ; 8(1): 42, 2017 Sep 20.
Artículo en Inglés | MEDLINE | ID: mdl-28931422

RESUMEN

BACKGROUND: There are a huge variety of data sources relevant to chemical, biological and pharmacological research, but these data sources are highly siloed and cannot be queried together in a straightforward way. Semantic technologies offer the ability to create links and mappings across datasets and manage them as a single, linked network so that searching can be carried out across datasets, independently of the source. We have developed an application called PIBAS FedSPARQL that uses semantic technologies to allow researchers to carry out such searching across a vast array of data sources. RESULTS: PIBAS FedSPARQL is a web-based query builder and result set visualizer of bioinformatics data. As an advanced feature, our system can detect similar data items identified by different Uniform Resource Identifiers (URIs), using a text-mining algorithm based on the processing of named entities to be used in Vector Space Model and Cosine Similarity Measures. According to our knowledge, PIBAS FedSPARQL was unique among the systems that we found in that it allows detecting of similar data items. As a query builder, our system allows researchers to intuitively construct and run Federated SPARQL queries across multiple data sources, including global initiatives, such as Bio2RDF, Chem2Bio2RDF, EMBL-EBI, and one local initiative called CPCTAS, as well as additional user-specified data source. From the input topic, subtopic, template and keyword, a corresponding initial Federated SPARQL query is created and executed. Based on the data obtained, end users have the ability to choose the most appropriate data sources in their area of interest and exploit their Resource Description Framework (RDF) structure, which allows users to select certain properties of data to enhance query results. CONCLUSIONS: The developed system is flexible and allows intuitive creation and execution of queries for an extensive range of bioinformatics topics. Also, the novel "similar data items detection" algorithm can be particularly useful for suggesting new data sources and cost optimization for new experiments. PIBAS FedSPARQL can be expanded with new topics, subtopics and templates on demand, rendering information retrieval more robust.


Asunto(s)
Biología Computacional , Minería de Datos/métodos , Internet , Programas Informáticos , Bases de Datos Factuales , Interfaz Usuario-Computador
9.
J Cheminform ; 8: 41, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27547247

RESUMEN

BACKGROUND: Highly chemically similar drugs usually possess similar biological activities, but sometimes, small changes in chemistry can result in a large difference in biological effects. Chemically similar drug pairs that show extreme deviations in activity represent distinctive drug interactions having important implications. These associations between chemical and biological similarity are studied as discontinuities in activity landscapes. Particularly, activity cliffs are quantified by the drop in similar activity of chemically similar drugs. In this paper, we construct a landscape using a large drug-target network and consider the rises in similarity and variation in activity along the chemical space. Detailed analysis of structure and activity gives a rigorous quantification of distinctive pairs and the probability of their occurrence. RESULTS: We analyze pairwise similarity (s) and variation (d) in activity of drugs on proteins. Interactions between drugs are quantified by considering pairwise s and d weights jointly with corresponding chemical similarity (c) weights. Similarity and variation in activity are measured as the number of common and uncommon targets of two drugs respectively. Distinctive interactions occur between drugs having high c and above (below) average d (s). Computation of predicted probability of distinctiveness employs joint probability of c, s and of c, d assuming independence of structure and activity. Predictions conform with the observations at different levels of distinctiveness. Results are validated on the data used and another drug ensemble. In the landscape, while s and d decrease as c increases, d maintains value more than s. c ∈ [0.3, 0.64] is the transitional region where rises in d are significantly greater than drops in s. It is fascinating that distinctive interactions filtered with high d and low s are different in nature. It is crucial that high c interactions are more probable of having above average d than s. Identification of distinctive interactions is better with high d than low s. These interactions belong to diverse classes. d is greatest between drugs and analogs prepared for treatment of same class of ailments but with different therapeutic specifications. In contrast, analogs having low s would treat ailments from distinct classes. CONCLUSIONS: Intermittent spikes in d along the axis of c represent canyons in the activity landscape. This new representation accounts for distinctiveness through relative rises in s and d. It provides a mathematical basis for predicting the probability of occurrence of distinctiveness. It identifies the drug pairs at varying levels of distinctiveness and non-distinctiveness. The predicted probability formula is validated even if data approximately satisfy the conditions of its construction. Also, the postulated independence of structure and activity is of little significance to the overall assessment. The difference in distinctive interactions obtained by s and d highlights the importance of studying both of them, and reveals how the choice of measurement can affect the interpretation. The methods in this paper can be used to interpret whether or not drug interactions are distinctive and the probability of their occurrence. Practitioners and researchers can rely on this identification for quantitative modeling and assessment.

10.
J Cheminform ; 7: 40, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26300984

RESUMEN

BACKGROUND: Predicting novel drug-target associations is important not only for developing new drugs, but also for furthering biological knowledge by understanding how drugs work and their modes of action. As more data about drugs, targets, and their interactions becomes available, computational approaches have become an indispensible part of drug target association discovery. In this paper we apply random walk with restart (RWR) method to a heterogeneous network of drugs and targets compiled from DrugBank database and investigate the performance of the method under parameter variation and choice of chemical fingerprint methods. RESULTS: We show that choice of chemical fingerprint does not affect the performance of the method when the parameters are tuned to optimal values. We use a subset of the ChEMBL15 dataset that contains 2,763 associations between 544 drugs and 467 target proteins to evaluate our method, and we extracted datasets of bioactivity ≤1 and ≤10 µM activity cutoff. For 1 µM bioactivity cutoff, we find that our method can correctly predict nearly 47, 55, 60% of the given drug-target interactions in the test dataset having more than 0, 1, 2 drug target relations for ChEMBL 1 µM dataset in top 50 rank positions. For 10 µM bioactivity cutoff, we find that our method can correctly predict nearly 32.4, 34.8, 35.3% of the given drug-target interactions in the test dataset having more than 0, 1, 2 drug target relations for ChEMBL 1 µM dataset in top 50 rank positions. We further examine the associations between 110 popular top selling drugs in 2012 and 3,519 targets and find the top ten targets for each drug. CONCLUSIONS: We demonstrate the effectiveness and promise of the approach-RWR on heterogeneous networks using chemical features-for identifying novel drug target interactions and investigate the performance.

11.
PLoS One ; 10(7): e0130796, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26177200

RESUMEN

Phenotypic assays have a proven track record for generating leads that become first-in-class therapies. Whole cell assays that inform on a phenotype or mechanism also possess great potential in drug repositioning studies by illuminating new activities for the existing pharmacopeia. The National Center for Advancing Translational Sciences (NCATS) pharmaceutical collection (NPC) is the largest reported collection of approved small molecule therapeutics that is available for screening in a high-throughput setting. Via a wide-ranging collaborative effort, this library was analyzed in the Open Innovation Drug Discovery (OIDD) phenotypic assay modules publicly offered by Lilly. The results of these tests are publically available online at www.ncats.nih.gov/expertise/preclinical/pd2 and via the PubChem Database (https://pubchem.ncbi.nlm.nih.gov/) (AID 1117321). Phenotypic outcomes for numerous drugs were confirmed, including sulfonylureas as insulin secretagogues and the anti-angiogenesis actions of multikinase inhibitors sorafenib, axitinib and pazopanib. Several novel outcomes were also noted including the Wnt potentiating activities of rotenone and the antifolate class of drugs, and the anti-angiogenic activity of cetaben.


Asunto(s)
Reposicionamiento de Medicamentos , Línea Celular Tumoral , Aprobación de Drogas , Evaluación Preclínica de Medicamentos , Ensayos Analíticos de Alto Rendimiento , Humanos , Concentración 50 Inhibidora , Fenotipo , Bibliotecas de Moléculas Pequeñas/farmacología
13.
J Cheminform ; 5(1): 23, 2013 May 08.
Artículo en Inglés | MEDLINE | ID: mdl-23657106

RESUMEN

BACKGROUND: Making data available as Linked Data using Resource Description Framework (RDF) promotes integration with other web resources. RDF documents can natively link to related data, and others can link back using Uniform Resource Identifiers (URIs). RDF makes the data machine-readable and uses extensible vocabularies for additional information, making it easier to scale up inference and data analysis. RESULTS: This paper describes recent developments in an ongoing project converting data from the ChEMBL database into RDF triples. Relative to earlier versions, this updated version of ChEMBL-RDF uses recently introduced ontologies, including CHEMINF and CiTO; exposes more information from the database; and is now available as dereferencable, linked data. To demonstrate these new features, we present novel use cases showing further integration with other web resources, including Bio2RDF, Chem2Bio2RDF, and ChemSpider, and showing the use of standard ontologies for querying. CONCLUSIONS: We have illustrated the advantages of using open standards and ontologies to link the ChEMBL database to other databases. Using those links and the knowledge encoded in standards and ontologies, the ChEMBL-RDF resource creates a foundation for integrated semantic web cheminformatics applications, such as the presented decision support.

14.
J Lab Autom ; 18(4): 264-8, 2013 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-23592569

RESUMEN

Electronic laboratory notebooks (ELNs) are increasingly replacing paper notebooks in life science laboratories, including those in industry, academic settings, and hospitals. ELNs offer significant advantages over paper notebooks, but adopting them in a predominantly paper-based environment is usually disruptive. The benefits of ELN increase when they are integrated with other laboratory informatics tools such as laboratory information management systems, chromatography data systems, analytical instrumentation, and scientific data management systems, but there is no well-established path for effective integration of these tools. In this article, we review and evaluate some of the approaches that have been taken thus far and also some radical new methods of integration that are emerging.


Asunto(s)
Automatización de Laboratorios/instrumentación , Sistemas de Información en Laboratorio Clínico/instrumentación , Equipos y Suministros Eléctricos , Minicomputadores/estadística & datos numéricos , Minicomputadores/tendencias , Animales , Electrónica Médica/tendencias , Humanos , Investigación
15.
J Cheminform ; 5(1): 2, 2013 Jan 14.
Artículo en Inglés | MEDLINE | ID: mdl-23317154

RESUMEN

BACKGROUND: Mycobacterium tuberculosis encodes 11 putative serine-threonine proteins Kinases (STPK) which regulates transcription, cell development and interaction with the host cells. From the 11 STPKs three kinases namely PknA, PknB and PknG have been related to the mycobacterial growth. From previous studies it has been observed that PknB is essential for mycobacterial growth and expressed during log phase of the growth and phosphorylates substrates involved in peptidoglycan biosynthesis. In recent years many high affinity inhibitors are reported for PknB. Previously implementation of data fusion has shown effective enrichment of active compounds in both structure and ligand based approaches .In this study we have used three types of data fusion ranking algorithms on the PknB dataset namely, sum rank, sum score and reciprocal rank. We have identified reciprocal rank algorithm is capable enough to select compounds earlier in a virtual screening process. We have also screened the Asinex database with reciprocal rank algorithm to identify possible inhibitors for PknB. RESULTS: In our work we have used both structure-based and ligand-based approaches for virtual screening, and have combined their results using a variety of data fusion methods. We found that data fusion increases the chance of actives being ranked highly. Specifically, we found that the ranking of Pharmacophore search, ROCS and Glide XP fused with a reciprocal ranking algorithm not only outperforms structure and ligand based approaches but also capable of ranking actives better than the other two data fusion methods using the BEDROC, robust initial enhancement (RIE) and AUC metrics. These fused results were used to identify 45 candidate compounds for further experimental validation. CONCLUSION: We show that very different structure and ligand based methods for predicting drug-target interactions can be combined effectively using data fusion, outperforming any single method in ranking of actives. Such fused results show promise for a coherent selection of candidates for biological screening.

16.
J Lab Autom ; 18(2): 126-36, 2013 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-22895535

RESUMEN

There are technologies on the horizon that could dramatically change how informatics organizations design, develop, deliver, and support applications and data infrastructures to deliver maximum value to drug discovery organizations. Effective integration of data and laboratory informatics tools promises the ability of organizations to make better informed decisions about resource allocation during the drug discovery and development process and for more informed decisions to be made with respect to the market opportunity for compounds. We propose in this article a new integration model called ELN-centric laboratory informatics tools integration.


Asunto(s)
Sistemas de Información en Laboratorio Clínico/normas , Descubrimiento de Drogas , Informática , Modelos Biológicos
17.
Mol Inform ; 32(11-12): 1000-8, 2013 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-27481145

RESUMEN

Effective discovery of new drugs for complex diseases demands an integrative analysis of big data aggregated from diverse sources in chemical and biological domains, to help better understand the mechanism of drug actions and to quickly translate discovery to clinical applications. Conventional approaches are confronting critical challenges in the integration of those huge heterogeneous datasets and the rapid transformation from data to knowledge. Semantic technologies aimed at facilitating the building of a common framework that allows data sharing and utilization across applications and domains in the web, have been developed quickly and have been exhibiting a broad impact in life science. Chemogenomics serves as a bridge to connect various chemical and biological data, thus building a semantic framework for chemogenomics research could not only facilitate the development of this field but also advance the intersection among other domains. During the last few years, such framework has been developed and applied in addressing real problems. In the review, we will describe the major techniques needed to build a semantic framework, and will discuss the challenges of having such framework making a broader impact.

18.
PLoS One ; 7(12): e51018, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23227228

RESUMEN

Associative classification mining (ACM) can be used to provide predictive models with high accuracy as well as interpretability. However, traditional ACM ignores the difference of significances among the features used for mining. Although weighted associative classification mining (WACM) addresses this issue by assigning different weights to features, most implementations can only be utilized when pre-assigned weights are available. In this paper, we propose a link-based approach to automatically derive weight information from a dataset using link-based models which treat the dataset as a bipartite model. By combining this link-based feature weighting method with a traditional ACM method-classification based on associations (CBA), a Link-based Associative Classifier (LAC) is developed. We then demonstrate the application of LAC to biomedical datasets for association discovery between chemical compounds and bioactivities or diseases. The results indicate that the novel link-based weighting method is comparable to support vector machine (SVM) and RELIEF method, and is capable of capturing significant features. Additionally, LAC is shown to produce models with high accuracies and discover interesting associations which may otherwise remain unrevealed by traditional ACM.


Asunto(s)
Algoritmos , Bases de Datos como Asunto , Línea Celular Tumoral , Humanos , Modelos Biológicos , Pruebas de Mutagenicidad
19.
J Cheminform ; 4(1): 29, 2012 Nov 23.
Artículo en Inglés | MEDLINE | ID: mdl-23176548

RESUMEN

Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM), which are popular in the data mining community, but so far have not been applied widely in cheminformatics. More specifically, classification based on predictive association rules (CPAR), classification based on multiple association rules (CMAR) and classification based on association rules (CBA) are employed on three datasets using various descriptor sets. Experimental evaluations on anti-tuberculosis (antiTB), mutagenicity and hERG (the human Ether-a-go-go-Related Gene) blocker datasets show that these three methods are computationally scalable and appropriate for high speed mining. Additionally, they provide comparable accuracy and efficiency to the commonly used Bayesian and support vector machines (SVM) methods, and produce highly interpretable models.

20.
PLoS Comput Biol ; 8(7): e1002574, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22859915

RESUMEN

The rapidly increasing amount of public data in chemistry and biology provides new opportunities for large-scale data mining for drug discovery. Systematic integration of these heterogeneous sets and provision of algorithms to data mine the integrated sets would permit investigation of complex mechanisms of action of drugs. In this work we integrated and annotated data from public datasets relating to drugs, chemical compounds, protein targets, diseases, side effects and pathways, building a semantic linked network consisting of over 290,000 nodes and 720,000 edges. We developed a statistical model to assess the association of drug target pairs based on their relation with other linked objects. Validation experiments demonstrate the model can correctly identify known direct drug target pairs with high precision. Indirect drug target pairs (for example drugs which change gene expression level) are also identified but not as strongly as direct pairs. We further calculated the association scores for 157 drugs from 10 disease areas against 1683 human targets, and measured their similarity using a [Formula: see text] score matrix. The similarity network indicates that drugs from the same disease area tend to cluster together in ways that are not captured by structural similarity, with several potential new drug pairings being identified. This work thus provides a novel, validated alternative to existing drug target prediction algorithms. The web service is freely available at: http://chem2bio2rdf.org/slap.


Asunto(s)
Minería de Datos/métodos , Bases de Datos Factuales , Descubrimiento de Drogas/métodos , Semántica , Algoritmos , Biología Computacional/métodos , Humanos , Modelos Teóricos , Reproducibilidad de los Resultados
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...