Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 73
Filtrar
1.
J Biomed Semantics ; 9(1): 25, 2018 12 27.
Artigo em Inglês | MEDLINE | ID: mdl-30587224

RESUMO

BACKGROUND: Structured electronic health records are a rich resource for identifying novel correlations, such as co-morbidities and adverse drug reactions. For drug development and better understanding of biomedical phenomena, such correlations need to be supported by viable hypotheses about the mechanisms involved, which can then form the basis of experimental investigations. METHODS: In this study, we demonstrate the use of discovery browsing, a literature-based discovery method, to generate plausible hypotheses elucidating correlations identified from structured clinical data. The method is supported by Semantic MEDLINE web application, which pinpoints interesting concepts and relevant MEDLINE citations, which are used to build a coherent hypothesis. RESULTS: Discovery browsing revealed a plausible explanation for the correlation between epilepsy and inflammatory bowel disease that was found in an earlier population study. The generated hypothesis involves interleukin-1 beta (IL-1 beta) and glutamate, and suggests that IL-1 beta influence on glutamate levels is involved in the etiology of both epilepsy and inflammatory bowel disease. CONCLUSIONS: The approach presented in this paper can supplement population-based correlation studies by enabling the scientist to identify literature that may justify the novel patterns identified in such studies and can underpin basic biomedical research that can lead to improved treatments and better healthcare outcomes.


Assuntos
Mineração de Dados , Epilepsia/metabolismo , Ácido Glutâmico/metabolismo , Doenças Inflamatórias Intestinais/metabolismo , Interleucina-1beta/metabolismo , Encéfalo/metabolismo , Humanos , MEDLINE , Semântica
2.
ILAR J ; 58(1): 80-89, 2017 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-28838071

RESUMO

Informatics methodologies exploit computer-assisted techniques to help biomedical researchers manage large amounts of information. In this paper, we focus on the biomedical research literature (MEDLINE). We first provide an overview of some text mining techniques that offer assistance in research by identifying biomedical entities (e.g., genes, substances, and diseases) and relations between them in text.We then discuss Semantic MEDLINE, an application that integrates PubMed document retrieval, concept and relation identification, and visualization, thus enabling a user to explore concepts and relations from within a set of retrieved citations. Semantic MEDLINE provides a roadmap through content and helps users discern patterns in large numbers of retrieved citations. We illustrate its use with an informatics method we call "discovery browsing," which provides a principled way of navigating through selected aspects of some biomedical research area. The method supports an iterative process that accommodates learning and hypothesis formation in which a user is provided with high level connections before delving into details.As a use case, we examine current developments in basic research on mechanisms of Alzheimer's disease. Out of the nearly 90 000 citations returned by the PubMed query "Alzheimer's disease," discovery browsing led us to 73 citations on sortilin and that disorder. We provide a synopsis of the basic research reported in 15 of these. There is wide-spread consensus among researchers working with a range of animal models and human cells that increased sortilin expression and decreased receptor expression are associated with amyloid beta and/or amyloid precursor protein.


Assuntos
Mineração de Dados/métodos , Armazenamento e Recuperação da Informação , MEDLINE , Humanos , Semântica
3.
PLoS One ; 12(7): e0179926, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28678823

RESUMO

Biomedical knowledge claims are often expressed as hypotheses, speculations, or opinions, rather than explicit facts (propositions). Much biomedical text mining has focused on extracting propositions from biomedical literature. One such system is SemRep, which extracts propositional content in the form of subject-predicate-object triples called predications. In this study, we investigated the feasibility of assessing the factuality level of SemRep predications to provide more nuanced distinctions between predications for downstream applications. We annotated semantic predications extracted from 500 PubMed abstracts with seven factuality values (fact, probable, possible, doubtful, counterfact, uncommitted, and conditional). We extended a rule-based, compositional approach that uses lexical and syntactic information to predict factuality levels. We compared this approach to a supervised machine learning method that uses a rich feature set based on the annotated corpus. Our results indicate that the compositional approach is more effective than the machine learning method in recognizing the factuality values of predications. The annotated corpus as well as the source code and binaries for factuality assignment are publicly available. We will also incorporate the results of the better performing compositional approach into SemMedDB, a PubMed-scale repository of semantic predications extracted using SemRep.


Assuntos
Pesquisa Biomédica , Mineração de Dados , Humanos , Aprendizado de Máquina , Processamento de Linguagem Natural , Publicações , Semântica
4.
Methods Inf Med ; 55(4): 340-6, 2016 Aug 05.
Artigo em Inglês | MEDLINE | ID: mdl-27435341

RESUMO

OBJECTIVES: Literature-based discovery (LBD) is a text mining methodology for automatically generating research hypotheses from existing knowledge. We mimic the process of LBD as a classification problem on a graph of MeSH terms. We employ unsupervised and supervised link prediction methods for predicting previously unknown connections between biomedical concepts. METHODS: We evaluate the effectiveness of link prediction through a series of experiments using a MeSH network that contains the history of link formation between biomedical concepts. We performed link prediction using proximity measures, such as common neighbor (CN), Jaccard coefficient (JC), Adamic / Adar index (AA) and preferential attachment (PA). Our approach relies on the assumption that similar nodes are more likely to establish a link in the future. RESULTS: Applying an unsupervised approach, the AA measure achieved the best performance in terms of area under the ROC curve (AUC = 0.76), followed by CN, JC, and PA. In a supervised approach, we evaluate whether proximity measures can be combined to define a model of link formation across all four predictors. We applied various classifiers, including decision trees, k-nearest neighbors, logistic regression, multilayer perceptron, naïve Bayes, and random forests. Random forest classifier accomplishes the best performance (AUC = 0.87). CONCLUSIONS: The link prediction approach proved to be effective for LBD processing. Supervised statistical learning approaches clearly outperform an unsupervised approach to link prediction.


Assuntos
Mineração de Dados , Descoberta do Conhecimento , Medical Subject Headings , Algoritmos , Área Sob a Curva
5.
J Med Syst ; 40(8): 185, 2016 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-27318993

RESUMO

We report on our research in using literature-based discovery (LBD) to provide pharmacological and/or pharmacogenomic explanations for reported adverse drug effects. The goal of LBD is to generate novel and potentially useful hypotheses by analyzing the scientific literature and optionally some additional resources. Our assumption is that drugs have effects on some genes or proteins and that these genes or proteins are associated with the observed adverse effects. Therefore, by using LBD we try to find genes or proteins that link the drugs with the reported adverse effects. These genes or proteins can be used to provide insight into the processes causing the adverse effects. Initial results show that our method has the potential to assist in explaining reported adverse drug effects.


Assuntos
Mineração de Dados/métodos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/epidemiologia , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/genética , Farmacogenética/métodos , Farmacovigilância , Humanos
6.
BMC Bioinformatics ; 17: 163, 2016 Apr 14.
Artigo em Inglês | MEDLINE | ID: mdl-27080229

RESUMO

BACKGROUND: Entity coreference is common in biomedical literature and it can affect text understanding systems that rely on accurate identification of named entities, such as relation extraction and automatic summarization. Coreference resolution is a foundational yet challenging natural language processing task which, if performed successfully, is likely to enhance such systems significantly. In this paper, we propose a semantically oriented, rule-based method to resolve sortal anaphora, a specific type of coreference that forms the majority of coreference instances in biomedical literature. The method addresses all entity types and relies on linguistic components of SemRep, a broad-coverage biomedical relation extraction system. It has been incorporated into SemRep, extending its core semantic interpretation capability from sentence level to discourse level. RESULTS: We evaluated our sortal anaphora resolution method in several ways. The first evaluation specifically focused on sortal anaphora relations. Our methodology achieved a F1 score of 59.6 on the test portion of a manually annotated corpus of 320 Medline abstracts, a 4-fold improvement over the baseline method. Investigating the impact of sortal anaphora resolution on relation extraction, we found that the overall effect was positive, with 50 % of the changes involving uninformative relations being replaced by more specific and informative ones, while 35 % of the changes had no effect, and only 15 % were negative. We estimate that anaphora resolution results in changes in about 1.5 % of approximately 82 million semantic relations extracted from the entire PubMed. CONCLUSIONS: Our results demonstrate that a heavily semantic approach to sortal anaphora resolution is largely effective for biomedical literature. Our evaluation and error analysis highlight some areas for further improvements, such as coordination processing and intra-sentential antecedent selection.


Assuntos
Ontologias Biológicas , Bases de Dados Factuais , Processamento de Linguagem Natural , Linguística , Semântica
7.
J Biomed Inform ; 60: 23-37, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26732995

RESUMO

Findings from information-seeking behavior research can inform application development. In this report we provide a system description of Spark, an application based on findings from Serendipitous Knowledge Discovery studies and data structures known as semantic predications. Background information and the previously published IF-SKD model (outlining Serendipitous Knowledge Discovery in online environments) illustrate the potential use of information-seeking behavior in application design. A detailed overview of the Spark system illustrates how methodologies in design and retrieval functionality enable production of semantic predication graphs tailored to evoke Serendipitous Knowledge Discovery in users.


Assuntos
Comportamento de Busca de Informação , Bases de Conhecimento , Aplicações da Informática Médica , Software , Internet , Modelos Teóricos , PubMed , Semântica , Interface Usuário-Computador
8.
AMIA Annu Symp Proc ; 2016: 1238-1247, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-28269921

RESUMO

Words which have different representations but are semantically related, such as dementia and delirium, can pose difficult issues in understanding text. We explore the use of interaction frequency data between semantic elements as a means to differentiate concept pairs, using semantic predications extracted from the biomedical literature. We applied datasets of features drawn from semantic predications for semantically related pairs to two Expectation Maximization clustering processes (without, and with concept labels), then used all data to train and evaluate several concept classifying algorithms. For the unlabeled datasets, 80% displayed expected cluster count and similar or matching proportions; all labeled data exhibited similar or matching proportions when restricting cluster count to unique labels. The highest performing classifier achieved 89% accuracy, with F1 scores for individual concept classification ranging from 0.69 to 1. We conclude with a discussion on how these findings may be applied to natural language processing of clinical text.


Assuntos
Algoritmos , Processamento de Linguagem Natural , Semântica , Terminologia como Assunto , Análise por Conglomerados , Humanos , Software
9.
Stud Health Technol Inform ; 216: 1094, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26262393

RESUMO

Literature-based discovery (LBD) generates discoveries, or hypotheses, by combining what is already known in the literature. Potential discoveries have the form of relations between biomedical concepts; for example, a drug may be determined to treat a disease other than the one for which it was intended. LBD views the knowledge in a domain as a network; a set of concepts along with the relations between them. As a starting point, we used SemMedDB, a database of semantic relations between biomedical concepts extracted with SemRep from Medline. SemMedDB is distributed as a MySQL relational database, which has some problems when dealing with network data. We transformed and uploaded SemMedDB into the Neo4j graph database, and implemented the basic LBD discovery algorithms with the Cypher query language. We conclude that storing the data needed for semantic LBD is more natural in a graph database. Also, implementing LBD discovery algorithms is conceptually simpler with a graph query language when compared with standard SQL.


Assuntos
Mineração de Dados/métodos , Bases de Dados Factuais , Processamento de Linguagem Natural , Publicações Periódicas como Assunto , Terminologia como Assunto , Vocabulário Controlado , Sistemas de Gerenciamento de Base de Dados , Aprendizado de Máquina , Semântica
10.
J Biomed Semantics ; 6: 25, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25992264

RESUMO

OBJECTIVE: Mild traumatic brain injury (mTBI) has high prevalence in the military, among athletes, and in the general population worldwide (largely due to falls). Consequences can include a range of neuropsychological disorders. Unfortunately, such neural injury often goes undiagnosed due to the difficulty in identifying symptoms, so the discovery of an effective biomarker would greatly assist diagnosis; however, no single biomarker has been identified. We identify several body substances as potential components of a panel of biomarkers to support the diagnosis of mild traumatic brain injury. METHODS: Our approach to diagnostic biomarker discovery combines ideas and techniques from systems medicine, natural language processing, and graph theory. We create a molecular interaction network that represents neural injury and is composed of relationships automatically extracted from the literature. We retrieve citations related to neurological injury and extract relationships (semantic predications) that contain potential biomarkers. After linking all relationships together to create a network representing neural injury, we filter the network by relationship frequency and concept connectivity to reduce the set to a manageable size of higher interest substances. RESULTS: 99,437 relevant citations yielded 26,441 unique relations. 18,085 of these contained a potential biomarker as subject or object with a total of 6246 unique concepts. After filtering by graph metrics, the set was reduced to 1021 relationships with 49 unique concepts, including 17 potential biomarkers. CONCLUSION: We created a network of relationships containing substances derived from 99,437 citations and filtered using graph metrics to provide a set of 17 potential biomarkers. We discuss the interaction of several of these (glutamate, glucose, and lactate) as the basis for more effective diagnosis than is currently possible. This method provides an opportunity to focus the effort of wet bench research on those substances with the highest potential as biomarkers for mTBI.

11.
J Biomed Inform ; 54: 141-57, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25661592

RESUMO

BACKGROUND: Literature-based discovery (LBD) is characterized by uncovering hidden associations in non-interacting scientific literature. Prior approaches to LBD include use of: (1) domain expertise and structured background knowledge to manually filter and explore the literature, (2) distributional statistics and graph-theoretic measures to rank interesting connections, and (3) heuristics to help eliminate spurious connections. However, manual approaches to LBD are not scalable and purely distributional approaches may not be sufficient to obtain insights into the meaning of poorly understood associations. While several graph-based approaches have the potential to elucidate associations, their effectiveness has not been fully demonstrated. A considerable degree of a priori knowledge, heuristics, and manual filtering is still required. OBJECTIVES: In this paper we implement and evaluate a context-driven, automatic subgraph creation method that captures multifaceted complex associations between biomedical concepts to facilitate LBD. Given a pair of concepts, our method automatically generates a ranked list of subgraphs, which provide informative and potentially unknown associations between such concepts. METHODS: To generate subgraphs, the set of all MEDLINE articles that contain either of the two specified concepts (A, C) are first collected. Then binary relationships or assertions, which are automatically extracted from the MEDLINE articles, called semantic predications, are used to create a labeled directed predications graph. In this predications graph, a path is represented as a sequence of semantic predications. The hierarchical agglomerative clustering (HAC) algorithm is then applied to cluster paths that are bounded by the two concepts (A, C). HAC relies on implicit semantics captured through Medical Subject Heading (MeSH) descriptors, and explicit semantics from the MeSH hierarchy, for clustering. Paths that exceed a threshold of semantic relatedness are clustered into subgraphs based on their shared context. Finally, the automatically generated clusters are provided as a ranked list of subgraphs. RESULTS: The subgraphs generated using this approach facilitated the rediscovery of 8 out of 9 existing scientific discoveries. In particular, they directly (or indirectly) led to the recovery of several intermediates (or B-concepts) between A- and C-terms, while also providing insights into the meaning of the associations. Such meaning is derived from predicates between the concepts, as well as the provenance of the semantic predications in MEDLINE. Additionally, by generating subgraphs on different thematic dimensions (such as Cellular Activity, Pharmaceutical Treatment and Tissue Function), the approach may enable a broader understanding of the nature of complex associations between concepts. Finally, in a statistical evaluation to determine the interestingness of the subgraphs, it was observed that an arbitrary association is mentioned in only approximately 4 articles in MEDLINE on average. CONCLUSION: These results suggest that leveraging the implicit and explicit semantics provided by manually assigned MeSH descriptors is an effective representation for capturing the underlying context of complex associations, along multiple thematic dimensions in LBD situations.


Assuntos
Análise por Conglomerados , Mineração de Dados/métodos , Descoberta do Conhecimento/métodos , Algoritmos , Bases de Dados Factuais , Humanos , Medical Subject Headings , Modelos Teóricos , Semântica
12.
BMC Bioinformatics ; 16: 6, 2015 Jan 16.
Artigo em Inglês | MEDLINE | ID: mdl-25592675

RESUMO

BACKGROUND: The proliferation of the scientific literature in the field of biomedicine makes it difficult to keep abreast of current knowledge, even for domain experts. While general Web search engines and specialized information retrieval (IR) systems have made important strides in recent decades, the problem of accurate knowledge extraction from the biomedical literature is far from solved. Classical IR systems usually return a list of documents that have to be read by the user to extract relevant information. This tedious and time-consuming work can be lessened with automatic Question Answering (QA) systems, which aim to provide users with direct and precise answers to their questions. In this work we propose a novel methodology for QA based on semantic relations extracted from the biomedical literature. RESULTS: We extracted semantic relations with the SemRep natural language processing system from 122,421,765 sentences, which came from 21,014,382 MEDLINE citations (i.e., the complete MEDLINE distribution up to the end of 2012). A total of 58,879,300 semantic relation instances were extracted and organized in a relational database. The QA process is implemented as a search in this database, which is accessed through a Web-based application, called SemBT (available at http://sembt.mf.uni-lj.si ). We conducted an extensive evaluation of the proposed methodology in order to estimate the accuracy of extracting a particular semantic relation from a particular sentence. Evaluation was performed by 80 domain experts. In total 7,510 semantic relation instances belonging to 2,675 distinct relations were evaluated 12,083 times. The instances were evaluated as correct 8,228 times (68%). CONCLUSIONS: In this work we propose an innovative methodology for biomedical QA. The system is implemented as a Web-based application that is able to provide precise answers to a wide range of questions. A typical question is answered within a few seconds. The tool has some extensions that make it especially useful for interpretation of DNA microarray results.


Assuntos
Indexação e Redação de Resumos , Algoritmos , Armazenamento e Recuperação da Informação , Processamento de Linguagem Natural , Análise de Sequência com Séries de Oligonucleotídeos , Semântica , Software , Bases de Dados Factuais , Humanos , Farmacogenética
13.
Cancer Inform ; 13(Suppl 1): 103-11, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25392688

RESUMO

In this study, we report on the performance of an automated approach to discovery of potential prostate cancer drugs from the biomedical literature. We used the semantic relationships in SemMedDB, a database of structured knowledge extracted from all MEDLINE citations using SemRep, to extract potential relationships using knowledge of cancer drugs pathways. Two cancer drugs pathway schemas were constructed using these relationships extracted from SemMedDB. Through both pathway schemas, we found drugs already used for prostate cancer therapy and drugs not currently listed as the prostate cancer medications. Our study demonstrates that the appropriate linking of relevant structured semantic relationships stored in SemMedDB can support the discovery of potential prostate cancer drugs.

14.
Stud Health Technol Inform ; 205: 579-83, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25160252

RESUMO

Literature-based discovery (LBD) refers to automatic discovery of implicit relations from the scientific literature. Co-occurrence associations between biomedical concepts are commonly used in LBD. These co-occurrences can be represented as a network that consists of a set of nodes representing concepts and a set of edges representing their relationships (or links). In this paper we propose and evaluate a methodology for link prediction of implicit connections in a network of co-occurring Medical Subject Headings (MeSH®). The proposed approach is complementary to, and may augment, existing LBD methods. Link prediction was performed using Jaccard and Adamic-Adar similarity measures. The preliminary results showed high prediction performance, with area under the ROC curve of 0.78 and 0.82 for the two similarity measures, respectively.


Assuntos
Inteligência Artificial , MEDLINE , Medical Subject Headings , Processamento de Linguagem Natural , Reconhecimento Automatizado de Padrão/métodos , Publicações Periódicas como Assunto , Terminologia como Assunto , Projetos Piloto , Semântica
15.
J Biomed Inform ; 52: 293-310, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25046831

RESUMO

Pharmacovigilance involves continually monitoring drug safety after drugs are put to market. To aid this process; algorithms for the identification of strongly correlated drug/adverse drug reaction (ADR) pairs from data sources such as adverse event reporting systems or Electronic Health Records have been developed. These methods are generally statistical in nature, and do not draw upon the large volumes of knowledge embedded in the biomedical literature. In this paper, we investigate the ability of scalable Literature Based Discovery (LBD) methods to identify side effects of pharmaceutical agents. The advantage of LBD methods is that they can provide evidence from the literature to support the plausibility of a drug/ADR association, thereby assisting human review to validate the signal, which is an essential component of pharmacovigilance. To do so, we draw upon vast repositories of knowledge that has been extracted from the biomedical literature by two Natural Language Processing tools, MetaMap and SemRep. We evaluate two LBD methods that scale comfortably to the volume of knowledge available in these repositories. Specifically, we evaluate Reflective Random Indexing (RRI), a model based on concept-level co-occurrence, and Predication-based Semantic Indexing (PSI), a model that encodes the nature of the relationship between concepts to support reasoning analogically about drug-effect relationships. An evaluation set was constructed from the Side Effect Resource 2 (SIDER2), which contains known drug/ADR relations, and models were evaluated for their ability to "rediscover" these relations. In this paper, we demonstrate that both RRI and PSI can recover known drug-adverse event associations. However, PSI performed better overall, and has the additional advantage of being able to recover the literature underlying the reasoning pathways it used to make its predictions.


Assuntos
Mineração de Dados/métodos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Processamento de Linguagem Natural , Semântica , Sistemas de Notificação de Reações Adversas a Medicamentos , Algoritmos , Pesquisa Biomédica , Humanos , MEDLINE , Curva ROC
16.
PLoS One ; 9(7): e102188, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25006672

RESUMO

Concept associations can be represented by a network that consists of a set of nodes representing concepts and a set of edges representing their relationships. Complex networks exhibit some common topological features including small diameter, high degree of clustering, power-law degree distribution, and modularity. We investigated the topological properties of a network constructed from co-occurrences between MeSH descriptors in the MEDLINE database. We conducted the analysis on two networks, one constructed from all MeSH descriptors and another using only major descriptors. Network reduction was performed using the Pearson's chi-square test for independence. To characterize topological properties of the network we adopted some specific measures, including diameter, average path length, clustering coefficient, and degree distribution. For the full MeSH network the average path length was 1.95 with a diameter of three edges and clustering coefficient of 0.26. The Kolmogorov-Smirnov test rejects the power law as a plausible model for degree distribution. For the major MeSH network the average path length was 2.63 edges with a diameter of seven edges and clustering coefficient of 0.15. The Kolmogorov-Smirnov test failed to reject the power law as a plausible model. The power-law exponent was 5.07. In both networks it was evident that nodes with a lower degree exhibit higher clustering than those with a higher degree. After simulated attack, where we removed 10% of nodes with the highest degrees, the giant component of each of the two networks contains about 90% of all nodes. Because of small average path length and high degree of clustering the MeSH network is small-world. A power-law distribution is not a plausible model for the degree distribution. The network is highly modular, highly resistant to targeted and random attack and with minimal dissortativity.


Assuntos
Biologia Computacional/métodos , Medical Subject Headings , Algoritmos , Humanos , Modelos Estatísticos , Análise de Componente Principal
17.
PLoS Comput Biol ; 10(6): e1003666, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24921649

RESUMO

Gene regulatory networks are a crucial aspect of systems biology in describing molecular mechanisms of the cell. Various computational models rely on random gene selection to infer such networks from microarray data. While incorporation of prior knowledge into data analysis has been deemed important, in practice, it has generally been limited to referencing genes in probe sets and using curated knowledge bases. We investigate the impact of augmenting microarray data with semantic relations automatically extracted from the literature, with the view that relations encoding gene/protein interactions eliminate the need for random selection of components in non-exhaustive approaches, producing a more accurate model of cellular behavior. A genetic algorithm is then used to optimize the strength of interactions using microarray data and an artificial neural network fitness function. The result is a directed and weighted network providing the individual contribution of each gene to its target. For testing, we used invasive ductile carcinoma of the breast to query the literature and a microarray set containing gene expression changes in these cells over several time points. Our model demonstrates significantly better fitness than the state-of-the-art model, which relies on an initial random selection of genes. Comparison to the component pathways of the KEGG Pathways in Cancer map reveals that the resulting networks contain both known and novel relationships. The p53 pathway results were manually validated in the literature. 60% of non-KEGG relationships were supported (74% for highly weighted interactions). The method was then applied to yeast data and our model again outperformed the comparison model. Our results demonstrate the advantage of combining gene interactions extracted from the literature in the form of semantic relations with microarray analysis in generating contribution-weighted gene regulatory networks. This methodology can make a significant contribution to understanding the complex interactions involved in cellular behavior and molecular physiology.


Assuntos
Biologia Computacional/métodos , Mineração de Dados/métodos , Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Bases de Conhecimento , Bases de Dados Factuais , Análise de Sequência com Séries de Oligonucleotídeos
18.
J Biomed Inform ; 49: 134-47, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24448204

RESUMO

In this study we report on potential drug-drug interactions between drugs occurring in patient clinical data. Results are based on relationships in SemMedDB, a database of structured knowledge extracted from all MEDLINE citations (titles and abstracts) using SemRep. The core of our methodology is to construct two potential drug-drug interaction schemas, based on relationships extracted from SemMedDB. In the first schema, Drug1 and Drug2 interact through Drug1's effect on some gene, which in turn affects Drug2. In the second, Drug1 affects Gene1, while Drug2 affects Gene2. Gene1 and Gene2, together, then have an effect on some biological function. After checking each drug pair from the medication lists of each of 22 patients, we found 19 known and 62 unknown drug-drug interactions using both schemas. For example, our results suggest that the interaction of Lisinopril, an ACE inhibitor commonly prescribed for hypertension, and the antidepressant sertraline can potentially increase the likelihood and possibly the severity of psoriasis. We also assessed the relationships extracted by SemRep from a linguistic perspective and found that the precision of SemRep was 0.58 for 300 randomly selected sentences from MEDLINE. Our study demonstrates that the use of structured knowledge in the form of relationships from the biomedical literature can support the discovery of potential drug-drug interactions occurring in patient clinical data. Moreover, SemMedDB provides a good knowledge resource for expanding the range of drugs, genes, and biological functions considered as elements in various drug-drug interaction pathways.


Assuntos
Interações Medicamentosas , Semântica , Inibidores da Enzima Conversora de Angiotensina/administração & dosagem , Inibidores da Enzima Conversora de Angiotensina/efeitos adversos , Humanos , Lisinopril/administração & dosagem , Lisinopril/efeitos adversos , Inibidores Seletivos de Recaptação de Serotonina/administração & dosagem , Inibidores Seletivos de Recaptação de Serotonina/efeitos adversos , Sertralina/administração & dosagem , Sertralina/efeitos adversos
19.
AMIA Annu Symp Proc ; 2014: 442-8, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25954348

RESUMO

Adverse drug events account for two million combined injuries, hospitalizations, or deaths each year. Furthermore, there are few comprehensive, up-to-date, and free sources of drug information. Clinical decision support systems may significantly mitigate the number of adverse drug events. However, these systems depend on up-to-date, comprehensive, and codified data to serve as input. The DailyMed website, a resource managed by the FDA and NLM, contains all currently approved drugs. We used a semantic natural language processing approach that successfully extracted information for adverse drug events, at-risk conditions, and susceptible populations from black box warning labels on this site. The precision, recall, and F-score were, 94%, 52%, 0.67 for adverse drug events; 80%, 53%, and 0.64 for conditions; and 95%, 44%, 0.61 for populations. Overall performance was 90% precision, 51% recall, and 0.65 F-Score. Information extracted can be stored in a structured format and may support clinical decision support systems.


Assuntos
Rotulagem de Medicamentos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Processamento de Linguagem Natural , Medicamentos sob Prescrição/efeitos adversos , Estudos de Viabilidade , Humanos , Internet , Semântica , Estados Unidos , United States Food and Drug Administration
20.
Biomed Res Int ; 2013: 848952, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24350292

RESUMO

Diabetic retinopathy (DR) is a secondary complication of diabetes associated with retinal neovascularization and represents the leading cause of blindness in the adult population in the developed world. Despite research efforts, the nature of pathogenetic processes leading to DR is still unknown, making development of novel effective treatments difficult. Advances in omic technologies now offer unprecedented insight into global molecular alterations in DR, but identification of novel treatments based on massive amounts of data generated in omic studies still represents a considerable challenge. For this reason, we attempted to facilitate discovery of novel treatments for DR by complementing the interpretation of omic results using the vast body of information existing in the published literature with the literature-based discovery (LBD) approaches. To achieve this, we collected data from transcriptomic studies performed on retinal tissue from animal models of DR, performed a meta-analysis of these datasets and identified altered genes and pathways. Using the SemBT LBD framework, we have determined which therapies could regulate perturbed pathways or that could stabilize the gene expression alterations in DR. We show that by using this approach, we not only could reidentify drugs currently in use or in clinical trials, but also could indicate novel treatment directions for ameliorating neovascularization processes in DR.


Assuntos
Retinopatia Diabética/etiologia , Neovascularização Patológica/genética , Transcriptoma/genética , Animais , Humanos , Camundongos , Ratos , Transdução de Sinais/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA