Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
1.
J Biomed Inform ; 117: 103733, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33737205

RESUMO

The context of medical conditions is an important feature to consider when processing clinical narratives. NegEx and its extension ConText became the most well-known rule-based systems that allow determining whether a medical condition is negated, historical or experienced by someone other than the patient in English clinical text. In this paper, we present a French adaptation and enrichment of FastContext which is the most recent, n-trie engine-based implementation of the ConText algorithm. We compiled an extensive list of French lexical cues by automatic and manual translation and enrichment. To evaluate French FastContext, we manually annotated the context of medical conditions present in two types of clinical narratives: (i)death certificates and (ii)electronic health records. Results show good performance across different context values on both types of clinical notes (on average 0.93 and 0.86 F1, respectively). Furthermore, French FastContext outperforms previously reported French systems for negation detection when compared on the same datasets and it is the first implementation of contextual temporality and experiencer identification reported for French. Finally, French FastContext has been implemented within the SIFR Annotator: a publicly accessible Web service to annotate French biomedical text data (http://bioportal.lirmm.fr/annotator). To our knowledge, this is the first implementation of a Web-based ConText-like system in a publicly accessible platform allowing non-natural-language-processing experts to both annotate and contextualize medical conditions in clinical notes.


Assuntos
Idioma , Processamento de Linguagem Natural , Algoritmos , Registros Eletrônicos de Saúde , Humanos
2.
BMC Bioinformatics ; 20(Suppl 4): 139, 2019 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-30999867

RESUMO

BACKGROUND: Pharmacogenomics (PGx) studies how genomic variations impact variations in drug response phenotypes. Knowledge in pharmacogenomics is typically composed of units that have the form of ternary relationships gene variant - drug - adverse event. Such a relationship states that an adverse event may occur for patients having the specified gene variant and being exposed to the specified drug. State-of-the-art knowledge in PGx is mainly available in reference databases such as PharmGKB and reported in scientific biomedical literature. But, PGx knowledge can also be discovered from clinical data, such as Electronic Health Records (EHRs), and in this case, may either correspond to new knowledge or confirm state-of-the-art knowledge that lacks "clinical counterpart" or validation. For this reason, there is a need for automatic comparison of knowledge units from distinct sources. RESULTS: In this article, we propose an approach, based on Semantic Web technologies, to represent and compare PGx knowledge units. To this end, we developed PGxO, a simple ontology that represents PGx knowledge units and their components. Combined with PROV-O, an ontology developed by the W3C to represent provenance information, PGxO enables encoding and associating provenance information to PGx relationships. Additionally, we introduce a set of rules to reconcile PGx knowledge, i.e. to identify when two relationships, potentially expressed using different vocabularies and levels of granularity, refer to the same, or to different knowledge units. We evaluated our ontology and rules by populating PGxO with knowledge units extracted from PharmGKB (2701), the literature (65,720) and from discoveries reported in EHR analysis studies (only 10, manually extracted); and by testing their similarity. We called PGxLOD (PGx Linked Open Data) the resulting knowledge base that represents and reconciles knowledge units of those various origins. CONCLUSIONS: The proposed ontology and reconciliation rules constitute a first step toward a more complete framework for knowledge comparison in PGx. In this direction, the experimental instantiation of PGxO, named PGxLOD, illustrates the ability and difficulties of reconciling various existing knowledge sources.


Assuntos
Bases de Conhecimento , Farmacogenética , Mineração de Dados , Bases de Dados Factuais , Registros Eletrônicos de Saúde , Humanos , Bancos de Tecidos
3.
Bioinformatics ; 34(11): 1962-1965, 2018 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-29846492

RESUMO

Summary: Second use of clinical data commonly involves annotating biomedical text with terminologies and ontologies. The National Center for Biomedical Ontology Annotator is a frequently used annotation service, originally designed for biomedical data, but not very suitable for clinical text annotation. In order to add new functionalities to the NCBO Annotator without hosting or modifying the original Web service, we have designed a proxy architecture that enables seamless extensions by pre-processing of the input text and parameters, and post processing of the annotations. We have then implemented enhanced functionalities for annotating and indexing free text such as: scoring, detection of context (negation, experiencer, temporality), new output formats and coarse-grained concept recognition (with UMLS Semantic Groups). In this paper, we present the NCBO Annotator+, a Web service which incorporates these new functionalities as well as a small set of evaluation results for concept recognition and clinical context detection on two standard evaluation tasks (Clef eHealth 2017, SemEval 2014). Availability and implementation: The Annotator+ has been successfully integrated into the SIFR BioPortal platform-an implementation of NCBO BioPortal for French biomedical terminologies and ontologies-to annotate English text. A Web user interface is available for testing and ontology selection (http://bioportal.lirmm.fr/ncbo_annotatorplus); however the Annotator+ is meant to be used through the Web service application programming interface (http://services.bioportal.lirmm.fr/ncbo_annotatorplus). The code is openly available, and we also provide a Docker packaging to enable easy local deployment to process sensitive (e.g. clinical) data in-house (https://github.com/sifrproject). Contact: andon.tchechmedjiev@lirmm.fr. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Ontologias Biológicas , Armazenamento e Recuperação da Informação/métodos , Software , Humanos
4.
BMC Bioinformatics ; 19(1): 405, 2018 Nov 06.
Artigo em Inglês | MEDLINE | ID: mdl-30400805

RESUMO

BACKGROUND: Despite a wide adoption of English in science, a significant amount of biomedical data are produced in other languages, such as French. Yet a majority of natural language processing or semantic tools as well as domain terminologies or ontologies are only available in English, and cannot be readily applied to other languages, due to fundamental linguistic differences. However, semantic resources are required to design semantic indexes and transform biomedical (text)data into knowledge for better information mining and retrieval. RESULTS: We present the SIFR Annotator ( http://bioportal.lirmm.fr/annotator ), a publicly accessible ontology-based annotation web service to process biomedical text data in French. The service, developed during the Semantic Indexing of French Biomedical Data Resources (2013-2019) project is included in the SIFR BioPortal, an open platform to host French biomedical ontologies and terminologies based on the technology developed by the US National Center for Biomedical Ontology. The portal facilitates use and fostering of ontologies by offering a set of services -search, mappings, metadata, versioning, visualization, recommendation- including for annotation purposes. We introduce the adaptations and improvements made in applying the technology to French as well as a number of language independent additional features -implemented by means of a proxy architecture- in particular annotation scoring and clinical context detection. We evaluate the performance of the SIFR Annotator on different biomedical data, using available French corpora -Quaero (titles from French MEDLINE abstracts and EMEA drug labels) and CépiDC (ICD-10 coding of death certificates)- and discuss our results with respect to the CLEF eHealth information extraction tasks. CONCLUSIONS: We show the web service performs comparably to other knowledge-based annotation approaches in recognizing entities in biomedical text and reach state-of-the-art levels in clinical context detection (negation, experiencer, temporality). Additionally, the SIFR Annotator is the first openly web accessible tool to annotate and contextualize French biomedical text with ontology concepts leveraging a dictionary currently made of 28 terminologies and ontologies and 333 K concepts. The code is openly available, and we also provide a Docker packaging for easy local deployment to process sensitive (e.g., clinical) data in-house ( https://github.com/sifrproject ).


Assuntos
Indexação e Redação de Resumos , Ontologias Biológicas , Análise de Dados , Registros de Saúde Pessoal , Informática Médica , Processamento de Linguagem Natural , Semântica , França , Perfilação da Expressão Gênica , Humanos , Armazenamento e Recuperação da Informação , MEDLINE
5.
Sci Data ; 11(1): 479, 2024 May 10.
Artigo em Inglês | MEDLINE | ID: mdl-38730252

RESUMO

This work presents a maturity model for assessing catalogues of semantic artefacts, one of the keystones that permit semantic interoperability of systems. We defined the dimensions and related features to include in the maturity model by analysing the current literature and existing catalogues of semantic artefacts provided by experts. In addition, we assessed 26 different catalogues to demonstrate the effectiveness of the maturity model, which includes 12 different dimensions (Metadata, Openness, Quality, Availability, Statistics, PID, Governance, Community, Sustainability, Technology, Transparency, and Assessment) and 43 related features (or sub-criteria) associated with these dimensions. Such a maturity model is one of the first attempts to provide recommendations for governance and processes for preserving and maintaining semantic artefacts and helps assess/address interoperability challenges.

6.
Front Artif Intell ; 6: 1187090, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37908741

RESUMO

Vegetable crop farmers diversify their production by growing a range of crops during the season on the same plot. Crop diversification and rotation enables farmers to increase their income and crop yields while enhancing their farm sustainability against climatic events and pest attacks. Farmers must plan their agricultural work per year and over successive years. Planning decisions are made on the basis of their experience regarding previous plans. For the purpose of assisting farmers in planning decisions and monitoring, we developed the Crop Planning and Production Process Ontology (C3PO), i.e., a representation of agricultural knowledge and data for diversified crop production. C3PO is composed of eight modules to capture all crop production dimensions and complexity for representing farming practices and constraints. It encodes agricultural processes and farm plot organization and captures common agricultural knowledge. C3PO introduces a representation of technical itineraries, i.e., sequences of technical farming tasks to grow vegetables, from soil identification and seed selection to harvest and storage. C3PO is the backbone of a knowledge graph which aggregates data from heterogeneous related semantic resources, e.g., organism taxonomies, chemicals, reference crop listings, or development stages. C3PO and its knowledge graph are used by the Elzeard enterprise to develop knowledge-based decision support systems for farmers. This article describes how we built C3PO and its knowledge graph-which are both publicly available-and briefly outlines their applications.

7.
Bioinformatics ; 26(14): 1800-1, 2010 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-20505005

RESUMO

SUMMARY: The Unstructured Information Management Architecture (UIMA) framework and web services are emerging as useful tools for integrating biomedical text mining tools. This note describes our work, which wraps the National Center for Biomedical Ontology (NCBO) Annotator-an ontology-based annotation service-to make it available as a component in UIMA workflows. AVAILABILITY: This wrapper is freely available on the web at http://bionlp-uima.sourceforge.net/ as part of the UIMA tools distribution from the Center for Computational Pharmacology (CCP) at the University of Colorado School of Medicine. It has been implemented in Java for support on Mac OS X, Linux and MS Windows.


Assuntos
Mineração de Dados/métodos , Software , Bases de Dados Factuais , Interface Usuário-Computador
8.
Nucleic Acids Res ; 37(Web Server issue): W170-3, 2009 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-19483092

RESUMO

Biomedical ontologies provide essential domain knowledge to drive data integration, information retrieval, data annotation, natural-language processing and decision support. BioPortal (http://bioportal.bioontology.org) is an open repository of biomedical ontologies that provides access via Web services and Web browsers to ontologies developed in OWL, RDF, OBO format and Protégé frames. BioPortal functionality includes the ability to browse, search and visualize ontologies. The Web interface also facilitates community-based participation in the evaluation and evolution of ontology content by providing features to add notes to ontology terms, mappings between terms and ontology reviews based on criteria such as usability, domain coverage, quality of content, and documentation and support. BioPortal also enables integrated search of biomedical data resources such as the Gene Expression Omnibus (GEO), ClinicalTrials.gov, and ArrayExpress, through the annotation and indexing of these resources with ontologies in BioPortal. Thus, BioPortal not only provides investigators, clinicians, and developers 'one-stop shopping' to programmatically access biomedical ontologies, but also provides support to integrate data from a variety of biomedical resources.


Assuntos
Software , Vocabulário Controlado , Indexação e Redação de Resumos , Pesquisa Biomédica , Internet , Processamento de Linguagem Natural , Integração de Sistemas , Interface Usuário-Computador
9.
Web Semant ; 9(3): 316-324, 2011 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-21918645

RESUMO

The volume of publicly available data in biomedicine is constantly increasing. However, these data are stored in different formats and on different platforms. Integrating these data will enable us to facilitate the pace of medical discoveries by providing scientists with a unified view of this diverse information. Under the auspices of the National Center for Biomedical Ontology (NCBO), we have developed the Resource Index-a growing, large-scale ontology-based index of more than twenty heterogeneous biomedical resources. The resources come from a variety of repositories maintained by organizations from around the world. We use a set of over 200 publicly available ontologies contributed by researchers in various domains to annotate the elements in these resources. We use the semantics that the ontologies encode, such as different properties of classes, the class hierarchies, and the mappings between ontologies, in order to improve the search experience for the Resource Index user. Our user interface enables scientists to search the multiple resources quickly and efficiently using domain terms, without even being aware that there is semantics "under the hood."

10.
BMC Bioinformatics ; 10 Suppl 2: S1, 2009 Feb 05.
Artigo em Inglês | MEDLINE | ID: mdl-19208184

RESUMO

The volume of publicly available genomic scale data is increasing. Genomic datasets in public repositories are annotated with free-text fields describing the pathological state of the studied sample. These annotations are not mapped to concepts in any ontology, making it difficult to integrate these datasets across repositories. We have previously developed methods to map text-annotations of tissue microarrays to concepts in the NCI thesaurus and SNOMED-CT. In this work we generalize our methods to map text annotations of gene expression datasets to concepts in the UMLS. We demonstrate the utility of our methods by processing annotations of datasets in the Gene Expression Omnibus. We demonstrate that we enable ontology-based querying and integration of tissue and gene expression microarray data. We enable identification of datasets on specific diseases across both repositories. Our approach provides the basis for ontology-driven data integration for translational research on gene and protein expression data. Based on this work we have built a prototype system for ontology based annotation and indexing of biomedical data. The system processes the text metadata of diverse resource elements such as gene expression data sets, descriptions of radiology images, clinical-trial reports, and PubMed article abstracts to annotate and index them with concepts from appropriate ontologies. The key functionality of this system is to enable users to locate biomedical data resources related to particular ontology concepts.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos , Vocabulário Controlado , Bases de Dados Factuais , Genômica , Armazenamento e Recuperação da Informação , Unified Medical Language System
11.
BMC Bioinformatics ; 10 Suppl 9: S14, 2009 Sep 17.
Artigo em Inglês | MEDLINE | ID: mdl-19761568

RESUMO

The National Center for Biomedical Ontology (NCBO) is developing a system for automated, ontology-based access to online biomedical resources (Shah NH, et al.: Ontology-driven indexing of public datasets for translational bioinformatics. BMC Bioinformatics 2009, 10(Suppl 2):S1). The system's indexing workflow processes the text metadata of diverse resources such as datasets from GEO and ArrayExpress to annotate and index them with concepts from appropriate ontologies. This indexing requires the use of a concept-recognition tool to identify ontology concepts in the resource's textual metadata. In this paper, we present a comparison of two concept recognizers - NLM's MetaMap and the University of Michigan's Mgrep. We utilize a number of data sources and dictionaries to evaluate the concept recognizers in terms of precision, recall, speed of execution, scalability and customizability. Our evaluations demonstrate that Mgrep has a clear edge over MetaMap for large-scale service oriented applications. Based on our analysis we also suggest areas of potential improvements for Mgrep. We have subsequently used Mgrep to build the Open Biomedical Annotator service. The Annotator service has access to a large dictionary of biomedical terms derived from the United Medical Language System (UMLS) and NCBO ontologies. The Annotator also leverages the hierarchical structure of the ontologies and their mappings to expand annotations. The Annotator service is available to the community as a REST Web service for creating ontology-based annotations of their data.


Assuntos
Biologia Computacional/métodos , Indexação e Redação de Resumos , Bases de Dados Factuais , Perfilação da Expressão Gênica , Armazenamento e Recuperação da Informação , Informática Médica , Vocabulário Controlado
12.
PLoS One ; 13(11): e0198270, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30500839

RESUMO

Recent advances in high-throughput technologies have resulted in a tremendous increase in the amount of omics data produced in plant science. This increase, in conjunction with the heterogeneity and variability of the data, presents a major challenge to adopt an integrative research approach. We are facing an urgent need to effectively integrate and assimilate complementary datasets to understand the biological system as a whole. The Semantic Web offers technologies for the integration of heterogeneous data and their transformation into explicit knowledge thanks to ontologies. We have developed the Agronomic Linked Data (AgroLD- www.agrold.org), a knowledge-based system relying on Semantic Web technologies and exploiting standard domain ontologies, to integrate data about plant species of high interest for the plant science community e.g., rice, wheat, arabidopsis. We present some integration results of the project, which initially focused on genomics, proteomics and phenomics. AgroLD is now an RDF (Resource Description Format) knowledge base of 100M triples created by annotating and integrating more than 50 datasets coming from 10 data sources-such as Gramene.org and TropGeneDB-with 10 ontologies-such as the Gene Ontology and Plant Trait Ontology. Our evaluation results show users appreciate the multiple query modes which support different use cases. AgroLD's objective is to offer a domain specific knowledge platform to solve complex biological and agronomical questions related to the implication of genes/proteins in, for instances, plant disease resistance or high yield traits. We expect the resolution of these questions to facilitate the formulation of new scientific hypotheses to be validated with a knowledge-oriented approach.


Assuntos
Agricultura , Genômica , Bases de Conhecimento , Proteômica , Genoma de Planta
13.
Database (Oxford) ; 20182018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-30239679

RESUMO

The future of agricultural research depends on data. The sheer volume of agricultural biological data being produced today makes excellent data management essential. Governmental agencies, publishers and science funders require data management plans for publicly funded research. Furthermore, the value of data increases exponentially when they are properly stored, described, integrated and shared, so that they can be easily utilized in future analyses. AgBioData (https://www.agbiodata.org) is a consortium of people working at agricultural biological databases, data archives and knowledgbases who strive to identify common issues in database development, curation and management, with the goal of creating database products that are more Findable, Accessible, Interoperable and Reusable. We strive to promote authentic, detailed, accurate and explicit communication between all parties involved in scientific data. As a step toward this goal, we present the current state of biocuration, ontologies, metadata and persistence, database platforms, programmatic (machine) access to data, communication and sustainability with regard to data curation. Each section describes challenges and opportunities for these topics, along with recommendations and best practices.


Assuntos
Agricultura , Bases de Dados Genéticas , Genômica , Cruzamento , Ontologia Genética , Metadados , Inquéritos e Questionários
14.
J Biomed Semantics ; 8(1): 21, 2017 Jun 07.
Artigo em Inglês | MEDLINE | ID: mdl-28592275

RESUMO

BACKGROUND: Ontologies and controlled terminologies have become increasingly important in biomedical research. Researchers use ontologies to annotate their data with ontology terms, enabling better data integration and interoperability across disparate datasets. However, the number, variety and complexity of current biomedical ontologies make it cumbersome for researchers to determine which ones to reuse for their specific needs. To overcome this problem, in 2010 the National Center for Biomedical Ontology (NCBO) released the Ontology Recommender, which is a service that receives a biomedical text corpus or a list of keywords and suggests ontologies appropriate for referencing the indicated terms. METHODS: We developed a new version of the NCBO Ontology Recommender. Called Ontology Recommender 2.0, it uses a novel recommendation approach that evaluates the relevance of an ontology to biomedical text data according to four different criteria: (1) the extent to which the ontology covers the input data; (2) the acceptance of the ontology in the biomedical community; (3) the level of detail of the ontology classes that cover the input data; and (4) the specialization of the ontology to the domain of the input data. RESULTS: Our evaluation shows that the enhanced recommender provides higher quality suggestions than the original approach, providing better coverage of the input data, more detailed information about their concepts, increased specialization for the domain of the input data, and greater acceptance and use in the community. In addition, it provides users with more explanatory information, along with suggestions of not only individual ontologies but also groups of ontologies to use together. It also can be customized to fit the needs of different ontology recommendation scenarios. CONCLUSIONS: Ontology Recommender 2.0 suggests relevant ontologies for annotating biomedical text data. It combines the strengths of its predecessor with a range of adjustments and new features that improve its reliability and usefulness. Ontology Recommender 2.0 recommends over 500 biomedical ontologies from the NCBO BioPortal platform, where it is openly available (both via the user interface at http://bioportal.bioontology.org/recommender , and via a Web service API).


Assuntos
Ontologias Biológicas , National Institutes of Health (U.S.) , Semântica , Estados Unidos
15.
F1000Res ; 6: 1843, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29333241

RESUMO

In this article, we present a joint effort of the wheat research community, along with data and ontology experts, to develop wheat data interoperability guidelines. Interoperability is the ability of two or more systems and devices to cooperate and exchange data, and interpret that shared information. Interoperability is a growing concern to the wheat scientific community, and agriculture in general, as the need to interpret the deluge of data obtained through high-throughput technologies grows. Agreeing on common data formats, metadata, and vocabulary standards is an important step to obtain the required data interoperability level in order to add value by encouraging data sharing, and subsequently facilitate the extraction of new information from existing and new datasets. During a period of more than 18 months, the RDA Wheat Data Interoperability Working Group (WDI-WG) surveyed the wheat research community about the use of data standards, then discussed and selected a set of recommendations based on consensual criteria. The recommendations promote standards for data types identified by the wheat research community as the most important for the coming years: nucleotide sequence variants, genome annotations, phenotypes, germplasm data, gene expression experiments, and physical maps. For each of these data types, the guidelines recommend best practices in terms of use of data formats, metadata standards and ontologies. In addition to the best practices, the guidelines provide examples of tools and implementations that are likely to facilitate the adoption of the recommendations. To maximize the adoption of the recommendations, the WDI-WG used a community-driven approach that involved the wheat research community from the start, took into account their needs and practices, and provided them with a framework to keep the recommendations up to date. We also report this approach's potential to be generalizable to other (agricultural) domains.

16.
Stud Health Technol Inform ; 205: 1008-12, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25160340

RESUMO

BACKGROUND: main biomedical information retrieval systems are based on controlled vocabularies and most specifically on terminologies or ontologies (T/O). These classification structures allow indexing, coding, annotating different kind of documents. Many T/O have been created for different purposes and it became a problem for finding specific concepts in the multitude of existing nomenclatures. The NCBO (National Center for Biomedical Ontologies) BioPortal and the CISMeF (Catalogue et Index des Sites Médicaux de langue Française) HeTOP projects have been developed to tackle this issue. OBJECTIVE: the present work consists in comparing both portals. METHODS: we hereby are proposing a set of criteria to compare bio-ontologies portals in terms of goals, features, technologies and usability. RESULTS: BioPortal and HeTOP have been compared based on the given criteria. While both portals are designed to store and make T/O available to the community and are sharing many basic features, they differ on several points mainly because of their basic purposes. CONCLUSION: thanks to the comparison criteria, we can assume that a merge between BioPortal and HeTOP is possible in terms of functionalities. The main difficulties will be about merging the data repositories and applying different policies on T/O content.


Assuntos
Algoritmos , Ontologias Biológicas , Curadoria de Dados/métodos , Documentação/métodos , Armazenamento e Recuperação da Informação/métodos , Processamento de Linguagem Natural , Semântica , Reconhecimento Automatizado de Padrão/métodos
17.
J Biomed Semantics ; 1 Suppl 1: S1, 2010 Jun 22.
Artigo em Inglês | MEDLINE | ID: mdl-20626921

RESUMO

BACKGROUND: Researchers in biomedical informatics use ontologies and terminologies to annotate their data in order to facilitate data integration and translational discoveries. As the use of ontologies for annotation of biomedical datasets has risen, a common challenge is to identify ontologies that are best suited to annotating specific datasets. The number and variety of biomedical ontologies is large, and it is cumbersome for a researcher to figure out which ontology to use. METHODS: We present the Biomedical Ontology Recommender web service. The system uses textual metadata or a set of keywords describing a domain of interest and suggests appropriate ontologies for annotating or representing the data. The service makes a decision based on three criteria. The first one is coverage, or the ontologies that provide most terms covering the input text. The second is connectivity, or the ontologies that are most often mapped to by other ontologies. The final criterion is size, or the number of concepts in the ontologies. The service scores the ontologies as a function of scores of the annotations created using the National Center for Biomedical Ontology (NCBO) Annotator web service. We used all the ontologies from the UMLS Metathesaurus and the NCBO BioPortal. RESULTS: We compare and contrast our Recommender by an exhaustive functional comparison to previously published efforts. We evaluate and discuss the results of several recommendation heuristics in the context of three real world use cases. The best recommendations heuristics, rated 'very relevant' by expert evaluators, are the ones based on coverage and connectivity criteria. The Recommender service (alpha version) is available to the community and is embedded into BioPortal.

18.
AMIA Annu Symp Proc ; 2010: 587-91, 2010 Nov 13.
Artigo em Inglês | MEDLINE | ID: mdl-21347046

RESUMO

Domain specific biomedical lexicons are extensively used by researchers for natural language processing tasks. Currently these lexicons are created manually by expert curators and there is a pressing need for automated methods to compile such lexicons. The Lexicon Builder Web service addresses this need and reduces the investment of time and effort involved in lexicon maintenance. The service has three components: Inclusion - selects one or several ontologies (or its branches) and includes preferred names and synonym terms; Exclusion - filters terms based on the term's Medline frequency, syntactic type, UMLS semantic type and match with stopwords; Output - aggregates information, handles compression and output formats. Evaluation demonstrates that the service has high accuracy and runtime performance. It is currently being evaluated for several use cases to establish its utility in biomedical information processing tasks. The Lexicon Builder promotes collaboration, sharing and standardization of lexicons amongst researchers by automating the creation, maintainence and cross referencing of custom lexicons.


Assuntos
Ontologias Biológicas , Processamento de Linguagem Natural , Humanos , Armazenamento e Recuperação da Informação , MEDLINE , Semântica , Software , Unified Medical Language System
19.
Summit Transl Bioinform ; 2009: 56-60, 2009 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-21347171

RESUMO

The range of publicly available biomedical data is enormous and is expanding fast. This expansion means that researchers now face a hurdle to extracting the data they need from the large numbers of data that are available. Biomedical researchers have turned to ontologies and terminologies to structure and annotate their data with ontology concepts for better search and retrieval. However, this annotation process cannot be easily automated and often requires expert curators. Plus, there is a lack of easy-to-use systems that facilitate the use of ontologies for annotation. This paper presents the Open Biomedical Annotator (OBA), an ontology-based Web service that annotates public datasets with biomedical ontology concepts based on their textual metadata (www.bioontology.org). The biomedical community can use the annotator service to tag datasets automatically with ontology terms (from UMLS and NCBO BioPortal ontologies). Such annotations facilitate translational discoveries by integrating annotated data.[1].

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA