Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 32
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Bioinformatics ; 38(11): 3141-3142, 2022 05 26.
Artículo en Inglés | MEDLINE | ID: mdl-35380605

RESUMEN

SUMMARY: To advance biomedical research, increasingly large amounts of complex data need to be discovered and integrated. This requires syntactic and semantic validation to ensure shared understanding of relevant entities. This article describes the ELIXIR biovalidator, which extends the syntactic validation of the widely used AJV library with ontology-based validation of JSON documents. AVAILABILITY AND IMPLEMENTATION: Source code: https://github.com/elixir-europe/biovalidator, Release: v1.9.1, License: Apache License 2.0, Deployed at: https://www.ebi.ac.uk/biosamples/schema/validator/validate. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Disciplinas de las Ciencias Biológicas , Metadatos , Semántica , Programas Informáticos
2.
Nucleic Acids Res ; 49(D1): D1311-D1320, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33045747

RESUMEN

Open Targets Genetics (https://genetics.opentargets.org) is an open-access integrative resource that aggregates human GWAS and functional genomics data including gene expression, protein abundance, chromatin interaction and conformation data from a wide range of cell types and tissues to make robust connections between GWAS-associated loci, variants and likely causal genes. This enables systematic identification and prioritisation of likely causal variants and genes across all published trait-associated loci. In this paper, we describe the public resources we aggregate, the technology and analyses we use, and the functionality that the portal offers. Open Targets Genetics can be searched by variant, gene or study/phenotype. It offers tools that enable users to prioritise causal variants and genes at disease-associated loci and access systematic cross-disease and disease-molecular trait colocalization analysis across 92 cell types and tissues including the eQTL Catalogue. Data visualizations such as Manhattan-like plots, regional plots, credible sets overlap between studies and PheWAS plots enable users to explore GWAS signals in depth. The integrated data is made available through the web portal, for bulk download and via a GraphQL API, and the software is open source. Applications of this integrated data include identification of novel targets for drug discovery and drug repurposing.


Asunto(s)
Bases de Datos Genéticas , Genoma Humano , Enfermedades Inflamatorias del Intestino/genética , Terapia Molecular Dirigida/métodos , Sitios de Carácter Cuantitativo , Programas Informáticos , Cromatina/química , Cromatina/metabolismo , Conjuntos de Datos como Asunto , Descubrimiento de Drogas/métodos , Reposicionamiento de Medicamentos/métodos , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Enfermedades Inflamatorias del Intestino/tratamiento farmacológico , Enfermedades Inflamatorias del Intestino/metabolismo , Enfermedades Inflamatorias del Intestino/patología , Internet , Fenotipo , Carácter Cuantitativo Heredable
3.
Nucleic Acids Res ; 48(D1): D77-D83, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31665515

RESUMEN

Expression Atlas is EMBL-EBI's resource for gene and protein expression. It sources and compiles data on the abundance and localisation of RNA and proteins in various biological systems and contexts and provides open access to this data for the research community. With the increased availability of single cell RNA-Seq datasets in the public archives, we have now extended Expression Atlas with a new added-value service to display gene expression in single cells. Single Cell Expression Atlas was launched in 2018 and currently includes 123 single cell RNA-Seq studies from 12 species. The website can be searched by genes within or across species to reveal experiments, tissues and cell types where this gene is expressed or under which conditions it is a marker gene. Within each study, cells can be visualized using a pre-calculated t-SNE plot and can be coloured by different features or by cell clusters based on gene expression. Within each experiment, there are links to downloadable files, such as RNA quantification matrices, clustering results, reports on protocols and associated metadata, such as assigned cell types.


Asunto(s)
Biología Computacional/métodos , Bases de Datos de Ácidos Nucleicos , Perfilación de la Expresión Génica , Programas Informáticos , Perfilación de la Expresión Génica/métodos , Especificidad de Órganos , Análisis de la Célula Individual/métodos , Interfaz Usuario-Computador
4.
Nucleic Acids Res ; 48(D1): D704-D715, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31701156

RESUMEN

In biology and biomedicine, relating phenotypic outcomes with genetic variation and environmental factors remains a challenge: patient phenotypes may not match known diseases, candidate variants may be in genes that haven't been characterized, research organisms may not recapitulate human or veterinary diseases, environmental factors affecting disease outcomes are unknown or undocumented, and many resources must be queried to find potentially significant phenotypic associations. The Monarch Initiative (https://monarchinitiative.org) integrates information on genes, variants, genotypes, phenotypes and diseases in a variety of species, and allows powerful ontology-based search. We develop many widely adopted ontologies that together enable sophisticated computational analysis, mechanistic discovery and diagnostics of Mendelian diseases. Our algorithms and tools are widely used to identify animal models of human disease through phenotypic similarity, for differential diagnostics and to facilitate translational research. Launched in 2015, Monarch has grown with regards to data (new organisms, more sources, better modeling); new API and standards; ontologies (new Mondo unified disease ontology, improvements to ontologies such as HPO and uPheno); user interface (a redesigned website); and community development. Monarch data, algorithms and tools are being used and extended by resources such as GA4GH and NCATS Translator, among others, to aid mechanistic discovery and diagnostics.


Asunto(s)
Biología Computacional/métodos , Genotipo , Fenotipo , Algoritmos , Animales , Ontologías Biológicas , Bases de Datos Genéticas , Exoma , Estudios de Asociación Genética , Variación Genética , Genómica , Humanos , Internet , Programas Informáticos , Investigación Biomédica Traslacional , Interfaz Usuario-Computador
5.
PLoS Biol ; 15(6): e2001414, 2017 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-28662064

RESUMEN

In many disciplines, data are highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such databases depends on the discipline of data science and on the humble bricks and mortar that make integration possible; identifiers are a core component of this integration infrastructure. Drawing on our experience and on work by other groups, we outline 10 lessons we have learned about the identifier qualities and best practices that facilitate large-scale data integration. Specifically, we propose actions that identifier practitioners (database providers) should take in the design, provision and reuse of identifiers. We also outline the important considerations for those referencing identifiers in various circumstances, including by authors and data generators. While the importance and relevance of each lesson will vary by context, there is a need for increased awareness about how to avoid and manage common identifier problems, especially those related to persistence and web-accessibility/resolvability. We focus strongly on web-based identifiers in the life sciences; however, the principles are broadly relevant to other disciplines.


Asunto(s)
Disciplinas de las Ciencias Biológicas/métodos , Biología Computacional/métodos , Minería de Datos/métodos , Diseño de Software , Programas Informáticos , Disciplinas de las Ciencias Biológicas/estadística & datos numéricos , Disciplinas de las Ciencias Biológicas/tendencias , Biología Computacional/tendencias , Minería de Datos/estadística & datos numéricos , Minería de Datos/tendencias , Bases de Datos Factuales/estadística & datos numéricos , Bases de Datos Factuales/tendencias , Predicción , Humanos , Internet
6.
Nucleic Acids Res ; 44(D1): D746-52, 2016 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-26481351

RESUMEN

Expression Atlas (http://www.ebi.ac.uk/gxa) provides information about gene and protein expression in animal and plant samples of different cell types, organism parts, developmental stages, diseases and other conditions. It consists of selected microarray and RNA-sequencing studies from ArrayExpress, which have been manually curated, annotated with ontology terms, checked for high quality and processed using standardised analysis methods. Since the last update, Atlas has grown seven-fold (1572 studies as of August 2015), and incorporates baseline expression profiles of tissues from Human Protein Atlas, GTEx and FANTOM5, and of cancer cell lines from ENCODE, CCLE and Genentech projects. Plant studies constitute a quarter of Atlas data. For genes of interest, the user can view baseline expression in tissues, and differential expression for biologically meaningful pairwise comparisons-estimated using consistent methodology across all of Atlas. Our first proteomics study in human tissues is now displayed alongside transcriptomics data in the same tissues. Novel analyses and visualisations include: 'enrichment' in each differential comparison of GO terms, Reactome, Plant Reactome pathways and InterPro domains; hierarchical clustering (by baseline expression) of most variable genes and experimental conditions; and, for a given gene-condition, distribution of baseline expression across biological replicates.


Asunto(s)
Bases de Datos Genéticas , Perfilación de la Expresión Génica , Plantas/metabolismo , Proteínas/metabolismo , Proteómica , Animales , Línea Celular Tumoral , Humanos , Plantas/genética , Interfaz Usuario-Computador
7.
Proteomics ; 17(19)2017 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-28792687

RESUMEN

The availability of user-friendly software to annotate biological datasets and experimental details is becoming essential in data management practices, both in local storage systems and in public databases. The Ontology Lookup Service (OLS, http://www.ebi.ac.uk/ols) is a popular centralized service to query, browse and navigate biomedical ontologies and controlled vocabularies. Recently, the OLS framework has been completely redeveloped (version 3.0), including enhancements in the data model, like the added support for Web Ontology Language based ontologies, among many other improvements. However, the new OLS is not backwards compatible and new software tools are needed to enable access to this widely used framework now that the previous version is no longer available. We here present the OLS Client as a free, open-source Java library to retrieve information from the new version of the OLS. It enables rapid tool creation by providing a robust, pluggable programming interface and common data model to programmatically access the OLS. The library has already been integrated and is routinely used by several bioinformatics resources and related data annotation tools. Secondly, we also introduce an updated version of the OLS Dialog (version 2.0), a Java graphical user interface that can be easily plugged into Java desktop applications to access the OLS. The software and related documentation are freely available at https://github.com/PRIDE-Utilities/ols-client and https://github.com/PRIDE-Toolsuite/ols-dialog.


Asunto(s)
Ontologías Biológicas , Biología Computacional/métodos , Bases de Datos Factuales , Programas Informáticos , Genómica , Humanos , Almacenamiento y Recuperación de la Información , Metabolómica , Proteómica , Interfaz Usuario-Computador
8.
BMC Bioinformatics ; 18(Suppl 17): 557, 2017 12 21.
Artículo en Inglés | MEDLINE | ID: mdl-29322915

RESUMEN

BACKGROUND: The Experimental Factor Ontology (EFO) is an application ontology driven by experimental variables including cell lines to organize and describe the diverse experimental variables and data resided in the EMBL-EBI resources. The Cell Line Ontology (CLO) is an OBO community-based ontology that contains information of immortalized cell lines and relevant experimental components. EFO integrates and extends ontologies from the bio-ontology community to drive a number of practical applications. It is desirable that the community shares design patterns and therefore that EFO reuses the cell line representation from the Cell Line Ontology (CLO). There are, however, challenges to be addressed when developing a common ontology design pattern for representing cell lines in both EFO and CLO. RESULTS: In this study, we developed a strategy to compare and map cell line terms between EFO and CLO. We examined Cellosaurus resources for EFO-CLO cross-references. Text labels of cell lines from both ontologies were verified by biological information axiomatized in each source. The study resulted in the identification 873 EFO-CLO aligned and 344 EFO unique immortalized permanent cell lines. All of these cell lines were updated to CLO and the cell line related information was merged. A design pattern that integrates EFO and CLO was also developed. CONCLUSION: Our study compared, aligned, and synchronized the cell line information between CLO and EFO. The final updated CLO will be examined as the candidate ontology to import and replace eligible EFO cell line classes thereby supporting the interoperability in the bio-ontology domain. Our mapping pipeline illustrates the use of ontology in aiding biological data standardization and integration through the biological and semantics content of cell lines.


Asunto(s)
Algoritmos , Ontologías Biológicas , Fenómenos Fisiológicos Celulares , Biología Computacional/métodos , Bases de Datos Factuales , Perfilación de la Expresión Génica , Línea Celular , Minería de Datos , Humanos , Semántica
9.
Nucleic Acids Res ; 42(Database issue): D926-32, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24304889

RESUMEN

Expression Atlas (http://www.ebi.ac.uk/gxa) is a value-added database providing information about gene, protein and splice variant expression in different cell types, organism parts, developmental stages, diseases and other biological and experimental conditions. The database consists of selected high-quality microarray and RNA-sequencing experiments from ArrayExpress that have been manually curated, annotated with Experimental Factor Ontology terms and processed using standardized microarray and RNA-sequencing analysis methods. The new version of Expression Atlas introduces the concept of 'baseline' expression, i.e. gene and splice variant abundance levels in healthy or untreated conditions, such as tissues or cell types. Differential gene expression data benefit from an in-depth curation of experimental intent, resulting in biologically meaningful 'contrasts', i.e. instances of differential pairwise comparisons between two sets of biological replicates. Other novel aspects of Expression Atlas are its strict quality control of raw experimental data, up-to-date RNA-sequencing analysis methods, expression data at the level of gene sets, as well as genes and a more powerful search interface designed to maximize the biological value provided to the user.


Asunto(s)
Bases de Datos Genéticas , Perfilación de la Expresión Génica , Genómica , Humanos , Internet , Análisis de Secuencia por Matrices de Oligonucleótidos , Proteínas/genética , Proteínas/metabolismo , Isoformas de ARN/metabolismo , Análisis de Secuencia de ARN
10.
Bioinformatics ; 30(9): 1338-9, 2014 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-24413672

RESUMEN

MOTIVATION: Resource description framework (RDF) is an emerging technology for describing, publishing and linking life science data. As a major provider of bioinformatics data and services, the European Bioinformatics Institute (EBI) is committed to making data readily accessible to the community in ways that meet existing demand. The EBI RDF platform has been developed to meet an increasing demand to coordinate RDF activities across the institute and provides a new entry point to querying and exploring integrated resources available at the EBI.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Academias e Institutos , Investigación Biomédica , Internet
11.
BMC Bioinformatics ; 14: 235, 2013 Jul 24.
Artículo en Inglés | MEDLINE | ID: mdl-23883183

RESUMEN

BACKGROUND: Constant technological advances have allowed scientists in biology to migrate from conventional single-omics to multi-omics experimental approaches, challenging bioinformatics to bridge this multi-tiered information. Ongoing research in renal biology is no exception. The results of large-scale and/or high throughput experiments, presenting a wealth of information on kidney disease are scattered across the web. To tackle this problem, we recently presented the KUPKB, a multi-omics data repository for renal diseases. RESULTS: In this article, we describe KUPNetViz, a biological graph exploration tool allowing the exploration of KUPKB data through the visualization of biomolecule interactions. KUPNetViz enables the integration of multi-layered experimental data over different species, renal locations and renal diseases to protein-protein interaction networks and allows association with biological functions, biochemical pathways and other functional elements such as miRNAs. KUPNetViz focuses on the simplicity of its usage and the clarity of resulting networks by reducing and/or automating advanced functionalities present in other biological network visualization packages. In addition, it allows the extrapolation of biomolecule interactions across different species, leading to the formulations of new plausible hypotheses, adequate experiment design and to the suggestion of novel biological mechanisms. We demonstrate the value of KUPNetViz by two usage examples: the integration of calreticulin as a key player in a larger interaction network in renal graft rejection and the novel observation of the strong association of interleukin-6 with polycystic kidney disease. CONCLUSIONS: The KUPNetViz is an interactive and flexible biological network visualization and exploration tool. It provides renal biologists with biological network snapshots of the complex integrated data of the KUPKB allowing the formulation of new hypotheses in a user friendly manner.


Asunto(s)
Internet , Enfermedades Renales Poliquísticas , Programas Informáticos , Animales , Biología Computacional/métodos , Biología Computacional/normas , Modelos Animales de Enfermedad , Humanos , Enfermedades Renales Poliquísticas/diagnóstico , Enfermedades Renales Poliquísticas/genética , Enfermedades Renales Poliquísticas/patología , Motor de Búsqueda
12.
FASEB J ; 26(5): 2145-53, 2012 May.
Artículo en Inglés | MEDLINE | ID: mdl-22345404

RESUMEN

The information gathered from the large number of omics experiments in renal biology is underexplored, as it is scattered over many publications or held in supplemental data. To address this, we have developed an open-source Kidney and Urinary Pathway Knowledge Base (KUPKB) that facilitates simple exploration of these omics data. The KUPKB currently comprises 220 data sets (miRNA, mRNA, proteins, and metabolites) extracted from existing publications or databases. Researchers can explore the integrated data using the iKUP browser, and a simple template is provided to submit new omics data sets to the knowledge base. As an example of iKUP's use, we show how we identified, in silico, calreticulin as a protein induced in human interstitial fibrosis and tubular atrophy (IFTA) in chronic kidney transplant rejection; a link that would have been difficult to establish using existing Web-based tools. Using immunohistochemistry, we validated in vivo this in silico result in human and rat biopsies of IFTA, thus identifying calreticulin as a potential new player in chronic kidney transplant rejection. The KUPKB provides a simple tool that enables users to quickly survey a wide range of omics data sets and has been shown to facilitate rapid hypothesis generation in the context of renal pathophysiology.


Asunto(s)
Bases de Datos Factuales , Internet , Enfermedades Renales/metabolismo , Animales , Calreticulina/metabolismo , Modelos Animales de Enfermedad , Humanos , Inmunohistoquímica , Masculino , Ratas , Ratas Sprague-Dawley
13.
J Biomed Semantics ; 14(1): 6, 2023 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-37264430

RESUMEN

BACKGROUND: The Findable, Accessible, Interoperable and Reusable(FAIR) Principles explicitly require the use of FAIR vocabularies, but what precisely constitutes a FAIR vocabulary remains unclear. Being able to define FAIR vocabularies, identify features of FAIR vocabularies, and provide assessment approaches against the features can guide the development of vocabularies. RESULTS: We differentiate data, data resources and vocabularies used for FAIR, examine the application of the FAIR Principles to vocabularies, align their requirements with the Open Biomedical Ontologies principles, and propose FAIR Vocabulary Features. We also design assessment approaches for FAIR vocabularies by mapping the FVFs with existing FAIR assessment indicators. Finally, we demonstrate how they can be used for evaluating and improving vocabularies using exemplary biomedical vocabularies. CONCLUSIONS: Our work proposes features of FAIR vocabularies and corresponding indicators for assessing the FAIR levels of different types of vocabularies, identifies use cases for vocabulary engineers, and guides the evolution of vocabularies.


Asunto(s)
Ontologías Biológicas , Vocabulario Controlado , Vocabulario
14.
BMC Bioinformatics ; 13 Suppl 1: S5, 2012 Jan 25.
Artículo en Inglés | MEDLINE | ID: mdl-22373396

RESUMEN

BACKGROUND: Ontologies are being developed for the life sciences to standardise the way we describe and interpret the wealth of data currently being generated. As more ontology based applications begin to emerge, tools are required that enable domain experts to contribute their knowledge to the growing pool of ontologies. There are many barriers that prevent domain experts engaging in the ontology development process and novel tools are needed to break down these barriers to engage a wider community of scientists. RESULTS: We present Populous, a tool for gathering content with which to construct an ontology. Domain experts need to add content, that is often repetitive in its form, but without having to tackle the underlying ontological representation. Populous presents users with a table based form in which columns are constrained to take values from particular ontologies. Populated tables are mapped to patterns that can then be used to automatically generate the ontology's content. These forms can be exported as spreadsheets, providing an interface that is much more familiar to many biologists. CONCLUSIONS: Populous's contribution is in the knowledge gathering stage of ontology development; it separates knowledge gathering from the conceptualisation and axiomatisation, as well as separating the user from the standard ontology authoring environments. Populous is by no means a replacement for standard ontology editing tools, but instead provides a useful platform for engaging a wider community of scientists in the mass production of ontology content.


Asunto(s)
Ontologías Biológicas , Biología Computacional/métodos , Bases de Datos Factuales , Semántica , Programas Informáticos , Interfaz Usuario-Computador
16.
Database (Oxford) ; 20222022 05 25.
Artículo en Inglés | MEDLINE | ID: mdl-35616100

RESUMEN

Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for mapping between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Mappings often lack the metadata needed to be correctly interpreted and applied. For example, are two terms equivalent or merely related? Are they narrow or broad matches? Or are they associated in some other way? Such relationships between the mapped terms are often not documented, which leads to incorrect assumptions and makes them hard to use in scenarios that require a high degree of precision (such as diagnostics or risk prediction). Furthermore, the lack of descriptions of how mappings were done makes it hard to combine and reconcile mappings, particularly curated and automated ones. We have developed the Simple Standard for Sharing Ontological Mappings (SSSOM) which addresses these problems by: (i) Introducing a machine-readable and extensible vocabulary to describe metadata that makes imprecision, inaccuracy and incompleteness in mappings explicit. (ii) Defining an easy-to-use simple table-based format that can be integrated into existing data science pipelines without the need to parse or query ontologies, and that integrates seamlessly with Linked Data principles. (iii) Implementing open and community-driven collaborative workflows that are designed to evolve the standard continuously to address changing requirements and mapping practices. (iv) Providing reference tools and software libraries for working with the standard. In this paper, we present the SSSOM standard, describe several use cases in detail and survey some of the existing work on standardizing the exchange of mappings, with the goal of making mappings Findable, Accessible, Interoperable and Reusable (FAIR). The SSSOM specification can be found at http://w3id.org/sssom/spec. Database URL: http://w3id.org/sssom/spec.


Asunto(s)
Metadatos , Web Semántica , Manejo de Datos , Bases de Datos Factuales , Flujo de Trabajo
17.
Nat Genet ; 53(9): 1290-1299, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34493866

RESUMEN

Many gene expression quantitative trait locus (eQTL) studies have published their summary statistics, which can be used to gain insight into complex human traits by downstream analyses, such as fine mapping and co-localization. However, technical differences between these datasets are a barrier to their widespread use. Consequently, target genes for most genome-wide association study (GWAS) signals have still not been identified. In the present study, we present the eQTL Catalogue ( https://www.ebi.ac.uk/eqtl ), a resource of quality-controlled, uniformly re-computed gene expression and splicing QTLs from 21 studies. We find that, for matching cell types and tissues, the eQTL effect sizes are highly reproducible between studies. Although most QTLs were shared between most bulk tissues, we identified a greater diversity of cell-type-specific QTLs from purified cell types, a subset of which also manifested as new disease co-localizations. Our summary statistics are freely available to enable the systematic interpretation of human GWAS associations across many cell types and tissues.


Asunto(s)
Bases de Datos Genéticas , Regulación de la Expresión Génica/genética , Sitios de Carácter Cuantitativo/genética , Carácter Cuantitativo Heredable , Linfocitos T CD4-Positivos/citología , Conjuntos de Datos como Asunto , Estudio de Asociación del Genoma Completo , Humanos , Herencia Multifactorial/genética , Polimorfismo de Nucleótido Simple/genética
18.
BMC Bioinformatics ; 10 Suppl 5: S1, 2009 May 06.
Artículo en Inglés | MEDLINE | ID: mdl-19426458

RESUMEN

BACKGROUND: Ontology construction for any domain is a labour intensive and complex process. Any methodology that can reduce the cost and increase efficiency has the potential to make a major impact in the life sciences. This paper describes an experiment in ontology construction from text for the animal behaviour domain. Our objective was to see how much could be done in a simple and relatively rapid manner using a corpus of journal papers. We used a sequence of pre-existing text processing steps, and here describe the different choices made to clean the input, to derive a set of terms and to structure those terms in a number of hierarchies. We describe some of the challenges, especially that of focusing the ontology appropriately given a starting point of a heterogeneous corpus. RESULTS: Using mainly automated techniques, we were able to construct an 18055 term ontology-like structure with 73% recall of animal behaviour terms, but a precision of only 26%. We were able to clean unwanted terms from the nascent ontology using lexico-syntactic patterns that tested the validity of term inclusion within the ontology. We used the same technique to test for subsumption relationships between the remaining terms to add structure to the initially broad and shallow structure we generated. All outputs are available at http://thirlmere.aston.ac.uk/~kiffer/animalbehaviour/. CONCLUSION: We present a systematic method for the initial steps of ontology or structured vocabulary construction for scientific domains that requires limited human effort and can make a contribution both to ontology learning and maintenance. The method is useful both for the exploration of a scientific domain and as a stepping stone towards formally rigourous ontologies. The filtering of recognised terms from a heterogeneous corpus to focus upon those that are the topic of the ontology is identified to be one of the main challenges for research in ontology learning.


Asunto(s)
Biología Computacional/métodos , Vocabulario Controlado , Algoritmos , Animales , Sistemas de Administración de Bases de Datos , Almacenamiento y Recuperación de la Información , Reconocimiento de Normas Patrones Automatizadas
19.
BMC Bioinformatics ; 10 Suppl 10: S14, 2009 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-19796398

RESUMEN

BACKGROUND: Semantically-enriched browsing has enhanced the browsing experience by providing contextualized dynamically generated Web content, and quicker access to searched-for information. However, adoption of Semantic Web technologies is limited and user perception from the non-IT domain sceptical. Furthermore, little attention has been given to evaluating semantic browsers with real users to demonstrate the enhancements and obtain valuable feedback. The Sealife project investigates semantic browsing and its application to the life science domain. Sealife's main objective is to develop the notion of context-based information integration by extending three existing Semantic Web browsers (SWBs) to link the existing Web to the eScience infrastructure. METHODS: This paper describes a user-centred evaluation framework that was developed to evaluate the Sealife SWBs that elicited feedback on users' perceptions on ease of use and information findability. Three sources of data: i) web server logs; ii) user questionnaires; and iii) semi-structured interviews were analysed and comparisons made between each browser and a control system. RESULTS: It was found that the evaluation framework used successfully elicited users' perceptions of the three distinct SWBs. The results indicate that the browser with the most mature and polished interface was rated higher for usability, and semantic links were used by the users of all three browsers. CONCLUSION: Confirmation or contradiction of our original hypotheses with relation to SWBs is detailed along with observations of implementation issues.


Asunto(s)
Biología Computacional/métodos , Internet , Semántica , Interfaz Usuario-Computador , Bases de Datos Factuales
20.
Drug Discov Today ; 24(10): 2068-2075, 2019 10.
Artículo en Inglés | MEDLINE | ID: mdl-31158512

RESUMEN

In this review, we provide a summary of recent progress in ontology mapping (OM) at a crucial time when biomedical research is under a deluge of an increasing amount and variety of data. This is particularly important for realising the full potential of semantically enabled or enriched applications and for meaningful insights, such as drug discovery, using machine-learning technologies. We discuss challenges and solutions for better ontology mappings, as well as how to select ontologies before their application. In addition, we describe tools and algorithms for ontology mapping, including evaluation of tool capability and quality of mappings. Finally, we outline the requirements for an ontology mapping service (OMS) and the progress being made towards implementation of such sustainable services.


Asunto(s)
Ontologías Biológicas , Descubrimiento de Drogas/métodos , Aprendizaje Automático , Semántica , Algoritmos , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA