Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
BMC Bioinformatics ; 24(1): 227, 2023 Jun 02.
Artículo en Inglés | MEDLINE | ID: mdl-37268890

RESUMEN

BACKGROUND: Entity normalization is an important information extraction task which has recently gained attention, particularly in the clinical/biomedical and life science domains. On several datasets, state-of-the-art methods perform rather well on popular benchmarks. Yet, we argue that the task is far from resolved. RESULTS: We have selected two gold standard corpora and two state-of-the-art methods to highlight some evaluation biases. We present non-exhaustive initial findings on the existence of evaluation problems of the entity normalization task. CONCLUSIONS: Our analysis suggests better evaluation practices to support the methodological research in this field.


Asunto(s)
Disciplinas de las Ciencias Biológicas , Almacenamiento y Recuperación de la Información , Proyectos de Investigación , Sesgo , Procesamiento de Lenguaje Natural
2.
BMC Bioinformatics ; 21(Suppl 23): 579, 2020 Dec 29.
Artículo en Inglés | MEDLINE | ID: mdl-33372606

RESUMEN

BACKGROUND: Entity normalization is an important information extraction task which has gained renewed attention in the last decade, particularly in the biomedical and life science domains. In these domains, and more generally in all specialized domains, this task is still challenging for the latest machine learning-based approaches, which have difficulty handling highly multi-class and few-shot learning problems. To address this issue, we propose C-Norm, a new neural approach which synergistically combines standard and weak supervision, ontological knowledge integration and distributional semantics. RESULTS: Our approach greatly outperforms all methods evaluated on the Bacteria Biotope datasets of BioNLP Open Shared Tasks 2019, without integrating any manually-designed domain-specific rules. CONCLUSIONS: Our results show that relatively shallow neural network methods can perform well in domains that present highly multi-class and few-shot learning problems.


Asunto(s)
Algoritmos , Redes Neurales de la Computación , Bacterias/metabolismo , Intervalos de Confianza , Bases de Datos como Asunto , Ecosistema , Humanos , Conocimiento , Aprendizaje Automático , Fenotipo
3.
Genomics Inform ; 17(2): e20, 2019 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-31307135

RESUMEN

Entity normalization, or entity linking in the general domain, is an information extraction task that aims to annotate/bind multiple words/expressions in raw text with semantic references, such as concepts of an ontology. An ontology consists minimally of a formally organized vocabulary or hierarchy of terms, which captures knowledge of a domain. Presently, machine-learning methods, often coupled with distributional representations, achieve good performance. However, these require large training datasets, which are not always available, especially for tasks in specialized domains. CONTES (CONcept-TErm System) is a supervised method that addresses entity normalization with ontology concepts using small training datasets. CONTES has some limitations, such as it does not scale well with very large ontologies, it tends to overgeneralize predictions, and it lacks valid representations for the out-of-vocabulary words. Here, we propose to assess different methods to reduce the dimensionality in the representation of the ontology. We also propose to calibrate parameters in order to make the predictions more accurate, and to address the problem of out-of-vocabulary words, with a specific method.

4.
PLoS Comput Biol ; 14(3): e1005992, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-29543809

RESUMEN

We present a new educational initiative called Meet-U that aims to train students for collaborative work in computational biology and to bridge the gap between education and research. Meet-U mimics the setup of collaborative research projects and takes advantage of the most popular tools for collaborative work and of cloud computing. Students are grouped in teams of 4-5 people and have to realize a project from A to Z that answers a challenging question in biology. Meet-U promotes "coopetition," as the students collaborate within and across the teams and are also in competition with each other to develop the best final product. Meet-U fosters interactions between different actors of education and research through the organization of a meeting day, open to everyone, where the students present their work to a jury of researchers and jury members give research seminars. This very unique combination of education and research is strongly motivating for the students and provides a formidable opportunity for a scientific community to unite and increase its visibility. We report on our experience with Meet-U in two French universities with master's students in bioinformatics and modeling, with protein-protein docking as the subject of the course. Meet-U is easy to implement and can be straightforwardly transferred to other fields and/or universities. All the information and data are available at www.meet-u.org.


Asunto(s)
Biología Computacional/educación , Biología Computacional/métodos , Investigación/educación , Humanos , Proyectos de Investigación , Estudiantes , Universidades
5.
Med Sci (Paris) ; 34(12): 1111-1114, 2018 12.
Artículo en Francés | MEDLINE | ID: mdl-30623769
6.
J Biomed Semantics ; 8(1): 53, 2017 Nov 23.
Artículo en Inglés | MEDLINE | ID: mdl-29169408

RESUMEN

BACKGROUND: High-throughput technologies produce huge amounts of heterogeneous biological data at all cellular levels. Structuring these data together with biological knowledge is a critical issue in biology and requires integrative tools and methods such as bio-ontologies to extract and share valuable information. In parallel, the development of recent whole-cell models using a systemic cell description opened alternatives for data integration. Integrating a systemic cell description within a bio-ontology would help to progress in whole-cell data integration and modeling synergistically. RESULTS: We present BiPON, an ontology integrating a multi-scale systemic representation of bacterial cellular processes. BiPON consists in of two sub-ontologies, bioBiPON and modelBiPON. bioBiPON organizes the systemic description of biological information while modelBiPON describes the mathematical models (including parameters) associated with biological processes. bioBiPON and modelBiPON are related using bridge rules on classes during automatic reasoning. Biological processes are thus automatically related to mathematical models. 37% of BiPON classes stem from different well-established bio-ontologies, while the others have been manually defined and curated. Currently, BiPON integrates the main processes involved in bacterial gene expression processes. CONCLUSIONS: BiPON is a proof of concept of the way to combine formally systems biology and bio-ontology. The knowledge formalization is highly flexible and generic. Most of the known cellular processes, new participants or new mathematical models could be inserted in BiPON. Altogether, BiPON opens up promising perspectives for knowledge integration and sharing and can be used by biologists, systems and computational biologists, and the emerging community of whole-cell modeling.


Asunto(s)
Fenómenos Fisiológicos Bacterianos , Ontologías Biológicas , Biología Computacional/métodos , Bases de Datos Factuales , Células Procariotas/metabolismo , Modelos Biológicos , Semántica , Programas Informáticos , Vocabulario Controlado
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...