Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros

Banco de datos
Tipo de estudio
Tipo del documento
Intervalo de año de publicación
1.
BMC Bioinformatics ; 25(1): 251, 2024 Jul 31.
Artículo en Inglés | MEDLINE | ID: mdl-39085787

RESUMEN

BACKGROUND: Detecting event triggers in biomedical texts, which contain domain knowledge and context-dependent terms, is more challenging than in general-domain texts. Most state-of-the-art models rely mainly on external resources such as linguistic tools and knowledge bases to improve system performance. However, they lack effective mechanisms to obtain semantic clues from label specification and sentence context. Given its success in image classification, label representation learning is a promising approach to enhancing biomedical event trigger detection models by leveraging the rich semantics of pre-defined event type labels. RESULTS: In this paper, we propose the Biomedical Label-based Synergistic representation Learning (BioLSL) model, which effectively utilizes event type labels by learning their correlation with trigger words and enriches the representation contextually. The BioLSL model consists of three modules. Firstly, the Domain-specific Joint Encoding module employs a transformer-based, domain-specific pre-trained architecture to jointly encode input sentences and pre-defined event type labels. Secondly, the Label-based Synergistic Representation Learning module learns the semantic relationships between input texts and event type labels, and generates a Label-Trigger Aware Representation (LTAR) and a Label-Context Aware Representation (LCAR) for enhanced semantic representations. Finally, the Trigger Classification module makes structured predictions, where each label is predicted with respect to its neighbours. We conduct experiments on three benchmark BioNLP datasets, namely MLEE, GE09, and GE11, to evaluate our proposed BioLSL model. Results show that BioLSL has achieved state-of-the-art performance, outperforming the baseline models. CONCLUSIONS: The proposed BioLSL model demonstrates good performance for biomedical event trigger detection without using any external resources. This suggests that label representation learning and context-aware enhancement are promising directions for improving the task. The key enhancement is that BioLSL effectively learns to construct semantic linkages between the event mentions and type labels, which provide the latent information of label-trigger and label-context relationships in biomedical texts. Moreover, additional experiments on BioLSL show that it performs exceptionally well with limited training data under the data-scarce scenarios.


Asunto(s)
Semántica , Procesamiento de Lenguaje Natural , Aprendizaje Automático , Minería de Datos/métodos , Algoritmos
2.
Neural Netw ; 178: 106424, 2024 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-38875934

RESUMEN

In natural language processing, fact verification is a very challenging task, which requires retrieving multiple evidence sentences from a reliable corpus to verify the authenticity of a claim. Although most of the current deep learning methods use the attention mechanism for fact verification, they have not considered imposing attentional constraints on important related words in the claim and evidence sentences, resulting in inaccurate attention for some irrelevant words. In this paper, we propose a syntactic evidence network (SENet) model which incorporates entity keywords, syntactic information and sentence attention for fact verification. The SENet model extracts entity keywords from claim and evidence sentences, and uses a pre-trained syntactic dependency parser to extract the corresponding syntactic sentence structures and incorporates the extracted syntactic information into the attention mechanism for language-driven word representation. In addition, the sentence attention mechanism is applied to obtain a richer semantic representation. We have conducted experiments on the FEVER and UKP Snopes datasets for performance evaluation. Our SENet model has achieved 78.69% in Label Accuracy and 75.63% in FEVER Score on the FEVER dataset. In addition, our SENet model also has achieved 65.0% in precision and 61.2% in macro F1 on the UKP Snopes dataset. The experimental results have shown that our proposed SENet model has outperformed the baseline models and achieved the state-of-the-art performance for fact verification.


Asunto(s)
Procesamiento de Lenguaje Natural , Semántica , Humanos , Aprendizaje Profundo , Redes Neurales de la Computación , Atención/fisiología , Lenguaje
3.
Comput Methods Programs Biomed ; 88(3): 283-94, 2007 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-17983685

RESUMEN

Traditional Chinese Medicine (TCM) has been actively researched through various approaches, including computational techniques. A review on basic elements of TCM is provided to illuminate various challenges and progresses in its study using computational methods. Information on various TCM formulations, in particular resources on databases of TCM formulations and their integration to Western medicine, are analyzed in several facets, such as TCM classifications, types of databases, and mining tools. Aspects of computational TCM diagnosis, namely inspection, auscultation, pulse analysis as well as TCM expert systems are reviewed in term of their benefits and drawbacks. Various approaches on exploring relationships among TCM components and finding genes/proteins relating to TCM symptom complex are also studied. This survey provides a summary on the advance of computational approaches for TCM and will be useful for future knowledge discovery in this area.


Asunto(s)
Medicina Tradicional China , Recolección de Datos , Almacenamiento y Recuperación de la Información
4.
IEEE Trans Cybern ; 47(6): 1562-1575, 2017 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-27352402

RESUMEN

Parallel test paper generation is a biobjective distributed resource optimization problem, which aims to generate multiple similarly optimal test papers automatically according to multiple user-specified assessment criteria. Generating high-quality parallel test papers is challenging due to its NP-hardness in both of the collective objective functions. In this paper, we propose a submodular memetic approximation algorithm for solving this problem. The proposed algorithm is an adaptive memetic algorithm (MA), which exploits the submodular property of the collective objective functions to design greedy-based approximation algorithms for enhancing steps of the multiobjective MA. Synergizing the intensification of submodular local search mechanism with the diversification of the population-based submodular crossover operator, our algorithm can jointly optimize the total quality maximization objective and the fairness quality maximization objective. Our MA can achieve provable near-optimal solutions in a huge search space of large datasets in efficient polynomial runtime. Performance results on various datasets have shown that our algorithm has drastically outperformed the current techniques in terms of paper quality and runtime efficiency.

5.
Artif Intell Med ; 47(2): 105-19, 2009 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-19376690

RESUMEN

OBJECTIVE: Recently, much research has been proposed using nature inspired algorithms to perform complex machine learning tasks. Ant colony optimization (ACO) is one such algorithm based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper investigates ant-based algorithms for gene expression data clustering and associative classification. METHODS AND MATERIAL: An ant-based clustering (Ant-C) and an ant-based association rule mining (Ant-ARM) algorithms are proposed for gene expression data analysis. The proposed algorithms make use of the natural behavior of ants such as cooperation and adaptation to allow for a flexible robust search for a good candidate solution. RESULTS: Ant-C has been tested on the three datasets selected from the Stanford Genomic Resource Database and achieved relatively high accuracy compared to other classical clustering methods. Ant-ARM has been tested on the acute lymphoblastic leukemia (ALL)/acute myeloid leukemia (AML) dataset and generated about 30 classification rules with high accuracy. CONCLUSIONS: Ant-C can generate optimal number of clusters without incorporating any other algorithms such as K-means or agglomerative hierarchical clustering. For associative classification, while a few of the well-known algorithms such as Apriori, FP-growth and Magnum Opus are unable to mine any association rules from the ALL/AML dataset within a reasonable period of time, Ant-ARM is able to extract associative classification rules.


Asunto(s)
Algoritmos , Hormigas , Expresión Génica , Animales , Análisis por Conglomerados , Conducta Alimentaria
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA