Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
J Biomed Inform ; 149: 104578, 2024 01.
Artigo em Inglês | MEDLINE | ID: mdl-38122841

RESUMO

OBJECTIVE: Coreference resolution (CR) is a natural language processing (NLP) task that is concerned with finding all expressions within a single document that refer to the same entity. This makes it crucial in supporting downstream NLP tasks such as summarization, question answering and information extraction. Despite great progress in CR, our experiments have highlighted a substandard performance of the existing open-source CR tools in the clinical domain. We set out to explore some practical solutions to fine-tune their performance on clinical data. METHODS: We first explored the possibility of automatically producing silver standards following the success of such an approach in other clinical NLP tasks. We designed an ensemble approach that leverages multiple models to automatically annotate co-referring mentions. Subsequently, we looked into other ways of incorporating human feedback to improve the performance of an existing neural network approach. We proposed a semi-automatic annotation process to facilitate the manual annotation process. We also compared the effectiveness of active learning relative to random sampling in an effort to further reduce the cost of manual annotation. RESULTS: Our experiments demonstrated that the silver standard approach was ineffective in fine-tuning the CR models. Our results indicated that active learning should also be applied with caution. The semi-automatic annotation approach combined with continued training was found to be well suited for the rapid transfer of CR models under low-resource conditions. The ensemble approach demonstrated a potential to further improve accuracy by leveraging multiple fine-tuned models. CONCLUSION: Overall, we have effectively transferred a general CR model to a clinical domain. Our findings based on extensive experimentation have been summarized into practical suggestions for rapid transferring of CR models across different styles of clinical narratives.


Assuntos
Armazenamento e Recuperação da Informação , Redes Neurais de Computação , Humanos , Processamento de Linguagem Natural , Narração , Pesquisa Empírica
2.
Entropy (Basel) ; 26(6)2024 Jun 19.
Artigo em Inglês | MEDLINE | ID: mdl-38920537

RESUMO

Coreference resolution is a key task in Natural Language Processing. It is difficult to evaluate the similarity of long-span texts, which makes text-level encoding somewhat challenging. This paper first compares the impact of commonly used methods to improve the global information collection ability of the model on the BERT encoding performance. Based on this, a multi-scale context information module is designed to improve the applicability of the BERT encoding model under different text spans. In addition, improving linear separability through dimension expansion. Finally, cross-entropy loss is used as the loss function. After adding BERT and span BERT to the module designed in this article, F1 increased by 0.5% and 0.2%, respectively.

3.
BMC Med Inform Decis Mak ; 22(1): 116, 2022 04 30.
Artigo em Inglês | MEDLINE | ID: mdl-35501781

RESUMO

BACKGROUND: Bio-entity Coreference Resolution (CR) is a vital task in biomedical text mining. An important issue in CR is the differential representation of identical mentions as their similar representations may make the coreference more puzzling. However, when extracting features, existing neural network-based models may bring additional noise to the distinction of identical mentions since they tend to get similar or even identical feature representations. METHODS: We propose a context-aware feature attention model to distinguish similar or identical text units effectively for better resolving coreference. The new model can represent the identical mentions based on different contexts by adaptively exploiting features, which enables the model reduce the text noise and capture the semantic information effectively. RESULTS: The experimental results show that the proposed model brings significant improvements on most of the baseline for coreference resolution and mention detection on the BioNLP dataset and CRAFT-CR dataset. The empirical studies further demonstrate its superior performance on the differential representation and coreferential link of identical mentions. CONCLUSIONS: Identical mentions impose difficulties on the current methods of Bio-entity coreference resolution. Thus, we propose the context-aware feature attention model to better distinguish identical mentions and achieve superior performance on both coreference resolution and mention detection, which will further improve the performance of the downstream tasks.


Assuntos
Mineração de Dados , Semântica , Mineração de Dados/métodos , Humanos , Redes Neurais de Computação
4.
Sensors (Basel) ; 21(3)2021 Jan 30.
Artigo em Inglês | MEDLINE | ID: mdl-33573265

RESUMO

Visual dialog demonstrates several important aspects of multimodal artificial intelligence; however, it is hindered by visual grounding and visual coreference resolution problems. To overcome these problems, we propose the novel neural module network for visual dialog (NMN-VD). NMN-VD is an efficient question-customized modular network model that combines only the modules required for deciding answers after analyzing input questions. In particular, the model includes a Refer module that effectively finds the visual area indicated by a pronoun using a reference pool to solve a visual coreference resolution problem, which is an important challenge in visual dialog. In addition, the proposed NMN-VD model includes a method for distinguishing and handling impersonal pronouns that do not require visual coreference resolution from general pronouns. Furthermore, a new Compare module that effectively handles comparison questions found in visual dialogs is included in the model, as well as a Find module that applies a triple-attention mechanism to solve visual grounding problems between the question and the image. The results of various experiments conducted using a set of large-scale benchmark data verify the efficacy and high performance of our proposed NMN-VD model.

5.
J Biomed Inform ; 60: 309-18, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26925515

RESUMO

BACKGROUND: Coreference resolution is an essential task in information extraction from the published biomedical literature. It supports the discovery of complex information by linking referring expressions such as pronouns and appositives to their referents, which are typically entities that play a central role in biomedical events. Correctly establishing these links allows detailed understanding of all the participants in events, and connecting events together through their shared participants. RESULTS: As an initial step towards the development of a novel coreference resolution system for the biomedical domain, we have categorised the characteristics of coreference relations by type of anaphor as well as broader syntactic and semantic characteristics, and have compared the performance of a domain adaptation of a state-of-the-art general system to published results from domain-specific systems in terms of this categorisation. We also develop a rule-based system for anaphoric coreference resolution in the biomedical domain with simple modules derived from available systems. Our results show that the domain-specific systems outperform the general system overall. Whilst this result is unsurprising, our proposed categorisation enables a detailed quantitative analysis of the system performance. We identify limitations of each system and find that there remain important gaps in the state-of-the-art systems, which are clearly identifiable with respect to the categorisation. CONCLUSION: We have analysed in detail the performance of existing coreference resolution systems for the biomedical literature and have demonstrated that there clear gaps in their coverage. The approach developed in the general domain needs to be tailored for portability to the biomedical domain. The specific framework for class-based error analysis of existing systems that we propose has benefits for identifying specific limitations of those systems. This in turn provides insights for further system development.


Assuntos
Mineração de Dados/métodos , Registros Eletrônicos de Saúde , Idioma , Processamento de Linguagem Natural , Algoritmos , Reações Falso-Negativas , Humanos , Informática Médica , Reconhecimento Automatizado de Padrão , Resolução de Problemas , Publicações , Reprodutibilidade dos Testes , Semântica
6.
J Med Syst ; 40(9): 206, 2016 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-27518854

RESUMO

This paper describes the design of an ellipsis and coreference resolution module integrated in a computerized virtual patient dialogue system. Real medical diagnosis dialogues have been collected and analyzed. Several groups of diagnosis-related concepts were defined and used to construct rules, patterns, and features to detect and resolve ellipsis and coreference. The best F-scores of ellipsis detection and resolution were 89.15 % and 83.40 %, respectively. The best F-scores of phrasal coreference detection and resolution were 93.83 % and 83.40 %, respectively. The accuracy of pronominal anaphora resolution was 92 % for the 3rd-person singular pronouns referring to specific entities, and 97.31 % for other pronouns.


Assuntos
Comunicação , Relações Médico-Paciente , Interface Usuário-Computador , Taiwan
7.
Front Hum Neurosci ; 13: 398, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31803033

RESUMO

In this EEG study, we used pre-registered and exploratory ERP and time-frequency analyses to investigate the resolution of anaphoric and non-anaphoric noun phrases during discourse comprehension. Participants listened to story contexts that described two antecedents, and subsequently read a target sentence with a critical noun phrase that lexically matched one antecedent ('old'), matched two antecedents ('ambiguous'), partially matched one antecedent in terms of semantic features ('partial-match'), or introduced another referent (non-anaphoric, 'new'). After each target sentence, participants judged whether the noun referred back to an antecedent (i.e., an 'old/new' judgment), which was easiest for ambiguous nouns and hardest for partially matching nouns. The noun-elicited N400 ERP component demonstrated initial sensitivity to repetition and semantic overlap, corresponding to repetition and semantic priming effects, respectively. New and partially matching nouns both elicited a subsequent frontal positivity, which suggested that partially matching anaphors may have been processed as new nouns temporarily. ERPs in an even later time window and ERPs time-locked to sentence-final words suggested that new and partially matching nouns had different effects on comprehension, with partially matching nouns incurring additional processing costs up to the end of the sentence. In contrast to the ERP results, the time-frequency results primarily demonstrated sensitivity to noun repetition, and did not differentiate partially matching anaphors from new nouns. In sum, our results show the ERP and time-frequency effects of referent repetition during discourse comprehension, and demonstrate the potentially demanding nature of establishing the anaphoric meaning of a novel noun.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA