Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
Intervalo de ano de publicação
1.
Int J Med Inform ; 191: 105539, 2024 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-39084086

RESUMO

BACKGROUND: Adverse Drug Events (ADE) are key information present in unstructured portions of Electronic Health Records. These pose a significant challenge in healthcare, ranging from mild discomfort to severe complications, and can impact patient safety and treatment outcomes. METHODS: We explore the influence of domain shift between a set of dummy clinical notes and a real-world hospital corpus of Japanese clinical notes of breast cancer treatment when extracting ADEs from free text. We annotated a subset of the hospital dataset and used it to fine-tune a Named Entity Recognition (NER) model, initially trained with the set of dummy documents. We used increasing amounts of the annotated data and evaluated the impact on the model's performance. Additionally, we examined the extracted information to identify combinations of drugs that are likely to cause ADEs. RESULTS: We show that domain adaptation can significantly improve model performance in the new domain, as by feeding a small subset of 100 documents for the fine-tuning process we saw a 40% improvement in model performance. However, we also noticed diminishing returns when fine-tuning the model with a larger dataset. For instance, by feeding eight times more data, we only saw further 18% improvement in extraction performance. CONCLUSION: While variations in writing style and vocabulary in clinical corpora can significantly impact the quality of NER results. We show that domain adaptation can be of great aid in mitigating these discrepancies and achieving better performance. Yet, while providing in-domain data to a model helps, there are diminishing returns when fine-tuning with large amounts of data.


Assuntos
Neoplasias da Mama , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Humanos , Neoplasias da Mama/tratamento farmacológico , Feminino , Mineração de Dados/métodos
2.
JMIR Med Inform ; 12: e59680, 2024 Jul 02.
Artigo em Inglês | MEDLINE | ID: mdl-38954456

RESUMO

BACKGROUND: Named entity recognition (NER) is a fundamental task in natural language processing. However, it is typically preceded by named entity annotation, which poses several challenges, especially in the clinical domain. For instance, determining entity boundaries is one of the most common sources of disagreements between annotators due to questions such as whether modifiers or peripheral words should be annotated. If unresolved, these can induce inconsistency in the produced corpora, yet, on the other hand, strict guidelines or adjudication sessions can further prolong an already slow and convoluted process. OBJECTIVE: The aim of this study is to address these challenges by evaluating 2 novel annotation methodologies, lenient span and point annotation, aiming to mitigate the difficulty of precisely determining entity boundaries. METHODS: We evaluate their effects through an annotation case study on a Japanese medical case report data set. We compare annotation time, annotator agreement, and the quality of the produced labeling and assess the impact on the performance of an NER system trained on the annotated corpus. RESULTS: We saw significant improvements in the labeling process efficiency, with up to a 25% reduction in overall annotation time and even a 10% improvement in annotator agreement compared to the traditional boundary-strict approach. However, even the best-achieved NER model presented some drop in performance compared to the traditional annotation methodology. CONCLUSIONS: Our findings demonstrate a balance between annotation speed and model performance. Although disregarding boundary information affects model performance to some extent, this is counterbalanced by significant reductions in the annotator's workload and notable improvements in the speed of the annotation process. These benefits may prove valuable in various applications, offering an attractive compromise for developers and researchers.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA