Pesquisa | Biblioteca Virtual em Saúde

The future of large language models in fighting emerging outbreaks: lights and shadows.

Rizzo, Alberto; Mensa, Enrico; Giacomelli, Andrea.

Lancet Microbe ; : 100954, 2024 Jul 30.

Artigo em Inglês | MEDLINE | ID: mdl-39094589

Editorial: Information extraction for health documents.

Mensa, Enrico; Martínez Fernández, Paloma; Roller, Roland; Radicioni, Daniele P.

Front Artif Intell ; 6: 1224529, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37396971

Violence detection explanation via semantic roles embeddings.

Mensa, Enrico; Colla, Davide; Dalmasso, Marco; Giustini, Marco; Mamo, Carlo; Pitidis, Alessio; Radicioni, Daniele P.

BMC Med Inform Decis Mak ; 20(1): 263, 2020 10 15.

Artigo em Inglês | MEDLINE | ID: mdl-33059690

RESUMO

BACKGROUND: Emergency room reports pose specific challenges to natural language processing techniques. In this setting, violence episodes on women, elderly and children are often under-reported. Categorizing textual descriptions as containing violence-related injuries (V) vs. non-violence-related injuries (NV) is thus a relevant task to the ends of devising alerting mechanisms to track (and prevent) violence episodes. METHODS: We present VIDES (so dubbed after VIOLENCE DETECTION SYSTEM), a system to detect episodes of violence from narrative texts in emergency room reports. It employs a deep neural network for categorizing textual ER reports data, and complements such output by making explicit which elements corroborate the interpretation of the record as reporting about violence-related injuries. To these ends we designed a novel hybrid technique for filling semantic frames that employs distributed representations of terms herein, along with syntactic and semantic information. The system has been validated on real data annotated with two sorts of information: about the presence vs. absence of violence-related injuries, and about some semantic roles that can be interpreted as major cues for violent episodes, such as the agent that committed violence, the victim, the body district involved, etc.. The employed dataset contains over 150K records annotated with class (V,NV) information, and 200 records with finer-grained information on the aforementioned semantic roles. RESULTS: We used data coming from an Italian branch of the EU-Injury Database (EU-IDB) project, compiled by hospital staff. Categorization figures approach full precision and recall for negative cases and.97 precision and.94 recall on positive cases. As regards as the recognition of semantic roles, we recorded an accuracy varying from.28 to.90 according to the semantic roles involved. Moreover, the system allowed unveiling annotation errors committed by hospital staff. CONCLUSIONS: Explaining systems' results, so to make their output more comprehensible and convincing, is today necessary for AI systems. Our proposal is to combine distributed and symbolic (frame-like) representations as a possible answer to such pressing request for interpretability. Although presently focused on the medical domain, the proposed methodology is general and, in principle, it can be extended to further application areas and categorization tasks.

Assuntos

Processamento de Linguagem Natural , Redes Neurais de Computação , Semântica , Violência , Idoso , Criança , Feminino , Humanos , Itália

Sense identification data: A dataset for lexical semantics.

Colla, Davide; Mensa, Enrico; Radicioni, Daniele P.

Data Brief ; 32: 106267, 2020 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-32984463

RESUMO

Sense Identification is a newly proposed task; in considering a pair of terms to assess their conceptual similarity, human raters are postulated to preliminarily select a sense pair. Senses involved in this pair are those actually subject to similarity rating. The sense identification task is searching for the sense selected during the similarity rating. The sense individuation task is important to investigate strategies and sense inventories underlying human lexical access and, moreover, it is a relevant complement to the semantic similarity task. Individuating which senses are involved in the similarity rating is also crucial in order to fully assess those ratings: if we have no idea of which two senses were retrieved, on which base can we assess the score expressing their semantic proximity? The Sense Identification Dataset (SID) dataset has been built to provide a common experimental ground to systems and approaches dealing with the sense identification task. It is the first dataset specifically designed for experimenting on the mentioned task. The SID dataset was created by manually annotating with sense identifiers the term pairs from an existing dataset, the SemEval-2017 Task 2 English dataset. The original dataset was originally conceived for experimenting on the semantic similarity task, and it contains a score expressing the human similarity rating for each term pair. For each such term pair we added a pair of annotated senses: in particular, senses were annotated such that they are compatible (explicative of) with the existing similarity ratings. The SID dataset contains BabelNet sense identifiers. This sense inventory is a broadly adopted 'naming convention' for word senses, and such identifiers can be easily mapped onto further resources such as WordNet and WikiData, thereby enabling further processing tasks and usages in the Natural Language Processing pipeline.

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA