Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Assunto da revista
Intervalo de ano de publicação
1.
J Biomed Inform ; 130: 104050, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35346854

RESUMO

Multi-label classification according to the International Classification of Diseases (ICD) is an Extreme Multi-label Classification task aiming to categorise health records according to a set of relevant ICD codes. We implemented PlaBERT, a new multi-label text classification head with per-label attention, on top of a BERT model. The model assessment is conducted on Electronic Health Records, conveying Discharge Summaries in three languages - English, Spanish, and Swedish. The study focuses on 157 diagnostic codes from the ICD. We additionally measure the labelling noise to estimate the consistency of the gold standard. Our specialised attention mechanism computes attention weights for each input token and label pair, obtaining the specific relevance of every word concerning each ICD code. The PlaBERT model outputs the computed attention importance for each token and label, allowing for visualisation. Our best results are 40.65, 38.36, and 41.13 F1-Score points on the English, Spanish and Swedish datasets, respectively, for the 157 gastrointestinal codes. Besides, Precision is the metric that most significantly improves owing to the attention mechanism of PlaBERT, with an increase of 44.63, 40.93, and 12.92 points, respectively, for the Spanish, Swedish and English datasets.


Assuntos
Classificação Internacional de Doenças , Idioma , Registros Eletrônicos de Saúde , Humanos , Processamento de Linguagem Natural , Alta do Paciente , Suécia
2.
Sensors (Basel) ; 22(11)2022 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-35684615

RESUMO

The linguistic and social impact of multiculturalism can no longer be neglected in any sector, creating the urgent need of creating systems and procedures for managing and sharing cultural heritages in both supranational and multi-literate contexts. In order to achieve this goal, text sensing appears to be one of the most crucial research areas. The long-term objective of the DigitalMaktaba project, born from interdisciplinary collaboration between computer scientists, historians, librarians, engineers and linguists, is to establish procedures for the creation, management and cataloguing of archival heritage in non-Latin alphabets. In this paper, we discuss the currently ongoing design of an innovative workflow and tool in the area of text sensing, for the automatic extraction of knowledge and cataloguing of documents written in non-Latin languages (Arabic, Persian and Azerbaijani). The current prototype leverages different OCR, text processing and information extraction techniques in order to provide both a highly accurate extracted text and rich metadata content (including automatically identified cataloguing metadata), overcoming typical limitations of current state of the art approaches. The initial tests provide promising results. The paper includes a discussion of future steps (e.g., AI-based techniques further leveraging the extracted data/metadata and making the system learn from user feedback) and of the many foreseen advantages of this research, both from a technical and a broader cultural-preservation and sharing point of view.


Assuntos
Armazenamento e Recuperação da Informação , Processamento de Linguagem Natural , Humanos , Idioma
3.
Front Artif Intell ; 3: 35, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33733153

RESUMO

The recent turn to "big data" from social media corpora has enabled sociolinguists to investigate patterns of language variation and change at unprecedented scales. However, research in this paradigm has been slow to address variable phenomena in minority languages, where data scarcity and the absence of computational tools (e.g., taggers, parsers) often present significant barriers to entry. This article analyzes socio-syntactic variation in one minority language variety, Hasidic Yiddish, focusing on a variable for which tokens can be identified in raw text using purely morphological criteria. In non-finite particle verbs, the overt tense marker tsu (cf. English to, German zu) is variably realized either between the preverbal particle and verb (e.g., oyf-tsu-es-n up-to-eat-INF 'to eat up'; the conservative variant) or before both elements (tsu oyf-es-n to up-eat-INF; the innovative variant). Nearly 38,000 tokens of non-finite particle verbs were extracted from the popular Hasidic Yiddish discussion forum Kave Shtiebel (the 'coffee room'; kaveshtiebel.com). A mixed-effects regression analysis reveals that despite a forum-wide favoring effect for the innovative variant, users favor the conservative variant the longer their accounts remain open and active. This process of rapid implicit standardization is supported by ethnographic evidence highlighting the spread of language norms among Hasidic writers on the internet, most of whom did not have the opportunity to express themselves in written Yiddish prior to the advent of social media.

4.
Front Psychol ; 11: 570587, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33192860

RESUMO

The present study investigates linguistics and cognitive effects of bilingualism with a minority language acquired through school medium education. If bilingualism has an effect on cognition and language abilities, regardless of language prestige or opportunities of use, young adult Gaelic-English speakers attending Gaelic medium education (GME) could have an advantage on linguistic and cognitive tasks targeting executive functions. These will be reported, compared to monolingual speakers living in the same area. Furthermore, this study investigates whether there is a difference in Home Speakers of Gaelic (speakers who had acquired the language at home) compared to New Speakers of this language, i.e., whether an immersive context-as the one offered in medium education- compensates for not being native. A group of 23 monolingual English young adult speakers was compared with a group of 25 bilingual speakers attending a GME school since 5 years old. Participants were tested on comprehension of a set of sentences with incremental complexity in English, on their capacity to inhibit a distractor using the Test of Everyday attention (TEA) and on their performance in a Digit Span task. A tendency for a better performance on more complex linguistics and cognitive tasks was reported in bilinguals compared to monolinguals with a further advantage for New Speakers compared to Home Speakers. The study supports the idea that being bilingual in a minority language is as beneficial as speaking any other combination of languages. An immersive context of acquisition can be a good ground for developing advantages on both linguistics and cognitive tasks, with a further advantage for New speakers of the language.

5.
Front Psychol ; 8: 1907, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29163288

RESUMO

This study explores the effects of bilingualism in Sardinian as a regional minority language on the linguistic competence in Italian as the dominant language and on non-linguistic cognitive abilities. Sardinian/Italian adult speakers and monolingual Italian speakers living in the same geographical area of Sardinia were compared in two kinds of tasks: (a) verbal and non-verbal cognitive tasks targeting working memory and attentional control and (b) tasks of linguistic abilities in Italian focused on the comprehension of sentences differing in grammatical complexity. Although no difference was found between bilinguals and monolinguals in the cognitive control of attention, bilinguals performed better on working memory tasks. Bilinguals with lower formal education were found to be faster at comprehension of one type of complex sentence (center embedded object relative clauses). In contrast, bilinguals and monolinguals with higher education showed comparable slower processing of complex sentences. These results show that the effects of bilingualism are modulated by type of language experience and education background: positive effects of active bilingualism on the dominant language are visible in bilinguals with lower education, whereas the effects of higher literacy in Italian obliterate those of active bilingualism in bilinguals and monolinguals with higher education.

6.
Front Psychol ; 6: 1898, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26733903

RESUMO

We report the results of a study which tested receptive Italian grammatical competence and general cognitive abilities in bilingual Italian-Sardinian children and age-matched monolingual Italian children attending the first and second year of primary school in the Nuoro province of Sardinia, where Sardinian is still widely spoken. The results show that across age groups the performance of Sardinian-Italian bilingual children is in most cases indistinguishable from that of monolingual Italian children, in terms of both Italian language skills and general cognitive abilities. However, where there are differences, these emerge gradually over time and are mostly in favor of bilingual children.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA