Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
PLoS One ; 19(5): e0301738, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38701052

RESUMEN

Adapters and Low-Rank Adaptation (LoRA) are parameter-efficient fine-tuning techniques designed to make the training of language models more efficient. Previous results demonstrated that these methods can even improve performance on some classification tasks. This paper complements existing research by investigating how these techniques influence classification performance and computation costs compared to full fine-tuning. We focus specifically on multilingual text classification tasks (genre, framing, and persuasion techniques detection; with different input lengths, number of predicted classes and classification difficulty), some of which have limited training data. In addition, we conduct in-depth analyses of their efficacy across different training scenarios (training on the original multilingual data; on the translations into English; and on a subset of English-only data) and different languages. Our findings provide valuable insights into the applicability of parameter-efficient fine-tuning techniques, particularly for multilabel classification and non-parallel multilingual tasks which are aimed at analysing input texts of varying length.


Asunto(s)
Multilingüismo , Humanos , Lenguaje , Algoritmos
2.
PLoS One ; 16(9): e0256874, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34492073

RESUMEN

The Coronavirus (COVID-19) pandemic has led to a rapidly growing 'infodemic' of health information online. This has motivated the need for accurate semantic search and retrieval of reliable COVID-19 information across millions of documents, in multiple languages. To address this challenge, this paper proposes a novel high precision and high recall neural Multistage BiCross encoder approach. It is a sequential three-stage ranking pipeline which uses the Okapi BM25 retrieval algorithm and transformer-based bi-encoder and cross-encoder to effectively rank the documents with respect to the given query. We present experimental results from our participation in the Multilingual Information Access (MLIA) shared task on COVID-19 multilingual semantic search. The independently evaluated MLIA results validate our approach and demonstrate that it outperforms other state-of-the-art approaches according to nearly all evaluation metrics in cases of both monolingual and bilingual runs.


Asunto(s)
COVID-19/epidemiología , Almacenamiento y Recuperación de la Información/métodos , Algoritmos , Humanos , Lenguaje , Multilingüismo , Semántica
3.
Dement Neuropsychol ; 8(3): 227-235, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-29213908

RESUMEN

Discourse production is an important aspect in the evaluation of brain-injured individuals. We believe that studies comparing the performance of brain-injured subjects with that of healthy controls must use groups with compatible education. A pioneering application of machine learning methods using Brazilian Portuguese for clinical purposes is described, highlighting education as an important variable in the Brazilian scenario. OBJECTIVE: The aims were to describe how to:(i) develop machine learning classifiers using features generated by natural language processing tools to distinguish descriptions produced by healthy individuals into classes based on their years of education; and(ii) automatically identify the features that best distinguish the groups. METHODS: The approach proposed here extracts linguistic features automatically from the written descriptions with the aid of two Natural Language Processing tools: Coh-Metrix-Port and AIC. It also includes nine task-specific features (three new ones, two extracted manually, besides description time; type of scene described - simple or complex; presentation order - which type of picture was described first; and age). In this study, the descriptions by 144 of the subjects studied in Toledo18 were used,which included 200 healthy Brazilians of both genders. RESULTS AND CONCLUSION: A Support Vector Machine (SVM) with a radial basis function (RBF) kernel is the most recommended approach for the binary classification of our data, classifying three of the four initial classes. CfsSubsetEval (CFS) is a strong candidate to replace manual feature selection methods.


Um importante aspecto na avaliação de indivíduos com lesão cerebral é a produção de discurso. Acreditamos que estudos que comparam o desempenho de lesados com grupos de controles sadios devem utilizar grupos com escolaridade compatíveis. Nós apresentamos uma abordagem pioneira ao utilizar métodos de aprendizado de máquina com propósitos clínicos, para o Português do Brasil, destacando a escolaridade como variável de importância no cenário brasileiro. OBJETIVO: Nosso objetivo é descrever como:(i) desenvolver classificadores via aprendizado de máquina, usando features criadas por ferramentas de processamento de línguas naturais, para diferenciar descrições produzidas por indivíduos sadios em classes de anos de escolaridade e(ii) identificar automaticamente as features que melhor distinguem esses grupos. MÉTODOS: A abordagem proposta neste estudo extrai características linguísticas automaticamente a partir das descrições escritas com a ajuda de duas ferramentas de Processamento de Linguagem Natural: Coh-Metrix-Port e AIC. Ela inclui ainda nove features dedicadas à tarefa (três novas, duas extraídas manualmente, além de tempo de descrição; tipo de cena descrita - simples ou complexa; ordem de apresentação das figuras e idade). Neste estudo, foram utilizadas as descrições de 144 indivíduos estudados em Toledo18, que incluiu 200 brasileiros, sadios, de ambos sexos. RESULTADOS E CONCLUSÃO: SMV com kernel RBF é o mais recomendado para a classificação binária dos nossos dados, classificando três das quatro classes iniciais. O método de seleção das features CfsSubsetEval (CSF) é um forte candidato para substituir métodos de seleção manual.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...