Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Más filtros










Intervalo de año de publicación
1.
PLoS One ; 19(1): e0296929, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38277376

RESUMEN

Every day thousands of news are published on the web and filtering tools can be used to extract knowledge on specific topics. The categorization of news into a predefined set of topics is a subject widely studied in the literature, however, most works are restricted to documents in English. In this work, we make two contributions. First, we introduce a Portuguese news dataset collected from WikiNews an open-source media that provide news from different sources. Since there is a lack of datasets for Portuguese, and an existing one is from a single news channel, we aim to introduce a dataset from different news channels. The availability of comprehensive datasets plays a key role in advancing research. Second, we compare different architectures for Portuguese news classification, exploring different text representations (BoW, TF-IDF, Embedding) and classification techniques (SVM, CNN, DJINN, BERT) for documents in Portuguese, covering classical methods and current technologies. We show the trade-off between accuracy and training time for this application. We aim to show the capabilities of available algorithms and the challenges faced in the area.


Asunto(s)
Conjuntos de Datos como Asunto , Internet , Humanos , Algoritmos , Portugal
2.
Artif Intell Rev ; : 1-69, 2023 Jan 31.
Artículo en Inglés | MEDLINE | ID: mdl-36743267

RESUMEN

A huge amount of data is generated daily leading to big data challenges. One of them is related to text mining, especially text classification. To perform this task we usually need a large set of labeled data that can be expensive, time-consuming, or difficult to be obtained. Considering this scenario semi-supervised learning (SSL), the branch of machine learning concerned with using labeled and unlabeled data has expanded in volume and scope. Since no recent survey exists to overview how SSL has been used in text classification, we aim to fill this gap and present an up-to-date review of SSL for text classification. We retrieve 1794 works from the last 5 years from IEEE Xplore, ACM Digital Library, Science Direct, and Springer. Then, 157 articles were selected to be included in this review. We present the application domain, datasets, and languages employed in the works. The text representations and machine learning algorithms. We also summarize and organize the works following a recent taxonomy of SSL. We analyze the percentage of labeled data used, the evaluation metrics, and obtained results. Lastly, we present some limitations and future trends in the area. We aim to provide researchers and practitioners with an outline of the area as well as useful information for their current research.

3.
IEEE Trans Neural Netw Learn Syst ; 33(8): 3522-3532, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-33539304

RESUMEN

Link prediction (LP) in networks aims at determining future interactions among elements; it is a critical machine-learning tool in different domains, ranging from genomics to social networks to marketing, especially in e-commerce recommender systems. Although many LP techniques have been developed in the prior art, most of them consider only static structures of the underlying networks, rarely incorporating the network's information flow. Exploiting the impact of dynamic streams, such as information diffusion, is still an open research topic for LP. Information diffusion allows nodes to receive information beyond their social circles, which, in turn, can influence the creation of new links. In this work, we analyze the LP effects through two diffusion approaches, susceptible-infected-recovered and independent cascade. As a result, we propose the progressive-diffusion (PD) method for LP based on nodes' propagation dynamics. The proposed model leverages a stochastic discrete-time rumor model centered on each node's propagation dynamics. It presents low-memory and low-processing footprints and is amenable to parallel and distributed processing implementation. Finally, we also introduce an evaluation metric for LP methods considering both the information diffusion capacity and the LP accuracy. Experimental results on a series of benchmarks attest to the proposed method's effectiveness compared with the prior art in both criteria.


Asunto(s)
Aprendizaje Automático , Redes Neurales de la Computación , Difusión , Genómica
4.
J. health inform ; 13(4): 139-144, out.-dez. 2021. ilus, tab
Artículo en Portugués | LILACS | ID: biblio-1359310

RESUMEN

Objetivo: Esse estudo objetivou levantar e caracterizar as aplicações de healthbots em língua portuguesa, considerando seus papéis na transformação digital da jornada do paciente. Métodos: Revisão de literatura narrativa pela qual se investigou a acessibilidade e a objetividade das aplicações, tendo o paciente como usuário final. Os artigos foram analisados quanto ao uso de bots, tecnologias da informação e dispositivos utilizados, objetivo das aplicações, área médica de intervenção e disciplinaridade no desenvolvimento das soluções. Resultados: De treze artigos selecionados na busca contendo aplicações com automatização de tarefas, apenas cinco descreveram a utilização de bots. Conclusão: Os healthbots possuem potencial para promover o aprimoramento da jornada do paciente. Contudo, o desenvolvimento e o emprego de tais aplicações ainda não estão difundidos no Brasil.


Objective: This study aimed to raise and characterize the applications of healthbots in Portuguese, considering their roles in the digital transformation of the patient's journey. Methods: Review of narrative literature through which the accessibility and objectivity of the applications were investigated, with the patient as the end user. The articles were analyzed regarding the use of bots, information technologies and devices used, purpose of applications, medical area of intervention and disciplinary action in the development of solutions. Results: Of thirteen articles selected in the search containing applications with task automation, only five described the use of bots. Conclusion: Healthbots have the potential to improve the patient journey. However, the development and use of such applications are still not widespread in Brazil.


Objetivo: Este estudio tuvo como objetivo plantear y caracterizar las aplicaciones de los healthbots en portugués, considerando sus roles en la transformación digital del viaje del paciente. Métodos: Revisión de literatura narrativa mediante la cual se investigó la accesibilidad y objetividad de las aplicaciones, con el paciente como usuario final. Los artículos fueron analizados en cuanto al uso de bots, tecnologías y dispositivos de información utilizados, finalidad de las aplicaciones, área médica de intervención y acción disciplinaria en el desarrollo de soluciones. Resultados: De trece artículos seleccionados en la búsqueda que contienen aplicaciones con automatización de tareas, solo cinco describieron el uso de bots. Conclusión: los Healthbots tienen el potencial de mejorar el viaje del paciente. Sin embargo, el desarrollo y uso de tales aplicaciones aún no está muy extendido en Brasil.


Asunto(s)
Informática Médica , Telemedicina , Tecnología de la Información , Atención Primaria de Salud , Relaciones Profesional-Paciente , Brasil , Educación a Distancia , Telemonitorización , Telecribado Médico
5.
Appl Soft Comput ; 101: 107057, 2021 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-33519326

RESUMEN

Twitter is a social media platform with more than 500 million users worldwide. It has become a tool for spreading the news, discussing ideas and comments on world events. Twitter is also an important source of health-related information, given the amount of news, opinions and information that is shared by both citizens and official sources. It is a challenge identifying interesting and useful content from large text-streams in different languages, few works have explored languages other than English. In this paper, we use topic identification and sentiment analysis to explore a large number of tweets in both countries with a high number of spreading and deaths by COVID-19, Brazil, and the USA. We employ 3,332,565 tweets in English and 3,155,277 tweets in Portuguese to compare and discuss the effectiveness of topic identification and sentiment analysis in both languages. We ranked ten topics and analyzed the content discussed on Twitter for four months providing an assessment of the discourse evolution over time. The topics we identified were representative of the news outlets during April and August in both countries. We contribute to the study of the Portuguese language, to the analysis of sentiment trends over a long period and their relation to announced news, and the comparison of the human behavior in two different geographical locations affected by this pandemic. It is important to understand public reactions, information dissemination and consensus building in all major forms, including social media in different countries.

6.
Sci Rep ; 9(1): 10833, 2019 07 25.
Artículo en Inglés | MEDLINE | ID: mdl-31346237

RESUMEN

Link prediction (LP) permits to infer missing or future connections in a network. The network organization defines how information spreads through the nodes. In turn, the spreading may induce changes in the connections and speed up the network evolution. Although many LP methods have been reported in the literature, as well some methodologies to evaluate them as a classification task or ranking problem, none have systematically investigated the effects on spreading and the structural network evolution. Here, we systematic analyze LP algorithms in a framework concerning: (1) different diffusion process - Epidemics, Information, and Rumor models; (2) which LP method most improve the spreading on the network by the addition of new links; (3) the structural properties of the LP-evolved networks. From extensive numerical simulations with representative existing LP methods on different datasets, we show that spreading improve in evolved scale-free networks with lower shortest-path and structural holes. We also find that properties like triangles, modularity, assortativity, or coreness may not increase the propagation. This work contributes as an overview of LP methods and network evolution and can be used as a practical guide of LP methods selection and evaluation in terms of computational cost, spreading capacity and network structure.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...