Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Syst Rev ; 12(1): 100, 2023 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-37340494

RESUMO

BACKGROUND: Conducting a systematic review demands a significant amount of effort in screening titles and abstracts. To accelerate this process, various tools that utilize active learning have been proposed. These tools allow the reviewer to interact with machine learning software to identify relevant publications as early as possible. The goal of this study is to gain a comprehensive understanding of active learning models for reducing the workload in systematic reviews through a simulation study. METHODS: The simulation study mimics the process of a human reviewer screening records while interacting with an active learning model. Different active learning models were compared based on four classification techniques (naive Bayes, logistic regression, support vector machines, and random forest) and two feature extraction strategies (TF-IDF and doc2vec). The performance of the models was compared for six systematic review datasets from different research areas. The evaluation of the models was based on the Work Saved over Sampling (WSS) and recall. Additionally, this study introduces two new statistics, Time to Discovery (TD) and Average Time to Discovery (ATD). RESULTS: The models reduce the number of publications needed to screen by 91.7 to 63.9% while still finding 95% of all relevant records (WSS@95). Recall of the models was defined as the proportion of relevant records found after screening 10% of of all records and ranges from 53.6 to 99.8%. The ATD values range from 1.4% till 11.7%, which indicate the average proportion of labeling decisions the researcher needs to make to detect a relevant record. The ATD values display a similar ranking across the simulations as the recall and WSS values. CONCLUSIONS: Active learning models for screening prioritization demonstrate significant potential for reducing the workload in systematic reviews. The Naive Bayes + TF-IDF model yielded the best results overall. The Average Time to Discovery (ATD) measures performance of active learning models throughout the entire screening process without the need for an arbitrary cut-off point. This makes the ATD a promising metric for comparing the performance of different models across different datasets.


Assuntos
Aprendizado de Máquina , Software , Humanos , Teorema de Bayes , Revisões Sistemáticas como Assunto , Simulação por Computador
2.
Front Res Metr Anal ; 8: 1178181, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37260784

RESUMO

Introduction: This study examines the performance of active learning-aided systematic reviews using a deep learning-based model compared to traditional machine learning approaches, and explores the potential benefits of model-switching strategies. Methods: Comprising four parts, the study: 1) analyzes the performance and stability of active learning-aided systematic review; 2) implements a convolutional neural network classifier; 3) compares classifier and feature extractor performance; and 4) investigates the impact of model-switching strategies on review performance. Results: Lighter models perform well in early simulation stages, while other models show increased performance in later stages. Model-switching strategies generally improve performance compared to using the default classification model alone. Discussion: The study's findings support the use of model-switching strategies in active learning-based systematic review workflows. It is advised to begin the review with a light model, such as Naïve Bayes or logistic regression, and switch to a heavier classification model based on a heuristic rule when needed.

3.
Data Brief ; 38: 107327, 2021 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-34504913

RESUMO

This data article describes user-generated data of Funda.nl, the largest online housing market website of the Netherlands. The data contain the inflow and outflow of hits (mouse clicks, opening of webpages, etc.) at the municipality level. The municipality of the user defines the origin and the municipality of the property that is viewed defines the destination. The data capture real behavior of the platform users. The flow data are based on 1.1 billion hits that are made by the users of the website in the first six months of 2018. The underlying data are collected by Google Analytics, the web analytics tool of Google. Funda utilizes the data for platform stability, security, product development, etc. The proprietary data of Funda are used to generate the information flows between municipalities. In the full sample we have 148,216 information flows between municipalities in the Netherlands, among which 313 zero flows. The data include subsamples for different types of platform users as user search intentions range from serious to fully recreational. The data enable researchers to analyze housing search behavior from a novel perspective. The data are, for instance, relevant for housing market researchers, digital economists, and economic geographers.

4.
PLoS One ; 16(3): e0247712, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33760839

RESUMO

In this paper we apply a gravity framework to user-generated data of a large online housing market platform. We show that gravity describes the patterns of inflow and outflow of hits (mouse clicks, etc.) from one municipality to another, where the municipality of the user defines the origin and the municipality of the property that is viewed defines the destination. By distinguishing serious searchers from recreational searchers we demonstrate that the gravity framework describes geographic search patterns of both types of users. The results indicate that recreational search is centered more around the user's location than serious search. However, this finding is driven entirely by differences in border effects as there is no difference in the distance effect. By demonstrating that geographic search patterns of both serious and recreational searchers are explained by their physical locations, we present clear evidence that physical location is an important determinant of economic behavior in the virtual realm too.


Assuntos
Comércio/tendências , Comportamento Exploratório , Habitação/tendências , Modelos Psicológicos , Geografia , Habitação/economia , Humanos , Internet , Países Baixos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...