Pesquisa | BVS Doenças Infecciosas e Parasitárias

Correlations of Cross-Entropy Loss in Machine Learning.

Connor, Richard; Dearle, Alan; Claydon, Ben; Vadicamo, Lucia.

Entropy (Basel) ; 26(6)2024 Jun 03.

Artigo em Inglês | MEDLINE | ID: mdl-38920500

RESUMO

Cross-entropy loss is crucial in training many deep neural networks. In this context, we show a number of novel and strong correlations among various related divergence functions. In particular, we demonstrate that, in some circumstances, (a) cross-entropy is almost perfectly correlated with the little-known triangular divergence, and (b) cross-entropy is strongly correlated with the Euclidean distance over the logits from which the softmax is derived. The consequences of these observations are as follows. First, triangular divergence may be used as a cheaper alternative to cross-entropy. Second, logits can be used as features in a Euclidean space which is strongly synergistic with the classification process. This justifies the use of Euclidean distance over logits as a measure of similarity, in cases where the network is trained using softmax and cross-entropy. We establish these correlations via empirical observation, supported by a mathematical explanation encompassing a number of strongly related divergence functions.

Interactive video retrieval evaluation at a distance: comparing sixteen interactive video search systems in a remote setting at the 10th Video Browser Showdown.

Heller, Silvan; Gsteiger, Viktor; Bailer, Werner; Gurrin, Cathal; Jónsson, Björn Þór; Lokoc, Jakub; Leibetseder, Andreas; Mejzlík, Frantisek; Peska, Ladislav; Rossetto, Luca; Schall, Konstantin; Schoeffmann, Klaus; Schuldt, Heiko; Spiess, Florian; Tran, Ly-Duyen; Vadicamo, Lucia; Veselý, Patrik; Vrochidis, Stefanos; Wu, Jiaxin.

Int J Multimed Inf Retr ; 11(1): 1-18, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35096506

RESUMO

The Video Browser Showdown addresses difficult video search challenges through an annual interactive evaluation campaign attracting research teams focusing on interactive video retrieval. The campaign aims to provide insights into the performance of participating interactive video retrieval systems, tested by selected search tasks on large video collections. For the first time in its ten year history, the Video Browser Showdown 2021 was organized in a fully remote setting and hosted a record number of sixteen scoring systems. In this paper, we describe the competition setting, tasks and results and give an overview of state-of-the-art methods used by the competing systems. By looking at query result logs provided by ten systems, we analyze differences in retrieval model performances and browsing times before a correct submission. Through advances in data gathering methodology and tools, we provide a comprehensive analysis of ad-hoc video search tasks, discuss results, task design and methodological challenges. We highlight that almost all top performing systems utilize some sort of joint embedding for text-image retrieval and enable specification of temporal context in queries for known-item search. Whereas a combination of these techniques drive the currently top performing systems, we identify several future challenges for interactive video search engines and the Video Browser Showdown competition itself.

The VISIONE Video Search System: Exploiting Off-the-Shelf Text Search Engines for Large-Scale Video Retrieval.

Amato, Giuseppe; Bolettieri, Paolo; Carrara, Fabio; Debole, Franca; Falchi, Fabrizio; Gennaro, Claudio; Vadicamo, Lucia; Vairo, Claudio.

J Imaging ; 7(5)2021 Apr 23.

Artigo em Inglês | MEDLINE | ID: mdl-34460672

RESUMO

This paper describes in detail VISIONE, a video search system that allows users to search for videos using textual keywords, the occurrence of objects and their spatial relationships, the occurrence of colors and their spatial relationships, and image similarity. These modalities can be combined together to express complex queries and meet users' needs. The peculiarity of our approach is that we encode all information extracted from the keyframes, such as visual deep features, tags, color and object locations, using a convenient textual encoding that is indexed in a single text retrieval engine. This offers great flexibility when results corresponding to various parts of the query (visual, text and locations) need to be merged. In addition, we report an extensive analysis of the retrieval performance of the system, using the query logs generated during the Video Browser Showdown (VBS) 2019 competition. This allowed us to fine-tune the system by choosing the optimal parameters and strategies from those we tested.

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA