Búsqueda | Portal Regional de la BVS

Exploiting Typicality for Selecting Informative and Anomalous Samples in Videos.

Bappy, Jawadul H; Paul, Sujoy; Tuncel, Ertem; Roy-Chowdhury, Amit K.

IEEE Trans Image Process ; 2019 Apr 17.

Artículo en Inglés | MEDLINE | ID: mdl-30998468

RESUMEN

In this paper, we present a novel approach to find informative and anomalous samples in videos exploiting the concept of typicality from information theory. In most video analysis tasks, selection of the most informative samples from a huge pool of training data in order to learn a good recognition model is an important problem. Furthermore, it is also useful to reduce the annotation cost as it is time-consuming to annotate unlabeled samples. Typicality is a simple and powerful technique which can be applied to compress the training data to learn a good classification model. In a continuous video clip, an activity shares a strong correlation with its previous activities. We assume that the activity samples that appear in a video form a Markov chain. We explicitly show how typicality can be utilized in this scenario. We compute an atypical score for a sample using typicality and the Markovian property, which can be applied to two challenging vision problems-(a) sample selection for learning activity recognition models, and (b) anomaly detection. In the first case, our approach leads to a significant reduction of manual labeling cost while achieving similar or better recognition performance compared to a model trained with the entire training set. For the latter case, the atypical score has been exploited in identifying anomalous activities in videos where our results demonstrate the effectiveness of the proposed framework over other recent strategies.

Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries.

Bappy, Jawadul H; Simons, Cody; Nataraj, Lakshmanan; Manjunath, B S; Roy-Chowdhury, Amit K.

IEEE Trans Image Process ; 28(7): 3286-3300, 2019 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-30703026

RESUMEN

With advanced image journaling tools, one can easily alter the semantic meaning of an image by exploiting certain manipulation techniques such as copy clone, object splicing, and removal, which mislead the viewers. In contrast, the identification of these manipulations becomes a very challenging task as manipulated regions are not visually apparent. This paper proposes a high-confidence manipulation localization architecture that utilizes resampling features, long short-term memory (LSTM) cells, and an encoder-decoder network to segment out manipulated regions from non-manipulated ones. Resampling features are used to capture artifacts, such as JPEG quality loss, upsampling, downsampling, rotation, and shearing. The proposed network exploits larger receptive fields (spatial maps) and frequency-domain correlation to analyze the discriminative characteristics between the manipulated and non-manipulated regions by incorporating the encoder and LSTM network. Finally, the decoder network learns the mapping from low-resolution feature maps to pixel-wise predictions for image tamper localization. With the predicted mask provided by the final layer (softmax) of the proposed architecture, end-to-end training is performed to learn the network parameters through back-propagation using the ground-truth masks. Furthermore, a large image splicing dataset is introduced to guide the training process. The proposed method is capable of localizing image manipulations at the pixel level with high precision, which is demonstrated through rigorous experimentation on three diverse datasets.

Distributed Multi-Target Tracking and Data Association in Vision Networks.

Kamal, Ahmed T; Bappy, Jawadul H; Farrell, Jay A; Roy-Chowdhury, Amit K.

IEEE Trans Pattern Anal Mach Intell ; 38(7): 1397-410, 2016 07.

Artículo en Inglés | MEDLINE | ID: mdl-26441444

RESUMEN

Distributed algorithms have recently gained immense popularity. With regards to computer vision applications, distributed multi-target tracking in a camera network is a fundamental problem. The goal is for all cameras to have accurate state estimates for all targets. Distributed estimation algorithms work by exchanging information between sensors that are communication neighbors. Vision-based distributed multi-target state estimation has at least two characteristics that distinguishes it from other applications. First, cameras are directional sensors and often neighboring sensors may not be sensing the same targets, i.e., they are naive with respect to that target. Second, in the presence of clutter and multiple targets, each camera must solve a data association problem. This paper presents an information-weighted, consensus-based, distributed multi-target tracking algorithm referred to as the Multi-target Information Consensus (MTIC) algorithm that is designed to address both the naivety and the data association problems. It converges to the centralized minimum mean square error estimate. The proposed MTIC algorithm and its extensions to non-linear camera models, termed as the Extended MTIC (EMTIC), are robust to false measurements and limited resources like power, bandwidth and the real-time operational requirements. Simulation and experimental analysis are provided to support the theoretical results.

RESUMEN

RESUMEN

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA