Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
Sensors (Basel) ; 23(12)2023 Jun 07.
Artigo em Inglês | MEDLINE | ID: mdl-37420553

RESUMO

Maritime obstacle detection is critical for safe navigation of autonomous surface vehicles (ASVs). While the accuracy of image-based detection methods has advanced substantially, their computational and memory requirements prohibit deployment on embedded devices. In this paper, we analyze the current best-performing maritime obstacle detection network, WaSR. Based on the analysis, we then propose replacements for the most computationally intensive stages and propose its embedded-compute-ready variant, eWaSR. In particular, the new design follows the most recent advancements of transformer-based lightweight networks. eWaSR achieves comparable detection results to state-of-the-art WaSR with only a 0.52% F1 score performance drop and outperforms other state-of-the-art embedded-ready architectures by over 9.74% in F1 score. On a standard GPU, eWaSR runs 10× faster than the original WaSR (115 FPS vs. 11 FPS). Tests on a real embedded sensor OAK-D show that, while WaSR cannot run due to memory restrictions, eWaSR runs comfortably at 5.5 FPS. This makes eWaSR the first practical embedded-compute-ready maritime obstacle detection network. The source code and trained eWaSR models are publicly available.


Assuntos
Veículos Autônomos , Fontes de Energia Elétrica , Software
2.
Sensors (Basel) ; 22(23)2022 Nov 25.
Artigo em Inglês | MEDLINE | ID: mdl-36501841

RESUMO

Robust maritime obstacle detection is critical for safe navigation of autonomous boats and timely collision avoidance. The current state-of-the-art is based on deep segmentation networks trained on large datasets. However, per-pixel ground truth labeling of such datasets is labor-intensive and expensive. We propose a new scaffolding learning regime (SLR) that leverages weak annotations consisting of water edges, the horizon location, and obstacle bounding boxes to train segmentation-based obstacle detection networks, thereby reducing the required ground truth labeling effort by a factor of twenty. SLR trains an initial model from weak annotations and then alternates between re-estimating the segmentation pseudo-labels and improving the network parameters. Experiments show that maritime obstacle segmentation networks trained using SLR on weak annotations not only match but outperform the same networks trained with dense ground truth labels, which is a remarkable result. In addition to the increased accuracy, SLR also increases domain generalization and can be used for domain adaptation with a low manual annotation load. The SLR code and pre-trained models are freely available online.


Assuntos
Trabalho de Parto , Aprendizagem , Gravidez , Feminino , Humanos , Aclimatação , Água , Processamento de Imagem Assistida por Computador
3.
IEEE Trans Cybern ; 52(12): 12661-12674, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34232901

RESUMO

Obstacle detection using semantic segmentation has become an established approach in autonomous vehicles. However, existing segmentation methods, primarily developed for ground vehicles, are inadequate in an aquatic environment as they produce many false positive (FP) detections in the presence of water reflections and wakes. We propose a novel deep encoder-decoder architecture, a water segmentation and refinement (WaSR) network, specifically designed for the marine environment to address these issues. A deep encoder based on ResNet101 with atrous convolutions enables the extraction of rich visual features, while a novel decoder gradually fuses them with inertial information from the inertial measurement unit (IMU). The inertial information greatly improves the segmentation accuracy of the water component in the presence of visual ambiguities, such as fog on the horizon. Furthermore, a novel loss function for semantic separation is proposed to enforce the separation of different semantic components to increase the robustness of the segmentation. We investigate different loss variants and observe a significant reduction in FPs and an increase in true positives (TPs). Experimental results show that WaSR outperforms the current state of the art by approximately 4% in F1 score on a challenging unmanned surface vehicle dataset. WaSR shows remarkable generalization capabilities and outperforms the state of the art by over 24% in F1 score on a strict domain generalization experiment.


Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Processamento de Imagem Assistida por Computador/métodos , Água
4.
IEEE Trans Pattern Anal Mach Intell ; 44(12): 9742-9755, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34941502

RESUMO

Template-based discriminative trackers are currently the dominant tracking paradigm due to their robustness, but are restricted to bounding box tracking and a limited range of transformation models, which reduces their localization accuracy. We propose a discriminative single-shot segmentation tracker - D3S 2, which narrows the gap between visual object tracking and video object segmentation. A single-shot network applies two target models with complementary geometric properties, one invariant to a broad range of transformations, including non-rigid deformations, the other assuming a rigid object to simultaneously achieve robust online target segmentation. The overall tracking reliability is further increased by decoupling the object and feature scale estimation. Without per-dataset finetuning, and trained only for segmentation as the primary output, D3S 2 outperforms all published trackers on the recent short-term tracking benchmark VOT2020 and performs very close to the state-of-the-art trackers on the GOT-10k, TrackingNet, OTB100 and LaSoT. D3S 2 outperforms the leading segmentation tracker SiamMask on video object segmentation benchmarks and performs on par with top video object segmentation algorithms.

5.
IEEE Trans Cybern ; 51(12): 6305-6318, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-32248144

RESUMO

A long-term visual object tracking performance evaluation methodology and a benchmark are proposed. Performance measures are designed by following a long-term tracking definition to maximize the analysis probing strength. The new measures outperform existing ones in interpretation potential and in better distinguishing between different tracking behaviors. We show that these measures generalize the short-term performance measures, thus linking the two tracking problems. Furthermore, the new measures are highly robust to temporal annotation sparsity and allow annotation of sequences hundreds of times longer than in the current datasets without increasing manual annotation labor. A new challenging dataset of carefully selected sequences with many target disappearances is proposed. A new tracking taxonomy is proposed to position trackers on the short-term/long-term spectrum. The benchmark contains an extensive evaluation of the largest number of long-term trackers and comparison to state-of-the-art short-term trackers. We analyze the influence of tracking architecture implementations to long-term performance and explore various redetection strategies as well as the influence of visual model update strategies to long-term tracking drift. The methodology is integrated in the VOT toolkit to automate experimental analysis and benchmarking and to facilitate the future development of long-term trackers.


Assuntos
Algoritmos , Gravação em Vídeo
6.
IEEE Trans Cybern ; 48(6): 1849-1861, 2018 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-28678728

RESUMO

Deformable parts models show a great potential in tracking by principally addressing nonrigid object deformations and self occlusions, but according to recent benchmarks, they often lag behind the holistic approaches. The reason is that potentially large number of degrees of freedom have to be estimated for object localization and simplifications of the constellation topology are often assumed to make the inference tractable. We present a new formulation of the constellation model with correlation filters that treats the geometric and visual constraints within a single convex cost function and derive a highly efficient optimization for maximum a posteriori inference of a fully connected constellation. We propose a tracker that models the object at two levels of detail. The coarse level corresponds a root correlation filter and a novel color model for approximate object localization, while the mid-level representation is composed of the new deformable constellation of correlation filters that refine the object location. The resulting tracker is rigorously analyzed on a highly challenging OTB, VOT2014, and VOT2015 benchmarks, exhibits a state-of-the-art performance and runs in real-time.

7.
IEEE Trans Image Process ; 25(3): 1261-74, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-26812723

RESUMO

The problem of visual tracking evaluation is sporting a large variety of performance measures, and largely suffers from lack of consensus about which measures should be used in experiments. This makes the cross-paper tracker comparison difficult. Furthermore, as some measures may be less effective than others, the tracking results may be skewed or biased toward particular tracking aspects. In this paper, we revisit the popular performance measures and tracker performance visualizations and analyze them theoretically and experimentally. We show that several measures are equivalent from the point of information they provide for tracker comparison and, crucially, that some are more brittle than the others. Based on our analysis, we narrow down the set of potential measures to only two complementary ones, describing accuracy and robustness, thus pushing toward homogenization of the tracker evaluation methodology. These two measures can be intuitively interpreted and visualized and have been employed by the recent visual object tracking challenges as the foundation for the evaluation methodology.

8.
IEEE Trans Pattern Anal Mach Intell ; 38(11): 2137-2155, 2016 11.
Artigo em Inglês | MEDLINE | ID: mdl-26766217

RESUMO

This paper addresses the problem of single-target tracker performance evaluation. We consider the performance measures, the dataset and the evaluation system to be the most important components of tracker evaluation and propose requirements for each of them. The requirements are the basis of a new evaluation methodology that aims at a simple and easily interpretable tracker comparison. The ranking-based methodology addresses tracker equivalence in terms of statistical significance and practical differences. A fully-annotated dataset with per-frame annotations with several visual attributes is introduced. The diversity of its visual properties is maximized in a novel way by clustering a large number of videos according to their visual attributes. This makes it the most sophistically constructed and annotated dataset to date. A multi-platform evaluation system allowing easy integration of third-party trackers is presented as well. The proposed evaluation methodology was tested on the VOT2014 challenge on the new dataset and 38 trackers, making it the largest benchmark to date. Most of the tested trackers are indeed state-of-the-art since they outperform the standard baselines, resulting in a highly-challenging benchmark. An exhaustive analysis of the dataset from the perspective of tracking difficulty is carried out. To facilitate tracker comparison a new performance visualization technique is proposed.

9.
IEEE Trans Cybern ; 46(3): 641-54, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25838534

RESUMO

Obstacle detection plays an important role in unmanned surface vehicles (USVs). The USVs operate in a highly diverse environments in which an obstacle may be a floating piece of wood, a scuba diver, a pier, or a part of a shoreline, which presents a significant challenge to continuous detection from images taken on board. This paper addresses the problem of online detection by constrained, unsupervised segmentation. To this end, a new graphical model is proposed that affords a fast and continuous obstacle image-map estimation from a single video stream captured on board a USV. The model accounts for the semantic structure of marine environment as observed from USV by imposing weak structural constraints. A Markov random field framework is adopted and a highly efficient algorithm for simultaneous optimization of model parameters and segmentation mask estimation is derived. Our approach does not require computationally intensive extraction of texture features and comfortably runs in real time. The algorithm is tested on a new, challenging, dataset for segmentation, and obstacle detection in marine environments, which is the largest annotated dataset of its kind. Results on this dataset show that our model outperforms the related approaches, while requiring a fraction of computational effort.

10.
IEEE Trans Cybern ; 44(3): 355-65, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-23757555

RESUMO

We propose a new method for a supervised online estimation of probabilistic discriminative models for classification tasks. The method estimates the class distributions from a stream of data in the form of Gaussian mixture models (GMMs). The reconstructive updates of the distributions are based on the recently proposed online kernel density estimator (oKDE). We maintain the number of components in the model low by compressing the GMMs from time to time. We propose a new cost function that measures loss of interclass discrimination during compression, thus guiding the compression toward simpler models that still retain discriminative properties. The resulting classifier thus independently updates the GMM of each class, but these GMMs interact during their compression through the proposed cost function. We call the proposed method the online discriminative kernel density estimator (odKDE). We compare the odKDE to oKDE, batch state-of-the-art kernel density estimators (KDEs), and batch/incremental support vector machines (SVM) on the publicly available datasets. The odKDE achieves comparable classification performance to that of best batch KDEs and SVM, while allowing online adaptation from large datasets, and produces models of lower complexity than the oKDE.

11.
IEEE Trans Pattern Anal Mach Intell ; 35(4): 941-53, 2013 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-22802114

RESUMO

This paper addresses the problem of tracking objects which undergo rapid and significant appearance changes. We propose a novel coupled-layer visual model that combines the target's global and local appearance by interlacing two layers. The local layer in this model is a set of local patches that geometrically constrain the changes in the target's appearance. This layer probabilistically adapts to the target's geometric deformation, while its structure is updated by removing and adding the local patches. The addition of these patches is constrained by the global layer that probabilistically models the target's global visual properties, such as color, shape, and apparent local motion. The global visual properties are updated during tracking using the stable patches from the local layer. By this coupled constraint paradigm between the adaptation of the global and the local layer, we achieve a more robust tracking through significant appearance changes. We experimentally compare our tracker to 11 state-of-the-art trackers. The experimental results on challenging sequences confirm that our tracker outperforms the related trackers in many cases by having a smaller failure rate as well as better accuracy. Furthermore, the parameter analysis shows that our tracker is stable over a range of parameter values.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Algoritmos , Humanos , Modelos Teóricos , Movimento , Gravação em Vídeo
12.
Sensors (Basel) ; 13(1): 241-73, 2012 Dec 24.
Artigo em Inglês | MEDLINE | ID: mdl-23262485

RESUMO

We present a novel system for detection, localization and tracking of multiple people, which fuses a multi-view computer vision approach with a radio-based localization system. The proposed fusion combines the best of both worlds, excellent computer-vision-based localization, and strong identity information provided by the radio system, and is therefore able to perform tracking by identification, which makes it impervious to propagated identity switches. We present comprehensive methodology for evaluation of systems that perform person localization in world coordinate system and use it to evaluate the proposed system as well as its components. Experimental results on a challenging indoor dataset, which involves multiple people walking around a realistically cluttered room, confirm that proposed fusion of both systems significantly outperforms its individual components. Compared to the radio-based system, it achieves better localization results, while at the same time it successfully prevents propagation of identity switches that occur in pure computer-vision-based tracking.

13.
IEEE Trans Syst Man Cybern B Cybern ; 40(6): 1505-20, 2010 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-20215085

RESUMO

We propose a new dynamic model which can be used within blob trackers to track the target's center of gravity. A strong point of the model is that it is designed to track a variety of motions which are usually encountered in applications such as pedestrian tracking, hand tracking, and sports. We call the dynamic model a two-stage dynamic model due to its particular structure, which is a composition of two models: a liberal model and a conservative model. The liberal model allows larger perturbations in the target's dynamics and is able to account for motions in between the random-walk dynamics and the nearly constant-velocity dynamics. On the other hand, the conservative model assumes smaller perturbations and is used to further constrain the liberal model to the target's current dynamics. We implement the two-stage dynamic model in a two-stage probabilistic tracker based on the particle filter and apply it to two separate examples of blob tracking: 1) tracking entire persons and 2) tracking of a person's hands. Experiments show that, in comparison to the widely used models, the proposed two-stage dynamic model allows tracking with smaller number of particles in the particle filter (e.g., 25 particles), while achieving smaller errors in the state estimation and a smaller failure rate. The results suggest that the improved performance comes from the model's ability to actively adapt to the target's motion during tracking.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Modelos Teóricos , Reconhecimento Automatizado de Padrão/métodos , Simulação por Computador , Movimento (Física)
14.
ISA Trans ; 43(3): 329-42, 2004 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-15272790

RESUMO

In this paper a global vision scheme for estimation of positions and orientations of mobile robots is presented. It is applied to robot soccer application which is a fast dynamic game and therefore needs an efficient and robust vision system implemented. General applicability of the vision system can be found in other robot applications such as mobile transport robots in production, warehouses, attendant robots, fast vision tracking of targets of interest and entertainment robotics. Basic operation of the vision system is divided into two steps. In the first, the incoming image is scanned and pixels are classified into a finite number of classes. At the same time, a segmentation algorithm is used to find corresponding regions belonging to one of the classes. In the second step, all the regions are examined. Selection of the ones that are a part of the observed object is made by means of simple logic procedures. The novelty is focused on optimization of the processing time needed to finish the estimation of possible object positions. Better results of the vision system are achieved by implementing camera calibration and shading correction algorithm. The former corrects camera lens distortion, while the latter increases robustness to irregular illumination conditions.


Assuntos
Algoritmos , Inteligência Artificial , Comportamento Cooperativo , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão , Robótica/métodos , Interpretação de Imagem Assistida por Computador/instrumentação , Futebol , Gravação em Vídeo/instrumentação , Gravação em Vídeo/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA