Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Sensors (Basel) ; 23(17)2023 Aug 29.
Artigo em Inglês | MEDLINE | ID: mdl-37687946

RESUMO

One of the main challenges faced by iris recognition systems is to be able to work with people in motion, where the sensor is at an increasing distance (more than 1 m) from the person. The ultimate goal is to make the system less and less intrusive and require less cooperation from the person. When this scenario is implemented using a single static sensor, it will be necessary for the sensor to have a wide field of view and for the system to process a large number of frames per second (fps). In such a scenario, many of the captured eye images will not have adequate quality (contrast or resolution). This paper describes the implementation in an MPSoC (multiprocessor system-on-chip) of an eye image detection system that integrates, in the programmable logic (PL) part, a functional block to evaluate the level of defocus blur of the captured images. In this way, the system will be able to discard images that do not have the required focus quality in the subsequent processing steps. The proposals were successfully designed using Vitis High Level Synthesis (VHLS) and integrated into an eye detection framework capable of processing over 57 fps working with a 16 Mpixel sensor. Using, for validation, an extended version of the CASIA-Iris-distance V4 database, the experimental evaluation shows that the proposed framework is able to successfully discard unfocused eye images. But what is more relevant is that, in a real implementation, this proposal allows discarding up to 97% of out-of-focus eye images, which will not have to be processed by the segmentation and normalised iris pattern extraction blocks.


Assuntos
Biometria , Humanos , Bases de Dados Factuais , Iris , Movimento (Física)
2.
Sensors (Basel) ; 16(12)2016 Nov 26.
Artigo em Inglês | MEDLINE | ID: mdl-27898029

RESUMO

There exist image processing applications, such as tracking or pattern recognition, that are not necessarily precise enough to maintain the same resolution across the whole image sensor. In fact, they must only keep it as high as possible in a relatively small region, but covering a wide field of view. This is the aim of foveal vision systems. Briefly, they propose to sense a large field of view at a spatially-variant resolution: one relatively small region, the fovea, is mapped at a high resolution, while the rest of the image is captured at a lower resolution. In these systems, this fovea must be moved, from one region of interest to another one, to scan a visual scene. It is interesting that the part of the scene that is covered by the fovea should not be merely spatial, but closely related to perceptual objects. Segmentation and attention are then intimately tied together: while the segmentation process is responsible for extracting perceptively-coherent entities from the scene (proto-objects), attention can guide segmentation. From this loop, the concept of foveal attention arises. This work proposes a hardware system for mapping a uniformly-sampled sensor to a space-variant one. Furthermore, this mapping is tied with a software-based, foveal attention mechanism that takes as input the stream of generated foveal images. The whole hardware/software architecture has been designed to be embedded within an all programmable system on chip (AP SoC). Our results show the flexibility of the data port for exchanging information between the mapping and attention parts of the architecture and the good performance rates of the mapping procedure. Experimental evaluation also demonstrates that the segmentation method and the attention model provide results comparable to other more computationally-expensive algorithms.

3.
Sensors (Basel) ; 14(6): 9522-45, 2014 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-24878593

RESUMO

One of the main issues within the field of social robotics is to endow robots with the ability to direct attention to people with whom they are interacting. Different approaches follow bio-inspired mechanisms, merging audio and visual cues to localize a person using multiple sensors. However, most of these fusion mechanisms have been used in fixed systems, such as those used in video-conference rooms, and thus, they may incur difficulties when constrained to the sensors with which a robot can be equipped. Besides, within the scope of interactive autonomous robots, there is a lack in terms of evaluating the benefits of audio-visual attention mechanisms, compared to only audio or visual approaches, in real scenarios. Most of the tests conducted have been within controlled environments, at short distances and/or with off-line performance measurements. With the goal of demonstrating the benefit of fusing sensory information with a Bayes inference for interactive robotics, this paper presents a system for localizing a person by processing visual and audio data. Moreover, the performance of this system is evaluated and compared via considering the technical limitations of unimodal systems. The experiments show the promise of the proposed approach for the proactive detection and tracking of speakers in a human-robot interactive framework.


Assuntos
Percepção Auditiva , Cabeça/fisiologia , Modelos Biológicos , Robótica/instrumentação , Robótica/métodos , Percepção Visual , Algoritmos , Humanos , Processamento de Imagem Assistida por Computador , Sistemas Homem-Máquina , Processamento de Sinais Assistido por Computador
4.
Cogn Process ; 14(1): 13-8, 2013 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-23328946

RESUMO

In biological vision systems, attention mechanisms are responsible for selecting the relevant information from the sensed field of view, so that the complete scene can be analyzed using a sequence of rapid eye saccades. In recent years, efforts have been made to imitate such attention behavior in artificial vision systems, because it allows optimizing the computational resources as they can be focused on the processing of a set of selected regions. In the framework of mobile robotics navigation, this work proposes an artificial model where attention is deployed at the level of objects (visual landmarks) and where new processes for estimating bottom-up and top-down (target-based) saliency maps are employed. Bottom-up attention is implemented through a hierarchical process, whose final result is the perceptual grouping of the image content. The hierarchical grouping is applied using a Combinatorial Pyramid that represents each level of the hierarchy by a combinatorial map. The process takes into account both image regions (faces in the map) and edges (arcs in the map). Top-down attention searches for previously detected landmarks, enabling their re-detection when the robot presumes that it is revisiting a known location. Landmarks are described by a combinatorial submap; thus, this search is conducted through an error-tolerant submap isomorphism procedure.


Assuntos
Atenção/fisiologia , Robótica/instrumentação , Robótica/métodos , Percepção Visual/fisiologia , Simulação por Computador , Modelos Teóricos , Percepção Espacial/fisiologia
5.
Artigo em Inglês | MEDLINE | ID: mdl-25177289

RESUMO

Artificial vision systems cannot process all the information that they receive from the world in real time because it is highly expensive and inefficient in terms of computational cost. Inspired by biological perception systems, artificial attention models pursuit to select only the relevant part of the scene. On human vision, it is also well established that these units of attention are not merely spatial but closely related to perceptual objects (proto-objects). This implies a strong bidirectional relationship between segmentation and attention processes. While the segmentation process is the responsible to extract the proto-objects from the scene, attention can guide segmentation, arising the concept of foveal attention. When the focus of attention is deployed from one visual unit to another, the rest of the scene is perceived but at a lower resolution that the focused object. The result is a multi-resolution visual perception in which the fovea, a dimple on the central retina, provides the highest resolution vision. In this paper, a bottom-up foveal attention model is presented. In this model the input image is a foveal image represented using a Cartesian Foveal Geometry (CFG), which encodes the field of view of the sensor as a fovea (placed in the focus of attention) surrounded by a set of concentric rings with decreasing resolution. Then multi-resolution perceptual segmentation is performed by building a foveal polygon using the Bounded Irregular Pyramid (BIP). Bottom-up attention is enclosed in the same structure, allowing to set the fovea over the most salient image proto-object. Saliency is computed as a linear combination of multiple low level features such as color and intensity contrast, symmetry, orientation and roundness. Obtained results from natural images show that the performance of the combination of hierarchical foveal segmentation and saliency estimation is good in terms of accuracy and speed.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA