Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
IEEE Trans Pattern Anal Mach Intell ; 44(1): 154-180, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-32750812

RESUMO

Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of µs), very high dynamic range (140 dB versus 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world.


Assuntos
Algoritmos , Robótica , Redes Neurais de Computação
2.
Comput Vis ECCV ; 12363: 1-17, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-35822859

RESUMO

Automated capture of animal pose is transforming how we study neuroscience and social behavior. Movements carry important social cues, but current methods are not able to robustly estimate pose and shape of animals, particularly for social animals such as birds, which are often occluded by each other and objects in the environment. To address this problem, we first introduce a model and multi-view optimization approach, which we use to capture the unique shape and pose space displayed by live birds. We then introduce a pipeline and experiments for keypoint, mask, pose, and shape regression that recovers accurate avian postures from single views. Finally, we provide extensive multi-view keypoint and mask annotations collected from a group of 15 social birds housed together in an outdoor aviary. The project website with videos, results, code, mesh model, and the Penn Aviary Dataset can be found at https://marcbadger.github.io/avian-mesh.

3.
IEEE Trans Pattern Anal Mach Intell ; 41(4): 901-914, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-29993801

RESUMO

Recovering 3D full-body human pose is a challenging problem with many applications. It has been successfully addressed by motion capture systems with body worn markers and multiple cameras. In this paper, we address the more challenging case of not only using a single camera but also not leveraging markers: going directly from 2D appearance to 3D geometry. Deep learning approaches have shown remarkable abilities to discriminatively learn 2D appearance features. The missing piece is how to integrate 2D, 3D, and temporal information to recover 3D geometry and account for the uncertainties arising from the discriminative model. We introduce a novel approach that treats 2D joint locations as latent variables whose uncertainty distributions are given by a deep fully convolutional neural network. The unknown 3D poses are modeled by a sparse representation and the 3D parameter estimates are realized via an Expectation-Maximization algorithm, where it is shown that the 2D joint location uncertainties can be conveniently marginalized out during inference. Extensive evaluation on benchmark datasets shows that the proposed approach achieves greater accuracy over state-of-the-art baselines. Notably, the proposed approach does not require synchronized 2D-3D data for training and is applicable to "in-the-wild" images, which is demonstrated with the MPII dataset.

4.
IEEE Trans Pattern Anal Mach Intell ; 39(8): 1648-1661, 2017 08.
Artigo em Inglês | MEDLINE | ID: mdl-28113356

RESUMO

We investigate the problem of estimating the 3D shape of an object defined by a set of 3D landmarks, given their 2D correspondences in a single image. A successful approach to alleviating the reconstruction ambiguity is the 3D deformable shape model and a sparse representation is often used to capture complex shape variability. But the model inference is still challenging due to the nonconvexity in the joint optimization of shape and viewpoint. In contrast to prior work that relies on an alternating scheme whose solution depends on initialization, we propose a convex approach to addressing this challenge and develop an efficient algorithm to solve the proposed convex program. We further propose a robust model to handle gross errors in the 2D correspondences. We demonstrate the exact recovery property of the proposed method, the advantage compared to several nonconvex baselines and the applicability to recover 3D human poses and car models from single images.

5.
IEEE Trans Pattern Anal Mach Intell ; 36(9): 1832-46, 2014 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-26352235

RESUMO

In this paper we propose a new approach to compute the scale space of any central projection system, such as catadioptric, fisheye or conventional cameras. Since these systems can be explained using a unified model, the single parameter that defines each type of system is used to automatically compute the corresponding Riemannian metric. This metric, is combined with the partial differential equations framework on manifolds, allows us to compute the Laplace-Beltrami (LB) operator, enabling the computation of the scale space of any central projection system. Scale space is essential for the intrinsic scale selection and neighborhood description in features like SIFT. We perform experiments with synthetic and real images to validate the generalization of our approach to any central projection system. We compare our approach with the best-existing methods showing competitive results in all type of cameras: catadioptric, fisheye, and perspective.

6.
IEEE Trans Pattern Anal Mach Intell ; 34(4): 818-24, 2012 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-22144517

RESUMO

This paper presents two new, efficient solutions to the two-view, relative pose problem from three image point correspondences and one common reference direction. This three-plus-one problem can be used either as a substitute for the classic five-point algorithm, using a vanishing point for the reference direction, or to make use of an inertial measurement unit commonly available on robots and mobile devices where the gravity vector becomes the reference direction. We provide a simple, closed-form solution and a solution based on algebraic geometry which offers numerical advantages. In addition, we introduce a new method for computing visual odometry with RANSAC and four point correspondences per hypothesis. In a set of real experiments, we demonstrate the power of our approach by comparing it to the five-point method in a hypothesize-and-test visual odometry setting.


Assuntos
Equipamentos e Provisões/normas , Imageamento Tridimensional/métodos , Robótica , Algoritmos , Percepção Visual/fisiologia
7.
IEEE Trans Pattern Anal Mach Intell ; 28(7): 1170-5, 2006 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-16792105

RESUMO

This paper addresses the problem of rotation estimation directly from images defined on the sphere and without correspondence. The method is particularly useful for the alignment of large rotations and has potential impact on 3D shape alignment. The foundation of the method lies in the fact that the spherical harmonic coefficients undergo a unitary mapping when the original image is rotated. The correlation between two images is a function of rotations and we show that it has an SO(3)-Fourier transform equal to the pointwise product of spherical harmonic coefficients of the original images. The resolution of the rotation space depends on the bandwidth we choose for the harmonic expansion and the rotation estimate is found through a direct search in this 3D discretized space. A refinement of the rotation estimate can be obtained from the conservation of harmonic coefficients in the rotational shift theorem. A novel decoupling of the shift theorem with respect to the Euler angles is presented and exploited in an iterative scheme to refine the initial rotation estimates. Experiments show the suitability of the method for large rotations and the dependence of the method on bandwidth and the choice of the spherical harmonic coefficients.


Assuntos
Algoritmos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Armazenamento e Recuperação da Informação/métodos , Rotação
8.
IEEE Trans Pattern Anal Mach Intell ; 26(5): 667-71, 2004 May.
Artigo em Inglês | MEDLINE | ID: mdl-15460289

RESUMO

In this paper, we study the Vapnik-Chervonenkis (VC)-dimension of set systems arising in 2D polygonal and 3D polyhedral configurations where a subset consists of all points visible from one camera. In the past, it has been shown that the VC-dimension of planar visibility systems is bounded by 23 if the cameras are allowed to be anywhere inside a polygon without holes. Here, we consider the case of exterior visibility, where the cameras lie on a constrained area outside the polygon and have to observe the entire boundary. We present results for the cases of cameras lying on a circle containing a polygon (VC-dimension= 2) or lying outside the convex hull of a polygon (VC-dimension= 5). The main result of this paper concerns the 3D case: We prove that the VC-dimension is unbounded if the cameras lie on a sphere containing the polyhedron, hence the term exterior visibility.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão , Técnica de Subtração , Análise por Conglomerados , Simulação por Computador , Aumento da Imagem/métodos , Análise Numérica Assistida por Computador , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Processamento de Sinais Assistido por Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...