Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros












Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-39150798

RESUMO

We introduce Metric3D v2, a geometric foundation model for zero-shot metric depth and surface normal estimation from a single image, which is crucial for metric 3D recovery. While depth and normal are geometrically related and highly complimentary, they present distinct challenges. State-of-the-art (SoTA) monocular depth methods achieve zero-shot generalization by learning affine-invariant depths, which cannot recover real-world metrics. Meanwhile, SoTA normal estimation methods have limited zero-shot performance due to the lack of large-scale labeled data. To tackle these issues, we propose solutions for both metric depth estimation and surface normal estimation. For metric depth estimation, we show that the key to a zero-shot single-view model lies in resolving the metric ambiguity from various camera models and large-scale data training. We propose a canonical camera space transformation module, which explicitly addresses the ambiguity problem and can be effortlessly plugged into existing monocular models. For surface normal estimation, we propose a joint depth-normal optimization module to distill diverse data knowledge from metric depth, enabling normal estimators to learn beyond normal labels. Equipped with these modules, our depth-normal models can be stably trained with over 16 million of images from thousands of camera models with different-type annotations, resulting in zero-shot generalization to in-the-wild images with unseen camera settings. Our method currently ranks the 1st on various zero-shot and non-zero-shot benchmarks for metric depth, affine-invariant-depth as well as surface-normal prediction, shown in Fig. 1. Notably, we surpassed the ultra-recent MarigoldDepth and DepthAnything on various depth benchmarks including NYUv2 and KITTI. Our method enables the accurate recovery of metric 3D structures on randomly collected internet images, paving the way for plausible single-image metrology. The potential benefits extend to downstream tasks, which can be significantly improved by simply plugging in our model. For example, our model relieves the scale drift issues of monocular-SLAM (Fig. 3), leading to high-quality metric scale dense mapping. These applications highlight the versatility of Metric3D v2 models as geometric foundation models. Our project page is at https://JUGGHM.github.io/Metric3Dv2.

2.
Sci Robot ; 9(90): eadj8124, 2024 May 29.
Artigo em Inglês | MEDLINE | ID: mdl-38809998

RESUMO

Neuromorphic vision sensors or event cameras have made the visual perception of extremely low reaction time possible, opening new avenues for high-dynamic robotics applications. These event cameras' output is dependent on both motion and texture. However, the event camera fails to capture object edges that are parallel to the camera motion. This is a problem intrinsic to the sensor and therefore challenging to solve algorithmically. Human vision deals with perceptual fading using the active mechanism of small involuntary eye movements, the most prominent ones called microsaccades. By moving the eyes constantly and slightly during fixation, microsaccades can substantially maintain texture stability and persistence. Inspired by microsaccades, we designed an event-based perception system capable of simultaneously maintaining low reaction time and stable texture. In this design, a rotating wedge prism was mounted in front of the aperture of an event camera to redirect light and trigger events. The geometrical optics of the rotating wedge prism allows for algorithmic compensation of the additional rotational motion, resulting in a stable texture appearance and high informational output independent of external motion. The hardware device and software solution are integrated into a system, which we call artificial microsaccade-enhanced event camera (AMI-EV). Benchmark comparisons validated the superior data quality of AMI-EV recordings in scenarios where both standard cameras and event cameras fail to deliver. Various real-world experiments demonstrated the potential of the system to facilitate robotics perception both for low-level and high-level vision tasks.


Assuntos
Algoritmos , Desenho de Equipamento , Robótica , Movimentos Sacádicos , Percepção Visual , Robótica/instrumentação , Humanos , Movimentos Sacádicos/fisiologia , Percepção Visual/fisiologia , Movimento (Física) , Software , Tempo de Reação/fisiologia , Biomimética/instrumentação , Fixação Ocular/fisiologia , Movimentos Oculares/fisiologia , Visão Ocular/fisiologia
3.
IEEE Trans Neural Netw Learn Syst ; 34(8): 4868-4880, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-34767515

RESUMO

Identifying independently moving objects is an essential task for dynamic scene understanding. However, traditional cameras used in dynamic scenes may suffer from motion blur or exposure artifacts due to their sampling principle. By contrast, event-based cameras are novel bio-inspired sensors that offer advantages to overcome such limitations. They report pixel-wise intensity changes asynchronously, which enables them to acquire visual information at exactly the same rate as the scene dynamics. We develop a method to identify independently moving objects acquired with an event-based camera, that is, to solve the event-based motion segmentation problem. We cast the problem as an energy minimization one involving the fitting of multiple motion models. We jointly solve two sub-problems, namely event-cluster assignment (labeling) and motion model fitting, in an iterative manner by exploiting the structure of the input event data in the form of a spatio-temporal graph. Experiments on available datasets demonstrate the versatility of the method in scenes with different motion patterns and number of moving objects. The evaluation shows state-of-the-art results without having to predetermine the number of expected moving objects. We release the software and dataset under an open source license to foster research in the emerging topic of event-based motion segmentation.

4.
Sensors (Basel) ; 22(7)2022 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-35408352

RESUMO

Traditional calibration methods rely on the accurate localization of the chessboard points in images and their maximum likelihood estimation (MLE)-based optimization models implicitly require all detected points to have an identical uncertainty. The uncertainties of the detected control points are mainly determined by camera pose, the slant of the chessboard and the inconsistent imaging capabilities of the camera. The negative influence of the uncertainties that are induced by the two former factors can be eliminated by adequate data sampling. However, the last factor leads to the detected control points from some sensor areas having larger uncertainties than those from other sensor areas. This causes the final calibrated parameters to overfit the control points that are located at the poorer sensor areas. In this paper, we present a method for measuring the uncertainties of the detected control points and incorporating these measured uncertainties into the optimization model of the geometric calibration. The new model suppresses the influence from the control points with large uncertainties while amplifying the contributions from points with small uncertainties for the final convergence. We demonstrate the usability of the proposed method by first using eight cameras to collect a calibration dataset and then comparing our method to other recent works and the calibration module in OpenCV using that dataset.


Assuntos
Diagnóstico por Imagem , Calibragem , Incerteza
5.
J Comput Chem ; 42(13): 908-916, 2021 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-33729600

RESUMO

The noncovalent interactions involving heteronuclear ethylene analogues H2 CEH2 (E = Si, Ge and Sn) have been studied by the Møller-Plesset perturbation theory to investigate the competition and cooperativity between the hydrogen/halogen bond and π-hole bond. H2 CEH2 has a dual role of being a Lewis base and acid with the region of π-electron accumulation above the carbon atom and the region of π-electron depletion (π-hole) above the E atom to participate in the NCX···CE (X = H and Cl) hydrogen/halogen bond and CE···NCY (Y = H, Cl, Li and Na) π-hole bond, respectively. When HCN/ClCN interacts with H2 CEH2 by two sites, the strength of hydrogen bond/halogen bond is stronger than that of π-hole bond. The π-hole bond becomes obviously stronger when the metal substituent of YCN (Y = Li and Na) interacting with H2 CEH2 , showing the character of partial covalent, its strength is much greater than that of hydrogen/halogen bond. In the ternary complexes, both hydrogen/halogen bond and π-hole bond are simultaneously strengthened compared to those in the binary complexes, especially in the systems containing alkali metal.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...