Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros

Banco de datos
Tipo de estudio
Tipo del documento
Asunto de la revista
País de afiliación
Intervalo de año de publicación
1.
Nature ; 591(7849): 234-239, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-33692557

RESUMEN

The ability to present three-dimensional (3D) scenes with continuous depth sensation has a profound impact on virtual and augmented reality, human-computer interaction, education and training. Computer-generated holography (CGH) enables high-spatio-angular-resolution 3D projection via numerical simulation of diffraction and interference1. Yet, existing physically based methods fail to produce holograms with both per-pixel focal control and accurate occlusion2,3. The computationally taxing Fresnel diffraction simulation further places an explicit trade-off between image quality and runtime, making dynamic holography impractical4. Here we demonstrate a deep-learning-based CGH pipeline capable of synthesizing a photorealistic colour 3D hologram from a single RGB-depth image in real time. Our convolutional neural network (CNN) is extremely memory efficient (below 620 kilobytes) and runs at 60 hertz for a resolution of 1,920 × 1,080 pixels on a single consumer-grade graphics processing unit. Leveraging low-power on-device artificial intelligence acceleration chips, our CNN also runs interactively on mobile (iPhone 11 Pro at 1.1 hertz) and edge (Google Edge TPU at 2.0 hertz) devices, promising real-time performance in future-generation virtual and augmented-reality mobile headsets. We enable this pipeline by introducing a large-scale CGH dataset (MIT-CGH-4K) with 4,000 pairs of RGB-depth images and corresponding 3D holograms. Our CNN is trained with differentiable wave-based loss functions5 and physically approximates Fresnel diffraction. With an anti-aliasing phase-only encoding method, we experimentally demonstrate speckle-free, natural-looking, high-resolution 3D holograms. Our learning-based approach and the Fresnel hologram dataset will help to unlock the full potential of holography and enable applications in metasurface design6,7, optical and acoustic tweezer-based microscopic manipulation8-10, holographic microscopy11 and single-exposure volumetric 3D printing12,13.


Asunto(s)
Gráficos por Computador , Sistemas de Computación , Holografía/métodos , Holografía/normas , Redes Neurales de la Computación , Animales , Realidad Aumentada , Color , Conjuntos de Datos como Asunto , Aprendizaje Profundo , Microscopía , Pinzas Ópticas , Impresión Tridimensional , Factores de Tiempo , Realidad Virtual
2.
Nature ; 569(7758): 698-702, 2019 05.
Artículo en Inglés | MEDLINE | ID: mdl-31142856

RESUMEN

Humans can feel, weigh and grasp diverse objects, and simultaneously infer their material properties while applying the right amount of force-a challenging set of tasks for a modern robot1. Mechanoreceptor networks that provide sensory feedback and enable the dexterity of the human grasp2 remain difficult to replicate in robots. Whereas computer-vision-based robot grasping strategies3-5 have progressed substantially with the abundance of visual data and emerging machine-learning tools, there are as yet no equivalent sensing platforms and large-scale datasets with which to probe the use of the tactile information that humans rely on when grasping objects. Studying the mechanics of how humans grasp objects will complement vision-based robotic object handling. Importantly, the inability to record and analyse tactile signals currently limits our understanding of the role of tactile information in the human grasp itself-for example, how tactile maps are used to identify objects and infer their properties is unknown6. Here we use a scalable tactile glove and deep convolutional neural networks to show that sensors uniformly distributed over the hand can be used to identify individual objects, estimate their weight and explore the typical tactile patterns that emerge while grasping objects. The sensor array (548 sensors) is assembled on a knitted glove, and consists of a piezoresistive film connected by a network of conductive thread electrodes that are passively probed. Using a low-cost (about US$10) scalable tactile glove sensor array, we record a large-scale tactile dataset with 135,000 frames, each covering the full hand, while interacting with 26 different objects. This set of interactions with different objects reveals the key correspondences between different regions of a human hand while it is manipulating objects. Insights from the tactile signatures of the human grasp-through the lens of an artificial analogue of the natural mechanoreceptor network-can thus aid the future design of prosthetics7, robot grasping tools and human-robot interactions1,8-10.


Asunto(s)
Vestuario , Análisis de Datos , Fuerza de la Mano/fisiología , Mano/fisiología , Redes Neurales de la Computación , Tacto/fisiología , Conjuntos de Datos como Asunto , Humanos , Sistemas Hombre-Máquina
4.
IEEE Trans Vis Comput Graph ; 24(6): 2037-2050, 2018 06.
Artículo en Inglés | MEDLINE | ID: mdl-28504938

RESUMEN

We propose a system to infer binocular disparity from a monocular video stream in real-time. Different from classic reconstruction of physical depth in computer vision, we compute perceptually plausible disparity, that is numerically inaccurate, but results in a very similar overall depth impression with plausible overall layout, sharp edges, fine details and agreement between luminance and disparity. We use several simple monocular cues to estimate disparity maps and confidence maps of low spatial and temporal resolution in real-time. These are complemented by spatially-varying, appearance-dependent and class-specific disparity prior maps, learned from example stereo images. Scene classification selects this prior at runtime. Fusion of prior and cues is done by means of robust MAP inference on a dense spatio-temporal conditional random field with high spatial and temporal resolution. Using normal distributions allows this in constant-time, parallel per-pixel work. We compare our approach to previous 2D-to-3D conversion systems in terms of different metrics, as well as a user study and validate our notion of perceptually plausible disparity.

5.
IEEE Trans Vis Comput Graph ; 23(4): 1322-1331, 2017 04.
Artículo en Inglés | MEDLINE | ID: mdl-28129167

RESUMEN

Accommodative depth cues, a wide field of view, and ever-higher resolutions all present major hardware design challenges for near-eye displays. Optimizing a design to overcome one of these challenges typically leads to a trade-off in the others. We tackle this problem by introducing an all-in-one solution - a new wide field of view, gaze-tracked near-eye display for augmented reality applications. The key component of our solution is the use of a single see-through, varifocal deformable membrane mirror for each eye reflecting a display. They are controlled by airtight cavities and change the effective focal power to present a virtual image at a target depth plane which is determined by the gaze tracker. The benefits of using the membranes include wide field of view (100° diagonal) and fast depth switching (from 20 cm to infinity within 300 ms). Our subjective experiment verifies the prototype and demonstrates its potential benefits for near-eye see-through displays.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA