RESUMEN
Physical objects are usually not designed with interaction capabilities to control digital content. Nevertheless, they provide an untapped source for interactions since every object could be used to control our digital lives. We call this the missing interface problem: Instead of embedding computational capacity into objects, we can simply detect users' gestures on them. However, gesture detection on such unmodified objects has to date been limited in the spatial resolution and detection fidelity. To address this gap, we conducted research on micro-gesture detection on physical objects based on Google Soli's radar sensor. We introduced two novel deep learning architectures to process range Doppler images, namely a three-dimensional convolutional neural network (Conv3D) and a spectrogram-based ConvNet. The results show that our architectures enable robust on-object gesture detection, achieving an accuracy of approximately 94% for a five-gesture set, surpassing previous state-of-the-art performance results by up to 39%. We also showed that the decibel (dB) Doppler range setting has a significant effect on system performance, as accuracy can vary up to 20% across the dB range. As a result, we provide guidelines on how to best calibrate the radar sensor.
Asunto(s)
Gestos , Radar , Algoritmos , Redes Neurales de la Computación , Reconocimiento en PsicologíaRESUMEN
The vergence-accommodation conflict (VAC) presents a major perceptual challenge for head-mounted displays with a fixed image plane. Varifocal and layered display designs can mitigate the VAC. However, the image quality of varifocal displays is affected by imprecise eye tracking, whereas layered displays suffer from reduced image contrast as the distance between layers increases. Combined designs support a larger workspace and tolerate some eye-tracking error. However, any layered design with a fixed layer spacing restricts the amount of error compensation and limits the in-focus contrast. We extend previous hybrid designs by introducing confidence-driven volume control, which adjusts the size of the view volume at runtime. We use the eye tracker's confidence to control the spacing of display layers and optimize the trade-off between the display's view volume and the amount of eye tracking error the display can compensate. In the case of high-quality focus point estimation, our approach provides high in-focus contrast, whereas low-quality eye tracking increases the view volume to tolerate the error. We describe our design, present its implementation as an optical-see head-mounted display using a multiplicative layer combination, and present an evaluation comparing our design with previous approaches.
RESUMEN
Systems with occlusion capabilities, such as those used in vision augmentation, image processing, and optical see-through head-mounted display (OST-HMD), have gained popularity. Achieving precise (hard-edge) occlusion in these systems is challenging, often requiring complex optical designs and bulky volumes. On the other hand, utilizing a single transparent liquid crystal display (LCD) is a simple approach to create occlusion masks. However, the generated mask will appear defocused (soft-edge) resulting in insufficient blocking or occlusion leakage. In our work, we delve into the perception of soft-edge occlusion by the human visual system and present a preference-based optimal expansion method that minimizes perceived occlusion leakage. In a user study involving 20 participants, we made a noteworthy observation that the human eye perceives a sharper edge blur of the occlusion mask when individuals see through it and gaze at a far distance, in contrast to the camera system's observation. Moreover, our study revealed significant individual differences in the perception of soft-edge masks in human vision when focusing. These differences may lead to varying degrees of demand for mask size among individuals. Our evaluation demonstrates that our method successfully accounts for individual differences and achieves optimal masking effects at arbitrary distances and pupil sizes.
RESUMEN
This paper presents guitARhero, an Augmented Reality application for interactively teaching guitar playing to beginners through responsive visualizations overlaid on the guitar neck. We support two types of visual guidance, a highlighting of the frets that need to be pressed and a 3D hand overlay, as well as two display scenarios, one using a desktop magic mirror and one using a video see-through head-mounted display. We conducted a user study with 20 participants to evaluate how well users could follow instructions presented with different guidance and display combinations and compare these to a baseline where users had to follow video instructions. Our study highlights the trade-off between the provided information and visual clarity affecting the user's ability to interpret and follow instructions for fine-grained tasks. We show that the perceived usefulness of instruction integration into an HMD view highly depends on the hardware capabilities and instruction details.
RESUMEN
Accurate camera localization is an essential part of tracking systems. However, localization results are greatly affected by illumination. Including data collected under various lighting conditions can improve the robustness of the localization algorithm to lighting variation but it is time consuming. Synthetic images are easy to accumulate and we can control varying illumination. However, synthetic images do not perfectly match real images of the same scene, i.e., there exists a gap between real and synthetic images that also affects the accuracy of camera localization. To reduce the impact of this gap, we introduce ''real-to-synthetic feature transform (REST)." REST is a fully connected neural network that converts real features to their synthetic counterpart. The converted features can then be matched against the accumulated database for robust camera localization. Our experimental results show that REST improves matching accuracy by approximately 28% compared to a naïve method.
Asunto(s)
Iluminación , Reconocimiento de Normas Patrones Automatizadas , Algoritmos , Bases de Datos Factuales , Iluminación/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Estimulación LuminosaRESUMEN
In optical see-through augmented reality (AR), information is often distributed between real and virtual contexts, and often appears at different distances from the user. To integrate information, users must repeatedly switch context and change focal distance. If the user's task is conducted under time pressure, they may attempt to integrate information while their eye is still changing focal distance, a phenomenon we term transient focal blur. Previously, Gabbard, Mehra, and Swan (2018) examined these issues, using a text-based visual search task on a one-eye optical see-through AR display. This paper reports an experiment that partially replicates and extends this task on a custom-built AR Haploscope. The experiment examined the effects of context switching, focal switching distance, binocular and monocular viewing, and transient focal blur on task performance and eye fatigue. Context switching increased eye fatigue but did not decrease performance. Increasing focal switching distance increased eye fatigue and decreased performance. Monocular viewing also increased eye fatigue and decreased performance. The transient focal blur effect resulted in additional performance decrements, and is an addition to knowledge about AR user interface design issues.
Asunto(s)
Astenopía , Realidad Aumentada , Gráficos por Computador , Humanos , Análisis y Desempeño de TareasRESUMEN
Triangle meshes are used in many important shape-related applications including geometric modeling, animation production, system simulation, and visualization. However, these meshes are typically generated in raw form with several defects and poor-quality elements, obstructing them from practical application. Over the past decades, different surface remeshing techniques have been presented to improve these poor-quality meshes prior to the downstream utilization. A typical surface remeshing algorithm converts an input mesh into a higher quality mesh with consideration of given quality requirements as well as an acceptable approximation to the input mesh. In recent years, surface remeshing has gained significant attention from researchers and engineers, and several remeshing algorithms have been proposed. However, there has been no survey article on remeshing methods in general with a defined search strategy and article selection mechanism covering the recent approaches in surface remeshing domain with a good connection to classical approaches. In this article, we present a survey on surface remeshing techniques, classifying all collected articles in different categories and analyzing specific methods with their advantages, disadvantages, and possible future improvements. Following the systematic literature review methodology, we define step-by-step guidelines throughout the review process, including search strategy, literature inclusion/exclusion criteria, article quality assessment, and data extraction. With the aim of literature collection and classification based on data extraction, we summarized collected articles, considering the key remeshing objectives, the way the mesh quality is defined and improved, and the way their techniques are compared with other previous methods. Remeshing objectives are described by angle range control, feature preservation, error control, valence optimization, and remeshing compatibility. The metrics used in the literature for the evaluation of surface remeshing algorithms are discussed. Meshing techniques are compared with other related methods via a comprehensive table with indices of the method name, the remeshing challenge met and solved, the category the method belongs to, and the year of publication. We expect this survey to be a practical reference for surface remeshing in terms of literature classification, method analysis, and future prospects.
Asunto(s)
Algoritmos , Gráficos por Computador , Simulación por ComputadorRESUMEN
We present a novel method that robustly estimates the reflectance, even in an environment with dynamically changing light. To control the appearance of an object by using a projector-camera system, an appropriate estimate of the object's reflectance is vital to the creation of an appropriate projection image. Most conventional estimation methods assume static light conditions; however, in practice, the appearance is affected by both the reflectance and environmental light. In an environment with dynamically changing light, conventional reflectance estimation methods require calibration every time the conditions change. In contrast, our method requires no additional calibration because it simultaneously estimates both the reflectance and environmental light. Our method is based on the concept of creating two different light conditions by switching the projection at a rate higher than that perceived by the human eye and captures the images of a target object separately under each condition. The reflectance and environmental light are then simultaneously estimated by using the pair of images acquired under these two conditions. We implemented a projector-camera system that switches the projection on and off at 120 Hz. Experiments confirm the robustness of our method when changing the environmental light. Further, our method can robustly estimate the reflectance under practical indoor lighting conditions.
RESUMEN
Recent technical advancements support the application of Optical See-Through Head-Mounted Displays (OST-HMDs) in critical situations like navigation and manufacturing. However, while the form-factor of an OST-HMD occupies less of the user's visual field than in the past, it can still result in critical oversights, e.g., missing a pedestrian while driving a car. In this paper, we design and compare two methods to compensate for the loss of awareness due to the occlusion caused by OST-HMDs. Instead of presenting the occluded content to the user, we detect motion that is not visible to the user and highlight its direction either on the edge of the HMD screen, or by activating LEDs placed in the user's peripheral vision. The methods involve an offline stage, where the occluded visual field and location of each indicator and its associated occluded region of interest (OROI) are determined, and an online stage, where an enhanced optical flow algorithm tracks the motion in the occluded visual field. We have implemented both methods on a Microsoft HoloLens and an ODG R-9. Our prototype systems achieved success rates of 100% in an objective evaluation, and 98.90% in a pilot user study. Our methods are able to compensate for the loss of safety-critical information in the occluded visual field for state-of-the-art OST-HMDs and can be extended for their future generations.
Asunto(s)
Gráficos por Computador , Procesamiento de Imagen Asistido por Computador/métodos , Realidad Virtual , Campos Visuales/fisiología , Algoritmos , Diseño de Equipo , Cabeza/fisiología , Humanos , Interfaz Usuario-ComputadorRESUMEN
In recent years optical see-through head-mounted displays (OST-HMDs) have moved from conceptual research to a market of mass-produced devices with new models and applications being released continuously. It remains challenging to deploy augmented reality (AR) applications that require consistent spatial visualization. Examples include maintenance, training and medical tasks, as the view of the attached scene camera is shifted from the user's view. A calibration step can compute the relationship between the HMD-screen and the user's eye to align the digital content. However, this alignment is only viable as long as the display does not move, an assumption that rarely holds for an extended period of time. As a consequence, continuous recalibration is necessary. Manual calibration methods are tedious and rarely support practical applications. Existing automated methods do not account for user-specific parameters and are error prone. We propose the combination of a pre-calibrated display with a per-frame estimation of the user's cornea position to estimate the individual eye center and continuously recalibrate the system. With this, we also obtain the gaze direction, which allows for instantaneous uncalibrated eye gaze tracking, without the need for additional hardware and complex illumination. Contrary to existing methods, we use simple image processing and do not rely on iris tracking, which is typically noisy and can be ambiguous. Evaluation with simulated and real data shows that our approach achieves a more accurate and stable eye pose estimation, which results in an improved and practical calibration with a largely improved distribution of projection error.