Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Mais filtros












Base de dados
Intervalo de ano de publicação
1.
IEEE Trans Pattern Anal Mach Intell ; 45(12): 14821-14837, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37713218

RESUMO

Neural Radiance Fields (NeRFs) have shown great potential for tasks like novel view synthesis of static 3D scenes. Since NeRFs are trained on a large number of input images, it is not trivial to change their content afterwards. Previous methods to modify NeRFs provide some control but they do not support direct shape deformation which is common for geometry representations like triangle meshes. In this paper, we present a NeRF geometry editing method that first extracts a triangle mesh representation of the geometry inside a NeRF. This mesh can be modified by any 3D modeling tool (we use ARAP mesh deformation). The mesh deformation is then extended into a volume deformation around the shape which establishes a mapping between ray queries to the deformed NeRF and the corresponding queries to the original NeRF. The basic shape editing mechanism is extended towards more powerful and more meaningful editing handles by generating box abstractions of the NeRF shapes which provide an intuitive interface to the user. By additionally assigning semantic labels, we can even identify and combine parts from different objects. We demonstrate the performance and quality of our method in a number of experiments on synthetic data as well as real captured scenes.

2.
IEEE Trans Pattern Anal Mach Intell ; 45(7): 8713-8728, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37015513

RESUMO

The recently proposed neural radiance fields (NeRF) use a continuous function formulated as a multi-layer perceptron (MLP) to model the appearance and geometry of a 3D scene. This enables realistic synthesis of novel views, even for scenes with view dependent appearance. Many follow-up works have since extended NeRFs in different ways. However, a fundamental restriction of the method remains that it requires a large number of images captured from densely placed viewpoints for high-quality synthesis and the quality of the results quickly degrades when the number of captured views is insufficient. To address this problem, we propose a novel NeRF-based framework capable of high-quality view synthesis using only a sparse set of RGB-D images, which can be easily captured using cameras and LiDAR sensors on current consumer devices. First, a geometric proxy of the scene is reconstructed from the captured RGB-D images. Renderings of the reconstructed scene along with precise camera parameters can then be used to pre-train a network. Finally, the network is fine-tuned with a small number of real captured images. We further introduce a patch discriminator to supervise the network under novel views during fine-tuning, as well as a 3D color prior to improve synthesis quality. We demonstrate that our method can generate arbitrary novel views of a 3D scene from as few as 6 RGB-D images. Extensive experiments show the improvements of our method compared with the existing NeRF-based methods, including approaches that also aim to reduce the number of input images.

3.
Sci Adv ; 8(47): eade0828, 2022 Nov 25.
Artigo em Inglês | MEDLINE | ID: mdl-36427303

RESUMO

To design advanced functional materials, different concepts are currently pursued, including machine learning and high-throughput calculations. Here, a different approach is presented, which uses the innate structure of the multidimensional property space. Clustering algorithms confirm the intricate structure of property space and relate the different property classes to different chemical bonding mechanisms. For the inorganic compounds studied here, four different property classes are identified and related to ionic, metallic, covalent, and recently identified metavalent bonding. These different bonding mechanisms can be quantified by two quantum chemical bonding descriptors, the number of electrons transferred and the number of electrons shared between adjacent atoms. Hence, we can link these bonding descriptors to the corresponding property portfolio, turning bonding descriptors into property predictors. The close relationship between material properties and quantum chemical bonding descriptors can be used for an inverse material design, identifying particularly promising materials based on a set of target functionalities.

4.
IEEE Trans Vis Comput Graph ; 28(6): 2430-2444, 2022 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-33079671

RESUMO

Segmenting arbitrary 3D objects into constituent parts that are structurally meaningful is a fundamental problem encountered in a wide range of computer graphics applications. Existing methods for 3D shape segmentation suffer from complex geometry processing and heavy computation caused by using low-level features and fragmented segmentation results due to the lack of global consideration. We present an efficient method, called SEG-MAT, based on the medial axis transform (MAT) of the input shape. Specifically, with the rich geometrical and structural information encoded in the MAT, we are able to develop a simple and principled approach to effectively identify the various types of junctions between different parts of a 3D shape. Extensive evaluations and comparisons show that our method outperforms the state-of-the-art methods in terms of segmentation quality and is also one order of magnitude faster.

5.
IEEE Trans Vis Comput Graph ; 27(1): 83-97, 2021 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-31449026

RESUMO

We present a learning-based approach to reconstructing high-resolution three-dimensional (3D) shapes with detailed geometry and high-fidelity textures. Albeit extensively studied, algorithms for 3D reconstruction from multi-view depth-and-color (RGB-D) scans are still prone to measurement noise and occlusions; limited scanning or capturing angles also often lead to incomplete reconstructions. Propelled by recent advances in 3D deep learning techniques, in this paper, we introduce a novel computation- and memory-efficient cascaded 3D convolutional network architecture, which learns to reconstruct implicit surface representations as well as the corresponding color information from noisy and imperfect RGB-D maps. The proposed 3D neural network performs reconstruction in a progressive and coarse-to-fine manner, achieving unprecedented output resolution and fidelity. Meanwhile, an algorithm for end-to-end training of the proposed cascaded structure is developed. We further introduce Human10, a newly created dataset containing both detailed and textured full-body reconstructions as well as corresponding raw RGB-D scans of 10 subjects. Qualitative and quantitative experimental results on both synthetic and real-world datasets demonstrate that the presented approach outperforms existing state-of-the-art work regarding visual quality and accuracy of reconstructed models.

6.
IEEE Trans Vis Comput Graph ; 27(3): 2085-2100, 2021 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-31536004

RESUMO

Example-based mesh deformation methods are powerful tools for realistic shape editing. However, existing techniques typically combine all the example deformation modes, which can lead to overfitting, i.e., using an overly complicated model to explain the user-specified deformation. This leads to implausible or unstable deformation results, including unexpected global changes outside the region of interest. To address this fundamental limitation, we propose a sparse blending method that automatically selects a smaller number of deformation modes to compactly describe the desired deformation. This along with a suitably chosen deformation basis including spatially localized deformation modes leads to significant advantages, including more meaningful, reliable, and efficient deformations because fewer and localized deformation modes are applied. To cope with large rotations, we develop a simple but effective representation based on polar decomposition of deformation gradients, which resolves the ambiguity of large global rotations using an as-consistent-as-possible global optimization. This simple representation has a closed form solution for derivatives, making it efficient for our sparse localized representation and thus ensuring interactive performance. Experimental results show that our method outperforms state-of-the-art data-driven mesh deformation methods, for both quality of results and efficiency.

7.
IEEE Trans Vis Comput Graph ; 27(6): 3007-3018, 2021 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-32746265

RESUMO

In geometry processing, symmetry is a universal type of high-level structural information of 3D models and benefits many geometry processing tasks including shape segmentation, alignment, matching, and completion. Thus it is an important problem to analyze various symmetry forms of 3D shapes. Planar reflective symmetry is the most fundamental one. Traditional methods based on spatial sampling can be time-consuming and may not be able to identify all the symmetry planes. In this article, we present a novel learning framework to automatically discover global planar reflective symmetry of a 3D shape. Our framework trains an unsupervised 3D convolutional neural network to extract global model features and then outputs possible global symmetry parameters, where input shapes are represented using voxels. We introduce a dedicated symmetry distance loss along with a regularization loss to avoid generating duplicated symmetry planes. Our network can also identify generalized cylinders by predicting their rotation axes. We further provide a method to remove invalid and duplicated planes and axes. We demonstrate that our method is able to produce reliable and accurate results. Our neural network based method is hundreds of times faster than the state-of-the-art methods, which are based on sampling. Our method is also robust even with noisy or incomplete input surfaces.

8.
IEEE Comput Graph Appl ; 40(3): 19-31, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32078540

RESUMO

Digitalization of three-dimensional (3-D) objects and scenes using modern depth sensors and high-resolution RGB cameras enables the preservation of human cultural artifacts at an unprecedented level of detail. Interactive visualization of these large datasets, however, is challenging without degradation in visual fidelity. A common solution is to fit the dataset into available video memory by downsampling and compression. The achievable reproduction accuracy is thereby limited for interactive scenarios, such as immersive exploration in virtual reality (VR). This degradation in visual realism ultimately hinders the effective communication of human cultural knowledge. This article presents a method to render 3-D scan datasets with minimal loss of visual fidelity. A point-based rendering approach visualizes scan data as a dense splat cloud. For improved surface approximation of thin and sparsely sampled objects, we propose oriented 3-D ellipsoids as rendering primitives. To render massive texture datasets, we present a virtual texturing system that dynamically loads required image data. It is paired with a single-pass page prediction method that minimizes visible texturing artifacts. Our system renders a challenging dataset in the order of 70 million points and a texture size of 1.2 TB consistently at 90 frames per second in stereoscopic VR.

9.
IEEE Trans Vis Comput Graph ; 26(11): 3217-3230, 2020 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-31150341

RESUMO

We present a novel approach to integrate data from multiple sensor types for dense 3D reconstruction of indoor scenes in realtime. Existing algorithms are mainly based on a single RGBD camera and thus require continuous scanning of areas with sufficient geometric features. Otherwise, tracking may fail due to unreliable frame registration. Inspired by the fact that the fusion of multiple sensors can combine their strengths towards a more robust and accurate self-localization, we incorporate multiple types of sensors which are prevalent in modern robot systems, including a 2D range sensor, an inertial measurement unit (IMU), and wheel encoders. We fuse their measurements to reinforce the tracking process and to eventually obtain better 3D reconstructions. Specifically, we develop a 2D truncated signed distance field (TSDF) volume representation for the integration and ray-casting of laser frames, leading to a unified cost function in the pose estimation stage. For validation of the estimated poses in the loop-closure optimization process, we train a classifier for the features extracted from heterogeneous sensors during the registration progress. To evaluate our method on challenging use case scenarios, we assembled a scanning platform prototype to acquire real-world scans. We further simulated synthetic scans based on high-fidelity synthetic scenes for quantitative evaluation. Extensive experimental evaluation on these two types of scans demonstrate that our system is capable of robustly acquiring dense 3D reconstructions and outperforms state-of-the-art RGBD and LiDAR systems.

10.
IEEE Trans Vis Comput Graph ; 24(4): 1623-1632, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-29543179

RESUMO

In virtual environments, the space that can be explored by real walking is limited by the size of the tracked area. To enable unimpeded walking through large virtual spaces in small real-world surroundings, redirection techniques are used. These unnoticeably manipulate the user's virtual walking trajectory. It is important to know how strongly such techniques can be applied without the user noticing the manipulation-or getting cybersick. Previously, this was estimated by measuring a detection threshold (DT) in highly-controlled psychophysical studies, which experimentally isolate the effect but do not aim for perceived immersion in the context of VR applications. While these studies suggest that only relatively low degrees of manipulation are tolerable, we claim that, besides establishing detection thresholds, it is important to know when the user's immersion breaks. We hypothesize that the degree of unnoticed manipulation is significantly different from the detection threshold when the user is immersed in a task. We conducted three studies: a) to devise an experimental paradigm to measure the threshold of limited immersion (TLI), b) to measure the TLI for slowly decreasing and increasing rotation gains, and c) to establish a baseline of cybersickness for our experimental setup. For rotation gains greater than 1.0, we found that immersion breaks quite late after the gain is detectable. However, for gains lesser than 1.0, some users reported a break of immersion even before established detection thresholds were reached. Apparently, the developed metric measures an additional quality of user experience. This article contributes to the development of effective spatial compression methods by utilizing the break of immersion as a benchmark for redirection techniques.


Assuntos
Enjoo devido ao Movimento/prevenção & controle , Realidade Virtual , Caminhada/fisiologia , Adolescente , Adulto , Gráficos por Computador , Feminino , Humanos , Masculino , Rotação , Adulto Jovem
11.
IEEE Trans Pattern Anal Mach Intell ; 39(9): 1744-1756, 2017 09.
Artigo em Inglês | MEDLINE | ID: mdl-27662671

RESUMO

Accurately determining the position and orientation from which an image was taken, i.e., computing the camera pose, is a fundamental step in many Computer Vision applications. The pose can be recovered from 2D-3D matches between 2D image positions and points in a 3D model of the scene. Recent advances in Structure-from-Motion allow us to reconstruct large scenes and thus create the need for image-based localization methods that efficiently handle large-scale 3D models while still being effective, i.e., while localizing as many images as possible. This paper presents an approach for large scale image-based localization that is both efficient and effective. At the core of our approach is a novel prioritized matching step that enables us to first consider features more likely to yield 2D-to-3D matches and to terminate the correspondence search as soon as enough matches have been found. Matches initially lost due to quantization are efficiently recovered by integrating 3D-to-2D search. We show how visibility information from the reconstruction process can be used to improve the efficiency of our approach. We evaluate the performance of our method through extensive experiments and demonstrate that it offers the best combination of efficiency and effectiveness among current state-of-the-art approaches for localization.

12.
Comput Intell Neurosci ; 2016: 6730249, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-26819588

RESUMO

We present a nonparametric facial feature localization method using relative directional information between regularly sampled image segments and facial feature points. Instead of using any iterative parameter optimization technique or search algorithm, our method finds the location of facial feature points by using a weighted concentration of the directional vectors originating from the image segments pointing to the expected facial feature positions. Each directional vector is calculated by linear combination of eigendirectional vectors which are obtained by a principal component analysis of training facial segments in feature space of histogram of oriented gradient (HOG). Our method finds facial feature points very fast and accurately, since it utilizes statistical reasoning from all the training data without need to extract local patterns at the estimated positions of facial features, any iterative parameter optimization algorithm, and any search algorithm. In addition, we can reduce the storage size for the trained model by controlling the energy preserving level of HOG pattern space.


Assuntos
Algoritmos , Face , Interpretação de Imagem Assistida por Computador , Reconhecimento Automatizado de Padrão , Reconhecimento Visual de Modelos/fisiologia , Humanos , Orientação/fisiologia , Análise de Componente Principal , Estatísticas não Paramétricas
13.
IEEE Trans Vis Comput Graph ; 21(12): 1390-402, 2015 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-26529460

RESUMO

With broader availability of large-scale 3D model repositories, the need for efficient and effective exploration becomes more and more urgent. Existing model retrieval techniques do not scale well with the size of the database since often a large number of very similar objects are returned for a query, and the possibilities to refine the search are quite limited. We propose an interactive approach where the user feeds an active learning procedure by labeling either entire models or parts of them as "like" or "dislike" such that the system can automatically update an active set of recommended models. To provide an intuitive user interface, candidate models are presented based on their estimated relevance for the current query. From the methodological point of view, our main contribution is to exploit not only the similarity between a query and the database models but also the similarities among the database models themselves. We achieve this by an offline pre-processing stage, where global and local shape descriptors are computed for each model and a sparse distance metric is derived that can be evaluated efficiently even for very large databases. We demonstrate the effectiveness of our method by interactively exploring a repository containing over 100 K models.

14.
IEEE Trans Vis Comput Graph ; 20(10): 1461-73, 2014 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26357392

RESUMO

An ever broader availability of freeform designs together with an increasing demand for product customization has lead to a rising interest in efficient physical realization of such designs, the trend toward personal fabrication. Not only large-scale architectural applications are (becoming increasingly) popular but also different consumer-level rapid-prototyping applications, including toy and 3D puzzle creation. In this work we present a method for do-it-yourself reproduction of freeform designs without the typical limitation of state-of-the-art approaches requiring manufacturing custom parts using semi-professional laser cutters or 3D printers. Our idea is based on a popular mathematical modeling system (Zometool) commonly used for modeling higher dimensional polyhedra and symmetric structures such as molecules and crystal lattices. The proposed method extends the scope of Zometool modeling to freeform, disk-topology surfaces. While being an efficient construction system on the one hand (consisting only of a single node type and nine different edge types), this inherent discreteness of the Zometool system, on the other hand gives rise to a hard approximation problem. We base our method on a marching front approach, where elements are not added in a greedy sense, but rather whole regions on the front are filled optimally, using a set of problem specific heuristics to keep complexity under control.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...