Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 31
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Artículo en Inglés | MEDLINE | ID: mdl-38345957

RESUMEN

In this paper, we introduce Neural-ABC, a novel parametric model based on neural implicit functions that can represent clothed human bodies with disentangled latent spaces for identity, clothing, shape, and pose. Traditional mesh-based representations struggle to represent articulated bodies with clothes due to the diversity of human body shapes and clothing styles, as well as the complexity of poses. Our proposed model provides a unified framework for parametric modeling, which can represent the identity, clothing, shape and pose of the clothed human body. Our proposed approach utilizes the power of neural implicit functions as the underlying representation and integrates well-designed structures to meet the necessary requirements. Specifically, we represent the underlying body as a signed distance function and clothing as an unsigned distance function, and they can be uniformly represented as unsigned distance fields. Different types of clothing do not require predefined topological structures or classifications, and can follow changes in the underlying body to fit the body. Additionally, we construct poses using a controllable articulated structure. The model is trained on both open and newly constructed datasets, and our decoupling strategy is carefully designed to ensure optimal performance. Our model excels at disentangling clothing and identity in different shape and poses while preserving the style of the clothing. We demonstrate that Neural-ABC fits new observations of different types of clothing. Compared to other state-of-the-art parametric models, Neural-ABC demonstrates powerful advantages in the reconstruction of clothed human bodies, as evidenced by fitting raw scans, depth maps and images. We show that the attributes of the fitted results can be further edited by adjusting their identities, clothing, shape and pose codes. The dataset and trained parametric model will be available at https://ustc3dv.github.io/NeuralABC/.

2.
IEEE Trans Image Process ; 33: 625-638, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38198242

RESUMEN

How to model the effect of reflection is crucial for single image reflection removal (SIRR) task. Modern SIRR methods usually simplify the reflection formulation with the assumption of linear combination of a transmission layer and a reflection layer. However, the large variations in image content and the real-world picture-taking conditions often result in far more complex reflection. In this paper, we introduce a new screen-blur combination based on two important factors, namely the intensity and the blurriness of reflection, to better characterize the reflection formulation in SIRR. Specifically, we present Screen-blur Reflection Networks (SRNet), which executes the screen-blur formulation in its network design and adapts to the complex reflection on real scenes. Technically, SRNet consists of three components: a blended image generator, a reflection estimator and a reflection removal module. The image generator exploits the screen-blur combination to synthesize the training blended images. The reflection estimator learns the reflection layer and a blur degree that measures the level of blurriness for reflection. The reflection removal module further uses the blended image, blur degree and reflection layer to filter out the transmission layer in a cascaded manner. Superior results on three different SIRR methods are reported when generating the training data on the principle of the screen-blur combination. Moreover, extensive experiments on six datasets quantitatively and qualitatively demonstrate the efficacy of SRNet over the state-of-the-art methods.

3.
Materials (Basel) ; 16(17)2023 Aug 30.
Artículo en Inglés | MEDLINE | ID: mdl-37687635

RESUMEN

The welding and construction processes for H-type thick-plate bridge steel involve complex multi-pass welding processes, which make it difficult to ensure its welding performance. Accordingly, it is crucial to explore the inherent correlations between the welding process parameters and welding quality, and apply them to welding robots, eliminating the instability in manual welding. In order to improve welding quality, the GMAW (gas metal arc welding) welding process parameters are simulated, using the Q345qD bridge steel flat joint model. Four welds with X-shaped grooves are designed to optimize the parameters of the welding current, welding voltage, and welding speed. The optimal welding process parameters are investigated through thermal-elastic-plastic simulation analysis and experimental verification. The results indicate that, when the welding current is set to 230 A, the welding voltage to 32 V, and the welding speed to 0.003 m/s, the maximum deformation of the welded plate is 0.52 mm, with a maximum welding residual stress of 345 MPa. Both the simulation results of multi-pass welding, and the experimental tests meet the welding requirements, as they show no excessive stress or strain. These parameters can be applied to building large steel-frame bridges using welding robots, improving the quality of welded joints.

4.
Artículo en Inglés | MEDLINE | ID: mdl-37590116

RESUMEN

Recently, many works have been proposed to utilize the neural radiance field for novel view synthesis of human performers. However, most of these methods require hours of training, making them difficult for practical use. To address this challenging problem, we propose IntrinsicNGP, which can train from scratch and achieve high-fidelity results in few minutes with videos of a human performer. To achieve this target, we introduce a continuous and optimizable intrinsic coordinate rather than the original explicit Euclidean coordinate in the hash encoding module of instant-NGP. With this novel intrinsic coordinate, IntrinsicNGP can aggregate inter-frame information for dynamic objects with the help of proxy geometry shapes. Moreover, the results trained with the given rough geometry shapes can be further refined with an optimizable offset field based on the intrinsic coordinate. Extensive experimental results on several datasets demonstrate the effectiveness and efficiency of IntrinsicNGP. We also illustrate our approach's ability to edit the shape of reconstructed subjects.

5.
IEEE Trans Pattern Anal Mach Intell ; 45(8): 9681-9698, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-37027610

RESUMEN

Non-rigid 3D registration, which deforms a source 3D shape in a non-rigid way to align with a target 3D shape, is a classical problem in computer vision. Such problems can be challenging because of imperfect data (noise, outliers and partial overlap) and high degrees of freedom. Existing methods typically adopt the lp type robust norm to measure the alignment error and regularize the smoothness of deformation, and use a proximal algorithm to solve the resulting non-smooth optimization problem. However, the slow convergence of such algorithms limits their wide applications. In this paper, we propose a formulation for robust non-rigid registration based on a globally smooth robust norm for alignment and regularization, which can effectively handle outliers and partial overlaps. The problem is solved using the majorization-minimization algorithm, which reduces each iteration to a convex quadratic problem with a closed-form solution. We further apply Anderson acceleration to speed up the convergence of the solver, enabling the solver to run efficiently on devices with limited compute capability. Extensive experiments demonstrate the effectiveness of our method for non-rigid alignment between two shapes with outliers and partial overlaps, with quantitative evaluation showing that it outperforms state-of-the-art methods in terms of registration accuracy and computational speed. The source code is available at https://github.com/yaoyx689/AMM_NRR.


Asunto(s)
Algoritmos , Programas Informáticos
6.
IEEE Trans Pattern Anal Mach Intell ; 45(8): 9726-9742, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-37022866

RESUMEN

Point clouds are characterized by irregularity and unstructuredness, which pose challenges in efficient data exploitation and discriminative feature extraction. In this paper, we present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology as a completely regular 2D point geometry image (PGI) structure, in which coordinates of spatial points are captured in colors of image pixels. Intuitively, Flattening-Net implicitly approximates a locally smooth 3D-to-2D surface flattening process while effectively preserving neighborhood consistency. As a generic representation modality, PGI inherently encodes the intrinsic property of the underlying manifold structure and facilitates surface-style point feature aggregation. To demonstrate its potential, we construct a unified learning framework directly operating on PGIs to achieve diverse types of high-level and low-level downstream applications driven by specific task networks, including classification, segmentation, reconstruction, and upsampling. Extensive experiments demonstrate that our methods perform favorably against the current state-of-the-art competitors.


Asunto(s)
Algoritmos , Aprendizaje
7.
IEEE Trans Vis Comput Graph ; 29(4): 2203-2210, 2023 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-34752397

RESUMEN

Caricature is a type of artistic style of human faces that attracts considerable attention in the entertainment industry. So far a few 3D caricature generation methods exist and all of them require some caricature information (e.g., a caricature sketch or 2D caricature) as input. This kind of input, however, is difficult to provide by non-professional users. In this paper, we propose an end-to-end deep neural network model that generates high-quality 3D caricatures directly from a normal 2D face photo. The most challenging issue for our system is that the source domain of face photos (characterized by normal 2D faces) is significantly different from the target domain of 3D caricatures (characterized by 3D exaggerated face shapes and textures). To address this challenge, we: (1) build a large dataset of 5,343 3D caricature meshes and use it to establish a PCA model in the 3D caricature shape space; (2) reconstruct a normal full 3D head from the input face photo and use its PCA representation in the 3D caricature shape space to establish correspondences between the input photo and 3D caricature shape; and (3) propose a novel character loss and a novel caricature loss based on previous psychological studies on caricatures. Experiments including a novel two-level user study show that our system can generate high-quality 3D caricatures directly from normal face photos.

8.
IEEE Trans Neural Netw Learn Syst ; 34(11): 8566-8578, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-35226610

RESUMEN

Mesh is a type of data structure commonly used for 3-D shapes. Representation learning for 3-D meshes is essential in many computer vision and graphics applications. The recent success of convolutional neural networks (CNNs) for structured data (e.g., images) suggests the value of adapting insights from CNN for 3-D shapes. However, 3-D shape data are irregular since each node's neighbors are unordered. Various graph neural networks for 3-D shapes have been developed with isotropic filters or predefined local coordinate systems to overcome the node inconsistency on graphs. However, isotropic filters or predefined local coordinate systems limit the representation power. In this article, we propose a local structure-aware anisotropic convolutional operation (LSA-Conv) that learns adaptive weighting matrices for each template's node according to its neighboring structure and performs shared anisotropic filters. In fact, the learnable weighting matrix is similar to the attention matrix in the random synthesizer-a new Transformer model for natural language processing (NLP). Since the learnable weighting matrices require large amounts of parameters for high-resolution 3-D shapes, we introduce a matrix factorization technique to notably reduce the parameter size, denoted as LSA-small. Furthermore, a residual connection with a linear transformation is introduced to improve the performance of our LSA-Conv. Comprehensive experiments demonstrate that our model produces significant improvement in 3-D shape reconstruction compared to state-of-the-art methods.

9.
IEEE Trans Vis Comput Graph ; 29(9): 3826-3839, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-35503829

RESUMEN

The freeform architectural modeling process often involves two important stages: concept design and digital modeling. In the first stage, architects usually sketch the overall 3D shape and the panel layout on a physical or digital paper briefly. In the second stage, a digital 3D model is created using the sketch as a reference. The digital model needs to incorporate geometric requirements for its components, such as the planarity of panels due to consideration of construction costs, which can make the modeling process more challenging. In this work, we present a novel sketch-based system to bridge the concept design and digital modeling of freeform roof-like shapes represented as planar quadrilateral (PQ) meshes. Our system allows the user to sketch the surface boundary and contour lines under axonometric projection and supports the sketching of occluded regions. In addition, the user can sketch feature lines to provide directional guidance to the PQ mesh layout. Given the 2D sketch input, we propose a deep neural network to infer in real-time the underlying surface shape along with a dense conjugate direction field, both of which are used to extract the final PQ mesh. To train and validate our network, we generate a large synthetic dataset that mimics architect sketching of freeform quadrilateral patches. The effectiveness and usability of our system are demonstrated with quantitative and qualitative evaluation as well as user studies.

10.
IEEE Trans Vis Comput Graph ; 28(2): 1274-1287, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-32746288

RESUMEN

Facial expression retargeting from humans to virtual characters is a useful technique in computer graphics and animation. Traditional methods use markers or blendshapes to construct a mapping between the human and avatar faces. However, these approaches require a tedious 3D modeling process, and the performance relies on the modelers' experience. In this article, we propose a brand-new solution to this cross-domain expression transfer problem via nonlinear expression embedding and expression domain translation. We first build low-dimensional latent spaces for the human and avatar facial expressions with variational autoencoder. Then we construct correspondences between the two latent spaces guided by geometric and perceptual constraints. Specifically, we design geometric correspondences to reflect geometric matching and utilize a triplet data structure to express users' perceptual preference of avatar expressions. A user-friendly method is proposed to automatically generate triplets for a system allowing users to easily and efficiently annotate the correspondences. Using both geometric and perceptual correspondences, we trained a network for expression domain translation from human to avatar. Extensive experimental results and user studies demonstrate that even nonprofessional users can apply our method to generate high-quality facial expression retargeting results with less time and effort.


Asunto(s)
Gráficos por Computador , Expresión Facial , Humanos , Interfaz Usuario-Computador
11.
IEEE Trans Pattern Anal Mach Intell ; 44(7): 3450-3466, 2022 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-33497327

RESUMEN

The iterative closest point (ICP) algorithm and its variants are a fundamental technique for rigid registration between two point sets, with wide applications in different areas from robotics to 3D reconstruction. The main drawbacks for ICP are its slow convergence, as well as its sensitivity to outliers, missing data, and partial overlaps. Recent work such as Sparse ICP achieves robustness via sparsity optimization at the cost of computational speed. In this paper, we propose a new method for robust registration with fast convergence. First, we show that the classical point-to-point ICP can be treated as a majorization-minimization (MM) algorithm, and propose an Anderson acceleration approach to speed up its convergence. In addition, we introduce a robust error metric based on the Welsch's function, which is minimized efficiently using the MM algorithm with Anderson acceleration. On challenging datasets with noises and partial overlaps, we achieve similar or better accuracy than Sparse ICP while being at least an order of magnitude faster. Finally, we extend the robust formulation to point-to-plane ICP, and solve the resulting problem using a similar Anderson-accelerated MM strategy. Our robust ICP methods improve the registration accuracy on benchmark datasets while being competitive in computational time.

12.
IEEE Trans Vis Comput Graph ; 28(12): 3959-3973, 2022 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-34495834

RESUMEN

This article addresses the problem of mesh super-resolution such that the geometry details which are not well represented in the low-resolution models can be recovered and well represented in the generated high-quality models. The main challenges of this problem are the nonregularity of 3D mesh representation and the high complexity of 3D shapes. We propose a deep neural network called GDR-Net to solve this ill-posed problem, which resolves the two challenges simultaneously. First, to overcome the nonregularity, we regress a displacement in radial basis function parameter space instead of the vertex-wise coordinates in the euclidean space. Second, to overcome the high complexity, we apply the detail recovery process to small surface patches extracted from the input surface and obtain the overall high-quality mesh by fusing the refined surface patches. To train the network, we constructed a dataset composed of both real-world and synthetic scanned models, including high/low-quality pairs. Our experimental results demonstrate that GDR-Net works well for general models and outperforms previous methods for recovering geometric details.

13.
IEEE Trans Vis Comput Graph ; 28(12): 4930-4939, 2022 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-34478373

RESUMEN

In this article, we develop a novel method for fast geodesic distance queries. The key idea is to embed the mesh into a high-dimensional space, such that the euclidean distance in the high-dimensional space can induce the geodesic distance in the original manifold surface. However, directly solving the high-dimensional embedding problem is not feasible due to the large number of variables and the fact that the embedding problem is highly nonlinear. We overcome the challenges with two novel ideas. First, instead of taking all vertices as variables, we embed only the saddle vertices, which greatly reduces the problem complexity. We then compute a local embedding for each non-saddle vertex. Second, to reduce the large approximation error resulting from the purely euclidean embedding, we propose a cascaded optimization approach that repeatedly introduces additional embedding coordinates with a non-euclidean function to reduce the approximation residual. Using the precomputation data, our approach can determine the geodesic distance between any two vertices in near-constant time. Computational testing results show that our method is more desirable than previous geodesic distance queries methods.

14.
IEEE Trans Image Process ; 30: 3815-3827, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33735079

RESUMEN

We present a novel method to jointly learn a 3D face parametric model and 3D face reconstruction from diverse sources. Previous methods usually learn 3D face modeling from one kind of source, such as scanned data or in-the-wild images. Although 3D scanned data contain accurate geometric information of face shapes, the capture system is expensive and such datasets usually contain a small number of subjects. On the other hand, in-the-wild face images are easily obtained and there are a large number of facial images. However, facial images do not contain explicit geometric information. In this paper, we propose a method to learn a unified face model from diverse sources. Besides scanned face data and face images, we also utilize a large number of RGB-D images captured with an iPhone X to bridge the gap between the two sources. Experimental results demonstrate that with training data from more sources, we can learn a more powerful face model.


Asunto(s)
Cara/diagnóstico por imagen , Imagenología Tridimensional/métodos , Bases de Datos Factuales , Femenino , Humanos , Masculino , Teléfono Inteligente
15.
IEEE Trans Pattern Anal Mach Intell ; 43(2): 579-594, 2021 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-31398106

RESUMEN

In this paper, we propose a parallel and scalable approach for geodesic distance computation on triangle meshes. Our key observation is that the recovery of geodesic distance with the heat method [1] can be reformulated as optimization of its gradients subject to integrability, which can be solved using an efficient first-order method that requires no linear system solving and converges quickly. Afterward, the geodesic distance is efficiently recovered by parallel integration of the optimized gradients in breadth-first order. Moreover, we employ a similar breadth-first strategy to derive a parallel Gauss-Seidel solver for the diffusion step in the heat method. To further lower the memory consumption from gradient optimization on faces, we also propose a formulation that optimizes the projected gradients on edges, which reduces the memory footprint by about 50 percent. Our approach is trivially parallelizable, with a low memory footprint that grows linearly with respect to the model size. This makes it particularly suitable for handling large models. Experimental results show that it can efficiently compute geodesic distance on meshes with more than 200 million vertices on a desktop PC with 128 GB RAM, outperforming the original heat method and other state-of-the-art geodesic distance solvers.

16.
IEEE Comput Graph Appl ; 41(6): 152-163, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-32946388

RESUMEN

This article presents a simple yet effective algorithm for automatically transferring face colors in portrait videos. We extract the facial features and vectorize the faces in the input video using Poisson vector graphics, which encodes the low-frequency colors as the boundary colors of diffusion curves, and the high-frequency colors as Poisson regions. Then, we transfer the face color of a reference image/video to the first frame of the input video by applying optimal mass transport between the boundary colors of diffusion curves. Next the boundary color of the first frame is transferred to the subsequent frames by matching the curves. Finally, with the original or modified Poisson regions, we render the video using an efficient random-access Poisson solver. Thanks to our efficient diffusion curve matching algorithm, transferring colors for the vectorized video takes less than 1 millisecond per frame. Our method is particularly desired for frequent transfer from multiple references due to its information reuse nature. The simple diffusion curve matching also greatly improves the performance of video vectorization, since we only need to solve an optimization problem for the first frame. Since our method does not require correspondence between the reference image/video and the input video, it is flexible and robust to handle faces with significantly different geometries and postures, which often pose challenges to the existing methods. Moreover, by manipulating Poisson regions, we can enhance or reduce the highlight and contrast so that the reference color can fit into the input video naturally. We demonstrate the efficacy of our method on image-to-video transfer and color swap in videos.

17.
IEEE Trans Vis Comput Graph ; 26(8): 2560-2575, 2020 08.
Artículo en Inglés | MEDLINE | ID: mdl-32324557

RESUMEN

Human bodies exhibit various shapes for different identities or poses, but the body shape has certain similarities in structure and thus can be embedded in a low-dimensional space. This article presents an autoencoder-like network architecture to learn disentangled shape and pose embedding specifically for the 3D human body. This is inspired by recent progress of deformation-based latent representation learning. To improve the reconstruction accuracy, we propose a hierarchical reconstruction pipeline for the disentangling process and construct a large dataset of human body models with consistent connectivity for the learning of the neural network. Our learned embedding can not only achieve superior reconstruction accuracy but also provide great flexibility in 3D human body generation via interpolation, bilinear interpolation, and latent space sampling. The results from extensive experiments demonstrate the powerfulness of our learned 3D human body embedding in various applications.


Asunto(s)
Gráficos por Computador , Procesamiento de Imagen Asistido por Computador/métodos , Redes Neurales de la Computación , Algoritmos , Femenino , Humanos , Imagenología Tridimensional , Masculino , Postura/fisiología
18.
IEEE Trans Pattern Anal Mach Intell ; 42(10): 2552-2566, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-31144624

RESUMEN

Existing convolutional neural network (CNN) based face recognition algorithms typically learn a discriminative feature mapping, using a loss function that enforces separation of features from different classes and/or aggregation of features within the same class. However, they may suffer from bias in the training data such as uneven sampling density, because they optimize the adjacency relationship of the learned features without considering the proximity of the underlying faces. Moreover, since they only use facial images for training, the learned feature mapping may not correctly indicate the relationship of other attributes such as gender and ethnicity, which can be important for some face recognition applications. In this paper, we propose a new CNN-based face recognition approach that incorporates such attributes into the training process. Using an attribute-aware loss function that regularizes the feature mapping using attribute proximity, our approach learns more discriminative features that are correlated with the attributes. We train our face recognition model on a large-scale RGB-D data set with over 100K identities captured under real application conditions. By comparing our approach with other methods on a variety of experiments, we demonstrate that depth channel and attribute-aware loss greatly improve the accuracy and robustness of face recognition.


Asunto(s)
Reconocimiento Facial Automatizado/métodos , Cara , Redes Neurales de la Computación , Algoritmos , Cara/anatomía & histología , Cara/diagnóstico por imagen , Femenino , Humanos , Masculino
19.
IEEE Trans Vis Comput Graph ; 25(4): 1774-1787, 2019 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-29993982

RESUMEN

The joint bilateral filter, which enables feature-preserving signal smoothing according to the structural information from a guidance, has been applied for various tasks in geometry processing. Existing methods either rely on a static guidance that may be inconsistent with the input and lead to unsatisfactory results, or a dynamic guidance that is automatically updated but sensitive to noises and outliers. Inspired by recent advances in image filtering, we propose a new geometry filtering technique called static/dynamic filter, which utilizes both static and dynamic guidances to achieve state-of-the-art results. The proposed filter is based on a nonlinear optimization that enforces smoothness of the signal while preserving variations that correspond to features of certain scales. We develop an efficient iterative solver for the problem, which unifies existing filters that are based on static or dynamic guidances. The filter can be applied to mesh face normals followed by vertex position update, to achieve scale-aware and feature-preserving filtering of mesh geometry. It also works well for other types of signals defined on mesh surfaces, such as texture colors. Extensive experimental results demonstrate the effectiveness of the proposed filter for various geometry processing applications such as mesh denoising, geometry feature enhancement, and texture color filtering.

20.
Artículo en Inglés | MEDLINE | ID: mdl-29994214

RESUMEN

3D face reconstruction from a single image is a classical and challenging problem, with wide applications in many areas. Inspired by recent works in face animation from RGBD or monocular video inputs, we develop a novel method for reconstructing 3D faces from unconstrained 2D images, using a coarse-to-fine optimization strategy. First, a smooth coarse 3D face is generated from an example-based bilinear face model, by aligning the projection of 3D face landmarks with 2D landmarks detected from the input image. Afterwards, using local corrective deformation fields, the coarse 3D face is refined using photometric consistency constraints, resulting in a medium face shape. Finally, a shape-from-shading method is applied on the medium face to recover fine geometric details. Our method outperforms stateof- the-art approaches in terms of accuracy and detail recovery, which is demonstrated in extensive experiments using real world models and publicly available datasets.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...