Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 67
Filtrar
1.
Cyborg Bionic Syst ; 5: 0063, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38188983

RESUMO

Respiratory motion-induced vertebral movements can adversely impact intraoperative spine surgery, resulting in inaccurate positional information of the target region and unexpected damage during the operation. In this paper, we propose a novel deep learning architecture for respiratory motion prediction, which can adapt to different patients. The proposed method utilizes an LSTM-AE with attention mechanism network that can be trained using few-shot datasets during operation. To ensure real-time performance, a dimension reduction method based on the respiration-induced physical movement of spine vertebral bodies is introduced. The experiment collected data from prone-positioned patients under general anaesthesia to validate the prediction accuracy and time efficiency of the LSTM-AE-based motion prediction method. The experimental results demonstrate that the presented method (RMSE: 4.39%) outperforms other methods in terms of accuracy within a learning time of 2 min. The maximum predictive errors under the latency of 333 ms with respect to the x, y, and z axes of the optical camera system were 0.13, 0.07, and 0.10 mm, respectively, within a motion range of 2 mm.

2.
J Orthop Surg Res ; 18(1): 708, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-38178197

RESUMO

BACKGROUND: This study aimed to investigate the positional consistency between the guidewire and the screw in spinal internal fixation surgery. METHODS: This study involved 64 patients who underwent robot-assisted thoracic or lumbar pedicle screw fixation surgery. Guidewires were inserted with the assistance of the Tirobot. Either cannulated screws or solid screws were inserted. Guidewire and screw accuracy was measured using CT images based on the Gertzbein and Robbins scale. The positional consistency between guidewire and screw was evaluated based on the fused CT images, which could graphically and quantitatively demonstrate the consistency. The consistency was evaluated based on a grading system that considered the maximum distance and angulation between the centerline of the guidewire and the screw in the region of the pedicle. RESULTS: A total of 322 screws were placed including 206 cannulated ones and 116 solid ones. Based on the Gertzbein and Robbins scale, 97.5% of the guidewires were grade A, and 94.1% of the screws were grade A. Based on our guidewire-screw consistency scale, 85% in cannulated group, and 69.8% in solid group, were grade A. Both solid and cannulated screws may alter trajectory compared to the guidewires. The positional accuracy and guidewire-screw consistency in the solid screw group is significantly worse than that in the cannulated screw group. The cortical bone of the pedicle has a positive guide effect on either solid or cannulated screws. CONCLUSION: The pedicle screws may alter trajectory despite the guidance of the guidewires. Solid screws show worse positional accuracy and guidewire-screw consistency compared with cannulated screws. Trial registration The study was retrospectively registered and approved by our center's institutional review board.


Assuntos
Parafusos Pediculares , Procedimentos Cirúrgicos Robóticos , Robótica , Cirurgia Assistida por Computador , Humanos , Procedimentos Cirúrgicos Robóticos/métodos , Coluna Vertebral , Cirurgia Assistida por Computador/métodos
3.
IEEE Trans Pattern Anal Mach Intell ; 46(5): 3422-3437, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38100347

RESUMO

Neural radiance fields (NeRF) have shown great success in novel view synthesis. However, recovering high-quality details from real-world scenes is still challenging for the existing NeRF-based approaches, due to the potential imperfect calibration information and scene representation inaccuracy. Even with high-quality training frames, the synthetic novel views produced by NeRF models still suffer from notable rendering artifacts, such as noise and blur. To address this, we propose NeRFLiX, a general NeRF-agnostic restorer paradigm that learns a degradation-driven inter-viewpoint mixer. Specially, we design a NeRF-style degradation modeling approach and construct large-scale training data, enabling the possibility of effectively removing NeRF-native rendering artifacts for deep neural networks. Moreover, beyond the degradation removal, we propose an inter-viewpoint aggregation framework that fuses highly related high-quality training images, pushing the performance of cutting-edge NeRF models to entirely new levels and producing highly photo-realistic synthetic views. Based on this paradigm, we further present NeRFLiX++ with a stronger two-stage NeRF degradation simulator and a faster inter-viewpoint mixer, achieving superior performance with significantly improved computational efficiency. Notably, NeRFLiX++ is capable of restoring photo-realistic ultra-high-resolution outputs from noisy low-resolution NeRF-rendered views. Extensive experiments demonstrate the excellent restoration ability of NeRFLiX++ on various novel view synthesis benchmarks.

4.
Eur Radiol ; 2023 Oct 18.
Artigo em Inglês | MEDLINE | ID: mdl-37848772

RESUMO

OBJECTIVES: To develop an automatic computer-based method that can help clinicians in assessing spine growth potential based on EOS radiographs. METHODS: We developed a deep learning-based (DL) algorithm that can mimic the human judgment process to automatically determine spine growth potential and the Risser sign based on full-length spine EOS radiographs. A total of 3383 EOS cases were collected and used for the training and test of the algorithm. Subsequently, the completed DL algorithm underwent clinical validation on an additional 440 cases and was compared to the evaluations of four clinicians. RESULTS: Regarding the Risser sign, the weighted kappa value of our DL algorithm was 0.933, while that of the four clinicians ranged from 0.909 to 0.930. In the assessment of spine growth potential, the kappa value of our DL algorithm was 0.944, while the kappa values of the four clinicians were 0.916, 0.934, 0.911, and 0.920, respectively. Furthermore, our DL algorithm obtained a slightly higher accuracy (0.973) and Youden index (0.952) compared to the best values achieved by the four clinicians. In addition, the speed of our DL algorithm was 15.2 ± 0.3 s/40 cases, much faster than the inference speeds of the clinicians, ranging from 177.2 ± 28.0 s/40 cases to 241.2 ± 64.1 s/40 cases. CONCLUSIONS: Our algorithm demonstrated comparable or even better performance compared to clinicians in assessing spine growth potential. This stable, efficient, and convenient algorithm seems to be a promising approach to assist doctors in clinical practice and deserves further study. CLINICAL RELEVANCE STATEMENT: This method has the ability to quickly ascertain the spine growth potential based on EOS radiographs, and it holds promise to provide assistance to busy doctors in certain clinical scenarios. KEY POINTS: • In the clinic, there is no available computer-based method that can automatically assess spine growth potential. • We developed a deep learning-based method that could automatically ascertain spine growth potential. • Compared with the results of the clinicians, our algorithm got comparable results.

5.
Bioengineering (Basel) ; 10(7)2023 Jun 26.
Artigo em Inglês | MEDLINE | ID: mdl-37508796

RESUMO

The purpose of this study is to develop an automated method for identifying the menarche status of adolescents based on EOS radiographs. We designed a deep-learning-based algorithm that contains a region of interest detection network and a classification network. The algorithm was trained and tested on a retrospective dataset of 738 adolescent EOS cases using a five-fold cross-validation strategy and was subsequently tested on a clinical validation set of 259 adolescent EOS cases. On the clinical validation set, our algorithm achieved accuracy of 0.942, macro precision of 0.933, macro recall of 0.938, and a macro F1-score of 0.935. The algorithm showed almost perfect performance in distinguishing between males and females, with the main classification errors found in females aged 12 to 14 years. Specifically for females, the algorithm had accuracy of 0.910, sensitivity of 0.943, and specificity of 0.855 in estimating menarche status, with an area under the curve of 0.959. The kappa value of the algorithm, in comparison to the actual situation, was 0.806, indicating strong agreement between the algorithm and the real-world scenario. This method can efficiently analyze EOS radiographs and identify the menarche status of adolescents. It is expected to become a routine clinical tool and provide references for doctors' decisions under specific clinical conditions.

6.
IEEE Trans Image Process ; 32: 4036-4045, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37440404

RESUMO

Transformer, the model of choice for natural language processing, has drawn scant attention from the medical imaging community. Given the ability to exploit long-term dependencies, transformers are promising to help atypical convolutional neural networks to learn more contextualized visual representations. However, most of recently proposed transformer-based segmentation approaches simply treated transformers as assisted modules to help encode global context into convolutional representations. To address this issue, we introduce nnFormer (i.e., not-another transFormer), a 3D transformer for volumetric medical image segmentation. nnFormer not only exploits the combination of interleaved convolution and self-attention operations, but also introduces local and global volume-based self-attention mechanism to learn volume representations. Moreover, nnFormer proposes to use skip attention to replace the traditional concatenation/summation operations in skip connections in U-Net like architecture. Experiments show that nnFormer significantly outperforms previous transformer-based counterparts by large margins on three public datasets. Compared to nnUNet, the most widely recognized convnet-based 3D medical segmentation model, nnFormer produces significantly lower HD95 and is much more computationally efficient. Furthermore, we show that nnFormer and nnUNet are highly complementary to each other in model ensembling. Codes and models of nnFormer are available at https://git.io/JSf3i.

7.
Artigo em Inglês | MEDLINE | ID: mdl-37467083

RESUMO

Modeling 3D avatars benefits various application scenarios such as AR/VR, gaming, and filming. Character faces contribute significant diversity and vividity as a vital component of avatars. However, building 3D character face models usually requires a heavy workload with commercial tools, even for experienced artists. Various existing sketch-based tools fail to support amateurs in modeling diverse facial shapes and rich geometric details. In this paper, we present SketchMetaFace - a sketching system targeting amateur users to model high-fidelity 3D faces in minutes. We carefully design both the user interface and the underlying algorithm. First, curvature-aware strokes are adopted to better support the controllability of carving facial details. Second, considering the key problem of mapping a 2D sketch map to a 3D model, we develop a novel learning-based method termed "Implicit and Depth Guided Mesh Modeling" (IDGMM). It fuses the advantages of mesh, implicit, and depth representations to achieve high-quality results with high efficiency. In addition, to further support usability, we present a coarse-to-fine 2D sketching interface design and a data-driven stroke suggestion tool. User studies demonstrate the superiority of our system over existing modeling tools in terms of the ease to use and visual quality of results. Experimental analyses also show that IDGMM reaches a better trade-off between accuracy and efficiency.

8.
Brain Sci ; 13(4)2023 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-37190563

RESUMO

Delayed neurocognitive recovery (dNCR) is a common complication that occurs post-surgery, especially in elderly individuals. The soluble N-ethylmaleimide-sensitive factor attachment protein receptor (SNARE) complex plays an essential role in various membrane fusion events, such as synaptic vesicle exocytosis and autophagosome-lysosome fusion. Although SNARE complex dysfunction has been observed in several neurodegenerative disorders, the causal link between SNARE-mediated membrane fusion and dNCR remains unclear. We previously demonstrated that surgical stimuli caused cognitive impairment in aged rats by inducing α-synuclein accumulation, inhibiting autophagy, and disrupting neurotransmitter release in hippocampal synaptosomes. Here, we evaluated the effects of propofol anesthesia plus surgery on learning and memory and investigated levels of SNARE proteins and chaperones in hippocampal synaptosomes. Aged rats that received propofol anesthesia and surgery exhibited learning and memory impairments in a Morris water maze test and decreased levels of synaptosome-associated protein 25, synaptobrevin/vesicle-associated membrane protein 2, and syntaxin 1. Levels of SNARE chaperones, including mammalian uncoordinated-18, complexins 1 and 2, cysteine string protein-α, and N-ethylmaleimide-sensitive factor, were all significantly decreased following anesthesia with surgical stress. However, the synaptic vesicle marker synaptophysin was unaffected. The autophagy-enhancer rapamycin attenuated structural and functional disturbances of the SNARE complex and ameliorated disrupted neurotransmitter release. Our results indicate that perturbations of SNARE proteins in hippocampal synaptosomes may underlie the occurrence of dNCR. Moreover, the protective effect of rapamycin may partially occur through recovery of SNARE structural and functional abnormalities. Our findings provide insight into the molecular mechanisms underlying dNCR.

9.
Expert Rev Med Devices ; 20(6): 427-432, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-37027325

RESUMO

INTRODUCTION: The application of robotic navigation during spine surgery has advanced rapidly over the past two decades, especially in the last 5 years. Robotic systems in spine surgery may offer potential advantages for both patients and surgeons. This article serves as an update to our previous review and explores the current status of spine surgery robots in clinical settings. AREAS COVERED: We evaluated the literature published from 2020 to 2022 on the outcomes of robotics-assisted spine surgery, including accuracy and its influencing factors, radiation exposure, and follow-up results. EXPERT OPINION: The application of robotics in spine surgery has driven spine surgery into a new era of precision treatment through a form of artificial intelligence assistance that compensates for the limitations of human abilities. Modularized robot configurations, intelligent alignment and planning incorporating multimodal images, efficient and simple human - machine interaction, accurate surgical status monitoring, and safe control strategies are the main technical features for the development of orthopedic surgical robots. The use of robotics-assisted decompression, osteotomies, and decision-making warrants further study. Future investigations should focus on patients' needs while continuing to explore in-depth medical - industrial collaborative development innovations that improve the overall utilization of artificial intelligence and sophistication in disease treatment.


Assuntos
Procedimentos Cirúrgicos Robóticos , Robótica , Cirurgia Assistida por Computador , Humanos , Inteligência Artificial , Procedimentos Cirúrgicos Robóticos/métodos , Coluna Vertebral/cirurgia , Cirurgia Assistida por Computador/métodos
10.
Artigo em Inglês | MEDLINE | ID: mdl-37028041

RESUMO

Large-scale datasets and deep generative models have enabled impressive progress in human face reenactment. Existing solutions for face reenactment have focused on processing real face images through facial landmarks by generative models. Different from real human faces, artistic human faces (e.g., those in paintings, cartoons, etc.) often involve exaggerated shapes and various textures. Therefore, directly applying existing solutions to artistic faces often fails to preserve the characteristics of the original artistic faces (e.g., face identity and decorative lines along face contours) due to the domain gap between real and artistic faces. To address these issues, we present ReenactArtFace, the first effective solution for transferring the poses and expressions from human videos to various artistic face images. We achieve artistic face reenactment in a coarse-to-fine manner. First, we perform 3D artistic face reconstruction, which reconstructs a textured 3D artistic face through a 3D morphable model (3DMM) and a 2D parsing map from an input artistic image. The 3DMM can not only rig the expressions better than facial landmarks but also render images under different poses/expressions as coarse reenactment results robustly. However, these coarse results suffer from self-occlusions and lack contour lines. Second, we thus perform artistic face refinement by using a personalized conditional adversarial generative model (cGAN) fine-tuned on the input artistic image and the coarse reenactment results. For high-quality refinement, we propose a contour loss to supervise the cGAN to faithfully synthesize contour lines. Quantitative and qualitative experiments demonstrate that our method achieves better results than the existing solutions.

11.
IEEE Trans Pattern Anal Mach Intell ; 45(9): 11079-11095, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37018106

RESUMO

We present a deep reinforcement learning method of progressive view inpainting for colored semantic point cloud scene completion under volume guidance, achieving high-quality scene reconstruction from only a single RGB-D image with severe occlusion. Our approach is end-to-end, consisting of three modules: 3D scene volume reconstruction, 2D RGB-D and segmentation image inpainting, and multi-view selection for completion. Given a single RGB-D image, our method first predicts its semantic segmentation map and goes through the 3D volume branch to obtain a volumetric scene reconstruction as a guide to the next view inpainting step, which attempts to make up the missing information; the third step involves projecting the volume under the same view of the input, concatenating them to complete the current view RGB-D and segmentation map, and integrating all RGB-D and segmentation maps into the point cloud. Since the occluded areas are unavailable, we resort to a A3C network to glance around and pick the next best view for large hole completion progressively until a scene is adequately reconstructed while guaranteeing validity. All steps are learned jointly to achieve robust and consistent results. We perform qualitative and quantitative evaluations with extensive experiments on the 3D-FUTURE data, obtaining better results than state-of-the-arts.

12.
J Orthop Surg Res ; 18(1): 271, 2023 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-37013564

RESUMO

BACKGROUND: This study aimed to evaluate the safety and efficacy of robot-assisted percutaneous pars-pedicle screw fixation surgery for treating Hangman's fracture. METHODS: The study involved 33 patients with Hangman's fracture who underwent robot-assisted fixation surgery using cannulated pars-pedicle screws through a percutaneous approach. The primary parameter evaluated was the accuracy of the screws according to the Gertzbein-Robbins scale, using postoperative CT images. Secondary parameters included the duration of surgery, intraoperative blood loss, postoperative hospital stay, and neurovascular injury. RESULTS: A total of 60 pars-pedicle screws were placed in 33 patients. Based on the Levine and Edwards classification, the patients included 12 cases of type I, 15 cases of type II, five cases of type IIa, and one atypical case. The average operative time was 92.4 ± 37.4 min, and the average blood loss was 22.4 ± 17.9 ml. Fifty-five of 60 screws were successfully placed within the bone. No screw-related neurovascular injury was observed, and satisfactory reduction was achieved in all cases. CONCLUSION: Robot-assisted percutaneous pars-pedicle screw fixation is a safe and feasible method for treating Hangman's fracture. TRIAL REGISTRATION: The study was retrospectively registered and approved by our center's institutional review board.


Assuntos
Fraturas Ósseas , Parafusos Pediculares , Robótica , Fraturas da Coluna Vertebral , Humanos , Fraturas da Coluna Vertebral/diagnóstico por imagem , Fraturas da Coluna Vertebral/cirurgia , Fixação Interna de Fraturas/métodos , Fraturas Ósseas/diagnóstico por imagem , Fraturas Ósseas/cirurgia , Estudos Retrospectivos
13.
IEEE Trans Med Imaging ; 42(6): 1875-1884, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37022815

RESUMO

Chronic Glaucoma is an eye disease with progressive optic nerve damage. It is the second leading cause of blindness after cataract and the first leading cause of irreversible blindness. Glaucoma forecast can predict future eye state of a patient by analyzing the historical fundus images, which is helpful for early detection and intervention of potential patients and avoiding the outcome of blindness. In this paper, we propose a GLaucoma forecast transformer based on Irregularly saMpled fundus images named GLIM-Net to predict the probability of developing glaucoma in the future. The main challenge is that the existing fundus images are often sampled at irregular times, making it difficult to accurately capture the subtle progression of glaucoma over time. We therefore introduce two novel modules, namely time positional encoding and time-sensitive MSA (multi-head self-attention) modules, to address this challenge. Unlike many existing works that focus on prediction for an unspecified future time, we also propose an extended model which is further capable of prediction conditioned on a specific future time. The experimental results on the benchmark dataset SIGF show that the accuracy of our method outperforms the state-of-the-art models. In addition, the ablation experiments also confirm the effectiveness of the two modules we propose, which can provide a good reference for the optimization of Transformer models.


Assuntos
Glaucoma , Humanos , Glaucoma/diagnóstico por imagem , Fundo de Olho , Cegueira
14.
IEEE Trans Med Imaging ; 42(4): 947-958, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36355729

RESUMO

Recently deep neural networks, which require a large amount of annotated samples, have been widely applied in nuclei instance segmentation of H&E stained pathology images. However, it is inefficient and unnecessary to label all pixels for a dataset of nuclei images which usually contain similar and redundant patterns. Although unsupervised and semi-supervised learning methods have been studied for nuclei segmentation, very few works have delved into the selective labeling of samples to reduce the workload of annotation. Thus, in this paper, we propose a novel full nuclei segmentation framework that chooses only a few image patches to be annotated, augments the training set from the selected samples, and achieves nuclei segmentation in a semi-supervised manner. In the proposed framework, we first develop a novel consistency-based patch selection method to determine which image patches are the most beneficial to the training. Then we introduce a conditional single-image GAN with a component-wise discriminator, to synthesize more training samples. Lastly, our proposed framework trains an existing segmentation model with the above augmented samples. The experimental results show that our proposed method could obtain the same-level performance as a fully-supervised baseline by annotating less than 5% pixels on some benchmarks.


Assuntos
Núcleo Celular , Redes Neurais de Computação , Aprendizado de Máquina Supervisionado
15.
IEEE Trans Pattern Anal Mach Intell ; 45(3): 3677-3694, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-35648876

RESUMO

Domain Adaptive Object Detection (DAOD) focuses on improving the generalization ability of object detectors via knowledge transfer. Recent advances in DAOD strive to change the emphasis of the adaptation process from global to local in virtue of fine-grained feature alignment methods. However, both the global and local alignment approaches fail to capture the topological relations among different foreground objects as the explicit dependencies and interactions between and within domains are neglected. In this case, only seeking one-vs-one alignment does not necessarily ensure the precise knowledge transfer. Moreover, conventional alignment-based approaches may be vulnerable to catastrophic overfitting regarding those less transferable regions (e.g., backgrounds) due to the accumulation of inaccurate localization results in the target domain. To remedy these issues, we first formulate DAOD as an open-set domain adaptation problem, in which the foregrounds and backgrounds are seen as the "known classes" and "unknown class" respectively. Accordingly, we propose a new and general framework for DAOD, named Foreground-aware Graph-based Relational Reasoning (FGRR), which incorporates graph structures into the detection pipeline to explicitly model the intra- and inter-domain foreground object relations on both pixel and semantic spaces, thereby endowing the DAOD model with the capability of relational reasoning beyond the popular alignment-based paradigm. FGRR first identifies the foreground pixels and regions by searching reliable correspondence and cross-domain similarity regularization respectively. The inter-domain visual and semantic correlations are hierarchically modeled via bipartite graph structures, and the intra-domain relations are encoded via graph attention mechanisms. Through message-passing, each node aggregates semantic and contextual information from the same and opposite domain to substantially enhance its expressive power. Empirical results demonstrate that the proposed FGRR exceeds the state-of-the-art performance on four DAOD benchmarks.

16.
BMC Surg ; 22(1): 378, 2022 Nov 04.
Artigo em Inglês | MEDLINE | ID: mdl-36333797

RESUMO

BACKGROUND: To evaluate the accuracy of screw placement using the TiRobot surgical robot in the Harms procedure and to assess the clinical outcomes of this technique. METHODS: This retrospective study included 21 patients with atlantoaxial instability treated by posterior atlantoaxial internal fixation (Harms procedure) using the TiRobot surgical robot between March 2016 and June 2021. The precision of screw placement, perioperative parameters and clinical outcomes were recorded. Screw placement was assessed based on intraoperative guiding pin accuracy measurements on intraoperative C-arm cone-beam computed tomography (CT) images using overlay technology and the incidence of screw encroachment identified on CT images. RESULTS: Among the 21 patients, the mean age was 44.8 years, and the causes of atlantoaxial instability were os odontoideum (n = 11), rheumatoid arthritis (n = 2), unknown pathogenesis (n = 3), and type II odontoid fracture (n = 5). A total of 82 screws were inserted with robotic assistance. From intraoperative guiding pin accuracy measurements, the average translational and angular deviations were 1.52 ± 0.35 mm (range 1.14-2.25 mm) and 2.25° ± 0.45° (range 1.73°-3.20º), respectively. Screw placement was graded as A for 80.5% of screws, B for 15.9%, and C for 3.7%. No complications related to screw misplacement were observed. After the 1-year follow-up, all patients with a neurological deficit experienced neurological improvement based on Nurick Myelopathy Scale scores, and all patients with preoperative neck pain reported improvement based on Visual Analog Scale scores. CONCLUSIONS: Posterior atlantoaxial internal fixation using the Harms technique assisted by a 3D-based navigation robot is safe, accurate, and effective for treating atlantoaxial instability.


Assuntos
Articulação Atlantoaxial , Instabilidade Articular , Robótica , Doenças da Coluna Vertebral , Fusão Vertebral , Humanos , Adulto , Articulação Atlantoaxial/diagnóstico por imagem , Articulação Atlantoaxial/cirurgia , Fusão Vertebral/métodos , Estudos Retrospectivos , Instabilidade Articular/cirurgia
17.
Artigo em Inglês | MEDLINE | ID: mdl-35503827

RESUMO

Sampling, grouping, and aggregation are three important components in the multi-scale analysis of point clouds. In this paper, we present a novel data-driven sampler learning strategy for point-wise analysis tasks. Unlike the widely used sampling technique, Farthest Point Sampling (FPS), we propose to learn sampling and downstream applications jointly. Our key insight is that uniform sampling methods like FPS are not always optimal for different tasks: sampling more points around boundary areas can make the point-wise classification easier for segmentation. Towards this end, we propose a novel sampler learning strategy that learns sampling point displacement supervised by task-related ground truth information and can be trained jointly with the underlying tasks. We further demonstrate our methods in various point-wise analysis tasks, including semantic part segmentation, point cloud completion, and keypoint detection. Our experiments show that jointly learning of the sampler and task brings better performance than using FPS in various point-based networks.

18.
IEEE Trans Vis Comput Graph ; 28(6): 2415-2429, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-33048679

RESUMO

In the game and film industries, modeling 3D heads plays a very important role in designing characters. Although human head modeling has been researched for a long time, few works have focused on animal-like heads, which are of more diverse shapes and richer geometric details. In this article, we present SAniHead, an interactive system for creating animal-like heads with a mesh representation from dual-view sketches. Our core technical contribution is a view-surface collaborative mesh generative network. Initially, a graph convolutional neural network (GCNN) is trained to learn the deformation of a template mesh to fit the shape of sketches, giving rise to a coarse model. It is then projected into vertex maps where image-to-image translation networks are performed for detail inference. After back-projecting the inferred details onto the meshed surface, a new GCNN is trained for further detail refinement. The modules of view-based detail inference and surface-based detail refinement are conducted in an alternating cascaded fashion, collaboratively improving the model. A refinement sketching interface is also implemented to support direct mesh manipulation. Experimental results show the superiority of our approach and the usability of our interactive system. Our work also contributes a 3D animal head dataset with corresponding line drawings.


Assuntos
Gráficos por Computador , Animais , Cabeça , Redes Neurais de Computação
19.
IEEE Trans Pattern Anal Mach Intell ; 44(10): 6454-6471, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-34101584

RESUMO

This paper focuses on the challenging task of learning 3D object surface reconstructions from RGB images. Existing methods achieve varying degrees of success by using different surface representations. However, they all have their own drawbacks, and cannot properly reconstruct the surface shapes of complex topologies, arguably due to a lack of constraints on the topological structures in their learning frameworks. To this end, we propose to learn and use the topology-preserved, skeletal shape representation to assist the downstream task of object surface reconstruction from RGB images. Technically, we propose the novel SkeletonNet design that learns a volumetric representation of a skeleton via a bridged learning of a skeletal point set, where we use parallel decoders each responsible for the learning of points on 1D skeletal curves and 2D skeletal sheets, as well as an efficient module of globally guided subvolume synthesis for a refined, high-resolution skeletal volume; we present a differentiable Point2Voxel layer to make SkeletonNet end-to-end and trainable. With the learned skeletal volumes, we propose two models, the Skeleton-Based Graph Convolutional Neural Network (SkeGCNN) and the Skeleton-Regularized Deep Implicit Surface Network (SkeDISN), which respectively build upon and improve over the existing frameworks of explicit mesh deformation and implicit field learning for the downstream surface reconstruction task. We conduct thorough experiments that verify the efficacy of our proposed SkeletonNet. SkeGCNN and SkeDISN outperform existing methods as well, and they have their own merits when measured by different metrics. Additional results in generalized task settings further demonstrate the usefulness of our proposed methods. We have made our implementation code publicly available at https://github.com/tangjiapeng/SkeletonNet.


Assuntos
Algoritmos , Aprendizagem , Aprendizado de Máquina , Redes Neurais de Computação
20.
IEEE Trans Pattern Anal Mach Intell ; 44(6): 3000-3014, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-33434125

RESUMO

Estimating 3D human pose from a single image is a challenging task. This work attempts to address the uncertainty of lifting the detected 2D joints to the 3D space by introducing an intermediate state - Part-Centric Heatmap Triplets (HEMlets), which shortens the gap between the 2D observation and the 3D interpretation. The HEMlets utilize three joint-heatmaps to represent the relative depth information of the end-joints for each skeletal body part. In our approach, a Convolutional Network (ConvNet) is first trained to predict HEMlets from the input image, followed by a volumetric joint-heatmap regression. We leverage on the integral operation to extract the joint locations from the volumetric heatmaps, guaranteeing end-to-end learning. Despite the simplicity of the network design, the quantitative comparisons show a significant performance improvement over the best-of-grade methods (e.g., 20 percent on Human3.6M). The proposed method naturally supports training with "in-the-wild" images, where only weakly-annotated relative depth information of skeletal joints is available. This further improves the generalization ability of our model, as validated by qualitative comparisons on outdoor images. Leveraging the strength of the HEMlets pose estimation, we further design and append a shallow yet effective network module to regress the SMPL parameters of the body pose and shape. We term the entire HEMlets-based human pose and shape recovery pipeline HEMlets PoSh. Extensive quantitative and qualitative experiments on the existing human body recovery benchmarks justify the state-of-the-art results obtained with our HEMlets PoSh approach.


Assuntos
Algoritmos , Imageamento Tridimensional , Humanos , Imageamento Tridimensional/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA