Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 48
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-38630565

RESUMO

Some robust point cloud registration approaches with controllable pose refinement magnitude, such as ICP and its variants, are commonly used to improve 6D pose estimation accuracy. However, the effectiveness of these methods gradually diminishes with the advancement of deep learning techniques and the enhancement of initial pose accuracy, primarily due to their lack of specific design for pose refinement. In this paper, we propose Point Cloud Completion and Keypoint Refinement with Fusion Data (PCKRF), a new pose refinement pipeline for 6D pose estimation. The pipeline consists of two steps. First, it completes the input point clouds via a novel pose-sensitive point completion network. The network uses both local and global features with pose information during point completion. Then, it registers the completed object point cloud with the corresponding target point cloud by our proposed Color supported Iterative KeyPoint (CIKP) method. The CIKP method introduces color information into registration and registers a point cloud around each keypoint to increase stability. The PCKRF pipeline can be integrated with existing popular 6D pose estimation methods, such as the full flow bidirectional fusion network, to further improve their pose estimation accuracy. Experiments demonstrate that our method exhibits superior stability compared to existing approaches when optimizing initial poses with relatively high precision. Notably, the results indicate that our method effectively complements most existing pose estimation techniques, leading to improved performance in most cases. Furthermore, our method achieves promising results even in challenging scenarios involving textureless and symmetrical objects. Our source code is available at https://github.com/zhanhz/KRF.

2.
IEEE Trans Vis Comput Graph ; 30(1): 606-616, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37871082

RESUMO

As communications are increasingly taking place virtually, the ability to present well online is becoming an indispensable skill. Online speakers are facing unique challenges in engaging with remote audiences. However, there has been a lack of evidence-based analytical systems for people to comprehensively evaluate online speeches and further discover possibilities for improvement. This paper introduces SpeechMirror, a visual analytics system facilitating reflection on a speech based on insights from a collection of online speeches. The system estimates the impact of different speech techniques on effectiveness and applies them to a speech to give users awareness of the performance of speech techniques. A similarity recommendation approach based on speech factors or script content supports guided exploration to expand knowledge of presentation evidence and accelerate the discovery of speech delivery possibilities. SpeechMirror provides intuitive visualizations and interactions for users to understand speech factors. Among them, SpeechTwin, a novel multimodal visual summary of speech, supports rapid understanding of critical speech factors and comparison of different speech samples, and SpeechPlayer augments the speech video by integrating visualization of the speaker's body language with interaction, for focused analysis. The system utilizes visualizations suited to the distinct nature of different speech factors for user comprehension. The proposed system and visualization techniques were evaluated with domain experts and amateurs, demonstrating usability for users with low visualization literacy and its efficacy in assisting users to develop insights for potential improvement.


Assuntos
Gráficos por Computador , Fala , Humanos , Comunicação
3.
Artigo em Inglês | MEDLINE | ID: mdl-38145513

RESUMO

As a significant geometric feature of 3D point clouds, sharp features play an important role in shape analysis, 3D reconstruction, registration, localization, etc. Current sharp feature detection methods are still sensitive to the quality of the input point cloud, and the detection performance is affected by random noisy points and non-uniform densities. In this paper, using the prior knowledge of geometric features, we propose a Multi-scale Laplace Network (MSL-Net), a new deep-learning-based method based on an intrinsic neighbor shape descriptor, to detect sharp features from 3D point clouds. Firstly, we establish a discrete intrinsic neighborhood of the point cloud based on the Laplacian graph, which reduces the error of local implicit surface estimation. Then, we design a new intrinsic shape descriptor based on the intrinsic neighborhood, combined with enhanced normal extraction and cosine-based field estimation function. Finally, we present the backbone of MSL-Net based on the intrinsic shape descriptor. Benefiting from the intrinsic neighborhood and shape descriptor, our MSL-Net has simple architecture and is capable of establishing accurate feature prediction that satisfies the manifold distribution while avoiding complex intrinsic metric calculations. Extensive experimental results demonstrate that with the multi-scale structure, MSL-Net has a strong analytical ability for local perturbations of point clouds. Compared with state-of-the-art methods, our MSL-Net is more robust and accurate. The code is publicly available at.

4.
Zootaxa ; 5319(1): 76-90, 2023 Jul 24.
Artigo em Inglês | MEDLINE | ID: mdl-37518249

RESUMO

A new species of the genus Hebius Thompson, 1913 is described from Youjiang District, Baise City, Guangxi Zhuang Autonomous Region, China, based on a single adult female specimen. It can be distinguished from its congeners by the following combination of characters: (1) dorsal scale rows 19-17-17, feebly keeled except the outermost row; (2) tail length comparatively long, TAL/TL ratio 0.30 in females; (3) ventrals 160 (+ 3 preventrals); (4) subcaudals 112; (5) supralabials 9, the fourth to sixth in contact with the eye; (6) infralabials 10, the first 5 touching the first pair of chin shields; (7) preocular 1; (8) postoculars 2; (9) temporals 4, arranged in three rows (1+1+2); (10) maxillary teeth 30, the last 3 enlarged, without diastem; (11) postocular streak presence; (12) background color of dorsal brownish black, a conspicuous, uniform, continuous beige stripe extending from behind the eye to the end of the tail; (13) anterior venter creamish-yellow, gradually fades to the rear, with irregular black blotches in the middle and outer quarter of ventrals, the posterior part almost completely black. The discovery of the new species increases the number of species in the genus Hebius to 51.


Assuntos
Colubridae , Lagartos , Feminino , Animais , China , Distribuição Animal , Cauda , Estruturas Animais , Filogenia
5.
IEEE Trans Image Process ; 32: 3136-3149, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37227918

RESUMO

Benefiting from the intuitiveness and naturalness of sketch interaction, sketch-based video retrieval (SBVR) has received considerable attention in the video retrieval research area. However, most existing SBVR research still lacks the capability of accurate video retrieval with fine-grained scene content. To address this problem, in this paper we investigate a new task, which focuses on retrieving the target video by utilizing a fine-grained storyboard sketch depicting the scene layout and major foreground instances' visual characteristics (e.g., appearance, size, pose, etc.) of video; we call such a task "fine-grained scene-level SBVR". The most challenging issue in this task is how to perform scene-level cross-modal alignment between sketch and video. Our solution consists of two parts. First, we construct a scene-level sketch-video dataset called SketchVideo, in which sketch-video pairs are provided and each pair contains a clip-level storyboard sketch and several keyframe sketches (corresponding to video frames). Second, we propose a novel deep learning architecture called Sketch Query Graph Convolutional Network (SQ-GCN). In SQ-GCN, we first adaptively sample the video frames to improve video encoding efficiency, and then construct appearance and category graphs to jointly model visual and semantic alignment between sketch and video. Experiments show that our fine-grained scene-level SBVR framework with SQ-GCN architecture outperforms the state-of-the-art fine-grained retrieval methods. The SketchVideo dataset and SQ-GCN code are available in the project webpage https://iscas-mmsketch.github.io/FG-SL-SBVR/.

6.
Artigo em Inglês | MEDLINE | ID: mdl-37220037

RESUMO

3D dense captioning aims to semantically describe each object detected in a 3D scene, which plays a significant role in 3D scene understanding. Previous works lack a complete definition of 3D spatial relationships and the directly integrate visual and language modalities, thus ignoring the discrepancies between the two modalities. To address these issues, we propose a novel complete 3D relationship extraction modality alignment network, which consists of three steps: 3D object detection, complete 3D relationships extraction, and modality alignment caption. To comprehensively capture the 3D spatial relationship features, we define a complete set of 3D spatial relationships, including the local spatial relationship between objects and the global spatial relationship between each object and the entire scene. To this end, we propose a complete 3D relationships extraction module based on message passing and self-attention to mine multi-scale spatial relationship features and inspect the transformation to obtain features in different views. In addition, we propose the modality alignment caption module to fuse multi-scale relationship features and generate descriptions to bridge the semantic gap from the visual space to the language space with the prior information in the word embedding, and help generate improved descriptions for the 3D scene. Extensive experiments demonstrate that the proposed model outperforms the state-of-the-art methods on the ScanRefer and Nr3D datasets.

7.
Artigo em Inglês | MEDLINE | ID: mdl-37021894

RESUMO

For 3D animators, choreography with artificial intelligence has attracted more attention recently. However, most existing deep learning methods mainly rely on music for dance generation and lack sufficient control over generated dance motions. To address this issue, we introduce the idea of keyframe interpolation for music-driven dance generation and present a novel transition generation technique for choreography. Specifically, this technique synthesizes visually diverse and plausible dance motions by using normalizing flows to learn the probability distribution of dance motions conditioned on a piece of music and a sparse set of key poses. Thus, the generated dance motions respect both the input musical beats and the key poses. To achieve a robust transition of varying lengths between the key poses, we introduce a time embedding at each timestep as an additional condition. Extensive experiments show that our model generates more realistic, diverse, and beat-matching dance motions than the compared state-of-the-art methods, both qualitatively and quantitatively. Our experimental results demonstrate the superiority of the keyframe-based control for improving the diversity of the generated dance motions.

8.
Sci Rep ; 13(1): 2995, 2023 02 21.
Artigo em Inglês | MEDLINE | ID: mdl-36810767

RESUMO

Positive human-agent relationships can effectively improve human experience and performance in human-machine systems or environments. The characteristics of agents that enhance this relationship have garnered attention in human-agent or human-robot interactions. In this study, based on the rule of the persona effect, we study the effect of an agent's social cues on human-agent relationships and human performance. We constructed a tedious task in an immersive virtual environment, designing virtual partners with varying levels of human likeness and responsiveness. Human likeness encompassed appearance, sound, and behavior, while responsiveness referred to the way agents responded to humans. Based on the constructed environment, we present two studies to explore the effects of an agent's human likeness and responsiveness to agents on participants' performance and perception of human-agent relationships during the task. The results indicate that when participants work with an agent, its responsiveness attracts attention and induces positive feelings. Agents with responsiveness and appropriate social response strategies have a significant positive effect on human-agent relationships. These results shed some light on how to design virtual agents to improve user experience and performance in human-agent interactions.


Assuntos
Atenção , Emoções , Humanos , Sistemas Homem-Máquina
9.
IEEE Trans Vis Comput Graph ; 29(3): 1785-1798, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-34851826

RESUMO

3D reconstruction from single-view images is a long-standing research problem. There have been various methods based on point clouds and volumetric representations. In spite of success in 3D models generation, it is quite challenging for these approaches to deal with models with complex topology and fine geometric details. Thanks to the recent advance of deep shape representations, learning the structure and detail representation using deep neural networks is a promising direction. In this article, we propose a novel approach named STD-Net to reconstruct 3D models utilizing mesh representation that is well suited for characterizing complex structures and geometry details. Our method consists of (1) an auto-encoder network for recovering the structure of an object with bounding box representation from a single-view image; (2) a topology-adaptive GCN for updating vertex position for meshes of complex topology; and (3) a unified mesh deformation block that deforms the structural boxes into structure-aware meshes. Evaluation on ShapeNet and PartNet shows that STD-Net has better performance than state-of-the-art methods in reconstructing complex structures and fine geometric details.

10.
IEEE Trans Vis Comput Graph ; 29(4): 2203-2210, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-34752397

RESUMO

Caricature is a type of artistic style of human faces that attracts considerable attention in the entertainment industry. So far a few 3D caricature generation methods exist and all of them require some caricature information (e.g., a caricature sketch or 2D caricature) as input. This kind of input, however, is difficult to provide by non-professional users. In this paper, we propose an end-to-end deep neural network model that generates high-quality 3D caricatures directly from a normal 2D face photo. The most challenging issue for our system is that the source domain of face photos (characterized by normal 2D faces) is significantly different from the target domain of 3D caricatures (characterized by 3D exaggerated face shapes and textures). To address this challenge, we: (1) build a large dataset of 5,343 3D caricature meshes and use it to establish a PCA model in the 3D caricature shape space; (2) reconstruct a normal full 3D head from the input face photo and use its PCA representation in the 3D caricature shape space to establish correspondences between the input photo and 3D caricature shape; and (3) propose a novel character loss and a novel caricature loss based on previous psychological studies on caricatures. Experiments including a novel two-level user study show that our system can generate high-quality 3D caricatures directly from normal face photos.

11.
IEEE Trans Pattern Anal Mach Intell ; 45(1): 905-918, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-35104210

RESUMO

Face portrait line drawing is a unique style of art which is highly abstract and expressive. However, due to its high semantic constraints, many existing methods learn to generate portrait drawings using paired training data, which is costly and time-consuming to obtain. In this paper, we propose a novel method to automatically transform face photos to portrait drawings using unpaired training data with two new features; i.e., our method can (1) learn to generate high quality portrait drawings in multiple styles using a single network and (2) generate portrait drawings in a "new style" unseen in the training data. To achieve these benefits, we (1) propose a novel quality metric for portrait drawings which is learned from human perception, and (2) introduce a quality loss to guide the network toward generating better looking portrait drawings. We observe that existing unpaired translation methods such as CycleGAN tend to embed invisible reconstruction information indiscriminately in the whole drawings due to significant information imbalance between the photo and portrait drawing domains, which leads to important facial features missing. To address this problem, we propose a novel asymmetric cycle mapping that enforces the reconstruction information to be visible and only embedded in the selected facial regions. Along with localized discriminators for important facial regions, our method well preserves all important facial features in the generated drawings. Generator dissection further explains that our model learns to incorporate face semantic information during drawing generation. Extensive experiments including a user study show that our model outperforms state-of-the-art methods.

12.
IEEE Trans Pattern Anal Mach Intell ; 45(2): 2009-2023, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-35471870

RESUMO

Recent works have achieved remarkable performance for action recognition with human skeletal data by utilizing graph convolutional models. Existing models mainly focus on developing graph convolutional operations to encode structural properties of a skeletal graph, whose topology is manually predefined and fixed over all action samples. Some recent works further take sample-dependent relationships among joints into consideration. However, the complex relationships between arbitrary pairwise joints are difficult to learn and the temporal features between frames are not fully exploited by simply using traditional convolutions with small local kernels. In this paper, we propose a motif-based graph convolution method, which makes use of sample-dependent latent relations among non-physically connected joints to impose a high-order locality and assigns different semantic roles to physical neighbors of a joint to encode hierarchical structures. Furthermore, we propose a sparsity-promoting loss function to learn a sparse motif adjacency matrix for latent dependencies in non-physical connections. For extracting effective temporal information, we propose an efficient local temporal block. It adopts partial dense connections to reuse temporal features in local time windows, and enrich a variety of information flow by gradient combination. In addition, we introduce a non-local temporal block to capture global dependencies among frames. Our model can capture local and non-local relationships both spatially and temporally, by integrating the local and non-local temporal blocks into the sparse motif-based graph convolutional networks (SMotif-GCNs). Comprehensive experiments on four large-scale datasets show that our model outperforms the state-of-the-art methods. Our code is publicly available at https://github.com/wenyh1616/SAMotif-GCN.

13.
IEEE Trans Vis Comput Graph ; 29(12): 4964-4977, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-35925853

RESUMO

Point cloud upsampling aims to generate dense point clouds from given sparse ones, which is a challenging task due to the irregular and unordered nature of point sets. To address this issue, we present a novel deep learning-based model, called PU-Flow, which incorporates normalizing flows and weight prediction techniques to produce dense points uniformly distributed on the underlying surface. Specifically, we exploit the invertible characteristics of normalizing flows to transform points between euclidean and latent spaces and formulate the upsampling process as ensemble of neighbouring points in a latent space, where the ensemble weights are adaptively learned from local geometric context. Extensive experiments show that our method is competitive and, in most test cases, it outperforms state-of-the-art methods in terms of reconstruction quality, proximity-to-surface accuracy, and computation efficiency. The source code will be publicly available at https://github.com/unknownue/puflow.

14.
IEEE Trans Vis Comput Graph ; 29(12): 5250-5264, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36103450

RESUMO

Simulating liquid-textile interaction has received great attention in computer graphics recently. Most existing methods take textiles as particles or parameterized meshes. Although these methods can generate visually pleasing results, they cannot simulate water content at a microscopic level due to the lack of geometrically modeling of textile's anisotropic structure. In this paper, we develop a method for yarn-level simulation of hygroscopicity of textiles and evaluate it using various quantitative metrics. We model textiles in a fiber-yarn-fabric multi-scale manner and consider the dynamic coupled physical mechanisms of liquid spreading, including wetting, wicking, moisture sorption/desorption, and transient moisture-heat transfer in textiles. Our method can accurately simulate liquid spreading on textiles with different fiber materials and geometrical structures with consideration of air temperatures and humidity conditions. It visualizes the hygroscopicity of textiles to demonstrate their moisture management ability. We conduct qualitative and quantitative experiments to validate our method and explore various factors to analyze their influence on liquid spreading and hygroscopicity of textiles.

15.
IEEE Trans Image Process ; 31: 3737-3751, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35594232

RESUMO

Sketch-based image retrieval (SBIR) is a long-standing research topic in computer vision. Existing methods mainly focus on category-level or instance-level image retrieval. This paper investigates the fine-grained scene-level SBIR problem where a free-hand sketch depicting a scene is used to retrieve desired images. This problem is useful yet challenging mainly because of two entangled facts: 1) achieving an effective representation of the input query data and scene-level images is difficult as it requires to model the information across multiple modalities such as object layout, relative size and visual appearances, and 2) there is a great domain gap between the query sketch input and target images. We present SceneSketcher-v2, a Graph Convolutional Network (GCN) based architecture to address these challenges. SceneSketcher-v2 employs a carefully designed graph convolution network to fuse the multi-modality information in the query sketch and target images and uses a triplet training process and end-to-end training manner to alleviate the domain gap. Extensive experiments demonstrate SceneSketcher-v2 outperforms state-of-the-art scene-level SBIR models with a significant margin.

16.
Bioact Mater ; 18: 138-150, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-35387155

RESUMO

Despite the recent advances in artificial tissue and organ engineering, how to generate large size viable and functional complex organs still remains as a grand challenge for regenerative medicine. Three-dimensional bioprinting has demonstrated its advantages as one of the major methods in fabricating simple tissues, yet it still faces difficulties to generate vasculatures and preserve cell functions in complex organ production. Here, we overcome the limitations of conventional bioprinting systems by converting a six degree-of-freedom robotic arm into a bioprinter, therefore enables cell printing on 3D complex-shaped vascular scaffolds from all directions. We also developed an oil bath-based cell printing method to better preserve cell natural functions after printing. Together with a self-designed bioreactor and a repeated print-and-culture strategy, our bioprinting system is capable to generate vascularized, contractible, and long-term survived cardiac tissues. Such bioprinting strategy mimics the in vivo organ development process and presents a promising solution for in vitro fabrication of complex organs.

17.
IEEE Trans Pattern Anal Mach Intell ; 44(9): 5826-5846, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-33739920

RESUMO

Unlike the conventional facial expressions, micro-expressions are involuntary and transient facial expressions capable of revealing the genuine emotions that people attempt to hide. Therefore, they can provide important information in a broad range of applications such as lie detection, criminal detection, etc. Since micro-expressions are transient and of low intensity, however, their detection and recognition is difficult and relies heavily on expert experiences. Due to its intrinsic particularity and complexity, video-based micro-expression analysis is attractive but challenging, and has recently become an active area of research. Although there have been numerous developments in this area, thus far there has been no comprehensive survey that provides researchers with a systematic overview of these developments with a unified evaluation. Accordingly, in this survey paper, we first highlight the key differences between macro- and micro-expressions, then use these differences to guide our research survey of video-based micro-expression analysis in a cascaded structure, encompassing the neuropsychological basis, datasets, features, spotting algorithms, recognition algorithms, applications and evaluation of state-of-the-art approaches. For each aspect, the basic techniques, advanced developments and major challenges are addressed and discussed. Furthermore, after considering the limitations of existing micro-expression datasets, we present and release a new dataset - called micro-and-macro expression warehouse (MMEW) - containing more video samples and more labeled emotion types. We then perform a unified comparison of representative methods on CAS(ME) 2 for spotting, and on MMEW and SAMM for recognition, respectively. Finally, some potential future research directions are explored and outlined.


Assuntos
Algoritmos , Expressão Facial , Emoções , Humanos
18.
IEEE Trans Vis Comput Graph ; 28(10): 3376-3390, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-33750692

RESUMO

Cartoon is a common form of art in our daily life and automatic generation of cartoon images from photos is highly desirable. However, state-of-the-art single-style methods can only generate one style of cartoon images from photos and existing multi-style image style transfer methods still struggle to produce high-quality cartoon images due to their highly simplified and abstract nature. In this article, we propose a novel multi-style generative adversarial network (GAN) architecture, called MS-CartoonGAN, which can transform photos into multiple cartoon styles. MS-CartoonGAN uses only unpaired photos and cartoon images of multiple styles for training. To achieve this, we propose to use (1) a hierarchical semantic loss with sparse regularization to retain semantic content and recover flat shading in different abstract levels, (2) a new edge-promoting adversarial loss for producing fine edges, and (3) a style loss to enhance the difference between output cartoon styles and make training process more stable. We also develop a multi-domain architecture, where the generator consists of a shared encoder and multiple decoders for different cartoon styles, along with multiple discriminators for individual styles. By observing that cartoon images drawn by different artists have their unique styles while sharing some common characteristics, our shared network architecture exploits the common characteristics of cartoon styles, achieving better cartoonization and being more efficient than single-style cartoonization. We show that our multi-domain architecture can theoretically guarantee to output desired multiple cartoon styles. Through extensive experiments including a user study, we demonstrate the superiority of the proposed method, outperforming state-of-the-art single-style and multi-style image style transfer methods.

19.
IEEE Trans Vis Comput Graph ; 28(1): 508-517, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-34591763

RESUMO

What makes speeches effective has long been a subject for debate, and until today there is broad controversy among public speaking experts about what factors make a speech effective as well as the roles of these factors in speeches. Moreover, there is a lack of quantitative analysis methods to help understand effective speaking strategies. In this paper, we propose E-ffective, a visual analytic system allowing speaking experts and novices to analyze both the role of speech factors and their contribution in effective speeches. From interviews with domain experts and investigating existing literature, we identified important factors to consider in inspirational speeches. We obtained the generated factors from multi-modal data that were then related to effectiveness data. Our system supports rapid understanding of critical factors in inspirational speeches, including the influence of emotions by means of novel visualization methods and interaction. Two novel visualizations include E-spiral (that shows the emotional shifts in speeches in a visually compact way) and E-script (that connects speech content with key speech delivery information). In our evaluation we studied the influence of our system on experts' domain knowledge about speech factors. We further studied the usability of the system by speaking novices and experts on assisting analysis of inspirational speech effectiveness.


Assuntos
Gráficos por Computador , Fala , Emoções
20.
IEEE Trans Image Process ; 30: 8847-8860, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34694996

RESUMO

Video over-segmentation into supervoxels is an important pre-processing technique for many computer vision tasks. Videos are an order of magnitude larger than images. Most existing methods for generating supervovels are either memory- or time-inefficient, which limits their application in subsequent video processing tasks. In this paper, we present an anisotropic supervoxel method, which is memory-efficient and can be executed on the graphics processing unit (GPU). Therefore, our algorithm achieves good balance among segmentation quality, memory usage and processing time. In order to provide accurate segmentation for moving objects in video, we use the optical flow information to design a brand new non-Euclidean metric to calculate the anisotropic distances between seeds and voxels. To efficiently compute the anisotropic metric, we adjust the classic jump flooding algorithm (which is designed for parallel execution on the GPU) to generate anisotropic Voronoi tessellation in the combined color and spatio-temporal space. We evaluate our method and the representative supervoxel algorithms for their capability on segmentation performance, computation speed and memory efficiency. We also apply supervoxel results to the application of foreground propagation in videos to test the performance on solving practical problems. Experiments show that our algorithm is much faster than the existing methods, and achieves good balance on segmentation quality and efficiency.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...