Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
IEEE Trans Image Process ; 31: 3111-3124, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35380961

RESUMEN

The success of current deep saliency models heavily depends on large amounts of annotated human fixation data to fit the highly non-linear mapping between the stimuli and visual saliency. Such fully supervised data-driven approaches are annotation-intensive and often fail to consider the underlying mechanisms of visual attention. In contrast, in this paper, we introduce a model based on various cognitive theories of visual saliency, which learns visual attention patterns in a weakly supervised manner. Our approach incorporates insights from cognitive science as differentiable submodules, resulting in a unified, end-to-end trainable framework. Specifically, our model encapsulates the following important components motivated from biological vision. (a) As scene semantics are closely related to visually attentive regions, our model encodes discriminative spatial information for scene understanding through spatial visual semantics embedding. (b) To model the objectness factors in visual attention deployment, we incorporate object-level semantics embedding and object relation information. (c) Considering the "winner-take-all" mechanism in visual stimuli processing, we model the competition mechanism among objects with softmax based neural attention. (d) Lastly, a conditional center prior is learned to mimic the spatial distribution bias of visual attention. Furthermore, we propose novel loss functions to utilize supervision cues from image-level semantics, saliency prior knowledge, and self-information compression. Experiments show that our method achieves promising results, and even outperforms many of its fully supervised counterparts. Overall, our weakly supervised saliency method makes an essential step towards reducing the annotation budget of current approaches, as well as providing a more comprehensive understanding of the visual attention mechanism. Our code is available at: https://github.com/ashleylqx/WeakFixation.git.


Asunto(s)
Compresión de Datos , Semántica , Humanos
2.
IEEE Trans Vis Comput Graph ; 26(11): 3163-3176, 2020 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-31217120

RESUMEN

Video stabilization is usually composed of three stages: feature trajectory extraction, trajectory smoothing, and frame warping. Most previous approaches view them as three separate stages. This paper proposes a method combining the last two stages, namely the trajectory smoothing and frame warping stages, into a single optimization framework. The novelty exists in the way of how we combine them: the trajectory smoothing part plays a major role while the frame warping part plays an auxiliary role. With this kind of design, we can conveniently increase the strength of the trajectory smoothing part by a robust first-order derivative term, which makes it possible to produce very aggressive stabilization effects. On the other hand, we adopt adaptive weighting mechanisms in the frame warping part, to follow the smoothed trajectories as much as possible while regularizing other places as similar as possible. Our method is robust to utilize both foreground and background features, and very short trajectories. The utilization of all these information in turn increases the accuracy of the proposed method. We also provide a simplified implementation of our method, which is less accurate but more efficient. Experiments on various kinds of videos demonstrate the effectiveness of our method.

3.
Artículo en Inglés | MEDLINE | ID: mdl-31494548

RESUMEN

We present a method for synopsizing multiple videos captured by a set of surveillance cameras with some overlapped field-of-views. Currently, object-based approaches that directly shift objects along the time axis are already able to compute compact synopsis results for multiple surveillance videos. The challenge is how to present the multiple synopsis results in a more compact and understandable way. Previous approaches show them side by side on the screen, which however is difficult for user to comprehend. In this paper, we solve the problem by joint object-shifting and camera view-switching. Firstly, we synchronize the input videos, and group the same object in different videos together. Then we shift the groups of objects along the time axis to obtain multiple synopsis videos. Instead of showing them simultaneously, we just show one of them at each time, and allow to switch among the views of different synopsis videos. In this view switching way, we obtain just a single synopsis results consisting of content from all the input videos, which is much easier for user to follow and understand. To obtain the best synopsis result, we construct a simultaneous object-shifting and view-switching optimization framework instead of solving them separately. We also present an alternative optimization strategy composed of graph cuts and dynamic programming to solve the unified optimization. Experiments demonstrate that our single synopsis video generated from multiple input videos is compact, complete, and easy to understand.

4.
Artículo en Inglés | MEDLINE | ID: mdl-31562093

RESUMEN

This paper presents a new surveillance video synopsis method which performs much better than previous approaches in terms of both compression ratio and artifact. Previously, a surveillance video was usually compressed by shifting the moving objects of that video forward along the time axis, which inevitably yielded serious collision and chronological disorder artifacts between the shifted objects. The main observation of this paper is that these artifacts can be alleviated by changing the speed or size of the objects, since with varied speed and size the objects can move more flexibly to avoid collision points or to keep chronological relationships. Based on this observation, we propose a video synopsis method that performs object shifting, speed changing, and size scaling simultaneously. We show how to integrate the three heterogeneous operations into a single optimization framework and achieve high-quality synopsis results. Unlike previous approaches that usually use alternative optimization strategies to solve synopsis optimizations, we develop a Metropolis sampling algorithm to find the solution for our three-variable optimization problem. A variety of experiments demonstrate the effectiveness of our method.

5.
Artículo en Inglés | MEDLINE | ID: mdl-31449021

RESUMEN

This paper proposes a novel residual attentive learning network architecture for predicting dynamic eye-fixation maps. The proposed model emphasizes two essential issues, i.e, effective spatiotemporal feature integration and multi-scale saliency learning. For the first problem, appearance and motion streams are tightly coupled via dense residual cross connections, which integrate appearance information with multi-layer, comprehensive motion features in a residual and dense way. Beyond traditional two-stream models learning appearance and motion features separately, such design allows early, multi-path information exchange between different domains, leading to a unified and powerful spatiotemporal learning architecture. For the second one, we propose a composite attention mechanism that learns multi-scale local attentions and global attention priors end-to-end. It is used for enhancing the fused spatiotemporal features via emphasizing important features in multi-scales. A lightweight convolutional Gated Recurrent Unit (convGRU), which is flexible for small training data situation, is used for long-term temporal characteristics modeling. Extensive experiments over four benchmark datasets clearly demonstrate the advantage of the proposed video saliency model over other competitors and the effectiveness of each component of our network. Our code and all the results will be available at https://github.com/ashleylqx/STRA-Net.

6.
IEEE Trans Cybern ; 49(1): 159-170, 2019 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-29990074

RESUMEN

Currently, the most widely used point trajectories generation methods estimate the trajectories from the dense optical flow, by using a consistency check strategy to detect the occluded regions. However, these methods will miss some important trajectories, thus resulting in breaking smooth areas without any structure especially around the motion boundaries (MBs). We suggest exploring MBs in video to generate more accurate dense point trajectories. Estimating MBs from the video improves the point trajectory accuracy of the discontinuity or occluded areas. Then, we obtain trajectories by tracking the initial feature points through all frames. The experimental results demonstrate that our method outperforms the state-of-the-art methods on the challenging benchmark.

7.
IEEE Trans Cybern ; 2018 Feb 27.
Artículo en Inglés | MEDLINE | ID: mdl-29994594

RESUMEN

In this paper, we propose a new multiobject visual tracking algorithm by submodular optimization. The proposed algorithm is composed of two main stages. At the first stage, a new selecting strategy of tracklets is proposed to cope with occlusion problem. We generate low-level tracklets using overlap criteria and min-cost flow, respectively, and then integrate them into a candidate tracklets set. In the second stage, we formulate the multiobject tracking problem as the submodular maximization problem subject to related constraints. The submodular function selects the correct tracklets from the candidate set of tracklets to form the object trajectory. Then, we design a connecting process which connects the corresponding trajectories to overcome the occlusion problem. Experimental results demonstrate the effectiveness of our tracking algorithm. Our source code is available at https://github.com/shenjianbing/submodulartrack.

8.
IEEE Trans Vis Comput Graph ; 24(3): 1260-1273, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-28186900

RESUMEN

Turbulent vortices in smoke flows are crucial for a visually interesting appearance. Unfortunately, it is challenging to efficiently simulate these appealing effects in the framework of vortex filament methods. The vortex filaments in grids scheme allows to efficiently generate turbulent smoke with macroscopic vortical structures, but suffers from the projection-related dissipation, and thus the small-scale vortical structures under grid resolution are hard to capture. In addition, this scheme cannot be applied in wall-bounded turbulent smoke simulation, which requires efficiently handling smoke-obstacle interaction and creating vorticity at the obstacle boundary. To tackle above issues, we propose an effective filament-mesh particle-particle (FMPP) method for fast wall-bounded turbulent smoke simulation with ample details. The Filament-Mesh component approximates the smooth long-range interactions by splatting vortex filaments on grid, solving the Poisson problem with a fast solver, and then interpolating back to smoke particles. The Particle-Particle component introduces smoothed particle hydrodynamics (SPH) turbulence model for particles in the same grid, where interactions between particles cannot be properly captured under grid resolution. Then, we sample the surface of obstacles with boundary particles, allowing the interaction between smoke and obstacle being treated as pressure forces in SPH. Besides, the vortex formation region is defined at the back of obstacles, providing smoke particles flowing by the separation particles with a vorticity force to simulate the subsequent vortex shedding phenomenon. The proposed approach can synthesize the lost small-scale vortical structures and also achieve the smoke-obstacle interaction with vortex shedding at obstacle boundaries in a lightweight manner. The experimental results demonstrate that our FMPP method can achieve more appealing visual effects than vortex filaments in grids scheme by efficiently simulating more vivid thin turbulent features.

9.
IEEE Trans Image Process ; 27(1): 164-178, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-28792900

RESUMEN

Stitching videos captured by hand-held mobile cameras can essentially enhance entertainment experience of ordinary users. However, such videos usually contain heavy shakiness and large parallax, which are challenging to stitch. In this paper, we propose a novel approach of video stitching and stabilization for videos captured by mobile devices. The main component of our method is a unified video stitching and stabilization optimization that computes stitching and stabilization simultaneously rather than does each one individually. In this way, we can obtain the best stitching and stabilization results relative to each other without any bias to one of them. To make the optimization robust, we propose a method to identify background of input videos, and also common background of them. This allows us to apply our optimization on background regions only, which is the key to handle large parallax problem. Since stitching relies on feature matches between input videos, and there inevitably exist false matches, we thus propose a method to distinguish between right and false matches, and encapsulate the false match elimination scheme and our optimization into a loop, to prevent the optimization from being affected by bad feature matches. We test the proposed approach on videos that are causally captured by smartphones when walking along busy streets, and use stitching and stability scores to evaluate the produced panoramic videos quantitatively. Experiments on a diverse of examples show that our results are much better than (challenging cases) or at least on par with (simple cases) the results of previous approaches.Stitching videos captured by hand-held mobile cameras can essentially enhance entertainment experience of ordinary users. However, such videos usually contain heavy shakiness and large parallax, which are challenging to stitch. In this paper, we propose a novel approach of video stitching and stabilization for videos captured by mobile devices. The main component of our method is a unified video stitching and stabilization optimization that computes stitching and stabilization simultaneously rather than does each one individually. In this way, we can obtain the best stitching and stabilization results relative to each other without any bias to one of them. To make the optimization robust, we propose a method to identify background of input videos, and also common background of them. This allows us to apply our optimization on background regions only, which is the key to handle large parallax problem. Since stitching relies on feature matches between input videos, and there inevitably exist false matches, we thus propose a method to distinguish between right and false matches, and encapsulate the false match elimination scheme and our optimization into a loop, to prevent the optimization from being affected by bad feature matches. We test the proposed approach on videos that are causally captured by smartphones when walking along busy streets, and use stitching and stability scores to evaluate the produced panoramic videos quantitatively. Experiments on a diverse of examples show that our results are much better than (challenging cases) or at least on par with (simple cases) the results of previous approaches.

10.
IEEE Trans Vis Comput Graph ; 23(10): 2328-2341, 2017 10.
Artículo en Inglés | MEDLINE | ID: mdl-27775524

RESUMEN

Wide-baseline street image interpolation is useful but very challenging. Existing approaches either rely on heavyweight 3D reconstruction or computationally intensive deep networks. We present a lightweight and efficient method which uses simple homography computing and refining operators to estimate piecewise smooth homographies between input views. To achieve the goal, we show how to combine homography fitting and homography propagation together based on reliable and unreliable superpixel discrimination. Such a combination, other than using homography fitting only, dramatically increases the accuracy and robustness of the estimated homographies. Then, we integrate the concepts of homography and mesh warping, and propose a novel homography-constrained warping formulation which enforces smoothness between neighboring homographies by utilizing the first-order continuity of the warped mesh. This further eliminates small artifacts of overlapping, stretching, etc. The proposed method is lightweight and flexible, allows wide-baseline interpolation. It improves the state of the art and demonstrates that homography computation suffices for interpolation. Experiments on city and rural datasets validate the efficiency and effectiveness of our method.

11.
IEEE Trans Vis Comput Graph ; 20(9): 1303-15, 2014 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-26357379

RESUMEN

Video synopsis aims at removing video's less important information, while preserving its key content for fast browsing, retrieving, or efficient storing. Previous video synopsis methods, including frame-based and object-based approaches that remove valueless whole frames or combine objects from time shots, cannot handle videos with redundancies existing in the movements of video object. In this paper, we present a novel part-based object movements synopsis method, which can effectively compress the redundant information of a moving video object and represent the synopsized object seamlessly. Our method works by part-based assembling and stitching. The object movement sequence is first divided into several part movement sequences. Then, we optimally assemble moving parts from different part sequences together to produce an initial synopsis result. The optimal assembling is formulated as a part movement assignment problem on a Markov Random Field (MRF), which guarantees the most important moving parts are selected while preserving both the spatial compatibility between assembled parts and the chronological order of parts. Finally, we present a non-linear spatiotemporal optimization formulation to stitch the assembled parts seamlessly, and achieve the final compact video object synopsis. The experiments on a variety of input video objects have demonstrated the effectiveness of the presented synopsis method.

12.
IEEE Trans Vis Comput Graph ; 19(10): 1664-76, 2013 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-23929846

RESUMEN

Video synopsis aims at providing condensed representations of video data sets that can be easily captured from digital cameras nowadays, especially for daily surveillance videos. Previous work in video synopsis usually moves active objects along the time axis, which inevitably causes collisions among the moving objects if compressed much. In this paper, we propose a novel approach for compact video synopsis using a unified spatiotemporal optimization. Our approach globally shifts moving objects in both spatial and temporal domains, which shifting objects temporally to reduce the length of the video and shifting colliding objects spatially to avoid visible collision artifacts. Furthermore, using a multilevel patch relocation (MPR) method, the moving space of the original video is expanded into a compact background based on environmental content to fit with the shifted objects. The shifted objects are finally composited with the expanded moving space to obtain the high-quality video synopsis, which is more condensed while remaining free of collision artifacts. Our experimental results have shown that the compact video synopsis we produced can be browsed quickly, preserves relative spatiotemporal relationships, and avoids motion collisions.

13.
IEEE Trans Vis Comput Graph ; 14(2): 426-39, 2008.
Artículo en Inglés | MEDLINE | ID: mdl-18192720

RESUMEN

This paper presents an approach of replacing textures of specified regions in the input image and video using stretch-based mesh optimization.. The retexturing results have the similar distortion and shading effects conforming to the underlying geometry and lighting conditions. For replacing textures in single image,two important steps are developed: the stretch-based mesh parametrization incorporating the recovered normal information is deduced to imitate perspective distortion of the region of interest; the Poisson-based refinement process is exploited to account for texture distortion at fine scale. The luminance of the input image is preserved through color transfer in YCbCr color space. Our approach is independent of the replaced textures. Once the input image is processed, any new texture can be applied to efficiently generate the retexturing results. For video retexturing, we propose key-frame-based texture replacement extended and generalized from the image retexturing. Our approach repeatedly propagates the replacement result of key frame to the rest of the frames. We develop the local motion optimization scheme to deal with the inaccuracies and errors of robust optical flow when tracking moving objects. Visibility shifting and texture drifting are effectively alleviated using graphcut segmentation algorithm and the global optimization to smooth trajectories of the tracked points over temporal domain. Our experimental results showed that the proposed approach can generate visually pleasing results for both image and video.

14.
IEEE Trans Vis Comput Graph ; 14(1): 73-83, 2008.
Artículo en Inglés | MEDLINE | ID: mdl-17993703

RESUMEN

This paper presents the layer-based representation of polyhedrons and its use for point-in-polyhedron tests. In the representation, the facets and edges of a polyhedron are sequentially arranged, and so, the binary search algorithm is efficiently used to speed up inclusion tests. In comparison with conventional representation for polyhedrons, the layer-based representation we propose greatly reduces the storage requirement because it represents much information implicitly, though it still has a storage complexity O(n). It is simple to implement, and robust for inclusion tests because many singularities are erased in constructing the layer-based representation. By incorporating an octree structure for organizing polyhedrons, our approach can run at a speed comparable with Binary Space Partitioning (BSP)-based inclusion tests, and at the same time greatly reduce storage and preprocessing time in treating large polyhedrons. We have developed an efficient solution for point-in-polyhedron tests with the time complexity varying between O(n) and O(logn), depending on the polyhedron shape and the constructed representation, and less than O(log3n) in most cases. The time complexity of preprocess is between O(n) and O(n2), varying with polyhedrons, where n is the edge number of a polyhedron.


Asunto(s)
Algoritmos , Gráficos por Computador , Interpretación de Imagen Asistida por Computador/métodos , Imagenología Tridimensional/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Procesamiento de Señales Asistido por Computador , Aumento de la Imagen/métodos , Almacenamiento y Recuperación de la Información/métodos , Análisis Numérico Asistido por Computador , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
15.
IEEE Trans Vis Comput Graph ; 13(5): 914-24, 2007.
Artículo en Inglés | MEDLINE | ID: mdl-17622676

RESUMEN

A new efficient biorthogonal wavelet analysis based on the principal square root of subdivision is proposed in the paper by using the lifting scheme. Since the principal square root of subdivision is of the slowest topological refinement among the traditional triangular subdivisions, the multiresolution analysis based on the principal square root of subdivision is more balanced than the existing wavelet analyses on triangular meshes, and accordingly offers more levels of detail for processing polygonal models. In order to optimize the multiresolution analysis process, the new wavelets, no matter whether they are interior or on boundaries, are orthogonalized with the local scaling functions based on a discrete inner product with subdivision masks. Because the wavelet analysis and synthesis algorithms are actually composed of a series of local lifting operations, they can be performed in linear time. The experiments demonstrate the efficiency and stability of the wavelet analysis for both closed and open triangular meshes with principal square root of subdivision connectivity. The principal square root of -subdivision-based biorthogonal wavelets can be used in many applications such as progressive transmission, shape approximation, multiresolution editing and rendering of 3D geometric models.


Asunto(s)
Algoritmos , Compresión de Datos/métodos , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Imagenología Tridimensional/métodos , Gráficos por Computador , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
16.
Artículo en Inglés | MEDLINE | ID: mdl-15544244

RESUMEN

A major requirement for surgical simulation is to allow virtual tissue cutting. This paper presents a scalable and adaptive cutting technique based on a mass-spring mesh. By the analogy of digital logic design, an arbitrary incision is modeled systematically by translating the cutting process into a state diagram. Subdivision of mesh elements is driven by the state transitions. Node redistribution, local re-meshing and deformation are applied to refine the subdivided mesh.


Asunto(s)
Simulación por Computador , Procedimientos Quirúrgicos Operativos , Hong Kong
17.
Artif Intell Med ; 32(1): 51-69, 2004 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-15350624

RESUMEN

Modeling of tissue deformation is of great importance to virtual reality (VR)-based medical simulations. Considerable effort has been dedicated to the development of interactively deformable virtual tissues. In this paper, an efficient and scalable deformable model is presented for virtual-reality-based medical applications. It considers deformation as a localized force transmittal process which is governed by algorithms based on breadth-first search (BFS). The computational speed is scalable to facilitate real-time interaction by adjusting the penetration depth. Simulated annealing (SA) algorithms are developed to optimize the model parameters by using the reference data generated with the linear static finite element method (FEM). The mechanical behavior and timing performance of the model have been evaluated. The model has been applied to simulate the typical behavior of living tissues and anisotropic materials. Integration with a haptic device has also been achieved on a generic personal computer (PC) platform. The proposed technique provides a feasible solution for VR-based medical simulations and has the potential for multi-user collaborative work in virtual environment.


Asunto(s)
Algoritmos , Simulación por Computador , Tejido Conectivo/fisiología , Modelos Teóricos , Interfaz Usuario-Computador , Fenómenos Biomecánicos , Humanos , Reología
18.
IEEE Trans Inf Technol Biomed ; 7(4): 358-63, 2003 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-15000361

RESUMEN

An effective deformable model based on a successive force propagation process is proposed. It avoids the laborious stiffness matrix formulation and is scalable simply by controlling the penetration depth. Mechanical tests are performed to evaluate its feasibility for modeling real tissues. An interactive system is developed using a commercial haptic device.


Asunto(s)
Simulación por Computador , Tejido Conectivo/fisiología , Educación Médica/métodos , Modelos Biológicos , Palpación/métodos , Estimulación Física/métodos , Enseñanza/métodos , Tacto/fisiología , Ambiente , Retroalimentación , Humanos , Hígado/fisiología , Modelos Anatómicos , Estrés Mecánico , Estrés Fisiológico , Interfaz Usuario-Computador
19.
Stud Health Technol Inform ; 94: 187-9, 2003.
Artículo en Inglés | MEDLINE | ID: mdl-15455889

RESUMEN

Chinese acupuncture is a traditional medical treatment in Chinese history. Recent evidence shows that this treatment is effective. However, acupuncture students can only practice on either real patients or mannequin. In this project, we propose a virtual reality training system for acupuncture. The system not just provides 3D stereo display, but also realistic haptic feedback in real-time. Since acupuncture usually involves thrust-and-lift needle motion, we also propose a bi-directional haptic model to tailor this application. Our results show that the novel haptic model confirms with practitioners' tactile experience.


Asunto(s)
Acupuntura , Simulación por Computador , Interfaz Usuario-Computador , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...