Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-29994117

RESUMO

The ability to train on a large dataset of labeled samples is critical to the success of deep learning in many domains. In this paper, we focus on motor vehicle classification and localization from a single video frame and introduce the "MIOvision Traffic Camera Dataset" (MIO-TCD) in this context. MIO-TCD is the largest dataset for motorized traffic analysis to date. It includes 11 traffic object classes such as cars, trucks, buses, motorcycles, bicycles, pedestrians. It contains 786,702 annotated images acquired at different times of the day and different periods of the year by hundreds of traffic surveillance cameras deployed across Canada and the United States. The dataset consists of two parts: a "localization dataset", containing 137,743 full video frames with bounding boxes around traffic objects, and a "classification dataset", containing 648,959 crops of traffic objects from the 11 classes. We also report results from the 2017 CVPR MIO-TCD Challenge, that leveraged this dataset, and compare them with results for state-of-the-art deep learning architectures. These results demonstrate the viability of deep learning methods for vehicle localization and classification from a single video frame in real-life traffic scenarios. The topperforming methods achieve both accuracy and Kappa score above 96% on the classification dataset and mean-average precision of 77% on the localization dataset. We also identify scenarios in which state-of-the-art methods still fail and we suggest avenues to address these challenges. Both the dataset and detailed results are publicly available on-line [1].

2.
Brain ; 141(2): 422-458, 2018 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-29360998

RESUMO

The mechanisms underpinning concussion, traumatic brain injury, and chronic traumatic encephalopathy, and the relationships between these disorders, are poorly understood. We examined post-mortem brains from teenage athletes in the acute-subacute period after mild closed-head impact injury and found astrocytosis, myelinated axonopathy, microvascular injury, perivascular neuroinflammation, and phosphorylated tau protein pathology. To investigate causal mechanisms, we developed a mouse model of lateral closed-head impact injury that uses momentum transfer to induce traumatic head acceleration. Unanaesthetized mice subjected to unilateral impact exhibited abrupt onset, transient course, and rapid resolution of a concussion-like syndrome characterized by altered arousal, contralateral hemiparesis, truncal ataxia, locomotor and balance impairments, and neurobehavioural deficits. Experimental impact injury was associated with axonopathy, blood-brain barrier disruption, astrocytosis, microgliosis (with activation of triggering receptor expressed on myeloid cells, TREM2), monocyte infiltration, and phosphorylated tauopathy in cerebral cortex ipsilateral and subjacent to impact. Phosphorylated tauopathy was detected in ipsilateral axons by 24 h, bilateral axons and soma by 2 weeks, and distant cortex bilaterally at 5.5 months post-injury. Impact pathologies co-localized with serum albumin extravasation in the brain that was diagnostically detectable in living mice by dynamic contrast-enhanced MRI. These pathologies were also accompanied by early, persistent, and bilateral impairment in axonal conduction velocity in the hippocampus and defective long-term potentiation of synaptic neurotransmission in the medial prefrontal cortex, brain regions distant from acute brain injury. Surprisingly, acute neurobehavioural deficits at the time of injury did not correlate with blood-brain barrier disruption, microgliosis, neuroinflammation, phosphorylated tauopathy, or electrophysiological dysfunction. Furthermore, concussion-like deficits were observed after impact injury, but not after blast exposure under experimental conditions matched for head kinematics. Computational modelling showed that impact injury generated focal point loading on the head and seven-fold greater peak shear stress in the brain compared to blast exposure. Moreover, intracerebral shear stress peaked before onset of gross head motion. By comparison, blast induced distributed force loading on the head and diffuse, lower magnitude shear stress in the brain. We conclude that force loading mechanics at the time of injury shape acute neurobehavioural responses, structural brain damage, and neuropathological sequelae triggered by neurotrauma. These results indicate that closed-head impact injuries, independent of concussive signs, can induce traumatic brain injury as well as early pathologies and functional sequelae associated with chronic traumatic encephalopathy. These results also shed light on the origins of concussion and relationship to traumatic brain injury and its aftermath.awx350media15713427811001.


Assuntos
Traumatismos em Atletas/complicações , Concussão Encefálica/etiologia , Traumatismos Craniocerebrais/complicações , Traumatismos Craniocerebrais/etiologia , Tauopatias/etiologia , Lesões do Sistema Vascular/etiologia , Potenciais de Ação/fisiologia , Adolescente , Animais , Atletas , Encéfalo/patologia , Proteínas de Ligação ao Cálcio , Estudos de Coortes , Simulação por Computador , Traumatismos Craniocerebrais/diagnóstico por imagem , Proteínas de Ligação a DNA/metabolismo , Modelos Animais de Doenças , Feminino , Regulação da Expressão Gênica/fisiologia , Hipocampo/fisiopatologia , Humanos , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Proteínas dos Microfilamentos , Modelos Neurológicos , Córtex Pré-Frontal/fisiopatologia , Receptores CCR2/genética , Receptores CCR2/metabolismo , Receptores de Interleucina-8A/genética , Receptores de Interleucina-8A/metabolismo , Adulto Jovem
3.
Opt Express ; 23(11): 15072-87, 2015 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-26072864

RESUMO

Resolution improvement through signal processing techniques for integrated circuit imaging is becoming more crucial as the rapid decrease in integrated circuit dimensions continues. Although there is a significant effort to push the limits of optical resolution for backside fault analysis through the use of solid immersion lenses, higher order laser beams, and beam apodization, signal processing techniques are required for additional improvement. In this work, we propose a sparse image reconstruction framework which couples overcomplete dictionary-based representation with a physics-based forward model to improve resolution and localization accuracy in high numerical aperture confocal microscopy systems for backside optical integrated circuit analysis. The effectiveness of the framework is demonstrated on experimental data.

4.
IEEE Trans Image Process ; 23(11): 4663-79, 2014 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-25122568

RESUMO

Change detection is one of the most commonly encountered low-level tasks in computer vision and video processing. A plethora of algorithms have been developed to date, yet no widely accepted, realistic, large-scale video data set exists for benchmarking different methods. Presented here is a unique change detection video data set consisting of nearly 90 000 frames in 31 video sequences representing six categories selected to cover a wide range of challenges in two modalities (color and thermal infrared). A distinguishing characteristic of this benchmark video data set is that each frame is meticulously annotated by hand for ground-truth foreground, background, and shadow area boundaries-an effort that goes much beyond a simple binary label denoting the presence of change. This enables objective and precise quantitative comparison and ranking of video-based change detection algorithms. This paper discusses various aspects of the new data set, quantitative performance metrics used, and comparative results for over two dozen change detection algorithms. It draws important conclusions on solved and remaining issues in change detection, and describes future challenges for the scientific community. The data set, evaluation tools, and algorithm rankings are available to the public on a website and will be updated with feedback from academia and industry in the future.

5.
IEEE Trans Image Process ; 22(9): 3485-96, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23799697

RESUMO

Despite a significant growth in the last few years, the availability of 3D content is still dwarfed by that of its 2D counterpart. To close this gap, many 2D-to-3D image and video conversion methods have been proposed. Methods involving human operators have been most successful but also time-consuming and costly. Automatic methods, which typically make use of a deterministic 3D scene model, have not yet achieved the same level of quality for they rely on assumptions that are often violated in practice. In this paper, we propose a new class of methods that are based on the radically different approach of learning the 2D-to-3D conversion from examples. We develop two types of methods. The first is based on learning a point mapping from local image/video attributes, such as color, spatial position, and, in the case of video, motion at each pixel, to scene-depth at that pixel using a regression type idea. The second method is based on globally estimating the entire depth map of a query image directly from a repository of 3D images ( image+depth pairs or stereopairs) using a nearest-neighbor regression type idea. We demonstrate both the efficacy and the computational efficiency of our methods on numerous 2D images and discuss their drawbacks and benefits. Although far from perfect, our results demonstrate that repositories of 3D content can be used for effective 2D-to-3D image conversion. An extension to video is immediate by enforcing temporal continuity of computed depth maps.

6.
IEEE Trans Image Process ; 22(6): 2479-94, 2013 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-23508265

RESUMO

We propose a general framework for fast and accurate recognition of actions in video using empirical covariance matrices of features. A dense set of spatio-temporal feature vectors are computed from video to provide a localized description of the action, and subsequently aggregated in an empirical covariance matrix to compactly represent the action. Two supervised learning methods for action recognition are developed using feature covariance matrices. Common to both methods is the transformation of the classification problem in the closed convex cone of covariance matrices into an equivalent problem in the vector space of symmetric matrices via the matrix logarithm. The first method applies nearest-neighbor classification using a suitable Riemannian metric for covariance matrices. The second method approximates the logarithm of a query covariance matrix by a sparse linear combination of the logarithms of training covariance matrices. The action label is then determined from the sparse coefficients. Both methods achieve state-of-the-art classification performance on several datasets, and are robust to action variability, viewpoint changes, and low object resolution. The proposed framework is conceptually simple and has low storage and computational requirements making it attractive for real-time implementation.

7.
IEEE Trans Image Process ; 21(9): 4244-55, 2012 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-22614646

RESUMO

Background subtraction has been a driving engine for many computer vision and video analytics tasks. Although its many variants exist, they all share the underlying assumption that photometric scene properties are either static or exhibit temporal stationarity. While this works in many applications, the model fails when one is interested in discovering changes in scene dynamics instead of changes in scene's photometric properties; the detection of unusual pedestrian or motor traffic patterns are but two examples. We propose a new model and computational framework that assume the dynamics of a scene, not its photometry, to be stationary, i.e., a dynamic background serves as the reference for the dynamics of an observed scene. Central to our approach is the concept of an event, which we define as short-term scene dynamics captured over a time window at a specific spatial location in the camera field of view. Unlike in our earlier work, we compute events by time-aggregating vector object descriptors that can combine multiple features, such as object size, direction of movement, speed, etc. We characterize events probabilistically, but use low-memory, low-complexity surrogates in a practical implementation. Using these surrogates amounts to behavior subtraction, a new algorithm for effective and efficient temporal anomaly detection and localization. Behavior subtraction is resilient to spurious background motion, such as due to camera jitter, and is content-blind, i.e., it works equally well on humans, cars, animals, and other objects in both uncluttered and highly cluttered scenes. Clearly, treating video as a collection of events rather than colored pixels opens new possibilities for video analytics.

9.
IEEE Trans Image Process ; 18(11): 2572-83, 2009 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-19586819

RESUMO

Efficient browsing of long video sequences is a key tool in visual surveillance, e.g., for postevent video forensics, but can also be used for fast review of motion pictures and home videos. While frame skipping (fixed or adaptive) is straightforward to implement, its performance is quite limited. Although more efficient techniques have been developed, such as video summarization and video montage, they lose either the temporal or semantic context of events. A recently proposed method called video synopsis deals with some of these issues but involves multiple processing stages and is fairly complex. Video condensation, that we propose here, is novel in the way information is removed from the space-time video volume, is conceptually simple and relatively easy to implement. We introduce the concept of a video ribbon inspired by that of a seam recently proposed for image resizing. We recursively carve ribbons out by minimizing an activity-aware cost function using dynamic programming. The ribbon model we develop is flexible and permits an easy adjustment of the compromise between temporal condensation ratio and anachronism of events. We also propose sliding-window ribbon carving to handle streaming video and demonstrate the method's efficiency on motor and pedestrian traffic data.

10.
IEEE Trans Image Process ; 17(8): 1443-51, 2008 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-18632352

RESUMO

Optical flow can be reliably estimated between areas visible in two images, but not in occlusion areas. If optical flow is needed in the whole image domain, one approach is to use additional views of the same scene. If such views are unavailable, an often-used alternative is to extrapolate optical flow in occlusion areas. Since the location of such areas is usually unknown prior to optical flow estimation, this is usually performed in three steps. First, occlusion-ignorant optical flow is estimated, then occlusion areas are identified using the estimated (unreliable) optical flow, and, finally, the optical flow is corrected using the computed occlusion areas. This approach, however, does not permit interaction between optical flow and occlusion estimates. In this paper, we permit such interaction by proposing a variational formulation that jointly computes optical flow, implicitly detects occlusions and extrapolates optical flow in occlusion areas. The extrapolation mechanism is based on anisotropic diffusion and uses the underlying image gradient to preserve structure, such as optical flow discontinuities. Our results show significant improvements in the computed optical flow fields over other approaches, both qualitatively and quantitatively.


Assuntos
Algoritmos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reologia/métodos , Gravação em Vídeo/métodos , Aumento da Imagem/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
11.
J Neuroeng Rehabil ; 4: 25, 2007 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-17623080

RESUMO

BACKGROUND: There is a need for effective and early functional rehabilitation of patients with gait and balance problems including those with spinal cord injury, neurological diseases and recovering from hip fractures, a common consequence of falls especially in the elderly population. Gait training in these patients using partial body weight support (BWS) on a treadmill, a technique that involves unloading the subject through a harness, improves walking better than training with full weight bearing. One problem with this technique not commonly acknowledged is that the harness provides external support that essentially eliminates associated postural adjustments (APAs) required for independent gait. We have developed a device to address this issue and conducted a training study for proof of concept of efficacy. METHODS: We present a tool that can enhance the concept of BWS training by allowing natural APAs to occur mediolaterally. While in a supine position in a 90 deg tilted environment built around a modified hospital bed, subjects wear a backpack frame that is freely moving on air-bearings (cf. puck on an air hockey table) and attached through a cable to a pneumatic cylinder that provides a load that can be set to emulate various G-like loads. Veridical visual input is provided through two 3-D automultiscopic displays that allow glasses free 3-D vision representing a virtual surrounding environment that may be acquired from sites chosen by the patient. Two groups of 12 healthy subjects were exposed to either strength training alone or a combination of strength and balance training in such a tilted environment over a period of four weeks. RESULTS: Isokinetic strength measured during upright squat extension improved similarly in both groups. Measures of balance assessed in upright showed statistically significant improvements only when balance was part of the training in the tilted environment. Postural measures indicated less reliance on visual and/or increased use of somatosensory cues after training. CONCLUSION: Upright balance function can be improved following balance specific training performed in a supine position in an environment providing the perception of an upright position with respect to gravity. Future studies will implement this concept in patients.


Assuntos
Transtornos Neurológicos da Marcha/reabilitação , Reabilitação/instrumentação , Transtornos de Sensação/reabilitação , Interface Usuário-Computador , Adulto , Peso Corporal , Desenho de Equipamento , Feminino , Gravitação , Humanos , Masculino , Equilíbrio Postural
12.
Nature ; 448(7151): 330-2, 2007 Jul 19.
Artigo em Inglês | MEDLINE | ID: mdl-17637664

RESUMO

On Jupiter's moon Io, volcanic plumes and evaporating lava flows provide hot gases to form an atmosphere that is subsequently ionized. Some of Io's plasma is captured by the planet's strong magnetic field to form a co-rotating torus at Io's distance; the remaining ions and electrons form Io's ionosphere. The torus and ionosphere are also depleted by three time-variable processes that produce a banana-shaped cloud orbiting with Io, a giant nebula extending out to about 500 Jupiter radii, and a jet close to Io. No spatial constraints exist for the sources of the first two; they have been inferred only from modelling the patterns seen in the trace gas sodium observed far from Io. Here we report observations that reveal a spatially confined stream that ejects sodium only from the wake of the Io-torus interaction, together with a visually distinct, spherically symmetrical outflow region arising from atmospheric sputtering. The spatial extent of the ionospheric wake that feeds the stream is more than twice that observed by the Galileo spacecraft and modelled successfully. This implies considerable variability, and therefore the need for additional modelling of volcanically-driven, episodic states of the great jovian nebula.

13.
IEEE Trans Image Process ; 15(2): 364-76, 2006 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-16479806

RESUMO

We address the issue of image sequence analysis jointly in space and time. While typical approaches to such an analysis consider two image frames at a time, we propose to perform this analysis jointly over multiple frames. We concentrate on spatiotemporal segmentation of image sequences and on analysis of occlusion effects therein. The segmentation process is three-dimensional (3-D); we search for a volume carved out by each moving object in the image sequence domain, or "object tunnel," a new space-time concept. We pose the problem in variational framework by using only motion information (no intensity edges). The resulting formulation can be viewed as volume competition, a 3-D generalization of region competition. We parameterize the unknown surface to be estimated, but rather than using an active-surface approach, we embed it into a higher dimensional function and apply the level-set methodology. We first develop simple models for the detection of moving objects over static background; no motion models are needed. Then, in order to improve segmentation accuracy, we incorporate motion models for objects and background. We further extend the method by including explicit models for occluded and newly exposed areas that lead to "occlusion volumes," another new space-time concept. Since, in this case, multiple volumes are sought, we apply a multiphase variant of the level-set method. We present various experimental results for synthetic and natural image sequences.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Técnica de Subtração , Gravação em Vídeo/métodos , Aumento da Imagem/métodos , Armazenamento e Recuperação da Informação/métodos , Movimento (Física) , Fatores de Tempo
14.
IEEE Trans Image Process ; 15(1): 128-40, 2006 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-16435544

RESUMO

A new type of three-dimensional (3-D) display recently introduced on the market holds great promise for the future of 3-D visualization, communication, and entertainment. This so-called automultiscopic display can deliver multiple views without glasses, thus allowing a limited "look-around" (correct motion-parallax). Central to this technology is the process of multiplexing several views into a single viewable image. This multiplexing is a complex process involving irregular subsampling of the original views. If not preceded by low-pass filtering, it results in aliasing that leads to texture as well as depth distortions. In order to eliminate this aliasing, we propose to model the multiplexing process with lattices, find their parameters and then design optimal anti-alias filters. To this effect, we use multidimensional sampling theory and basic optimization tools. We derive optimal anti-alias filters for a specific automultiscopic monitor using three models: the orthogonal lattice, the nonorthogonal lattice, and the union of shifted lattices. In the first case, the resulting separable low-pass filter offers significant aliasing reduction that is further improved by hexagonal-passband low-pass filter for the nonorthogonal lattice model. A more accurate model is obtained using union of shifted lattices, but due to the complex nature of repeated spectra, practical filters designed in this case offer no additional improvement. We also describe a practical method to design finite-precision, low-complexity filters that can be implemented using modern graphics cards.


Assuntos
Algoritmos , Apresentação de Dados , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Processamento de Sinais Assistido por Computador , Interface Usuário-Computador , Artefatos , Gráficos por Computador , Simulação por Computador , Aumento da Imagem/métodos , Armazenamento e Recuperação da Informação/métodos , Modelos Estatísticos , Tamanho da Amostra
15.
IEEE Trans Image Process ; 14(6): 713-25, 2005 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-15971771

RESUMO

This paper presents a novel approach to the reconstruction of images from nonuniformly spaced samples. This problem is often encountered in digital image processing applications. Nonrecursive video coding with motion compensation, spatiotemporal interpolation of video sequences, and generation of new views in multicamera systems are three possible applications. We propose a new reconstruction algorithm based on a spline model for images. We use regularization, since this is an ill-posed inverse problem. We minimize a cost function composed of two terms: one related to the approximation error and the other related to the smoothness of the modeling function. All the processing is carried out in the space of spline coefficients; this space is discrete, although the problem itself is of a continuous nature. The coefficients of regularization and approximation filters are computed exactly by using the explicit expressions of B-spline functions in the time domain. The regularization is carried out locally, while the computation of the regularization factor accounts for the structure of the nonuniform sampling grid. The linear system of equations obtained is solved iteratively. Our results show a very good performance in motion-compensated interpolation applications.


Assuntos
Algoritmos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Armazenamento e Recuperação da Informação/métodos , Gravação em Vídeo/métodos , Simulação por Computador , Modelos Biológicos , Modelos Estatísticos , Análise Numérica Assistida por Computador , Reconhecimento Automatizado de Padrão/métodos , Fotografação/métodos , Reprodutibilidade dos Testes , Tamanho da Amostra , Sensibilidade e Especificidade , Processamento de Sinais Assistido por Computador , Técnica de Subtração
16.
IEEE Trans Image Process ; 12(2): 201-20, 2003.
Artigo em Inglês | MEDLINE | ID: mdl-18237901

RESUMO

Segmentation of motion in an image sequence is one of the most challenging problems in image processing, while at the same time one that finds numerous applications. To date, a wealth of approaches to motion segmentation have been proposed. Many of them suffer from the local nature of the models used. Global models, such as those based on Markov random fields, perform, in general, better. In this paper, we propose a new approach to motion segmentation that is based on a global model. The novelty of the approach is twofold. First, inspired by recent work of other researchers we formulate the problem as that of region competition, but we solve it using the level set methodology. The key features of a level set representation, as compared to active contours, often used in this context, are its ability to handle variations in the topology of the segmentation and its numerical stability. The second novelty of the paper is the formulation in which, unlike in many other motion segmentation algorithms, we do not use intensity boundaries as an accessory; the segmentation is purely based on motion. This permits accurate estimation of motion boundaries of an object even when its intensity boundaries are hardly visible. Since occasionally intensity boundaries may prove beneficial, we extend the formulation to account for the coincidence of motion and intensity boundaries. In addition, we generalize the approach to multiple motions. We discuss possible discretizations of the evolution (PDE) equations and we give details of an initialization scheme so that the results could be duplicated. We show numerous experimental results for various formulations on natural images with either synthetic or natural motion.

17.
IEEE Trans Image Process ; 11(4): 351-62, 2002.
Artigo em Inglês | MEDLINE | ID: mdl-18244637

RESUMO

Region-based functionality offered by the MPEG-4 video compression standard is also appealing for still images, for example to permit object-based queries of a still-image database. A popular method for still-image compression is fractal coding. However, traditional fractal image coding uses rectangular range and domain blocks. Although new schemes have been proposed that merge small blocks into irregular shapes, the merging process does not, in general, produce semantically-meaningful regions. We propose a new approach to fractal image coding that permits region-based functionalities; images are coded region by region according to a previously-computed segmentation map. We use rectangular range and domain blocks, but divide boundary blocks into segments belonging to different regions. Since this prevents the use of standard dissimilarity measure, we propose a new measure adapted to segment shape. We propose two approaches: one in the spatial and one in the transform domain. While providing additional functionality, the proposed methods perform similarly to other tested methods in terms of PSNR but often result in images that are subjectively better. Due to the limited domain-block codebook size, the new methods are faster than other fractal coding methods tested. The results are very encouraging and show the potential of this approach for various internet and still-image database applications.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA