Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Sensors (Basel) ; 18(8)2018 Jul 26.
Artigo em Inglês | MEDLINE | ID: mdl-30049979

RESUMO

In this paper, the Relative Pose based Redundancy Removal (RPRR) scheme is presented, which has been designed for mobile RGB-D sensor networks operating under bandwidth-constrained operational scenarios. The scheme considers a multiview scenario in which pairs of sensors observe the same scene from different viewpoints, and detect the redundant visual and depth information to prevent their transmission leading to a significant improvement in wireless channel usage efficiency and power savings. We envisage applications in which the environment is static, and rapid 3D mapping of an enclosed area of interest is required, such as disaster recovery and support operations after earthquakes or industrial accidents. Experimental results show that wireless channel utilization is improved by 250% and battery consumption is halved when the RPRR scheme is used instead of sending the sensor images independently.

2.
IEEE Trans Pattern Anal Mach Intell ; 45(11): 12783-12797, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36215373

RESUMO

Tracking a time-varying indefinite number of objects in a video sequence over time remains a challenge despite recent advances in the field. Most existing approaches are not able to properly handle multi-object tracking challenges such as occlusion, in part because they ignore long-term temporal information. To address these shortcomings, we present MO3TR: a truly end-to-end Transformer-based online multi-object tracking (MOT) framework that learns to handle occlusions, track initiation and termination without the need for an explicit data association module or any heuristics. MO3TR encodes object interactions into long-term temporal embeddings using a combination of spatial and temporal Transformers, and recursively uses the information jointly with the input data to estimate the states of all tracked objects over time. The spatial attention mechanism enables our framework to learn implicit representations between all the objects and the objects to the measurements, while the temporal attention mechanism focuses on specific parts of past information, allowing our approach to resolve occlusions over multiple frames. Our experiments demonstrate the potential of this new approach, achieving results on par with or better than the current state-of-the-art on multiple MOT metrics for several popular multi-object tracking benchmarks.

3.
Top Cogn Sci ; 13(1): 252-255, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-32096601

RESUMO

Events and event prediction are pivotal concepts across much of cognitive science, as demonstrated by the papers in this special issue. We first discuss how the study of events and the predictive processing framework may fruitfully inform each other. We then briefly point to some links to broader philosophical questions about events.


Assuntos
Ciência Cognitiva , Humanos
4.
IEEE Trans Med Imaging ; 40(10): 2911-2925, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-33531297

RESUMO

Recently, ultra-widefield (UWF) 200° fundus imaging by Optos cameras has gradually been introduced because of its broader insights for detecting more information on the fundus than regular 30° - 60° fundus cameras. Compared with UWF fundus images, regular fundus images contain a large amount of high-quality and well-annotated data. Due to the domain gap, models trained by regular fundus images to recognize UWF fundus images perform poorly. Hence, given that annotating medical data is labor intensive and time consuming, in this paper, we explore how to leverage regular fundus images to improve the limited UWF fundus data and annotations for more efficient training. We propose the use of a modified cycle generative adversarial network (CycleGAN) model to bridge the gap between regular and UWF fundus and generate additional UWF fundus images for training. A consistency regularization term is proposed in the loss of the GAN to improve and regulate the quality of the generated data. Our method does not require that images from the two domains be paired or even that the semantic labels be the same, which provides great convenience for data collection. Furthermore, we show that our method is robust to noise and errors introduced by the generated unlabeled data with the pseudo-labeling technique. We evaluated the effectiveness of our methods on several common fundus diseases and tasks, such as diabetic retinopathy (DR) classification, lesion detection and tessellated fundus segmentation. The experimental results demonstrate that our proposed method simultaneously achieves superior generalizability of the learned representations and performance improvements in multiple tasks.


Assuntos
Retinopatia Diabética , Retinopatia Diabética/diagnóstico por imagem , Angiofluoresceinografia , Fundo de Olho , Humanos , Fotografação , Tomografia de Coerência Óptica
5.
IEEE Trans Vis Comput Graph ; 16(3): 355-68, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20224132

RESUMO

In this paper, we present three techniques for 6DOF natural feature tracking in real time on mobile phones. We achieve interactive frame rates of up to 30 Hz for natural feature tracking from textured planar targets on current generation phones. We use an approach based on heavily modified state-of-the-art feature descriptors, namely SIFT and Ferns plus a template-matching-based tracker. While SIFT is known to be a strong, but computationally expensive feature descriptor, Ferns classification is fast, but requires large amounts of memory. This renders both original designs unsuitable for mobile phones. We give detailed descriptions on how we modified both approaches to make them suitable for mobile phones. The template-based tracker further increases the performance and robustness of the SIFT- and Ferns-based approaches. We present evaluations on robustness and performance and discuss their appropriateness for Augmented Reality applications.


Assuntos
Algoritmos , Telefone Celular , Gráficos por Computador , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Interface Usuário-Computador , Sistemas Computacionais , Armazenamento e Recuperação da Informação/métodos , Software
6.
IEEE Trans Pattern Anal Mach Intell ; 42(1): 15-26, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-30334782

RESUMO

In this paper, we introduce a novel methodology for characterizing the performance of deep learning networks (ResNets and DenseNet) with respect to training convergence and generalization as a function of mini-batch size and learning rate for image classification. This methodology is based on novel measurements derived from the eigenvalues of the approximate Fisher information matrix, which can be efficiently computed even for high capacity deep models. Our proposed measurements can help practitioners to monitor and control the training process (by actively tuning the mini-batch size and learning rate) to allow for good training convergence and generalization. Furthermore, the proposed measurements also allow us to show that it is possible to optimize the training process with a new dynamic sampling training approach that continuously and automatically change the mini-batch size and learning rate during the training process. Finally, we show that the proposed dynamic sampling training approach has a faster training time and a competitive classification accuracy compared to the current state of the art.

7.
IEEE Trans Pattern Anal Mach Intell ; 31(3): 570-6, 2009 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-19147883

RESUMO

We introduce a classification-based approach to finding occluding texture boundaries. The classifier is composed of a set of weak learners which operate on image intensity discriminative features which are defined on small patches and fast to compute. A database which is designed to simulate digitized occluding contours of textured objects in natural images is used to train the weak learners. The trained classifier score is then used to obtain a probabilistic model for the presence of texture transitions which can readily be used for line search texture boundary detection in the direction normal to an initial boundary estimate. This method is fast and therefore suitable for real-time and interactive applications. It works as a robust estimator which requires a ribbon like search region and can handle complex texture structures without requiring a large number of observations. We demonstrate results both in the context of interactive 2-D delineation and fast 3-D tracking and compare its performance with other existing methods for line search boundary detection.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Técnica de Subtração , Simulação por Computador , Aumento da Imagem/métodos , Modelos Estatísticos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
8.
Top Cogn Sci ; 13(1): 243-247, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33063929
9.
IEEE Trans Pattern Anal Mach Intell ; 26(4): 479-94, 2004 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-15382652

RESUMO

This paper presents a new Bayesian framework for motion segmentation--dividing a frame from an image sequence into layers representing different moving objects--by tracking edges between frames. Edges are found using the Canny edge detector, and the Expectation-Maximization algorithm is then used to fit motion models to these edges and also to calculate the probabilities of the edges obeying each motion model. The edges are also used to segment the image into regions of similar color. The most likely labeling for these regions is then calculated by using the edge probabilities, in association with a Markov Random Field-style prior. The identification of the relative depth ordering of the different motion layers is also determined, as an integral part of the process. An efficient implementation of this framework is presented for segmenting two motions (foreground and background) using two frames. It is then demonstrated how, by tracking the edges into further frames, the probabilities may be accumulated to provide an even more accurate and robust estimate, and segment an entire sequence. Further extensions are then presented to address the segmentation of more than two motions. Here, a hierarchical method of initializing the Expectation-Maximization algorithm is described, and it is demonstrated that the Minimum Description Length principle may be used to automatically select the best number of motion layers. The results from over 30 sequences (demonstrating both two and three motions) are presented and discussed.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Armazenamento e Recuperação da Informação/métodos , Movimento/fisiologia , Reconhecimento Automatizado de Padrão , Técnica de Subtração , Análise por Conglomerados , Gráficos por Computador , Simulação por Computador , Percepção de Profundidade , Humanos , Aumento da Imagem/métodos , Imageamento Tridimensional/métodos , Movimento (Física) , Análise Numérica Assistida por Computador , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Processamento de Sinais Assistido por Computador
10.
Artigo em Inglês | MEDLINE | ID: mdl-23365890

RESUMO

Implanted visual prostheses provide bionic vision with very low spatial and intensity resolution when compared against healthy human vision. Vision processing converts camera video to low resolution imagery for bionic vision with the aim of preserving salient features such as edges. Transformative Reality extends and improves upon traditional vision processing in three ways. Firstly, a combination of visual and non-visual sensors are used to provide multi-modal data of a person's surroundings. This enables the sensing of features that are difficult to sense with only a camera. Secondly, robotic sensing algorithms construct models of the world in real time. This enables the detection of complex features such as navigable empty ground or people. Thirdly, models are visually rendered so that visually complex entities such as people can be effectively represented in low resolution. Preliminary simulated prosthetic vision trials, where a head mounted display is used to constrain a subject's vision to 25×25 binary phosphenes, suggest that Transformative Reality provides functional bionic vision for tasks such as indoor navigation, object manipulation and people detection in scenes where traditional processing is unusable.


Assuntos
Algoritmos , Modelos Teóricos , Desenho de Prótese , Robótica , Próteses Visuais , Humanos , Robótica/instrumentação , Robótica/métodos
11.
IEEE Trans Pattern Anal Mach Intell ; 32(1): 105-19, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19926902

RESUMO

The repeatability and efficiency of a corner detector determines how likely it is to be useful in a real-world application. The repeatability is important because the same scene viewed from different positions should yield features which correspond to the same real-world 3D locations. The efficiency is important because this determines whether the detector combined with further processing can operate at frame rate. Three advances are described in this paper. First, we present a new heuristic for feature detection and, using machine learning, we derive a feature detector from this which can fully process live PAL video using less than 5 percent of the available processing time. By comparison, most other detectors cannot even operate at frame rate (Harris detector 115 percent, SIFT 195 percent). Second, we generalize the detector, allowing it to be optimized for repeatability, with little loss of efficiency. Third, we carry out a rigorous comparison of corner detectors based on the above repeatability criterion applied to 3D scenes. We show that, despite being principally constructed for speed, on these stringent tests, our heuristic detector significantly outperforms existing feature detectors. Finally, the comparison demonstrates that using machine learning produces significant improvements in repeatability, yielding a detector that is both very fast and of very high quality.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA