Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
PLoS One ; 17(7): e0269174, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35834472

RESUMEN

This paper presents a systematic study of the effects of hyperspectral pixel dimensionality reduction on the pixel classification task. We use five dimensionality reduction methods-PCA, KPCA, ICA, AE, and DAE-to compress 301-dimensional hyperspectral pixels. Compressed pixels are subsequently used to perform pixel classifications. Pixel classification accuracies together with compression method, compression rates, and reconstruction errors provide a new lens to study the suitability of a compression method for the task of pixel classification. We use three high-resolution hyperspectral image datasets, representing three common landscape types (i.e. urban, transitional suburban, and forests) collected by the Remote Sensing and Spatial Ecosystem Modeling laboratory of the University of Toronto. We found that PCA, KPCA, and ICA post greater signal reconstruction capability; however, when compression rates are more than 90% these methods show lower classification scores. AE and DAE methods post better classification accuracy at 95% compression rate, however their performance drops as compression rate approaches 97%. Our results suggest that both the compression method and the compression rate are important considerations when designing a hyperspectral pixel classification pipeline.


Asunto(s)
Compresión de Datos , Ecosistema , Compresión de Datos/métodos , Bosques , Fenómenos Físicos
2.
IEEE Trans Pattern Anal Mach Intell ; 44(2): 905-923, 2022 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-32780697

RESUMEN

There is a large growth in hardware and software systems capable of producing vast amounts of image and video data. These systems are rich sources of continuous image and video streams. This motivates researchers to build scalable computer vision systems that utilize data-streaming concepts for processing of visual data streams. However, several challenges exist in building large-scale computer vision systems. For example, computer vision algorithms have different accuracy and speed profiles depending on the content, type, and speed of incoming data. Also, it is not clear how to adaptively tune these algorithms in large-scale systems. These challenges exist because we lack formal frameworks for building and optimizing large-scale visual processing. This paper presents formal methods and algorithms that aim to overcome these challenges and improve building and optimizing large-scale computer vision systems. We describe a formal algebra framework for the mathematical description of computer vision pipelines for processing image and video streams. The algebra naturally describes feedback control and provides a formal and abstract method for optimizing computer vision pipelines. We then show that a general optimizer can be used with the feedback-control mechanisms of our stream algebra to provide a common online parameter optimization method for computer vision pipelines.

3.
J Imaging ; 7(12)2021 Nov 30.
Artículo en Inglés | MEDLINE | ID: mdl-34940723

RESUMEN

Powered wheelchairs have enhanced the mobility and quality of life of people with special needs. The next step in the development of powered wheelchairs is to incorporate sensors and electronic systems for new control applications and capabilities to improve their usability and the safety of their operation, such as obstacle avoidance or autonomous driving. However, autonomous powered wheelchairs require safe navigation in different environments and scenarios, making their development complex. In our research, we propose, instead, to develop contactless control for powered wheelchairs where the position of the caregiver is used as a control reference. Hence, we used a depth camera to recognize the caregiver and measure at the same time their relative distance from the powered wheelchair. In this paper, we compared two different approaches for real-time object recognition using a 3DHOG hand-crafted object descriptor based on a 3D extension of the histogram of oriented gradients (HOG) and a convolutional neural network based on YOLOv4-Tiny. To evaluate both approaches, we constructed Miun-Feet-a custom dataset of images of labeled caregiver's feet in different scenarios, with backgrounds, objects, and lighting conditions. The experimental results showed that the YOLOv4-Tiny approach outperformed 3DHOG in all the analyzed cases. In addition, the results showed that the recognition accuracy was not improved using the depth channel, enabling the use of a monocular RGB camera only instead of a depth camera and reducing the computational cost and heat dissipation limitations. Hence, the paper proposes an additional method to compute the caregiver's distance and angle from the Powered Wheelchair (PW) using only the RGB data. This work shows that it is feasible to use the location of the caregiver's feet as a control signal for the control of a powered wheelchair and that it is possible to use a monocular RGB camera to compute their relative positions.

4.
J Imaging ; 7(11)2021 Oct 27.
Artículo en Inglés | MEDLINE | ID: mdl-34821858

RESUMEN

Object detection for sky surveillance is a challenging problem due to having small objects in a large volume and a constantly changing background which requires high resolution frames. For example, detecting flying birds in wind farms to prevent their collision with the wind turbines. This paper proposes a YOLOv4-based ensemble model for bird detection in grayscale videos captured around wind turbines in wind farms. In order to tackle this problem, we introduce two datasets-(1) Klim and (2) Skagen-collected at two locations in Denmark. We use Klim training set to train three increasingly capable YOLOv4 based models. Model 1 uses YOLOv4 trained on the Klim dataset, Model 2 introduces tiling to improve small bird detection, and the last model uses tiling and temporal stacking and achieves the best mAP values on both Klim and Skagen datasets. We used this model to set up an ensemble detector, which further improves mAP values on both datasets. The three models achieve testing mAP values of 82%, 88%, and 90% on the Klim dataset. mAP values for Model 1 and Model 3 on the Skagen dataset are 60% and 92%. Improving object detection accuracy could mitigate birds' mortality rate by choosing the locations for such establishment and the turbines location. It can also be used to improve the collision avoidance systems used in wind energy facilities.

5.
IEEE Trans Pattern Anal Mach Intell ; 37(4): 847-61, 2015 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-26353298

RESUMEN

We developed a new method for extracting 3D flight trajectories of droplets using high-speed stereo capture. We noticed that traditional multi-camera tracking techniques fare poorly on our problem, in part due to the fact that all droplets have very similar shapes, sizes and appearances. Our method uses local motion models to track individual droplets in each frame. 2D tracks are used to learn a global, non-linear motion model, which in turn can be used to estimate the 3D locations of individual droplets even when these are not visible in any camera. We have evaluated the proposed method on both synthetic and real data and our method is able to reconstruct 3D flight trajectories of hundreds of droplets. The proposed technique solves for both the 3D trajectory of a droplet and its motion model concomitantly, and we have found it to be superior to 3D reconstruction via triangulation. Furthermore, the learned global motion model allows us to relax the simultaneity assumptions of stereo camera systems. Our results suggest that, even when full stereo information is available, our unsynchronized reconstruction using the global motion model can significantly improve the 3D estimation accuracy.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA