Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Sensors (Basel) ; 24(11)2024 May 24.
Artigo em Inglês | MEDLINE | ID: mdl-38894173

RESUMO

Pedestrian monitoring in crowded areas like train stations has an important impact in the overall operation and management of those public spaces. An organized distribution of the different elements located inside a station will contribute not only to the safety of all passengers but will also allow for a more efficient process of the regular activities including entering/leaving the station, boarding/alighting from trains, and waiting. This improved distribution only comes by obtaining sufficiently accurate information on passengers' positions, and their derivatives like speeds, densities, traffic flow. The work described here addresses this need by using an artificial intelligence approach based on computational vision and convolutional neural networks. From the available videos taken regularly at subways stations, two methods are tested. One is based on tracking each person's bounding box from which filtered 3D kinematics are derived, including position, velocity and density. Another infers the pose and activity that a person has by analyzing its main body key points. Measurements of these quantities would enable a sensible and efficient design of inner spaces in places like railway and subway stations.

2.
Sensors (Basel) ; 23(6)2023 Mar 17.
Artigo em Inglês | MEDLINE | ID: mdl-36991934

RESUMO

Methods based on 64-beam LiDAR can provide very precise 3D object detection. However, highly accurate LiDAR sensors are extremely costly: a 64-beam model can cost approximately USD 75,000. We previously proposed SLS-Fusion (sparse LiDAR and stereo fusion) to fuse low-cost four-beam LiDAR with stereo cameras that outperform most advanced stereo-LiDAR fusion methods. In this paper, and according to the number of LiDAR beams used, we analyzed how the stereo and LiDAR sensors contributed to the performance of the SLS-Fusion model for 3D object detection. Data coming from the stereo camera play a significant role in the fusion model. However, it is necessary to quantify this contribution and identify the variations in such a contribution with respect to the number of LiDAR beams used inside the model. Thus, to evaluate the roles of the parts of the SLS-Fusion network that represent LiDAR and stereo camera architectures, we propose dividing the model into two independent decoder networks. The results of this study show that-starting from four beams-increasing the number of LiDAR beams has no significant impact on the SLS-Fusion performance. The presented results can guide the design decisions by practitioners.

3.
Sensors (Basel) ; 23(3)2023 Jan 26.
Artigo em Inglês | MEDLINE | ID: mdl-36772438

RESUMO

Recently, the scientific community has placed great emphasis on the recognition of human activity, especially in the area of health and care for the elderly. There are already practical applications of activity recognition and unusual conditions that use body sensors such as wrist-worn devices or neck pendants. These relatively simple devices may be prone to errors, might be uncomfortable to wear, might be forgotten or not worn, and are unable to detect more subtle conditions such as incorrect postures. Therefore, other proposed methods are based on the use of images and videos to carry out human activity recognition, even in open spaces and with multiple people. However, the resulting increase in the size and complexity involved when using image data requires the use of the most recent advanced machine learning and deep learning techniques. This paper presents an approach based on deep learning with attention to the recognition of activities from multiple frames. Feature extraction is performed by estimating the pose of the human skeleton, and classification is performed using a neural network based on Bidirectional Encoder Representation of Transformers (BERT). This algorithm was trained with the UP-Fall public dataset, generating more balanced artificial data with a Generative Adversarial Neural network (GAN), and evaluated with real data, outperforming the results of other activity recognition methods using the same dataset.


Assuntos
Algoritmos , Redes Neurais de Computação , Humanos , Idoso , Aprendizado de Máquina , Esqueleto , Postura
4.
Sensors (Basel) ; 22(11)2022 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-35684613

RESUMO

In recent years, much effort has been devoted to the development of applications capable of detecting different types of human activity. In this field, fall detection is particularly relevant, especially for the elderly. On the one hand, some applications use wearable sensors that are integrated into cell phones, necklaces or smart bracelets to detect sudden movements of the person wearing the device. The main drawback of these types of systems is that these devices must be placed on a person's body. This is a major drawback because they can be uncomfortable, in addition to the fact that these systems cannot be implemented in open spaces and with unfamiliar people. In contrast, other approaches perform activity recognition from video camera images, which have many advantages over the previous ones since the user is not required to wear the sensors. As a result, these applications can be implemented in open spaces and with unknown people. This paper presents a vision-based algorithm for activity recognition. The main contribution of this work is to use human skeleton pose estimation as a feature extraction method for activity detection in video camera images. The use of this method allows the detection of multiple people's activities in the same scene. The algorithm is also capable of classifying multi-frame activities, precisely for those that need more than one frame to be detected. The method is evaluated with the public UP-FALL dataset and compared to similar algorithms using the same dataset.


Assuntos
Algoritmos , Atividades Humanas , Idoso , Humanos , Esqueleto
5.
J Imaging ; 7(11)2021 Oct 26.
Artigo em Inglês | MEDLINE | ID: mdl-34821856

RESUMO

Breast cancer is one of the leading causes of death among women, more so than all other cancers. The accurate diagnosis of breast cancer is very difficult due to the complexity of the disease, changing treatment procedures and different patient population samples. Diagnostic techniques with better performance are very important for personalized care and treatment and to reduce and control the recurrence of cancer. The main objective of this research was to select feature selection techniques using correlation analysis and variance of input features before passing these significant features to a classification method. We used an ensemble method to improve the classification of breast cancer. The proposed approach was evaluated using the public WBCD dataset (Wisconsin Breast Cancer Dataset). Correlation analysis and principal component analysis were used for dimensionality reduction. Performance was evaluated for well-known machine learning classifiers, and the best seven classifiers were chosen for the next step. Hyper-parameter tuning was performed to improve the performances of the classifiers. The best performing classification algorithms were combined with two different voting techniques. Hard voting predicts the class that gets the majority vote, whereas soft voting predicts the class based on highest probability. The proposed approach performed better than state-of-the-art work, achieving an accuracy of 98.24%, high precision (99.29%) and a recall value of 95.89%.

6.
Sensors (Basel) ; 21(20)2021 Oct 09.
Artigo em Inglês | MEDLINE | ID: mdl-34695925

RESUMO

The role of sensors such as cameras or LiDAR (Light Detection and Ranging) is crucial for the environmental awareness of self-driving cars. However, the data collected from these sensors are subject to distortions in extreme weather conditions such as fog, rain, and snow. This issue could lead to many safety problems while operating a self-driving vehicle. The purpose of this study is to analyze the effects of fog on the detection of objects in driving scenes and then to propose methods for improvement. Collecting and processing data in adverse weather conditions is often more difficult than data in good weather conditions. Hence, a synthetic dataset that can simulate bad weather conditions is a good choice to validate a method, as it is simpler and more economical, before working with a real dataset. In this paper, we apply fog synthesis on the public KITTI dataset to generate the Multifog KITTI dataset for both images and point clouds. In terms of processing tasks, we test our previous 3D object detector based on LiDAR and camera, named the Spare LiDAR Stereo Fusion Network (SLS-Fusion), to see how it is affected by foggy weather conditions. We propose to train using both the original dataset and the augmented dataset to improve performance in foggy weather conditions while keeping good performance under normal conditions. We conducted experiments on the KITTI and the proposed Multifog KITTI datasets which show that, before any improvement, performance is reduced by 42.67% in 3D object detection for Moderate objects in foggy weather conditions. By using a specific strategy of training, the results significantly improved by 26.72% and keep performing quite well on the original dataset with a drop only of 8.23%. In summary, fog often causes the failure of 3D detection on driving scenes. By additional training with the augmented dataset, we significantly improve the performance of the proposed 3D object detection algorithm for self-driving cars in foggy weather conditions.


Assuntos
Condução de Veículo , Algoritmos , Chuva , Projetos de Pesquisa , Tempo (Meteorologia)
7.
Comput Intell Neurosci ; 2021: 5570870, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34007266

RESUMO

Classroom communication involves teacher's behavior and student's responses. Extensive research has been done on the analysis of student's facial expressions, but the impact of instructor's facial expressions is yet an unexplored area of research. Facial expression recognition has the potential to predict the impact of teacher's emotions in a classroom environment. Intelligent assessment of instructor behavior during lecture delivery not only might improve the learning environment but also could save time and resources utilized in manual assessment strategies. To address the issue of manual assessment, we propose an instructor's facial expression recognition approach within a classroom using a feedforward learning model. First, the face is detected from the acquired lecture videos and key frames are selected, discarding all the redundant frames for effective high-level feature extraction. Then, deep features are extracted using multiple convolution neural networks along with parameter tuning which are then fed to a classifier. For fast learning and good generalization of the algorithm, a regularized extreme learning machine (RELM) classifier is employed which classifies five different expressions of the instructor within the classroom. Experiments are conducted on a newly created instructor's facial expression dataset in classroom environments plus three benchmark facial datasets, i.e., Cohn-Kanade, the Japanese Female Facial Expression (JAFFE) dataset, and the Facial Expression Recognition 2013 (FER2013) dataset. Furthermore, the proposed method is compared with state-of-the-art techniques, traditional classifiers, and convolutional neural models. Experimentation results indicate significant performance gain on parameters such as accuracy, F1-score, and recall.


Assuntos
Reconhecimento Facial , Emoções , Face , Expressão Facial , Feminino , Humanos , Redes Neurais de Computação
8.
Sensors (Basel) ; 20(21)2020 Nov 02.
Artigo em Inglês | MEDLINE | ID: mdl-33147784

RESUMO

The main source of delays in public transport systems (buses, trams, metros, railways) takes place in their stations. For example, a public transport vehicle can travel at 60 km per hour between stations, but its commercial speed (average en-route speed, including any intermediate delay) does not reach more than half of that value. Therefore, the problem that public transport operators must solve is how to reduce the delay in stations. From the perspective of transport engineering, there are several ways to approach this issue, from the design of infrastructure and vehicles to passenger traffic management. The tools normally available to traffic engineers are analytical models, microscopic traffic simulation, and, ultimately, real-scale laboratory experiments. In any case, the data that are required are number of passengers that get on and off from the vehicles, as well as the number of passengers waiting on platforms. Traditionally, such data has been collected manually by field counts or through videos that are then processed by hand. On the other hand, public transport networks, specially metropolitan railways, have an extensive monitoring infrastructure based on standard video cameras. Traditionally, these are observed manually or with very basic signal processing support, so there is significant scope for improving data capture and for automating the analysis of site usage, safety, and surveillance. This article shows a way of collecting and analyzing the data needed to feed both traffic models and analyze laboratory experimentation, exploiting recent intelligent sensing approaches. The paper presents a new public video dataset gathered using real-scale laboratory recordings. Part of this dataset has been annotated by hand, marking up head locations to provide a ground-truth on which to train and evaluate deep learning detection and tracking algorithms. Tracking outputs are then used to count people getting on and off, achieving a mean accuracy of 92% with less than 0.15% standard deviation on 322 mostly unseen dataset video sequences.


Assuntos
Algoritmos , Veículos Automotores , Processamento de Sinais Assistido por Computador , Meios de Transporte , Gravação em Vídeo , Humanos
9.
Sensors (Basel) ; 20(7)2020 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-32218350

RESUMO

We present a deep learning-based multitask framework for joint 3D human pose estimation and action recognition from RGB sensors using simple cameras. The approach proceeds along two stages. In the first, a real-time 2D pose detector is run to determine the precise pixel location of important keypoints of the human body. A two-stream deep neural network is then designed and trained to map detected 2D keypoints into 3D poses. In the second stage, the Efficient Neural Architecture Search (ENAS) algorithm is deployed to find an optimal network architecture that is used for modeling the spatio-temporal evolution of the estimated 3D poses via an image-based intermediate representation and performing action recognition. Experiments on Human3.6M, MSR Action3D and SBU Kinect Interaction datasets verify the effectiveness of the proposed method on the targeted tasks. Moreover, we show that the method requires a low computational budget for training and inference. In particular, the experimental results show that by using a monocular RGB sensor, we can develop a 3D pose estimation and human action recognition approach that reaches the performance of RGB-depth sensors. This opens up many opportunities for leveraging RGB cameras (which are much cheaper than depth cameras and extensively deployed in private and public places) to build intelligent recognition systems.

10.
Sensors (Basel) ; 20(4)2020 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-32075119

RESUMO

Vehicle make and model recognition (VMMR) is a key task for automated vehicular surveillance (AVS) and various intelligent transport system (ITS) applications. In this paper, we propose and study the suitability of the bag of expressions (BoE) approach for VMMR-based applications. The method includes neighborhood information in addition to visual words. BoE improves the existing power of a bag of words (BOW) approach, including occlusion handling, scale invariance and view independence. The proposed approach extracts features using a combination of different keypoint detectors and a Histogram of Oriented Gradients (HOG) descriptor. An optimized dictionary of expressions is formed using visual words acquired through k-means clustering. The histogram of expressions is created by computing the occurrences of each expression in the image. For classification, multiclass linear support vector machines (SVM) are trained over the BoE-based features representation. The approach has been evaluated by applying cross-validation tests on the publicly available National Taiwan Ocean University-Make and Model Recognition (NTOU-MMR) dataset, and experimental results show that it outperforms recent approaches for VMMR. With multiclass linear SVM classification, promising average accuracy and processing speed are obtained using a combination of keypoint detectors with HOG-based BoE description, making it applicable to real-time VMMR systems.

11.
J Imaging ; 6(6)2020 May 29.
Artigo em Inglês | MEDLINE | ID: mdl-34460585

RESUMO

Breast cancer is the most common cause of death for women worldwide. Thus, the ability of artificial intelligence systems to detect possible breast cancer is very important. In this paper, an ensemble classification mechanism is proposed based on a majority voting mechanism. First, the performance of different state-of-the-art machine learning classification algorithms were evaluated for the Wisconsin Breast Cancer Dataset (WBCD). The three best classifiers were then selected based on their F3 score. F3 score is used to emphasize the importance of false negatives (recall) in breast cancer classification. Then, these three classifiers, simple logistic regression learning, support vector machine learning with stochastic gradient descent optimization and multilayer perceptron network, are used for ensemble classification using a voting mechanism. We also evaluated the performance of hard and soft voting mechanism. For hard voting, majority-based voting mechanism was used and for soft voting we used average of probabilities, product of probabilities, maximum of probabilities and minimum of probabilities-based voting methods. The hard voting (majority-based voting) mechanism shows better performance with 99.42%, as compared to the state-of-the-art algorithm for WBCD.

12.
Sensors (Basel) ; 19(12)2019 Jun 21.
Artigo em Inglês | MEDLINE | ID: mdl-31234366

RESUMO

Human action recognition (HAR) has emerged as a core research domain for video understanding and analysis, thus attracting many researchers. Although significant results have been achieved in simple scenarios, HAR is still a challenging task due to issues associated with view independence, occlusion and inter-class variation observed in realistic scenarios. In previous research efforts, the classical bag of visual words approach along with its variations has been widely used. In this paper, we propose a Dynamic Spatio-Temporal Bag of Expressions (D-STBoE) model for human action recognition without compromising the strengths of the classical bag of visual words approach. Expressions are formed based on the density of a spatio-temporal cube of a visual word. To handle inter-class variation, we use class-specific visual word representation for visual expression generation. In contrast to the Bag of Expressions (BoE) model, the formation of visual expressions is based on the density of spatio-temporal cubes built around each visual word, as constructing neighborhoods with a fixed number of neighbors could include non-relevant information making a visual expression less discriminative in scenarios with occlusion and changing viewpoints. Thus, the proposed approach makes the model more robust to occlusion and changing viewpoint challenges present in realistic scenarios. Furthermore, we train a multi-class Support Vector Machine (SVM) for classifying bag of expressions into action classes. Comprehensive experiments on four publicly available datasets: KTH, UCF Sports, UCF11 and UCF50 show that the proposed model outperforms existing state-of-the-art human action recognition methods in term of accuracy to 99.21%, 98.60%, 96.94 and 94.10%, respectively.


Assuntos
Atividades Humanas , Processamento de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Análise Espaço-Temporal , Algoritmos , Humanos , Esportes/fisiologia , Gravação em Vídeo
13.
Sensors (Basel) ; 19(8)2019 Apr 24.
Artigo em Inglês | MEDLINE | ID: mdl-31022945

RESUMO

Designing motion representations for 3D human action recognition from skeleton sequences is an important yet challenging task. An effective representation should be robust to noise, invariant to viewpoint changes and result in a good performance with low-computational demand. Two main challenges in this task include how to efficiently represent spatio-temporal patterns of skeletal movements and how to learn their discriminative features for classification tasks. This paper presents a novel skeleton-based representation and a deep learning framework for 3D action recognition using RGB-D sensors. We propose to build an action map called SPMF (Skeleton Posture-Motion Feature), which is a compact image representation built from skeleton poses and their motions. An Adaptive Histogram Equalization (AHE) algorithm is then applied on the SPMF to enhance their local patterns and form an enhanced action map, namely Enhanced-SPMF. For learning and classification tasks, we exploit Deep Convolutional Neural Networks based on the DenseNet architecture to learn directly an end-to-end mapping between input skeleton sequences and their action labels via the Enhanced-SPMFs. The proposed method is evaluated on four challenging benchmark datasets, including both individual actions, interactions, multiview and large-scale datasets. The experimental results demonstrate that the proposed method outperforms previous state-of-the-art approaches on all benchmark tasks, whilst requiring low computational time for training and inference.

14.
IEEE Trans Cybern ; 44(6): 936-49, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24144690

RESUMO

A novel embedding-based dimensionality reduction approach, called structural Laplacian Eigenmaps, is proposed to learn models representing any concept that can be defined by a set of multivariate sequences. This approach relies on the expression of the intrinsic structure of the multivariate sequences in the form of structural constraints, which are imposed on dimensionality reduction process to generate a compact and data-driven manifold in a low dimensional space. This manifold is a mathematical representation of the intrinsic nature of the concept of interest regardless of the stylistic variability found in its instances. In addition, this approach is extended to model jointly several related concepts within a unified representation creating a continuous space between concept manifolds. Since a generated manifold encodes the unique characteristic of the concept of interest, it can be employed for classification of unknown instances of concepts. Exhaustive experimental evaluation on different datasets confirms the superiority of the proposed methodology to other state-of-the-art dimensionality reduction methods. Finally, the practical value of this novel dimensionality reduction method is demonstrated in three challenging computer vision applications, i.e., view-dependent and view-independent action recognition as well as human-human interaction classification.


Assuntos
Algoritmos , Inteligência Artificial , Reconhecimento Automatizado de Padrão/métodos , Bases de Dados Factuais , Humanos , Movimento/fisiologia , Gravação em Vídeo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA