Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 158
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Sensors (Basel) ; 24(13)2024 Jun 24.
Artigo em Inglês | MEDLINE | ID: mdl-39000878

RESUMO

Fourier Ptychographic Microscopy (FPM) is a microscopy imaging technique based on optical principles. It employs Fourier optics to separate and combine different optical information from a sample. However, noise introduced during the imaging process often results in poor resolution of the reconstructed image. This article has designed an approach based on a residual local mixture network to improve the quality of Fourier ptychographic reconstruction images. By incorporating channel attention and spatial attention into the FPM reconstruction process, the network enhances the efficiency of the network reconstruction and reduces the reconstruction time. Additionally, the introduction of the Gaussian diffusion model further reduces coherent artifacts and improves image reconstruction quality. Comparative experimental results indicate that this network achieves better reconstruction quality, and outperforming existing methods in both subjective observation and objective quantitative evaluation.

2.
Sensors (Basel) ; 24(4)2024 Feb 19.
Artigo em Inglês | MEDLINE | ID: mdl-38400482

RESUMO

The common channel attention mechanism maps feature statistics to feature weights. However, the effectiveness of this mechanism may not be assured in remotely sensing images due to statistical differences across multiple bands. This paper proposes a novel channel attention mechanism based on feature information called the feature information entropy attention mechanism (FEM). The FEM constructs a relationship between features based on feature information entropy and then maps this relationship to their importance. The Vaihingen dataset and OpenEarthMap dataset are selected for experiments. The proposed method was compared with the squeeze-and-excitation mechanism (SEM), the convolutional block attention mechanism (CBAM), and the frequency channel attention mechanism (FCA). Compared with these three channel attention mechanisms, the mIoU of the FEM in the Vaihingen dataset is improved by 0.90%, 1.10%, and 0.40%, and in the OpenEarthMap dataset, it is improved by 2.30%, 2.20%, and 2.10%, respectively. The proposed channel attention mechanism in this paper shows better performance in remote sensing land use classification.

3.
Sensors (Basel) ; 24(3)2024 Jan 23.
Artigo em Inglês | MEDLINE | ID: mdl-38339433

RESUMO

Around 70 million people worldwide are affected by epilepsy, a neurological disorder characterized by non-induced seizures that occur at irregular and unpredictable intervals. During an epileptic seizure, transient symptoms emerge as a result of extreme abnormal neural activity. Epilepsy imposes limitations on individuals and has a significant impact on the lives of their families. Therefore, the development of reliable diagnostic tools for the early detection of this condition is considered beneficial to alleviate the social and emotional distress experienced by patients. While the Bonn University dataset contains five collections of EEG data, not many studies specifically focus on subsets D and E. These subsets correspond to EEG recordings from the epileptogenic zone during ictal and interictal events. In this work, the parallel ictal-net (PIN) neural network architecture is introduced, which utilizes scalograms obtained through a continuous wavelet transform to achieve the high-accuracy classification of EEG signals into ictal or interictal states. The results obtained demonstrate the effectiveness of the proposed PIN model in distinguishing between ictal and interictal events with a high degree of confidence. This is validated by the computing accuracy, precision, recall, and F1 scores, all of which consistently achieve around 99% confidence, surpassing previous approaches in the related literature.


Assuntos
Eletroencefalografia , Epilepsia , Humanos , Eletroencefalografia/métodos , Convulsões/diagnóstico , Epilepsia/diagnóstico , Redes Neurais de Computação , Análise de Ondaletas
4.
Sensors (Basel) ; 24(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38276379

RESUMO

Image dehazing has become a crucial prerequisite for most outdoor computer applications. The majority of existing dehazing models can achieve the haze removal problem. However, they fail to preserve colors and fine details. Addressing this problem, we introduce a novel high-performing attention-based dehazing model (ADMC2-net)that successfully incorporates both RGB and HSV color spaces to maintain color properties. This model consists of two parallel densely connected sub-models (RGB and HSV) followed by a new efficient attention module. This attention module comprises pixel-attention and channel-attention mechanisms to get more haze-relevant features. Experimental results analyses can validate that our proposed model (ADMC2-net) can achieve superior results on synthetic and real-world datasets and outperform most of state-of-the-art methods.

5.
Sensors (Basel) ; 24(18)2024 Sep 13.
Artigo em Inglês | MEDLINE | ID: mdl-39338697

RESUMO

Due to the limitations of deep learning models in processing one-dimensional signal feature extraction, and high model complexity leading to low training accuracy and large consumption of computing resources, this paper innovatively proposes a rolling bearing fault diagnosis method based on Gramian Angular Field (GAF) and enhanced lightweight residual network. Firstly, the one-dimensional signal is transformed into a two-dimensional GAF image, fully preserving the signal's temporal dependency. Secondly, to address the parameter redundancy and high computational complexity of the ResNet-18 model, its residual blocks are improved. The second convolutional layer in the downsampling residual blocks is removed, traditional convolutional layers are replaced with depthwise separable convolutions, and the lightweight Efficient Channel Attention (ECA) module is embedded after each residual block. This further enhances the model's ability to capture key features while maintaining low computational cost, resulting in a lightweight model referred to as E-ResNet13. Finally, the generated GAF feature maps are fed into the E-ResNet13 model for training, and through a global average pooling layer, they are mapped to a fully connected layer for classifying the faults of rolling bearings. Verifying the superiority of the proposed GAF-E-ResNet13 model, experimental results show that the GAF image encoding method achieves higher fault recognition accuracy compared to other encoding methods. Compared with other intelligent diagnosis methods, the E-ResNet13 model demonstrates strong diagnostic performance and generalization capability under both a single condition and complex varying conditions, fully proving the innovation and practicality of this method.

6.
Sensors (Basel) ; 24(14)2024 Jul 22.
Artigo em Inglês | MEDLINE | ID: mdl-39066153

RESUMO

Recent research has made significant progress in automated unmanned systems utilizing Artificial Intelligence (AI)-based image processing to optimize the rebar manufacturing process and minimize defects such as twisting during production. Despite various studies, including those employing data augmentation through Generative Adversarial Networks (GANs), the performance of rebar twist prediction has been limited due to image quality degradation caused by environmental noise, such as insufficient image quality and inconsistent lighting conditions in rebar processing environments. To address these challenges, we propose a novel approach for real-time rebar twist prediction in manufacturing processes. Our method involves restoring low-quality grayscale images to high resolution and employing an object detection model to identify and track rebar endpoints. We then apply regression analysis to the coordinates obtained from the bounding boxes to estimate the error rate of the rebar endpoint positions, thereby determining the occurrence of twisting. To achieve this, we first developed a Unified-Channel Attention (UCA) module that is robust to changes in intensity and contrast for grayscale images. The UCA can be integrated into image restoration models to more accurately detect rebar endpoint characteristics in object detection models. Furthermore, we introduce a method for predicting the future positions of rebar endpoints using various linear and non-linear regression models. The predicted positions are used to calculate the error rate in rebar endpoint locations, determined by the distance between the actual and predicted positions, which is then used to classify the presence of rebar twisting. Our experimental results demonstrate that integrating the UCA module with our image restoration model significantly improved existing models in Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) metrics. Moreover, employing regression models to predict future rebar endpoint positions enhances the F1 score for twist prediction. As a result, our approach offers a practical solution for rapid defect detection in rebar manufacturing processes.

7.
Sensors (Basel) ; 24(4)2024 Feb 17.
Artigo em Inglês | MEDLINE | ID: mdl-38400437

RESUMO

Nowadays, most trajectory prediction algorithms have difficulty simulating actual traffic behavior, and there is still a problem of large prediction errors. Therefore, this paper proposes a multi-object trajectory prediction algorithm based on lane information and foresight information. A Hybrid Dilated Convolution module based on the Channel Attention mechanism (CA-HDC) is developed to extract features, which improves the lane feature extraction in complicated environments and solves the problem of poor robustness of the traditional PINet. A lane information fusion module and a trajectory adjustment module based on the foresight information are developed. A socially acceptable trajectory with Generative Adversarial Networks (S-GAN) is developed to reduce the error of the trajectory prediction algorithm. The lane detection accuracy in special scenarios such as crowded, shadow, arrow, crossroad, and night are improved on the CULane dataset. The average F1-measure of the proposed lane detection has been increased by 4.1% compared to the original PINet. The trajectory prediction test based on D2-City indicates that the average displacement error of the proposed trajectory prediction algorithm is reduced by 4.27%, and the final displacement error is reduced by 7.53%. The proposed algorithm can achieve good results in lane detection and multi-object trajectory prediction tasks.

8.
Sensors (Basel) ; 24(10)2024 May 11.
Artigo em Inglês | MEDLINE | ID: mdl-38793924

RESUMO

Underwater images suffer from low contrast and color distortion. In order to improve the quality of underwater images and reduce storage and computational resources, this paper proposes a lightweight model Rep-UWnet to enhance underwater images. The model consists of a fully connected convolutional network and three densely connected RepConv blocks in series, with the input images connected to the output of each block with a Skip connection. First, the original underwater image is subjected to feature extraction by the SimSPPF module and is processed through feature summation with the original one to be produced as the input image. Then, the first convolutional layer with a kernel size of 3 × 3, generates 64 feature maps, and the multi-scale hybrid convolutional attention module enhances the useful features by reweighting the features of different channels. Second, three RepConv blocks are connected to reduce the number of parameters in extracting features and increase the test speed. Finally, a convolutional layer with 3 kernels generates enhanced underwater images. Our method reduces the number of parameters from 2.7 M to 0.45 M (around 83% reduction) but outperforms state-of-the-art algorithms by extensive experiments. Furthermore, we demonstrate our Rep-UWnet effectively improves high-level vision tasks like edge detection and single image depth estimation. This method not only surpasses the contrast method in objective quality, but also significantly improves the contrast, colorimetry, and clarity of underwater images in subjective quality.

9.
Sensors (Basel) ; 24(11)2024 May 21.
Artigo em Inglês | MEDLINE | ID: mdl-38894060

RESUMO

To enhance the accuracy of detecting objects in front of intelligent vehicles in urban road scenarios, this paper proposes a dual-layer voxel feature fusion augmentation network (DL-VFFA). It aims to address the issue of objects misrecognition caused by local occlusion or limited field of view for targets. The network employs a point cloud voxelization architecture, utilizing the Mahalanobis distance to associate similar point clouds within neighborhood voxel units. It integrates local and global information through weight sharing to extract boundary point information within each voxel unit. The relative position encoding of voxel features is computed using an improved attention Gaussian deviation matrix in point cloud space to focus on the relative positions of different voxel sequences within channels. During the fusion of point cloud and image features, learnable weight parameters are designed to decouple fine-grained regions, enabling two-layer feature fusion from voxel to voxel and from point cloud to image. Extensive experiments on the KITTI dataset demonstrate the significant performance of DL-VFFA. Compared to the baseline network Second, DL-VFFA performs better in medium- and high-difficulty scenarios. Furthermore, compared to the voxel fusion module in MVX-Net, the voxel feature fusion results in this paper are more accurate, effectively capturing fine-grained object features post-voxelization. Through ablative experiments, we conducted in-depth analyses of the three voxel fusion modules in DL-VFFA to enhance the performance of the baseline detector and achieved superior results.

10.
Sensors (Basel) ; 24(18)2024 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-39338770

RESUMO

In response to the issue that the fusion process of infrared and visible images is easily affected by lighting factors, in this paper, we propose an adaptive illumination perception fusion mechanism, which was integrated into an infrared and visible image fusion network. Spatial attention mechanisms were applied to both infrared images and visible images for feature extraction. Deep convolutional neural networks were utilized for further feature information extraction. The adaptive illumination perception fusion mechanism is then integrated into the image reconstruction process to reduce the impact of lighting variations in the fused images. A Median Strengthening Channel and Spatial Attention Module (MSCS) was designed to be integrated into the backbone of YOLOv8. In this paper, we used the fusion network to create a dataset named ivifdata for training the target recognition network. The experimental results indicated that the improved YOLOv8 network saw further enhancements of 2.3%, 1.4%, and 8.2% in the Recall, mAP50, and mAP50-95 metrics, respectively. The experiments revealed that the improved YOLOv8 network has advantages in terms of recognition rate and completeness, while also reducing the rates of false negatives and false positives.

11.
Sensors (Basel) ; 24(9)2024 May 05.
Artigo em Inglês | MEDLINE | ID: mdl-38733038

RESUMO

With the continuous advancement of autonomous driving and monitoring technologies, there is increasing attention on non-intrusive target monitoring and recognition. This paper proposes an ArcFace SE-attention model-agnostic meta-learning approach (AS-MAML) by integrating attention mechanisms into residual networks for pedestrian gait recognition using frequency-modulated continuous-wave (FMCW) millimeter-wave radar through meta-learning. We enhance the feature extraction capability of the base network using channel attention mechanisms and integrate the additive angular margin loss function (ArcFace loss) into the inner loop of MAML to constrain inner loop optimization and improve radar discrimination. Then, this network is used to classify small-sample micro-Doppler images obtained from millimeter-wave radar as the data source for pose recognition. Experimental tests were conducted on pose estimation and image classification tasks. The results demonstrate significant detection and recognition performance, with an accuracy of 94.5%, accompanied by a 95% confidence interval. Additionally, on the open-source dataset DIAT-µRadHAR, which is specially processed to increase classification difficulty, the network achieves a classification accuracy of 85.9%.


Assuntos
Pedestres , Radar , Humanos , Algoritmos , Marcha/fisiologia , Reconhecimento Automatizado de Padrão/métodos , Aprendizado de Máquina
12.
Sensors (Basel) ; 24(10)2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38794007

RESUMO

In recent years, deep learning methods have achieved remarkable success in hyperspectral image classification (HSIC), and the utilization of convolutional neural networks (CNNs) has proven to be highly effective. However, there are still several critical issues that need to be addressed in the HSIC task, such as the lack of labeled training samples, which constrains the classification accuracy and generalization ability of CNNs. To address this problem, a deep multi-scale attention fusion network (DMAF-NET) is proposed in this paper. This network is based on multi-scale features and fully exploits the deep features of samples from multiple levels and different perspectives with an aim to enhance HSIC results using limited samples. The innovation of this article is mainly reflected in three aspects: Firstly, a novel baseline network for multi-scale feature extraction is designed with a pyramid structure and densely connected 3D octave convolutional network enabling the extraction of deep-level information from features at different granularities. Secondly, a multi-scale spatial-spectral attention module and a pyramidal multi-scale channel attention module are designed, respectively. This allows modeling of the comprehensive dependencies of coordinates and directions, local and global, in four dimensions. Finally, a multi-attention fusion module is designed to effectively combine feature mappings extracted from multiple branches. Extensive experiments on four popular datasets demonstrate that the proposed method can achieve high classification accuracy even with fewer labeled samples.

13.
BMC Bioinformatics ; 24(1): 285, 2023 Jul 18.
Artigo em Inglês | MEDLINE | ID: mdl-37464322

RESUMO

Deep learning-based medical image segmentation has made great progress over the past decades. Scholars have proposed many novel transformer-based segmentation networks to solve the problems of building long-range dependencies and global context connections in convolutional neural networks (CNNs). However, these methods usually replace the CNN-based blocks with improved transformer-based structures, which leads to the lack of local feature extraction ability, and these structures require a huge number of data for training. Moreover, those methods did not pay attention to edge information, which is essential in medical image segmentation. To address these problems, we proposed a new network structure, called P-TransUNet. This network structure combines the designed efficient P-Transformer and the fusion module, which extract distance-related long-range dependencies and local information respectively and produce the fused features. Besides, we introduced edge loss into training to focus the attention of the network on the edge of the lesion area to improve segmentation performance. Extensive experiments across four tasks of medical image segmentation demonstrated the effectiveness of P-TransUNet, and showed that our network outperforms other state-of-the-art methods.


Assuntos
Fontes de Energia Elétrica , Redes Neurais de Computação , Processamento de Imagem Assistida por Computador
14.
Curr Genomics ; 24(3): 171-186, 2023 Nov 22.
Artigo em Inglês | MEDLINE | ID: mdl-38178985

RESUMO

Introduction: N4 acetylcytidine (ac4C) is a highly conserved nucleoside modification that is essential for the regulation of immune functions in organisms. Currently, the identification of ac4C is primarily achieved using biological methods, which can be time-consuming and labor-intensive. In contrast, accurate identification of ac4C by computational methods has become a more effective method for classification and prediction. Aim: To the best of our knowledge, although there are several computational methods for ac4C locus prediction, the performance of the models they constructed is poor, and the network structure they used is relatively simple and suffers from the disadvantage of network degradation. This study aims to improve these limitations by proposing a predictive model based on integrated deep learning to better help identify ac4C sites. Methods: In this study, we propose a new integrated deep learning prediction framework, DLC-ac4C. First, we encode RNA sequences based on three feature encoding schemes, namely C2 encoding, nucleotide chemical property (NCP) encoding, and nucleotide density (ND) encoding. Second, one-dimensional convolutional layers and densely connected convolutional networks (DenseNet) are used to learn local features, and bi-directional long short-term memory networks (Bi-LSTM) are used to learn global features. Third, a channel attention mechanism is introduced to determine the importance of sequence characteristics. Finally, a homomorphic integration strategy is used to limit the generalization error of the model, which further improves the performance of the model. Results: The DLC-ac4C model performed well in terms of sensitivity (Sn), specificity (Sp), accuracy (Acc), Mathews correlation coefficient (MCC), and area under the curve (AUC) for the independent test data with 86.23%, 79.71%, 82.97%, 66.08%, and 90.42%, respectively, which was significantly better than the prediction accuracy of the existing methods. Conclusion: Our model not only combines DenseNet and Bi-LSTM, but also uses the channel attention mechanism to better capture hidden information features from a sequence perspective, and can identify ac4C sites more effectively.

15.
Sensors (Basel) ; 23(3)2023 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-36772279

RESUMO

Tool wear is a key factor in the machining process, which affects the tool life and quality of the machined work piece. Therefore, it is crucial to monitor and diagnose the tool condition. An improved CaAt-ResNet-1d model for multi-sensor tool wear diagnosis was proposed. The ResNet18 structure based on a one-dimensional convolutional neural network is adopted to make the basic model architecture. The one-dimensional convolutional neural network is more suitable for feature extraction of time series data. Add the channel attention mechanism of CaAt1 to the residual network block and the channel attention mechanism of CaAt5 automatically learns the features of different channels. The proposed method is validated on the PHM2010 dataset. Validation results show that CaAt-ResNet-1d can reach 89.27% accuracy, improving by about 7% compared to Gated-Transformer and 3% compared to Resnet18. The experimental results demonstrate the capacity and effectiveness of the proposed method for tool wear monitor.

16.
Sensors (Basel) ; 23(18)2023 Sep 06.
Artigo em Inglês | MEDLINE | ID: mdl-37765747

RESUMO

Compressed sensing (CS) MRI has shown great potential in enhancing time efficiency. Deep learning techniques, specifically generative adversarial networks (GANs), have emerged as potent tools for speedy CS-MRI reconstruction. Yet, as the complexity of deep learning reconstruction models increases, this can lead to prolonged reconstruction time and challenges in achieving convergence. In this study, we present a novel GAN-based model that delivers superior performance without the model complexity escalating. Our generator module, built on the U-net architecture, incorporates dilated residual (DR) networks, thus expanding the network's receptive field without increasing parameters or computational load. At every step of the downsampling path, this revamped generator module includes a DR network, with the dilation rates adjusted according to the depth of the network layer. Moreover, we have introduced a channel attention mechanism (CAM) to distinguish between channels and reduce background noise, thereby focusing on key information. This mechanism adeptly combines global maximum and average pooling approaches to refine channel attention. We conducted comprehensive experiments with the designed model using public domain MRI datasets of the human brain. Ablation studies affirmed the efficacy of the modified modules within the network. Incorporating DR networks and CAM elevated the peak signal-to-noise ratios (PSNR) of the reconstructed images by about 1.2 and 0.8 dB, respectively, on average, even at 10× CS acceleration. Compared to other relevant models, our proposed model exhibits exceptional performance, achieving not only excellent stability but also outperforming most of the compared networks in terms of PSNR and SSIM. When compared with U-net, DR-CAM-GAN's average gains in SSIM and PSNR were 14% and 15%, respectively. Its MSE was reduced by a factor that ranged from two to seven. The model presents a promising pathway for enhancing the efficiency and quality of CS-MRI reconstruction.

17.
Sensors (Basel) ; 23(16)2023 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-37631836

RESUMO

Fourier ptychographic microscopy (FPM) is a novel technique for computing microimaging that allows imaging of samples such as pathology sections. However, due to the influence of systematic errors and noise, the quality of reconstructed images using FPM is often poor, and the reconstruction efficiency is low. In this paper, a hybrid attention network that combines spatial attention mechanisms with channel attention mechanisms into FPM reconstruction is introduced. Spatial attention can extract fine spatial features and reduce redundant features while, combined with residual channel attention, it adaptively readjusts the hierarchical features to achieve the conversion of low-resolution complex amplitude images to high-resolution ones. The high-resolution images generated by this method can be applied to medical cell recognition, segmentation, classification, and other related studies, providing a better foundation for relevant research.

18.
Sensors (Basel) ; 23(2)2023 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-36679791

RESUMO

Point cloud registration is a crucial preprocessing step for point cloud data analysis and applications. Nowadays, many deep-learning-based methods have been proposed to improve the registration quality. These methods always use the sum of two cross-entropy as a loss function to train the model, which may lead to mismatching in overlapping regions. In this paper, we designed a new loss function based on the cross-entropy and applied it to the ROPNet point cloud registration model. Meanwhile, we improved the ROPNet by adding the channel attention mechanism to make the network focus on both global and local important information, thus improving the registration performance and reducing the point cloud registration error. We tested our method on ModelNet40 dataset, and the experimental results demonstrate the effectiveness of our proposed method.


Assuntos
Análise de Dados , Entropia
19.
Sensors (Basel) ; 23(2)2023 Jan 16.
Artigo em Inglês | MEDLINE | ID: mdl-36679819

RESUMO

Aiming to address the problems of the high bit error rate (BER) of demodulation or low classification accuracy of modulation signals with a low signal-to-noise ratio (SNR), we propose a double-residual denoising autoencoder method with a channel attention mechanism, referred to as DRdA-CA, to improve the SNR of modulation signals. The proposed DRdA-CA consists of an encoding module and a decoding module. A squeeze-and-excitation (SE) ResNet module containing one residual connection is modified and then introduced into the autoencoder as the channel attention mechanism, to better extract the characteristics of the modulation signals and reduce the computational complexity of the model. Moreover, the other residual connection is further added inside the encoding and decoding modules to optimize the network degradation problem, which is beneficial for fully exploiting the multi-level features of modulation signals and improving the reconstruction quality of the signal. The ablation experiments prove that both the improved SE module and dual residual connections in the proposed method play an important role in improving the denoising performance. The subsequent experimental results show that the proposed DRdA-CA significantly improves the SNR values of eight modulation types in the range of -12 dB to 8 dB. Especially for 16QAM and 64QAM, the SNR is improved by 8.38 dB and 8.27 dB on average, respectively. Compared to the DnCNN denoising method, the proposed DRdA-CA makes the average classification accuracy increase by 67.59∼74.94% over the entire SNR range. When it comes to the demodulation, compared with the RLS and the DnCNN denoising algorithms, the proposed denoising method reduces the BER of 16QAM by an average of 63.5% and 40.5%, and reduces the BER of 64QAM by an average of 46.7% and 18.6%. The above results show that the proposed DRdA-CA achieves the optimal noise reduction effect.


Assuntos
Algoritmos , Razão Sinal-Ruído
20.
Sensors (Basel) ; 23(8)2023 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-37112176

RESUMO

Driver distraction is considered a main cause of road accidents, every year, thousands of people obtain serious injuries, and most of them lose their lives. In addition, a continuous increase can be found in road accidents due to driver's distractions, such as talking, drinking, and using electronic devices, among others. Similarly, several researchers have developed different traditional deep learning techniques for the efficient detection of driver activity. However, the current studies need further improvement due to the higher number of false predictions in real time. To cope with these issues, it is significant to develop an effective technique which detects driver's behavior in real time to prevent human lives and their property from being damaged. In this work, we develop a convolutional neural network (CNN)-based technique with the integration of a channel attention (CA) mechanism for efficient and effective detection of driver behavior. Moreover, we compared the proposed model with solo and integration flavors of various backbone models and CA such as VGG16, VGG16+CA, ResNet50, ResNet50+CA, Xception, Xception+CA, InceptionV3, InceptionV3+CA, and EfficientNetB0. Additionally, the proposed model obtained optimal performance in terms of evaluation metrics, for instance, accuracy, precision, recall, and F1-score using two well-known datasets such as AUC Distracted Driver (AUCD2) and State Farm Distracted Driver Detection (SFD3). The proposed model achieved 99.58% result in terms of accuracy using SFD3 while 98.97% accuracy on AUCD2 datasets.


Assuntos
Condução de Veículo , Direção Distraída , Humanos , Acidentes de Trânsito/prevenção & controle , Redes Neurais de Computação
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA