Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 57
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Sensors (Basel) ; 24(14)2024 Jul 11.
Artigo em Inglês | MEDLINE | ID: mdl-39065881

RESUMO

Addressing the limitations of current railway track foreign object detection techniques, which suffer from inadequate real-time performance and diminished accuracy in detecting small objects, this paper introduces an innovative vision-based perception methodology harnessing the power of deep learning. Central to this approach is the construction of a railway boundary model utilizing a sophisticated track detection method, along with an enhanced UNet semantic segmentation network to achieve autonomous segmentation of diverse track categories. By employing equal interval division and row-by-row traversal, critical track feature points are precisely extracted, and the track linear equation is derived through the least squares method, thus establishing an accurate railway boundary model. We optimized the YOLOv5s detection model in four aspects: incorporating the SE attention mechanism into the Neck network layer to enhance the model's feature extraction capabilities, adding a prediction layer to improve the detection performance for small objects, proposing a linear size scaling method to obtain suitable anchor boxes, and utilizing Inner-IoU to refine the boundary regression loss function, thereby increasing the positioning accuracy of the bounding boxes. We conducted a detection accuracy validation for railway track foreign object intrusion using a self-constructed image dataset. The results indicate that the proposed semantic segmentation model achieved an MIoU of 91.8%, representing a 3.9% improvement over the previous model, effectively segmenting railway tracks. Additionally, the optimized detection model could effectively detect foreign object intrusions on the tracks, reducing missed and false alarms and achieving a 7.4% increase in the mean average precision (IoU = 0.5) compared to the original YOLOv5s model. The model exhibits strong generalization capabilities in scenarios involving small objects. This proposed approach represents an effective exploration of deep learning techniques for railway track foreign object intrusion detection, suitable for use in complex environments to ensure the operational safety of rail lines.

2.
Sensors (Basel) ; 24(14)2024 Jul 12.
Artigo em Inglês | MEDLINE | ID: mdl-39065920

RESUMO

Simultaneous Localization and Mapping (SLAM) is one of the key technologies with which to address the autonomous navigation of mobile robots, utilizing environmental features to determine a robot's position and create a map of its surroundings. Currently, visual SLAM algorithms typically yield precise and dependable outcomes in static environments, and many algorithms opt to filter out the feature points in dynamic regions. However, when there is an increase in the number of dynamic objects within the camera's view, this approach might result in decreased accuracy or tracking failures. Therefore, this study proposes a solution called YPL-SLAM based on ORB-SLAM2. The solution adds a target recognition and region segmentation module to determine the dynamic region, potential dynamic region, and static region; determines the state of the potential dynamic region using the RANSAC method with polar geometric constraints; and removes the dynamic feature points. It then extracts the line features of the non-dynamic region and finally performs the point-line fusion optimization process using a weighted fusion strategy, considering the image dynamic score and the number of successful feature point-line matches, thus ensuring the system's robustness and accuracy. A large number of experiments have been conducted using the publicly available TUM dataset to compare YPL-SLAM with globally leading SLAM algorithms. The results demonstrate that the new algorithm surpasses ORB-SLAM2 in terms of accuracy (with a maximum improvement of 96.1%) while also exhibiting a significantly enhanced operating speed compared to Dyna-SLAM.

3.
Sensors (Basel) ; 24(12)2024 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-38931575

RESUMO

Vehicle detection is a research direction in the field of target detection and is widely used in intelligent transportation, automatic driving, urban planning, and other fields. To balance the high-speed advantage of lightweight networks and the high-precision advantage of multiscale networks, a vehicle detection algorithm based on a lightweight backbone network and a multiscale neck network is proposed. The mobile NetV3 lightweight network based on deep separable convolution is used as the backbone network to improve the speed of vehicle detection. The icbam attention mechanism module is used to strengthen the processing of the vehicle feature information detected by the backbone network to enrich the input information of the neck network. The bifpn and icbam attention mechanism modules are integrated into the neck network to improve the detection accuracy of vehicles of different sizes and categories. A vehicle detection experiment on the Ua-Detrac dataset verifies that the proposed algorithm can effectively balance vehicle detection accuracy and speed. The detection accuracy is 71.19%, the number of parameters is 3.8 MB, and the detection speed is 120.02 fps, which meets the actual requirements of the parameter quantity, detection speed, and accuracy of the vehicle detection algorithm embedded in the mobile device.

4.
Math Biosci Eng ; 21(4): 5782-5802, 2024 Apr 26.
Artigo em Inglês | MEDLINE | ID: mdl-38872558

RESUMO

With the widespread integration of deep learning in intelligent transportation and various industrial sectors, target detection technology is gradually becoming one of the key research areas. Accurately detecting road vehicles and pedestrians is of great significance for the development of autonomous driving technology. Road object detection faces problems such as complex backgrounds, significant scale changes, and occlusion. To accurately identify traffic targets in complex environments, this paper proposes a road target detection algorithm based on the enhanced YOLOv5s. This algorithm introduces the weighted enhanced polarization self attention (WEPSA) self-attention mechanism, which uses spatial attention and channel attention to strengthen the important features extracted by the feature extraction network and suppress insignificant background information. In the neck network, we designed a weighted feature fusion network (CBiFPN) to enhance neck feature representation and enrich semantic information. This strategic feature fusion not only boosts the algorithm's adaptability to intricate scenes, but also contributes to its robust performance. Then, the bounding box regression loss function uses EIoU to accelerate model convergence and reduce losses. Finally, a large number of experiments have shown that the improved YOLOv5s algorithm achieves mAP@0.5 scores of 92.8% and 53.5% on the open-source datasets KITTI and Cityscapes. On the self-built dataset, the mAP@0.5 reaches 88.7%, which is 1.7%, 3.8%, and 3.3% higher than YOLOv5s, respectively, ensuring real-time performance while improving detection accuracy. In addition, compared to the latest YOLOv7 and YOLOv8, the improved YOLOv5 shows good overall performance on the open-source datasets.

5.
Sci Rep ; 14(1): 12178, 2024 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-38806585

RESUMO

The resolution of traffic congestion and personal safety issues holds paramount importance for human's life. The ability of an autonomous driving system to navigate complex road conditions is crucial. Deep learning has greatly facilitated machine vision perception in autonomous driving. Aiming at the problem of small target detection in traditional YOLOv5s, this paper proposes an optimized target detection algorithm. The C3 module on the algorithm's backbone is upgraded to the CBAMC3 module, introducing a novel GELU activation function and EfficiCIoU loss function, which accelerate convergence on position loss lbox, confidence loss lobj, and classification loss lcls, enhance image learning capabilities and address the issue of inaccurate detection of small targets by improving the algorithm. Testing with a vehicle-mounted camera on a predefined route effectively identifies road vehicles and analyzes depth position information. The avoidance model, combined with Pure Pursuit and MPC control algorithms, exhibits more stable variations in vehicle speed, front-wheel steering angle, lateral acceleration, etc., compared to the non-optimized version. The robustness of the driving system's visual avoidance functionality is enhanced, further ameliorating congestion issues and ensuring personal safety.

6.
Heliyon ; 10(10): e31029, 2024 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-38779013

RESUMO

Submarine mud poses a risk to channel navigation safety. Traditional detection methods lack efficiency and accuracy. As a result, this paper proposed an enhanced shallow submarine mud detection algorithm, leveraging an improved YOLOv5s model to increase accuracy and effectiveness in identifying such hazards in marine channels. Firstly, the sub-bottom profiler was employed to assess the submarine channel of Lianyungang Port to acquire the image data of the shallow mud sound print. Concurrently, the analysis incorporated the characteristics of changes in sound intensity peaks to precisely identify the shallow mud's location. Furthermore, the incorporation of C2F feature module into the backbone module enhances the gradient flow of the algorithm, augments the feature extraction information, and improves the algorithm's detection performance. Subsequently, Efficient Multi-Scale Attention (EMA) mechanism is incorporated into the neck module, aiming to optimize the algorithm's channel dimensions, minimize computational overhead, and enhance its detection efficiency. Finally, the study introduced Normalized Wasserstein Distance (NWD) loss function into bounding box regression loss function. This integration effectively addresses the issue of multi-scale defects, emphasizes the transformation of target planar position deviation, and improves the accuracy of the algorithm's detection capabilities. The results indicate that the improved YOLOv5s-EF algorithm outperforms the original YOLOv5s algorithm and other widely used detection algorithms. It achieved a validation set precision rate of 97.8%, recall rate of 97.6%, F1 value of 97.7%, mean Average Precision (mAP)@0.5 of 98.2%, mAP@0.95 of 69.6%, and Frames Per Second (FPS) of 51.8. YOLOv5s-EF algorithm proposed in this study offers a novel technical approach for detecting mud in submarine channels, which is importance for ensuring the safe operation and maintenance of dredging in such channels.

7.
Front Plant Sci ; 15: 1364185, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38685961

RESUMO

Peanut pod rot is one of the major plant diseases affecting peanut production and quality over China, which causes large productivity losses and is challenging to control. To improve the disease resistance of peanuts, breeding is one significant strategy. Crucial preventative and management measures include grading peanut pod rot and screening high-contributed genes that are highly resistant to pod rot should be carried out. A machine vision-based grading approach for individual cases of peanut pod rot was proposed in this study, which avoids time-consuming, labor-intensive, and inaccurate manual categorization and provides dependable technical assistance for breeding studies and peanut pod rot resistance. The Shuffle Attention module has been added to the YOLOv5s (You Only Look Once version 5 small) feature extraction backbone network to overcome occlusion, overlap, and adhesions in complex backgrounds. Additionally, to reduce missing and false identification of peanut pods, the loss function CIoU (Complete Intersection over Union) was replaced with EIoU (Enhanced Intersection over Union). The recognition results can be further improved by introducing grade classification module, which can read the information from the identified RGB images and output data like numbers of non-rotted and rotten peanut pods, the rotten pod rate, and the pod rot grade. The Precision value of the improved YOLOv5s reached 93.8%, which was 7.8%, 8.4%, and 7.3% higher than YOLOv5s, YOLOv8n, and YOLOv8s, respectively; the mAP (mean Average Precision) value was 92.4%, which increased by 6.7%, 7.7%, and 6.5%, respectively. Improved YOLOv5s has an average improvement of 6.26% over YOLOv5s in terms of recognition accuracy: that was 95.7% for non-rotted peanut pods and 90.8% for rotten peanut pods. This article presented a machine vision- based grade classification method for peanut pod rot, which offered technological guidance for selecting high-quality cultivars with high resistance to pod rot in peanut.

8.
Sensors (Basel) ; 24(7)2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38610405

RESUMO

With the increase in the scale of breeding at modern pastures, the management of dairy cows has become much more challenging, and individual recognition is the key to the implementation of precision farming. Based on the need for low-cost and accurate herd management and for non-stressful and non-invasive individual recognition, we propose a vision-based automatic recognition method for dairy cow ear tags. Firstly, for the detection of cow ear tags, the lightweight Small-YOLOV5s is proposed, and then a differentiable binarization network (DBNet) combined with a convolutional recurrent neural network (CRNN) is used to achieve the recognition of the numbers on ear tags. The experimental results demonstrated notable improvements: Compared to those of YOLOV5s, Small-YOLOV5s enhanced recall by 1.5%, increased the mean average precision by 0.9%, reduced the number of model parameters by 5,447,802, and enhanced the average prediction speed for a single image by 0.5 ms. The final accuracy of the ear tag number recognition was an impressive 92.1%. Moreover, this study introduces two standardized experimental datasets specifically designed for the ear tag detection and recognition of dairy cows. These datasets will be made freely available to researchers in the global dairy cattle community with the intention of fostering intelligent advancements in the breeding industry.


Assuntos
Agricultura , Reconhecimento Psicológico , Animais , Feminino , Bovinos , Fazendas , Indústrias , Inteligência
9.
Math Biosci Eng ; 21(2): 1765-1790, 2024 Jan 02.
Artigo em Inglês | MEDLINE | ID: mdl-38454659

RESUMO

Detecting abnormal surface features is an important method for identifying abnormal fish. However, existing methods face challenges in excessive subjectivity, limited accuracy, and poor real-time performance. To solve these challenges, a real-time and accurate detection model of abnormal surface features of in-water fish is proposed, based on improved YOLOv5s. The specific enhancements include: 1) We optimize the complete intersection over union and non-maximum suppression through the normalized Gaussian Wasserstein distance metric to improve the model's ability to detect tiny targets. 2) We design the DenseOne module to enhance the reusability of abnormal surface features, and introduce MobileViTv2 to improve detection speed, which are integrated into the feature extraction network. 3) According to the ACmix principle, we fuse the omni-dimensional dynamic convolution and convolutional block attention module to solve the challenge of extracting deep features within complex backgrounds. We carried out comparative experiments on 160 validation sets of in-water abnormal fish, achieving precision, recall, mAP50, mAP50:95 and frames per second (FPS) of 99.5, 99.1, 99.1, 73.9% and 88 FPS, respectively. The results of our model surpass the baseline by 1.4, 1.2, 3.2, 8.2% and 1 FPS. Moreover, the improved model outperforms other state-of-the-art models regarding comprehensive evaluation indexes.


Assuntos
Peixes , Água , Animais , Distribuição Normal
10.
Math Biosci Eng ; 21(3): 4269-4285, 2024 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-38549327

RESUMO

In response to the issues of low efficiency and high cost in traditional manual methods for road surface crack detection, an improved YOLOv5s (you only look once version 5 small) algorithm was proposed. Based on this improvement, a road surface crack object recognition model was established using YOLOv5s. First, based on the Res2Net (a new multi-scale backbone architecture) network, an improved multi-scale Res2-C3 (a new multi-scale backbone architecture of C3) module was suggested to enhance feature extraction performance. Second, the feature fusion network and backbone of YOLOv5 were merged with the GAM (global attention mechanism) attention mechanism, reducing information dispersion and enhancing the interaction of global dimensions features. We incorporated dynamic snake convolution into the feature fusion network section to enhance the model's ability to handle irregular shapes and deformation problems. Experimental results showed that the final revision of the model dramatically increased both the detection speed and the accuracy of road surface identification. The mean average precision (mAP) reached 93.9%, with an average precision improvement of 12.6% compared to the YOLOv5s model. The frames per second (FPS) value was 49.97. The difficulties of low accuracy and slow speed in road surface fracture identification were effectively addressed by the modified model, demonstrating that the enhanced model achieved relatively high accuracy while maintaining inference speed.

11.
Sci Rep ; 14(1): 6707, 2024 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-38509164

RESUMO

In order to solve the problems of slow detection speed, large number of parameters and large computational volume of deep learning based gangue target detection method, we propose an improved algorithm for gangue target detection based on Yolov5s. First, the lightweight network EfficientVIT is used as the backbone network to increase the target detection speed. Second, C3_Faster replaces the C3 part in the HEAD module, which reduces the model complexity. once again, the 20 × 20 feature map branch in the Neck region is deleted, which reduces the model complexity; thirdly, the CIOU loss function is replaced by the Mpdiou loss function. The introduction of the SE attention mechanism makes the model pay more attention to critical features to improve detection performance. Experimental results show that the improved model size of the coal gang detection algorithm reduces the compression by 77.8%, the number of parameters by 78.3% the computational cost is reduced by 77.8% and the number of frames is reduced by 30.6%, which can be used as a reference for intelligent coal gangue classification.

12.
Sensors (Basel) ; 24(4)2024 Feb 11.
Artigo em Inglês | MEDLINE | ID: mdl-38400340

RESUMO

In complex industrial environments, accurate recognition and localization of industrial targets are crucial. This study aims to improve the precision and accuracy of object detection in industrial scenarios by effectively fusing feature information at different scales and levels, and introducing edge detection head algorithms and attention mechanisms. We propose an improved YOLOv5-based algorithm for industrial object detection. Our improved algorithm incorporates the Crossing Bidirectional Feature Pyramid (CBiFPN), effectively addressing the information loss issue in multi-scale and multi-level feature fusion. Therefore, our method can enhance detection performance for objects of varying sizes. Concurrently, we have integrated the attention mechanism (C3_CA) into YOLOv5s to augment feature expression capabilities. Furthermore, we introduce the Edge Detection Head (EDH) method, which is adept at tackling detection challenges in scenes with occluded objects and cluttered backgrounds by merging edge information and amplifying it within the features. Experiments conducted on the modified ITODD dataset demonstrate that the original YOLOv5s algorithm achieves 82.11% and 60.98% on mAP@0.5 and mAP@0.5:0.95, respectively, with its precision and recall being 86.8% and 74.75%, respectively. The performance of the modified YOLOv5s algorithm on mAP@0.5 and mAP@0.5:0.95 has been improved by 1.23% and 1.44%, respectively, and the precision and recall have been enhanced by 3.68% and 1.06%, respectively. The results show that our method significantly boosts the accuracy and robustness of industrial target recognition and localization.

13.
Pest Manag Sci ; 80(6): 2577-2586, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38243837

RESUMO

BACKGROUND: The polyphagous mirid bug Apolygus lucorum (Meyer-Dür) and the green leafhopper Empoasca spp. Walsh are small pests that are widely distributed and important pests of many economically important crops, especially kiwis. Conventional monitoring methods are expensive, laborious and error-prone. Currently, deep learning methods are ineffective at recognizing them. This study proposes a new deep-learning-based YOLOv5s_HSSE model to automatically detect and count them on sticky card traps. RESULTS: Based on a database of 1502 images, all images were collected from kiwi orchards at multiple locations and times. We trained the YOLOv5s model to detect and count them and then changed the activation function to Hard swish in YOLOv5s, introduced the SIoU Loss function, and added the squeeze-and-excitation attention mechanism to form a new YOLOv5s_HSSE model. Mean average precision of this model in the test dataset was 95.9%, the recall rate was 93.9% and the frames per second was 155, which are higher than those of other single-stage deep-learning models, such as SSD, YOLOv3 and YOLOv4. CONCLUSION: The proposed YOLOv5s_HSSE model can be used to identify and count A. lucorum and Empoasca spp., and it is a new, efficient and accurate monitoring method. Pest detection will benefit from the broader applications of deep learning. © 2024 Society of Chemical Industry.


Assuntos
Hemípteros , Heterópteros , Animais , Aprendizado Profundo , Controle de Insetos/métodos , Processamento de Imagem Assistida por Computador/métodos
14.
Biomimetics (Basel) ; 9(1)2024 Jan 03.
Artigo em Inglês | MEDLINE | ID: mdl-38248602

RESUMO

Steel strip is an important raw material for the engineering, automotive, shipbuilding, and aerospace industries. However, during the production process, the surface of the steel strip is prone to cracks, pitting, and other defects that affect its appearance and performance. It is important to use machine vision technology to detect defects on the surface of a steel strip in order to improve its quality. To address the difficulties in classifying the fine-grained features of strip steel surface images and to improve the defect detection rate, we propose an improved YOLOv5s model called YOLOv5s-FPD (Fine Particle Detection). The SPPF-A (Spatial Pyramid Pooling Fast-Advance) module was constructed to adjust the spatial pyramid structure, and the ASFF (Adaptively Spatial Feature Fusion) and CARAFE (Content-Aware ReAssembly of FEatures) modules were introduced to improve the feature extraction and fusion capabilities of strip images. The CSBL (Convolutional Separable Bottleneck) module was also constructed, and the DCNv2 (Deformable ConvNets v2) module was introduced to improve the model's lightweight properties. The CBAM (Convolutional Block Attention Module) attention module is used to extract key and important information, further improving the model's feature extraction capability. Experimental results on the NEU_DET (NEU surface defect database) dataset show that YOLOv5s-FPD improves the mAP50 accuracy by 2.6% before data enhancement and 1.8% after SSIE (steel strip image enhancement) data enhancement, compared to the YOLOv5s prototype. It also improves the detection accuracy of all six defects in the dataset. Experimental results on the VOC2007 public dataset demonstrate that YOLOv5s-FPD improves the mAP50 accuracy by 4.6% before data enhancement, compared to the YOLOv5s prototype. Overall, these results confirm the validity and usefulness of the proposed model.

15.
Front Plant Sci ; 14: 1323453, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38148868

RESUMO

Introduction: With continuously increasing labor costs, an urgent need for automated apple- Qpicking equipment has emerged in the agricultural sector. Prior to apple harvesting, it is imperative that the equipment not only accurately locates the apples, but also discerns the graspability of the fruit. While numerous studies on apple detection have been conducted, the challenges related to determining apple graspability remain unresolved. Methods: This study introduces a method for detecting multi-occluded apples based on an enhanced YOLOv5s model, with the aim of identifying the type of apple occlusion in complex orchard environments and determining apple graspability. Using bootstrap your own atent(BYOL) and knowledge transfer(KT) strategies, we effectively enhance the classification accuracy for multi-occluded apples while reducing data production costs. A selective kernel (SK) module is also incorporated, enabling the network model to more precisely identify various apple occlusion types. To evaluate the performance of our network model, we define three key metrics: APGA, APTUGA, and APUGA, representing the average detection accuracy for graspable, temporarily ungraspable, and ungraspable apples, respectively. Results: Experimental results indicate that the improved YOLOv5s model performs exceptionally well, achieving detection accuracies of 94.78%, 93.86%, and 94.98% for APGA, APTUGA, and APUGA, respectively. Discussion: Compared to current lightweight network models such as YOLOX-s and YOLOv7s, our proposed method demonstrates significant advantages across multiple evaluation metrics. In future research, we intend to integrate fruit posture and occlusion detection to f]urther enhance the visual perception capabilities of apple-picking equipment.

16.
Front Plant Sci ; 14: 1246717, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37915513

RESUMO

Introduction: The accurate extraction of navigation paths is crucial for the automated navigation of agricultural robots. Navigation line extraction in complex environments such as Panax notoginseng shade house can be challenging due to factors including similar colors between the fork rows and soil, and the shadows cast by shade nets. Methods: In this paper, we propose a new method for navigation line extraction based on deep learning and least squares (DL-LS) algorithms. We improve the YOLOv5s algorithm by introducing MobileNetv3 and ECANet. The trained model detects the seven-fork roots in the effective area between rows and uses the root point substitution method to determine the coordinates of the localization base points of the seven-fork root points. The seven-fork column lines on both sides of the plant monopoly are fitted using the least squares method. Results: The experimental results indicate that Im-YOLOv5s achieves higher detection performance than other detection models. Through these improvements, Im-YOLOv5s achieves a mAP (mean Average Precision) of 94.9%. Compared to YOLOv5s, Im-YOLOv5s improves the average accuracy and frame rate by 1.9% and 27.7%, respectively, and the weight size is reduced by 47.9%. The results also reveal the ability of DL-LS to accurately extract seven-fork row lines, with a maximum deviation of the navigation baseline row direction of 1.64°, meeting the requirements of robot navigation line extraction. Discussion: The results shows that compared to existing models, this model is more effective in detecting the seven-fork roots in images, and the computational complexity of the model is smaller. Our proposed method provides a basis for the intelligent mechanization of Panax notoginseng planting.

17.
Math Biosci Eng ; 20(9): 16148-16168, 2023 Aug 09.
Artigo em Inglês | MEDLINE | ID: mdl-37920007

RESUMO

Aerial image target detection technology has essential application value in navigation security, traffic control and environmental monitoring. Compared with natural scene images, the background of aerial images is more complex, and there are more small targets, which puts higher requirements on the detection accuracy and real-time performance of the algorithm. To further improve the detection accuracy of lightweight networks for small targets in aerial images, we propose a cross-scale multi-feature fusion target detection method (CMF-YOLOv5s) for aerial images. Based on the original YOLOv5s, a bidirectional cross-scale feature fusion sub-network (BsNet) is constructed, using a newly designed multi-scale fusion module (MFF) and cross-scale feature fusion strategy to enhance the algorithm's ability, that fuses multi-scale feature information and reduces the loss of small target feature information. To improve the problem of the high leakage detection rate of small targets in aerial images, we constructed a multi-scale detection head containing four outputs to improve the network's ability to perceive small targets. To enhance the network's recognition rate of small target samples, we improve the K-means algorithm by introducing a genetic algorithm to optimize the prediction frame size to generate anchor boxes more suitable for aerial images. The experimental results show that on the aerial image small target dataset VisDrone-2019, the proposed method can detect more small targets in aerial images with complex backgrounds. With a detection speed of 116 FPS, compared with the original algorithm, the detection accuracy metrics mAP0.5 and mAP0.5:0.95 for small targets are improved by 5.5% and 3.6%, respectively. Meanwhile, compared with eight advanced lightweight networks such as YOLOv7-Tiny and PP-PicoDet-s, mAP0.5 improves by more than 3.3%, and mAP0.5:0.95 improves by more than 1.9%.

18.
Sensors (Basel) ; 23(21)2023 Oct 28.
Artigo em Inglês | MEDLINE | ID: mdl-37960492

RESUMO

The hoist cage is used to lift miners in a coal mine's auxiliary shaft. Monitoring miners' unsafe behaviors and their status in the hoist cage is crucial to production safety in coal mines. In this study, a visual detection model is proposed to estimate the number and categories of miners, and to identify whether the miners are wearing helmets and whether they have fallen in the hoist cage. A dataset with eight categories of miners' statuses in hoist cages was developed for training and validating the model. Using the dataset, the classical models were trained for comparison, from which the YOLOv5s model was selected to be the basic model. Due to small-sized targets, poor lighting conditions, and coal dust and shelter, the detection accuracy of the Yolov5s model was only 89.2%. To obtain better detection accuracy, k-means++ clustering algorithm, a BiFPN-based feature fusion network, the convolutional block attention module (CBAM), and a CIoU loss function were proposed to improve the YOLOv5s model, and an attentional multi-scale cascaded feature fusion-based YOLOv5s model (AMCFF-YOLOv5s) was subsequently developed. The training results on the self-built dataset indicate that its detection accuracy increased to 97.6%. Moreover, the AMCFF-YOLOv5s model was proven to be robust to noise and light.

19.
Animals (Basel) ; 13(19)2023 Oct 07.
Artigo em Inglês | MEDLINE | ID: mdl-37835740

RESUMO

A forest wildlife detection algorithm based on an improved YOLOv5s network model is proposed to advance forest wildlife monitoring and improve detection accuracy in complex forest environments. This research utilizes a data set from the Hunan Hupingshan National Nature Reserve in China, to which data augmentation and expansion methods are applied to extensively train the proposed model. To enhance the feature extraction ability of the proposed model, a weighted channel stitching method based on channel attention is introduced. The Swin Transformer module is combined with a CNN network to add a Self-Attention mechanism, thus improving the perceptual field for feature extraction. Furthermore, a new loss function (DIOU_Loss) and an adaptive class suppression loss (L_BCE) are adopted to accelerate the model's convergence speed, reduce false detections in confusing categories, and increase its accuracy. When comparing our improved algorithm with the original YOLOv5s network model under the same experimental conditions and data set, significant improvements are observed, in particular, the mean average precision (mAP) is increased from 72.6% to 89.4%, comprising an accuracy improvement of 16.8%. Our improved algorithm also outperforms popular target detection algorithms, including YOLOv5s, YOLOv3, RetinaNet, and Faster-RCNN. Our proposed improvement measures can well address the challenges posed by the low contrast between background and targets, as well as occlusion and overlap, in forest wildlife images captured by trap cameras. These measures provide practical solutions for enhanced forest wildlife protection and facilitate efficient data acquisition.

20.
Sensors (Basel) ; 23(19)2023 Oct 06.
Artigo em Inglês | MEDLINE | ID: mdl-37837095

RESUMO

In response to the problem of high computational and parameter requirements of fatigued-driving detection models, as well as weak facial-feature keypoint extraction capability, this paper proposes a lightweight and real-time fatigued-driving detection model based on an improved YOLOv5s and Attention Mesh 3D keypoint extraction method. The main strategies are as follows: (1) Using Shufflenetv2_BD to reconstruct the Backbone network to reduce parameter complexity and computational load. (2) Introducing and improving the fusion method of the Cross-scale Aggregation Module (CAM) between the Backbone and Neck networks to reduce information loss in shallow features of closed-eyes and closed-mouth categories. (3) Building a lightweight Context Information Fusion Module by combining the Efficient Multi-Scale Module (EAM) and Depthwise Over-Parameterized Convolution (DoConv) to enhance the Neck network's ability to extract facial features. (4) Redefining the loss function using Wise-IoU (WIoU) to accelerate model convergence. Finally, the fatigued-driving detection model is constructed by combining the classification detection results with the thresholds of continuous closed-eye frames, continuous yawning frames, and PERCLOS (Percentage of Eyelid Closure over the Pupil over Time) of eyes and mouth. Under the premise that the number of parameters and the size of the baseline model are reduced by 58% and 56.3%, respectively, and the floating point computation is only 5.9 GFLOPs, the average accuracy of the baseline model is increased by 1%, and the Fatigued-recognition rate is 96.3%, which proves that the proposed algorithm can achieve accurate and stable real-time detection while lightweight. It provides strong support for the lightweight deployment of vehicle terminals.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA