Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 323
Filtrar
1.
Data Brief ; 55: 110701, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-39100771

RESUMEN

One of the most popular and well-established forms of payment in use today is paper money. Handling paper money might be challenging for those with vision impairments. Assistive technology has been reinventing itself throughout time to better serve the elderly and disabled people. To detect paper currency and extract other useful information from them, image processing techniques and other advanced technologies, such as Artificial Intelligence, Deep Learning, etc., can be used. In this paper, we present a meticulously curated and comprehensive dataset named 'NSTU-BDTAKA' tailored for the simultaneous detection and recognition of a specific object of cultural significance - the Bangladeshi paper currency (in Bengali it is called 'Taka'). This research aims to facilitate the development and evaluation of models for both taka detection and recognition tasks, offering a rich resource for researchers and practitioners alike. The dataset is divided into two distinct components: (i) taka detection, and (ii) taka recognition. The taka detection subset comprises 3,111 high-resolution images, each meticulously annotated with rectangular bounding boxes that encompass instances of the taka. These annotations serve as ground truth for training and validating object detection models, and we adopt the state-of-the-art YOLOv5 architecture for this purpose. In the taka recognition subset, the dataset has been extended to include a vast collection of 28,875 images, each showcasing various instances of the taka captured in diverse contexts and environments. The recognition dataset is designed to address the nuanced task of taka recognition providing researchers with a comprehensive set of images to train, validate, and test recognition models. This subset encompasses challenges such as variations in lighting, scale, orientation, and occlusion, further enhancing the robustness of developed recognition algorithms. The dataset NSTU-BDTAKA not only serves as a benchmark for taka detection and recognition but also fosters advancements in object detection and recognition methods that can be extrapolated to other cultural artifacts and objects. We envision that the dataset will catalyze research efforts in the field of computer vision, enabling the development of more accurate, robust, and efficient models for both detection and recognition tasks.

2.
Sensors (Basel) ; 24(15)2024 Aug 02.
Artículo en Inglés | MEDLINE | ID: mdl-39124057

RESUMEN

With the increasing importance of subways in urban public transportation systems, pedestrian flow simulation for supporting station management and risk analysis becomes more necessary. There is a need to calibrate the simulation model parameters with real-world pedestrian flow data to achieve a simulation closer to the real situation. This study presents a calibration approach based on YOLOv5 for calibrating the simulation model parameters in the social force model inserted in Anylogic. This study compared the simulation results after model calibration with real data. The results show that (1) the parameters calibrated in this paper can reproduce the characteristics of pedestrian flow in the station; (2) the calibration model not only decreases global errors but also overcomes the common phenomenon of large differences between simulation and reality.

3.
Water Environ Res ; 96(8): e11092, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-39129273

RESUMEN

Water pollution has become a major concern in recent years, affecting over 2 billion people worldwide, according to UNESCO. This pollution can occur by either naturally, such as algal blooms, or man-made when toxic substances are released into water bodies like lakes, rivers, springs, and oceans. To address this issue and monitor surface-level water pollution in local water bodies, an informative real-time vision-based surveillance system has been developed in conjunction with large language models (LLMs). This system has an integrated camera connected to a Raspberry Pi for processing input frames and is further linked to LLMs for generating contextual information regarding the type, causes, and impact of pollutants on both human health and the environment. This multi-model setup enables local authorities to monitor water pollution and take necessary steps to mitigate it. To train the vision model, seven major types of pollutants found in water bodies like algal bloom, synthetic foams, dead fishes, oil spills, wooden logs, industrial waste run-offs, and trashes were used for achieving accurate detection. ChatGPT API has been integrated with the model to generate contextual information about pollution detected. Thus, the multi-model system can conduct surveillance over water bodies and autonomously alert local authorities to take immediate action, eliminating the need for human intervention. PRACTITIONER POINTS: Combines cameras and LLMs with Raspberry Pi for processing and generating pollutant information. Uses YOLOv5 to detect algal blooms, synthetic foams, dead fish, oil spills, and industrial waste. Supports various modules and environments, including drones and mobile apps for broad monitoring. Educates on environmental healthand alerts authorities about water pollution.


Asunto(s)
Monitoreo del Ambiente , Contaminación del Agua , Monitoreo del Ambiente/métodos , Contaminación del Agua/análisis , Inteligencia Artificial , Modelos Teóricos
4.
Phys Eng Sci Med ; 2024 Aug 12.
Artículo en Inglés | MEDLINE | ID: mdl-39133370

RESUMEN

The cervical vertebral maturation (CVM) method is essential to determine the timing of orthodontic and orthopedic treatment. In this paper, a target detection model called DC-YOLOv5 is proposed to achieve fully automated detection and staging of CVM. A total of 1800 cephalometric radiographs were labeled and categorized based on the CVM stages. We introduced a model named DC-YOLOv5, optimized for the specific characteristics of CVM based on YOLOv5. This optimization includes replacing the original bounding box regression loss calculation method with Wise-IOU to address the issue of mutual interference between vertical and horizontal losses in Complete-IOU (CIOU), which made model convergence challenging. We incorporated the Res-dcn-head module structure to enhance the focus on small target features, improving the model's sensitivity to subtle sample differences. Additionally, we introduced the Convolutional Block Attention Module (CBAM) dual-channel attention mechanism to enhance focus and understanding of critical features, thereby enhancing the accuracy and efficiency of target detection. Loss functions, precision, recall, mean average precision (mAP), and F1 scores were used as the main algorithm evaluation metrics to assess the performance of these models. Furthermore, we attempted to analyze regions important for model predictions using gradient Class Activation Mapping (CAM) techniques. The final F1 scores of the DC-YOLOv5 model for CVM identification were 0.993, 0.994 for mAp0.5 and 0.943 for mAp0.5:0.95, with faster convergence, more accurate and more robust detection than the other four models. The DC-YOLOv5 algorithm shows high accuracy and robustness in CVM identification, which provides strong support for fast and accurate CVM identification and has a positive effect on the development of medical field and clinical diagnosis.

5.
Sci Rep ; 14(1): 15879, 2024 07 10.
Artículo en Inglés | MEDLINE | ID: mdl-38982140

RESUMEN

Spinal diseases and frozen shoulder are prevalent health problems in Asian populations. Early assessment and treatment are very important to prevent the disease from getting worse and reduce pain. In the field of computer vision, it is a challenging problem to assess the range of motion. In order to realize efficient, real-time and accurate assessment of the range of motion, an assessment system combining MediaPipe and YOLOv5 technologies was proposed in this study. On this basis, Convolutional Block Attention Module (CBAM) is introduced into the YOLOv5 target detection model, which can enhance the extraction of feature information, suppress background interference, and improve the generalization ability of the model. In order to meet the requirements of large-scale computing, a client/server (C/S) framework structure is adopted. The evaluation results can be obtained quickly after the client uploads the image data, providing a convenient and practical solution. In addition, a game of "Picking Bayberries" was developed as an auxiliary treatment method to provide patients with interesting rehabilitation training.


Asunto(s)
Bursitis , Rango del Movimiento Articular , Enfermedades de la Columna Vertebral , Humanos , Bursitis/fisiopatología , Bursitis/terapia , Bursitis/diagnóstico , Enfermedades de la Columna Vertebral/diagnóstico , Enfermedades de la Columna Vertebral/fisiopatología , Enfermedades de la Columna Vertebral/terapia , Masculino , Femenino , Adulto , Persona de Mediana Edad
6.
Front Plant Sci ; 15: 1404772, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39055359

RESUMEN

Accurate detection and counting of flax plant organs are crucial for obtaining phenotypic data and are the cornerstone of flax variety selection and management strategies. In this study, a Flax-YOLOv5 model is proposed for obtaining flax plant phenotypic data. Based on the solid foundation of the original YOLOv5x feature extraction network, the network structure was extended to include the BiFormer module, which seamlessly integrates bi-directional encoders and converters, enabling it to focus on key features in an adaptive query manner. As a result, this improves the computational performance and efficiency of the model. In addition, we introduced the SIoU function to compute the regression loss, which effectively solves the problem of mismatch between predicted and actual frames. The flax plants grown in Lanzhou were collected to produce the training, validation, and test sets, and the detection results on the validation set showed that the average accuracy (mAP@0.5) was 99.29%. In the test set, the correlation coefficients (R) of the model's prediction results with the manually measured number of flax fruits, plant height, main stem length, and number of main stem divisions were 99.59%, 99.53%, 99.05%, and 92.82%, respectively. This study provides a stable and reliable method for the detection and quantification of flax phenotypic characteristics. It opens up a new technical way of selecting and breeding good varieties.

7.
Front Neurorobot ; 18: 1423738, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39015151

RESUMEN

Introduction: Road cracks significantly shorten the service life of roads. Manual detection methods are inefficient and costly. The YOLOv5 model has made some progress in road crack detection. However, issues arise when deployed on edge computing devices. The main problem is that edge computing devices are directly connected to sensors. This results in the collection of noisy, poor-quality data. This problem adds computational burden to the model, potentially impacting its accuracy. To address these issues, this paper proposes a novel road crack detection algorithm named EMG-YOLO. Methods: First, an Efficient Decoupled Header is introduced in YOLOv5 to optimize the head structure. This approach separates the classification task from the localization task. Each task can then focus on learning its most relevant features. This significantly reduces the model's computational resources and time. It also achieves faster convergence rates. Second, the IOU loss function in the model is upgraded to the MPDIOU loss function. This function works by minimizing the top-left and bottom-right point distances between the predicted bounding box and the actual labeled bounding box. The MPDIOU loss function addresses the complex computation and high computational burden of the current YOLOv5 model. Finally, the GCC3 module replaces the traditional convolution. It performs global context modeling with the input feature map to obtain global context information. This enhances the model's detection capabilities on edge computing devices. Results: Experimental results show that the improved model has better performance in all parameter indicators compared to current mainstream algorithms. The EMG-YOLO model improves the accuracy of the YOLOv5 model by 2.7%. The mAP (0.5) and mAP (0.9) are improved by 2.9% and 0.9%, respectively. The new algorithm also outperforms the YOLOv5 model in complex environments on edge computing devices. Discussion: The EMG-YOLO algorithm proposed in this paper effectively addresses the issues of poor data quality and high computational burden on edge computing devices. This is achieved through optimizing the model head structure, upgrading the loss function, and introducing global context modeling. Experimental results demonstrate significant improvements in both accuracy and efficiency, especially in complex environments. Future research can further optimize this algorithm and explore more lightweight and efficient object detection models for edge computing devices.

8.
Nan Fang Yi Ke Da Xue Xue Bao ; 44(7): 1217-1226, 2024 Jul 20.
Artículo en Chino | MEDLINE | ID: mdl-39051067

RESUMEN

The development of various models for automated images screening has significantly enhanced the efficiency and accuracy of cervical cytology image analysis. Single-stage target detection models are capable of fast detection of abnormalities in cervical cytology, but an accurate diagnosis of abnormal cells not only relies on identification of a single cell itself, but also involves the comparison with the surrounding cells. Herein we present the Trans-YOLOv5 model, an automated abnormal cell detection model based on the YOLOv5 model incorporating the global-local attention mechanism to allow efficient multiclassification detection of abnormal cells in cervical cytology images. The experimental results using a large cervical cytology image dataset demonstrated the efficiency and accuracy of this model in comparison with the state-of-the-art methods, with a mAP reaching 65.9% and an AR reaching 53.3%, showing a great potential of this model in automated cervical cancer screening based on cervical cytology images.


Asunto(s)
Cuello del Útero , Neoplasias del Cuello Uterino , Humanos , Femenino , Neoplasias del Cuello Uterino/patología , Neoplasias del Cuello Uterino/diagnóstico , Cuello del Útero/patología , Cuello del Útero/citología , Procesamiento de Imagen Asistido por Computador/métodos , Algoritmos , Frotis Vaginal/métodos , Citología
9.
BMC Med Imaging ; 24(1): 187, 2024 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-39054448

RESUMEN

OBJECTIVE: There are two major issues in the MRI image diagnosis task for Parkinson's disease. Firstly, there are slight differences in MRI images between healthy individuals and Parkinson's patients, and the medical field has not yet established precise lesion localization standards, which poses a huge challenge for the effective prediction of Parkinson's disease through MRI images. Secondly, the early diagnosis of Parkinson's disease traditionally relies on the subjective judgment of doctors, which leads to insufficient accuracy and consistency. This article proposes an improved YOLOv5 detection algorithm based on deep learning for predicting and classifying Parkinson's images. METHODS: This article improves the YOLOv5s network as the basic framework. Firstly, the CA attention mechanism was introduced to enable the model to dynamically adjust attention based on local features of the image, significantly enhancing the sensitivity of the model to PD related small pathological features; Secondly, replace the dynamic full dimensional convolution module to optimize the multi-level extraction of image features; Finally, the coupling head strategy is adopted to improve the execution efficiency of classification and localization tasks separately. RESULTS: We validated the effectiveness of the proposed method using a dataset of 582 MRI images from 108 patients. The results show that the proposed method achieves 0.961, 0.974, and 0.986 in Precision, Recall, and mAP, respectively, and the experimental results are superior to other algorithms. CONSLUSION: The improved model has achieved high accuracy and detection accuracy, and can accurately detect and recognize complex Parkinson's MRI images. SIGNIFICANCE: This algorithm has shown good performance in the early diagnosis of Parkinson's disease and can provide clinical assistance for doctors in early diagnosis. It compensates for the limitations of traditional methods.


Asunto(s)
Aprendizaje Profundo , Imagen por Resonancia Magnética , Enfermedad de Parkinson , Humanos , Enfermedad de Parkinson/diagnóstico por imagen , Enfermedad de Parkinson/clasificación , Imagen por Resonancia Magnética/métodos , Algoritmos , Femenino , Masculino , Interpretación de Imagen Asistida por Computador/métodos , Anciano , Persona de Mediana Edad , Diagnóstico Precoz
10.
Sensors (Basel) ; 24(14)2024 Jul 11.
Artículo en Inglés | MEDLINE | ID: mdl-39065881

RESUMEN

Addressing the limitations of current railway track foreign object detection techniques, which suffer from inadequate real-time performance and diminished accuracy in detecting small objects, this paper introduces an innovative vision-based perception methodology harnessing the power of deep learning. Central to this approach is the construction of a railway boundary model utilizing a sophisticated track detection method, along with an enhanced UNet semantic segmentation network to achieve autonomous segmentation of diverse track categories. By employing equal interval division and row-by-row traversal, critical track feature points are precisely extracted, and the track linear equation is derived through the least squares method, thus establishing an accurate railway boundary model. We optimized the YOLOv5s detection model in four aspects: incorporating the SE attention mechanism into the Neck network layer to enhance the model's feature extraction capabilities, adding a prediction layer to improve the detection performance for small objects, proposing a linear size scaling method to obtain suitable anchor boxes, and utilizing Inner-IoU to refine the boundary regression loss function, thereby increasing the positioning accuracy of the bounding boxes. We conducted a detection accuracy validation for railway track foreign object intrusion using a self-constructed image dataset. The results indicate that the proposed semantic segmentation model achieved an MIoU of 91.8%, representing a 3.9% improvement over the previous model, effectively segmenting railway tracks. Additionally, the optimized detection model could effectively detect foreign object intrusions on the tracks, reducing missed and false alarms and achieving a 7.4% increase in the mean average precision (IoU = 0.5) compared to the original YOLOv5s model. The model exhibits strong generalization capabilities in scenarios involving small objects. This proposed approach represents an effective exploration of deep learning techniques for railway track foreign object intrusion detection, suitable for use in complex environments to ensure the operational safety of rail lines.

11.
Sensors (Basel) ; 24(14)2024 Jul 11.
Artículo en Inglés | MEDLINE | ID: mdl-39065900

RESUMEN

Traditionally, monitoring insect populations involved the use of externally placed sticky paper traps, which were periodically inspected by a human operator. To automate this process, a specialized sensing device and an accurate model for detecting and counting insect pests are essential. Despite considerable progress in insect pest detector models, their practical application is hindered by the shortage of insect trap images. To attenuate the "lack of data" issue, the literature proposes data augmentation. However, our knowledge about data augmentation is still quite limited, especially in the field of insect pest detection. The aim of this experimental study was to investigate the effect of several widely used augmentation techniques and their combinations on remote-sensed trap images with the YOLOv5 (small) object detector model. This study was carried out systematically on two different datasets starting from the single geometric and photometric transformation toward their combinations. Our results show that the model's mean average precision value (mAP50) could be increased from 0.844 to 0.992 and from 0.421 to 0.727 on the two datasets using the appropriate augmentation methods combination. In addition, this study also points out that the integration of photometric image transformations into the mosaic augmentation can be more efficient than the native combination of augmentation techniques because this approach further improved the model's mAP50 values to 0.999 and 0.756 on the two test sets, respectively.


Asunto(s)
Insectos , Tecnología de Sensores Remotos , Animales , Insectos/fisiología , Tecnología de Sensores Remotos/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Algoritmos , Humanos
12.
Sensors (Basel) ; 24(14)2024 Jul 12.
Artículo en Inglés | MEDLINE | ID: mdl-39065920

RESUMEN

Simultaneous Localization and Mapping (SLAM) is one of the key technologies with which to address the autonomous navigation of mobile robots, utilizing environmental features to determine a robot's position and create a map of its surroundings. Currently, visual SLAM algorithms typically yield precise and dependable outcomes in static environments, and many algorithms opt to filter out the feature points in dynamic regions. However, when there is an increase in the number of dynamic objects within the camera's view, this approach might result in decreased accuracy or tracking failures. Therefore, this study proposes a solution called YPL-SLAM based on ORB-SLAM2. The solution adds a target recognition and region segmentation module to determine the dynamic region, potential dynamic region, and static region; determines the state of the potential dynamic region using the RANSAC method with polar geometric constraints; and removes the dynamic feature points. It then extracts the line features of the non-dynamic region and finally performs the point-line fusion optimization process using a weighted fusion strategy, considering the image dynamic score and the number of successful feature point-line matches, thus ensuring the system's robustness and accuracy. A large number of experiments have been conducted using the publicly available TUM dataset to compare YPL-SLAM with globally leading SLAM algorithms. The results demonstrate that the new algorithm surpasses ORB-SLAM2 in terms of accuracy (with a maximum improvement of 96.1%) while also exhibiting a significantly enhanced operating speed compared to Dyna-SLAM.

13.
Sensors (Basel) ; 24(13)2024 Jun 26.
Artículo en Inglés | MEDLINE | ID: mdl-39000937

RESUMEN

Although existing 3D object-detection methods have achieved promising results on conventional datasets, it is still challenging to detect objects in data collected under adverse weather conditions. Data distortion from LiDAR and cameras in such conditions leads to poor performance of traditional single-sensor detection methods. Multi-modal data-fusion methods struggle with data distortion and low alignment accuracy, making accurate target detection difficult. To address this, we propose a multi-modal object-detection algorithm, Snow-CLOCs, specifically for snowy conditions. In image detection, we improved the YOLOv5 algorithm by integrating the InceptionNeXt network to enhance feature extraction and using the Wise-IoU algorithm to reduce dependency on high-quality data. For LiDAR point-cloud detection, we built upon the SECOND algorithm and employed the DROR filter to remove noise, enhancing detection accuracy. We combined the detection results from the camera and LiDAR into a unified detection set, represented using a sparse tensor, and extracted features through a 2D convolutional neural network to achieve object detection and localization. Snow-CLOCs achieved a detection accuracy of 86.61% for vehicle detection in snowy conditions.

14.
Sensors (Basel) ; 24(13)2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-39001150

RESUMEN

Quickly and accurately assessing the damage level of buildings is a challenging task for post-disaster emergency response. Most of the existing research mainly adopts semantic segmentation and object detection methods, which have yielded good results. However, for high-resolution Unmanned Aerial Vehicle (UAV) imagery, these methods may result in the problem of various damage categories within a building and fail to accurately extract building edges, thus hindering post-disaster rescue and fine-grained assessment. To address this issue, we proposed an improved instance segmentation model that enhances classification accuracy by incorporating a Mixed Local Channel Attention (MLCA) mechanism in the backbone and improving small object segmentation accuracy by refining the Neck part. The method was tested on the Yangbi earthquake UVA images. The experimental results indicated that the modified model outperformed the original model by 1.07% and 1.11% in the two mean Average Precision (mAP) evaluation metrics, mAPbbox50 and mAPseg50, respectively. Importantly, the classification accuracy of the intact category was improved by 2.73% and 2.73%, respectively, while the collapse category saw an improvement of 2.58% and 2.14%. In addition, the proposed method was also compared with state-of-the-art instance segmentation models, e.g., Mask-R-CNN and YOLO V9-Seg. The results demonstrated that the proposed model exhibits advantages in both accuracy and efficiency. Specifically, the efficiency of the proposed model is three times faster than other models with similar accuracy. The proposed method can provide a valuable solution for fine-grained building damage evaluation.

15.
Sensors (Basel) ; 24(13)2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-39001149

RESUMEN

The efficient and accurate identification of traffic signs is crucial to the safety and reliability of active driving assistance and driverless vehicles. However, the accurate detection of traffic signs under extreme cases remains challenging. Aiming at the problems of missing detection and false detection in traffic sign recognition in fog traffic scenes, this paper proposes a recognition algorithm for traffic signs based on pix2pixHD+YOLOv5-T. Firstly, the defogging model is generated by training the pix2pixHD network to meet the advanced visual task. Secondly, in order to better match the defogging algorithm with the target detection algorithm, the algorithm YOLOv5-Transformer is proposed by introducing a transformer module into the backbone of YOLOv5. Finally, the defogging algorithm pix2pixHD is combined with the improved YOLOv5 detection algorithm to complete the recognition of traffic signs in foggy environments. Comparative experiments proved that the traffic sign recognition algorithm proposed in this paper can effectively reduce the impact of a foggy environment on traffic sign recognition. Compared with the YOLOv5-T and YOLOv5 algorithms in moderate fog environments, the overall improvement of this algorithm is achieved. The precision of traffic sign recognition of the algorithm in the fog traffic scene reached 78.5%, the recall rate was 72.2%, and mAP@0.5 was 82.8%.

16.
Sensors (Basel) ; 24(13)2024 Jul 06.
Artículo en Inglés | MEDLINE | ID: mdl-39001173

RESUMEN

Microplastics (MPs, size ≤ 5 mm) have emerged as a significant worldwide concern, threatening marine and freshwater ecosystems, and the lack of MP detection technologies is notable. The main goal of this research is the development of a camera sensor for the detection of MPs and measuring their size and velocity while in motion. This study introduces a novel methodology involving computer vision and artificial intelligence (AI) for the detection of MPs. Three different camera systems, including fixed-focus 2D and autofocus (2D and 3D), were implemented and compared. A YOLOv5-based object detection model was used to detect MPs in the captured image. DeepSORT was then implemented for tracking MPs through consecutive images. In real-time testing in a laboratory flume setting, the precision in MP counting was found to be 97%, and during field testing in a local river, the precision was 96%. This study provides foundational insights into utilizing AI for detecting MPs in different environmental settings, contributing to more effective efforts and strategies for managing and mitigating MP pollution.

17.
Sci Rep ; 14(1): 16848, 2024 07 22.
Artículo en Inglés | MEDLINE | ID: mdl-39039263

RESUMEN

Pomegranate is an important fruit crop that is usually managed manually through experience. Intelligent management systems for pomegranate orchards can improve yields and address labor shortages. Fast and accurate detection of pomegranates is one of the key technologies of this management system, crucial for yield and scientific management. Currently, most solutions use deep learning to achieve pomegranate detection, but deep learning is not effective in detecting small targets and large parameters, and the computation speed is slow; therefore, there is room for improving the pomegranate detection task. Based on the improved You Only Look Once version 5 (YOLOv5) algorithm, a lightweight pomegranate growth period detection algorithm YOLO-Granada is proposed. A lightweight ShuffleNetv2 network is used as the backbone to extract pomegranate features. Using grouped convolution reduces the computational effort of ordinary convolution, and using channel shuffle increases the interaction between different channels. In addition, the attention mechanism can help the neural network suppress less significant features in the channels or space, and the Convolutional Block Attention Module attention mechanism can improve the effect of attention and optimize the object detection accuracy by using the contribution factor of weights. The average accuracy of the improved network reaches 0.922. It is only less than 1% lower than the original YOLOv5s model (0.929) but brings a speed increase and a compression of the model size. and the detection speed is 17.3% faster than the original network. The parameters, floating-point operations, and model size of this network are compressed to 54.7%, 51.3%, and 56.3% of the original network, respectively. In addition, the algorithm detects 8.66 images per second, achieving real-time results. In this study, the Nihui convolutional neural network framework was further utilized to develop an Android-based application for real-time pomegranate detection. The method provides a more accurate and lightweight solution for intelligent management devices in pomegranate orchards, which can provide a reference for the design of neural networks in agricultural applications.


Asunto(s)
Algoritmos , Frutas , Redes Neurales de la Computación , Granada (Fruta) , Granada (Fruta)/química , Aprendizaje Profundo
18.
Sci Rep ; 14(1): 15901, 2024 Jul 10.
Artículo en Inglés | MEDLINE | ID: mdl-38987266

RESUMEN

The rapid development of the logistics industry has driven innovations in parcel sorting technology, among which the swift and precise positioning and classification of parcels have become key to enhancing the performance of logistics systems. This study aims to address the limitations of traditional light curtain positioning methods in logistics sorting workshops amidst high-speed upgrades. This paper proposes a high-speed classification and location algorithm for logistics parcels utilizing a monocular camera. The algorithm combines traditional visual processing methods with an enhanced version of the lightweight YOLOv5 object detection algorithm, achieving high-speed, high-precision parcel positioning. Through the adjustment of the network structure and the incorporation of new feature extraction modules and ECIOU loss functions, the model's robustness and detection accuracy have been significantly improved. Experimental results demonstrate that this model exhibits outstanding performance on a customized logistics parcel dataset, notably enhancing the model's parameter efficiency and computational speed, thereby offering an effective solution for industrial applications in high-speed logistics supply.

19.
Sci Rep ; 14(1): 17508, 2024 07 30.
Artículo en Inglés | MEDLINE | ID: mdl-39079949

RESUMEN

The effective identification of fruit tree leaf disease is of great practical significance to reduce pesticide spraying, improve fruit yield and realize ecological agriculture. Computer vision technology can be effectively identifying and prevent plant diseases and insect pests. However, the lack of consideration of disease diversity and accuracy of existing detection models hinders their application and development in the field of plant pest detection. This paper proposes an efficient detection model of apple leaf disease spot through the improvement of the traditional Yolov5 detection network called A-Net. In order to significantly increase the A-Net's detection speed and accuracy, the A-Net model applies the loss function Wise-IoU, which includes the attention mechanism and the dynamic focusing mechanism, to the Yolov5 network model. The RepVGG module is then used to replace the original model's convolution module. The experimental results show that the improved model effectively suppresses the growth of some error weights. Compared with several object detection models, the improved A-Net model has a Mean Average Precision across IoU threshold 0.5 and an accuracy of 92.7%, which fully proves that the improved A-Net model has more advantages in detecting apple leaf diseases.


Asunto(s)
Malus , Enfermedades de las Plantas , Hojas de la Planta , Enfermedades de las Plantas/parasitología , Enfermedades de las Plantas/prevención & control , Hojas de la Planta/parasitología , Redes Neurales de la Computación
20.
Math Biosci Eng ; 21(4): 5782-5802, 2024 Apr 26.
Artículo en Inglés | MEDLINE | ID: mdl-38872558

RESUMEN

With the widespread integration of deep learning in intelligent transportation and various industrial sectors, target detection technology is gradually becoming one of the key research areas. Accurately detecting road vehicles and pedestrians is of great significance for the development of autonomous driving technology. Road object detection faces problems such as complex backgrounds, significant scale changes, and occlusion. To accurately identify traffic targets in complex environments, this paper proposes a road target detection algorithm based on the enhanced YOLOv5s. This algorithm introduces the weighted enhanced polarization self attention (WEPSA) self-attention mechanism, which uses spatial attention and channel attention to strengthen the important features extracted by the feature extraction network and suppress insignificant background information. In the neck network, we designed a weighted feature fusion network (CBiFPN) to enhance neck feature representation and enrich semantic information. This strategic feature fusion not only boosts the algorithm's adaptability to intricate scenes, but also contributes to its robust performance. Then, the bounding box regression loss function uses EIoU to accelerate model convergence and reduce losses. Finally, a large number of experiments have shown that the improved YOLOv5s algorithm achieves mAP@0.5 scores of 92.8% and 53.5% on the open-source datasets KITTI and Cityscapes. On the self-built dataset, the mAP@0.5 reaches 88.7%, which is 1.7%, 3.8%, and 3.3% higher than YOLOv5s, respectively, ensuring real-time performance while improving detection accuracy. In addition, compared to the latest YOLOv7 and YOLOv8, the improved YOLOv5 shows good overall performance on the open-source datasets.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA