Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 380
Filtrar
1.
J Imaging ; 10(8)2024 Aug 13.
Artigo em Inglês | MEDLINE | ID: mdl-39194986

RESUMO

Currently, existing deep learning methods exhibit many limitations in multi-target detection, such as low accuracy and high rates of false detection and missed detections. This paper proposes an improved Faster R-CNN algorithm, aiming to enhance the algorithm's capability in detecting multi-scale targets. This algorithm has three improvements based on Faster R-CNN. Firstly, the new algorithm uses the ResNet101 network for feature extraction of the detection image, which achieves stronger feature extraction capabilities. Secondly, the new algorithm integrates Online Hard Example Mining (OHEM), Soft non-maximum suppression (Soft-NMS), and Distance Intersection Over Union (DIOU) modules, which improves the positive and negative sample imbalance and the problem of small targets being easily missed during model training. Finally, the Region Proposal Network (RPN) is simplified to achieve a faster detection speed and a lower miss rate. The multi-scale training (MST) strategy is also used to train the improved Faster R-CNN to achieve a balance between detection accuracy and efficiency. Compared to the other detection models, the improved Faster R-CNN demonstrates significant advantages in terms of mAP@0.5, F1-score, and Log average miss rate (LAMR). The model proposed in this paper provides valuable insights and inspiration for many fields, such as smart agriculture, medical diagnosis, and face recognition.

2.
Phys Eng Sci Med ; 2024 Aug 12.
Artigo em Inglês | MEDLINE | ID: mdl-39133370

RESUMO

The cervical vertebral maturation (CVM) method is essential to determine the timing of orthodontic and orthopedic treatment. In this paper, a target detection model called DC-YOLOv5 is proposed to achieve fully automated detection and staging of CVM. A total of 1800 cephalometric radiographs were labeled and categorized based on the CVM stages. We introduced a model named DC-YOLOv5, optimized for the specific characteristics of CVM based on YOLOv5. This optimization includes replacing the original bounding box regression loss calculation method with Wise-IOU to address the issue of mutual interference between vertical and horizontal losses in Complete-IOU (CIOU), which made model convergence challenging. We incorporated the Res-dcn-head module structure to enhance the focus on small target features, improving the model's sensitivity to subtle sample differences. Additionally, we introduced the Convolutional Block Attention Module (CBAM) dual-channel attention mechanism to enhance focus and understanding of critical features, thereby enhancing the accuracy and efficiency of target detection. Loss functions, precision, recall, mean average precision (mAP), and F1 scores were used as the main algorithm evaluation metrics to assess the performance of these models. Furthermore, we attempted to analyze regions important for model predictions using gradient Class Activation Mapping (CAM) techniques. The final F1 scores of the DC-YOLOv5 model for CVM identification were 0.993, 0.994 for mAp0.5 and 0.943 for mAp0.5:0.95, with faster convergence, more accurate and more robust detection than the other four models. The DC-YOLOv5 algorithm shows high accuracy and robustness in CVM identification, which provides strong support for fast and accurate CVM identification and has a positive effect on the development of medical field and clinical diagnosis.

3.
Sensors (Basel) ; 24(15)2024 Jul 24.
Artigo em Inglês | MEDLINE | ID: mdl-39123850

RESUMO

Robust object detection in complex environments, poor visual conditions, and open scenarios presents significant technical challenges in autonomous driving. These challenges necessitate the development of advanced fusion methods for millimeter-wave (mmWave) radar point cloud data and visual images. To address these issues, this paper proposes a radar-camera robust fusion network (RCRFNet), which leverages self-supervised learning and open-set recognition to effectively utilise the complementary information from both sensors. Specifically, the network uses matched radar-camera data through a frustum association approach to generate self-supervised signals, enhancing network training. The integration of global and local depth consistencies between radar point clouds and visual images, along with image features, helps construct object class confidence levels for detecting unknown targets. Additionally, these techniques are combined with a multi-layer feature extraction backbone and a multimodal feature detection head to achieve robust object detection. Experiments on the nuScenes public dataset demonstrate that RCRFNet outperforms state-of-the-art (SOTA) methods, particularly in conditions of low visual visibility and when detecting unknown class objects.

4.
Front Neurorobot ; 18: 1431897, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39108349

RESUMO

We propose a visual Simultaneous Localization and Mapping (SLAM) algorithm that integrates target detection and clustering techniques in dynamic scenarios to address the vulnerability of traditional SLAM algorithms to moving targets. The proposed algorithm integrates the target detection module into the front end of the SLAM and identifies dynamic objects within the visual range by improving the YOLOv5. Feature points associated with the dynamic objects are disregarded, and only those that correspond to static targets are utilized for frame-to-frame matching. This approach effectively addresses the camera pose estimation in dynamic environments, enhances system positioning accuracy, and optimizes the visual SLAM performance. Experiments on the TUM public dataset and comparison with the traditional ORB-SLAM3 algorithm and DS-SLAM algorithm validate that the proposed visual SLAM algorithm demonstrates an average improvement of 85.70 and 30.92% in positioning accuracy in highly dynamic scenarios. In comparison to the DynaSLAM system using MASK-RCNN, our system exhibits superior real-time performance while maintaining a comparable ATE index. These results highlight that our pro-posed SLAM algorithm effectively reduces pose estimation errors, enhances positioning accuracy, and showcases enhanced robustness compared to conventional visual SLAM algorithms.

5.
Anal Chim Acta ; 1319: 342982, 2024 Aug 29.
Artigo em Inglês | MEDLINE | ID: mdl-39122269

RESUMO

BACKGROUND: The importance of multi-target simultaneous detection lies in its ability to significantly boost detection efficiency, making it invaluable for rapid and cost-effective testing. Photoelectrochemical (PEC) sensors have emerged as promising candidates for detecting harmful substances and biomarkers, attributable to their unparalleled sensitivity, minimal background signal, cost-effectiveness, equipment simplicity, and outstanding repeatability. However, designing an effective multi-target detection strategy remains a challenging task in the PEC sensing field. Consequently, there is a pressing need to address the development of PEC sensors capable of simultaneously detecting multiple targets. RESULTS: CdIn2S4/V-MoS2 heterojunctions were successfully prepared via a hydrothermal method. These heterojunctions exhibited a high photocurrent intensity, representing a 1.53-fold enhancement compared to CdIn2S4 alone. Next, we designed a multi-channel aptasensing chip using ITO as the substrate. Three working electrodes were created via laser etching and subsequently modified with CdIn2S4/V-MoS2 heterojunctions. Thiolated aptamers were then self-assembled onto the CdIn2S4/V-MoS2 heterojunctions via covalent bonds, serving as recognition tool. By empolying the CdIn2S4/V-MoS2 heterojunctions as the sensing platform and aptamers as recognition tool, we successfully developed a disposable aptasensing chip for the simultaneous PEC detection of three typical mycotoxins (aflatoxin B1 (AFB1), ochratoxin A (OTA), and zearalenone (ZEN)). This aptasensing chip exhibited wide detection range for AFB1 (0.05-50 ng/mL), OTA (0.05-500 ng/mL), and ZEN (0.1-250 ng/mL). Furthermore, it demonstrated ultra-low detection limits of 0.017 ng/mL for AFB1, 0.016 ng/mL for OTA, and 0.033 ng/mL for ZEN. SIGNIFICANCE AND NOVELTY: The aptasensing chip stands out for its cost-effectiveness, simplicity of fabrication, and multi-channel capabilities. The versatility and practicality enable it to serve as a powerful platform for designing multi-channel PEC aptasensors. With its ability to detect multiple targets with high sensitivity and specificity, the aptasensing chip holds immense potential for applications across diverse fields, such as environmental monitoring, clinical diagnostics, and food safety monitoring, where multi-target detection is crucial.


Assuntos
Aptâmeros de Nucleotídeos , Dissulfetos , Técnicas Eletroquímicas , Molibdênio , Semicondutores , Molibdênio/química , Técnicas Eletroquímicas/instrumentação , Técnicas Eletroquímicas/métodos , Aptâmeros de Nucleotídeos/química , Dissulfetos/química , Limite de Detecção , Nanoestruturas/química , Processos Fotoquímicos , Micotoxinas/análise , Técnicas Biossensoriais , Compostos de Cádmio/química , Ocratoxinas/análise
6.
Med Eng Phys ; 130: 104198, 2024 08.
Artigo em Inglês | MEDLINE | ID: mdl-39160026

RESUMO

Intention detection of the reaching movement is considerable for myoelectric human and machine collaboration applications. A comprehensive set of handcrafted features was mined from windows of electromyogram (EMG) of the upper-limb muscles while reaching nine nearby targets like activities of daily living. The feature selection-based scoring method, neighborhood component analysis (NCA), selected the relevant feature subset. Finally, the target was recognized by the support vector machine (SVM) model. The classification performance was generalized by a nested cross-validation structure that selected the optimal feature subset in the inner loop. According to the low spatial resolution of the target location on display and following the slight discrimination of signals between targets, the best classification accuracy of 77.11 % was achieved for concatenating the features of two segments with a length of 2 and 0.25 s. Due to the lack of subtle variation in EMG, while reaching different targets, a wide range of features was applied to consider additional aspects of the knowledge contained in EMG signals. Furthermore, since NCA selected features that provided more discriminant power, it became achievable to employ various combinations of features and even concatenated features extracted from different movement parts to improve classification performance.


Assuntos
Eletromiografia , Movimento , Reconhecimento Automatizado de Padrão , Processamento de Sinais Assistido por Computador , Máquina de Vetores de Suporte , Humanos , Masculino , Adulto , Feminino , Adulto Jovem , Atividades Cotidianas
7.
Sensors (Basel) ; 24(15)2024 Aug 05.
Artigo em Inglês | MEDLINE | ID: mdl-39124108

RESUMO

Side-scan sonar is a principal technique for subsea target detection, where the quantity of sonar images of seabed targets significantly influences the accuracy of intelligent target recognition. To expand the number of representative side-scan sonar target image samples, a novel augmentation method employing self-training with a Disrupted Student model is designed (DS-SIAUG). The process begins by inputting a dataset of side-scan sonar target images, followed by augmenting the samples through an adversarial network consisting of the DDPM (Denoising Diffusion Probabilistic Model) and the YOLO (You Only Look Once) detection model. Subsequently, the Disrupted Student model is used to filter out representative target images. These selected images are then reused as a new dataset to repeat the adversarial filtering process. Experimental results indicate that using the Disrupted Student model for selection achieves a target recognition accuracy comparable to manual selection, improving the accuracy of intelligent target recognition by approximately 5% over direct adversarial network augmentation.

8.
Am J Primatol ; : e23676, 2024 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-39148233

RESUMO

Using unmanned aerial vehicles (UAVs) for surveys on thermostatic animals has gained prominence due to their ability to provide practical and precise dynamic censuses, contributing to developing and refining conservation strategies. However, the practical application of UAVs for animal monitoring necessitates the automation of image interpretation to enhance their effectiveness. Based on our past experiences, we present the Sichuan snub-nosed monkey (Rhinopithecus roxellana) as a case study to illustrate the effective use of thermal cameras mounted on UAVs for monitoring monkey populations in Qinling, a region characterized by magnificent biodiversity. We used the local contrast method for a small infrared target detection algorithm to collect the total population size. Through the experimental group, we determined the average optimal grayscale threshold, while the validation group confirmed that this threshold enables automatic detection and counting of target animals in similar datasets. The precision rate obtained from the experiments ranged from 85.14% to 97.60%. Our findings reveal a negative correlation between the minimum average distance between thermal spots and the count of detected individuals, indicating higher interference in images with closer thermal spots. We propose a formula for adjusting primate population estimates based on detection rates obtained from UAV surveys. Our results demonstrate the practical application of UAV-based thermal imagery and automated detection algorithms for primate monitoring, albeit with consideration of environmental factors and the need for data preprocessing. This study contributes to advancing the application of UAV technology in wildlife monitoring, with implications for conservation management and research.

9.
Sensors (Basel) ; 24(14)2024 Jul 22.
Artigo em Inglês | MEDLINE | ID: mdl-39066144

RESUMO

In large public places such as railway stations and airports, dense pedestrian detection is important for safety and security. Deep learning methods provide relatively effective solutions but still face problems such as feature extraction difficulties, image multi-scale variations, and high leakage detection rates, which bring great challenges to the research in this field. In this paper, we propose an improved dense pedestrian detection algorithm GR-yolo based on Yolov8. GR-yolo introduces the repc3 module to optimize the backbone network, which enhances the ability of feature extraction, adopts the aggregation-distribution mechanism to reconstruct the yolov8 neck structure, fuses multi-level information, achieves a more efficient exchange of information, and enhances the detection ability of the model. Meanwhile, the Giou loss calculation is used to help GR-yolo converge better, improve the detection accuracy of the target position, and reduce missed detection. Experiments show that GR-yolo has improved detection performance over yolov8, with a 3.1% improvement in detection means accuracy on the wider people dataset, 7.2% on the crowd human dataset, and 11.7% on the people detection images dataset. Therefore, the proposed GR-yolo algorithm is suitable for dense, multi-scale, and scene-variable pedestrian detection, and the improvement also provides a new idea to solve dense pedestrian detection in real scenes.

10.
Plant Methods ; 20(1): 105, 2024 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-39014411

RESUMO

BACKGROUND: Rice field weed object detection can provide key information on weed species and locations for precise spraying, which is of great significance in actual agricultural production. However, facing the complex and changing real farm environments, traditional object detection methods still have difficulties in identifying small-sized, occluded and densely distributed weed instances. To address these problems, this paper proposes a multi-scale feature enhanced DETR network, named RMS-DETR. By adding multi-scale feature extraction branches on top of DETR, this model fully utilizes the information from different semantic feature layers to improve recognition capability for rice field weeds in real-world scenarios. METHODS: Introducing multi-scale feature layers on the basis of the DETR model, we conduct a differentiated design for different semantic feature layers. The high-level semantic feature layer adopts Transformer structure to extract contextual information between barnyard grass and rice plants. The low-level semantic feature layer uses CNN structure to extract local detail features of barnyard grass. Introducing multi-scale feature layers inevitably leads to increased model computation, thus lowering model inference speed. Therefore, we employ a new type of Pconv (Partial convolution) to replace traditional standard convolutions in the model. RESULTS: Compared to the original DETR model, our proposed RMS-DETR model achieved an average recognition accuracy improvement of 3.6% and 4.4% on our constructed rice field weeds dataset and the DOTA public dataset, respectively. The average recognition accuracies reached 0.792 and 0.851, respectively. The RMS-DETR model size is 40.8 M with inference time of 0.0081 s. Compared with three classical DETR models (Deformable DETR, Anchor DETR and DAB-DETR), the RMS-DETR model respectively improved average precision by 2.1%, 4.9% and 2.4%. DISCUSSION: This model is capable of accurately identifying rice field weeds in complex real-world scenarios, thus providing key technical support for precision spraying and management of variable-rate spraying systems.

11.
Front Plant Sci ; 15: 1404772, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39055359

RESUMO

Accurate detection and counting of flax plant organs are crucial for obtaining phenotypic data and are the cornerstone of flax variety selection and management strategies. In this study, a Flax-YOLOv5 model is proposed for obtaining flax plant phenotypic data. Based on the solid foundation of the original YOLOv5x feature extraction network, the network structure was extended to include the BiFormer module, which seamlessly integrates bi-directional encoders and converters, enabling it to focus on key features in an adaptive query manner. As a result, this improves the computational performance and efficiency of the model. In addition, we introduced the SIoU function to compute the regression loss, which effectively solves the problem of mismatch between predicted and actual frames. The flax plants grown in Lanzhou were collected to produce the training, validation, and test sets, and the detection results on the validation set showed that the average accuracy (mAP@0.5) was 99.29%. In the test set, the correlation coefficients (R) of the model's prediction results with the manually measured number of flax fruits, plant height, main stem length, and number of main stem divisions were 99.59%, 99.53%, 99.05%, and 92.82%, respectively. This study provides a stable and reliable method for the detection and quantification of flax phenotypic characteristics. It opens up a new technical way of selecting and breeding good varieties.

12.
Sci Rep ; 14(1): 17235, 2024 Jul 26.
Artigo em Inglês | MEDLINE | ID: mdl-39060388

RESUMO

With the rise of global smart city construction, target detection technology plays a crucial role in optimizing urban functions and improving the quality of life. However, existing target detection technologies still have shortcomings in terms of accuracy, real-time performance, and adaptability. To address this challenge, this study proposes an innovative target detection model. Our model adopts the structure of YOLOv8-DSAF, comprising three key modules: depthwise separable convolution (DSConv), dual-path attention gate module (DPAG), and feature enhancement module (FEM). Firstly, DSConv technology optimizes computational complexity, enabling real-time target detection within limited hardware resources. Secondly, the DPAG module introduces a dual-channel attention mechanism, allowing the model to selectively focus on crucial areas, thereby improving detection accuracy in high-dynamic traffic scenarios. Finally, the FEM module highlights crucial features to prevent their loss, further enhancing detection accuracy. Additionally, we propose an Internet of Things smart city framework consisting of four main layers: the application domain, the Internet of Things infrastructure layer, the edge layer, and the cloud layer. The proposed algorithm utilizes the Internet of Things infrastructure layer, edge layer, and cloud layer to collect and process data in real-time, achieving faster response times. Experimental results on the KITTI V and Cityscapes datasets indicate that our model outperforms the YOLOv8 model. This suggests that in complex urban traffic scenarios, our model exhibits superior performance with higher detection accuracy and adaptability. We believe that this innovative model will significantly propel the development of smart cities and advance target detection technology.

13.
Bioengineering (Basel) ; 11(7)2024 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-39061768

RESUMO

Automated detection of cervical lesion cell/clumps in cervical cytological images is essential for computer-aided diagnosis. In this task, the shape and size of the lesion cell/clumps appeared to vary considerably, reducing the detection performance of cervical lesion cell/clumps. To address the issue, we propose an adaptive feature extraction network for cervical lesion cell/clumps detection, called AFE-Net. Specifically, we propose the adaptive module to acquire the features of cervical lesion cell/clumps, while introducing the global bias mechanism to acquire the global average information, aiming at combining the adaptive features with the global information to improve the representation of the target features in the model, and thus enhance the detection performance of the model. Furthermore, we analyze the results of the popular bounding box loss on the model and propose the new bounding box loss tendency-IoU (TIoU). Finally, the network achieves the mean Average Precision (mAP) of 64.8% on the CDetector dataset, with 30.7 million parameters. Compared with YOLOv7 of 62.6% and 34.8M, the model improved mAP by 2.2% and reduced the number of parameters by 11.8%.

14.
Foods ; 13(14)2024 Jul 13.
Artigo em Inglês | MEDLINE | ID: mdl-39063292

RESUMO

The lack of spatial pose information and the low positioning accuracy of the picking target are the key factors affecting the picking function of citrus-picking robots. In this paper, a new method for automatic citrus fruit harvest is proposed, which uses semantic segmentation and rotating target detection to estimate the pose of a single culture. First, Faster R-CNN is used for grab detection to identify candidate grab frames. At the same time, the semantic segmentation network extracts the contour information of the citrus fruit to be harvested. Then, the capture frame with the highest confidence is selected for each target fruit using the semantic segmentation results, and the rough angle is estimated. The network uses image-processing technology and a camera-imaging model to further segment the mask image of the fruit and its epiphyllous branches and realize the fitting of contour, fruit centroid, and fruit minimum outer rectangular frame and three-dimensional boundary frame. The positional relationship of the citrus fruit to its epiphytic branches was used to estimate the three-dimensional pose of the citrus fruit. The effectiveness of the method was verified through citrus-planting experiments, and then field picking experiments were carried out in the natural environment of orchards. The results showed that the success rate of citrus fruit recognition and positioning was 93.6%, the average attitude estimation angle error was 7.9°, and the success rate of picking was 85.1%. The average picking time is 5.6 s, indicating that the robot can effectively perform intelligent picking operations.

15.
Front Plant Sci ; 15: 1381367, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38966144

RESUMO

Introduction: Pine wilt disease spreads rapidly, leading to the death of a large number of pine trees. Exploring the corresponding prevention and control measures for different stages of pine wilt disease is of great significance for its prevention and control. Methods: To address the issue of rapid detection of pine wilt in a large field of view, we used a drone to collect multiple sets of diseased tree samples at different times of the year, which made the model trained by deep learning more generalizable. This research improved the YOLO v4(You Only Look Once version 4) network for detecting pine wilt disease, and the channel attention mechanism module was used to improve the learning ability of the neural network. Results: The ablation experiment found that adding the attention mechanism SENet module combined with the self-designed feature enhancement module based on the feature pyramid had the best improvement effect, and the mAP of the improved model was 79.91%. Discussion: Comparing the improved YOLO v4 model with SSD, Faster RCNN, YOLO v3, and YOLO v5, it was found that the mAP of the improved YOLO v4 model was significantly higher than the other four models, which provided an efficient solution for intelligent diagnosis of pine wood nematode disease. The improved YOLO v4 model enables precise location and identification of pine wilt trees under changing light conditions. Deployment of the model on a UAV enables large-scale detection of pine wilt disease and helps to solve the challenges of rapid detection and prevention of pine wilt disease.

16.
Med Biol Eng Comput ; 2024 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-38967693

RESUMO

Lumbar disc herniation is one of the most prevalent orthopedic issues in clinical practice. The lumbar spine is a crucial joint for movement and weight-bearing, so back pain can significantly impact the everyday lives of patients and is prone to recurring. The pathogenesis of lumbar disc herniation is complex and diverse, making it difficult to identify and assess after it has occurred. Magnetic resonance imaging (MRI) is the most effective method for detecting injuries, requiring continuous examination by medical experts to determine the extent of the injury. However, the continuous examination process is time-consuming and susceptible to errors. This study proposes an enhanced model, BE-YOLOv5, for hierarchical detection of lumbar disc herniation from MRI images. To tailor the training of the model to the job requirements, a specialized dataset was created. The data was cleaned and improved before the final calibration. A final training set of 2083 data points and a test set of 100 data points were obtained. The YOLOv5 model was enhanced by integrating the attention mechanism module, ECAnet, with a 3 × 3 convolutional kernel size, substituting its feature extraction network with a BiFPN, and implementing structural system pruning. The model achieved an 89.7% mean average precision (mAP) and 48.7 frames per second (FPS) on the test set. In comparison to Faster R-CNN, original YOLOv5, and the latest YOLOv8, this model performs better in terms of both accuracy and speed for the detection and grading of lumbar disc herniation from MRI, validating the effectiveness of multiple enhancement methods. The proposed model is expected to be used for diagnosing lumbar disc herniation from MRI images and to demonstrate efficient and high-precision performance.

17.
Heliyon ; 10(12): e33016, 2024 Jun 30.
Artigo em Inglês | MEDLINE | ID: mdl-38994116

RESUMO

Addressing the challenges in detecting surface defects on ceramic disks, such as difficulty in detecting small defects, variations in defect sizes, and inaccurate defect localization, we propose an enhanced YOLOv5s algorithm. Firstly, we improve the anchor frame structure of the YOLOv5s model to enhance its generalization ability, enabling robust defect detection for objects of varying sizes. Secondly, we introduce the ECA attention mechanism to improve the model's accuracy in detecting small targets. Under identical experimental conditions, our enhanced YOLOv5s algorithm demonstrates significant improvements, with precision, F1 scores, and mAP values increasing by 3.1 %, 3 %, and 4.5 % respectively. Moreover, the accuracy in detecting crack, damage, slag, and spot defects increases by 0.2 %, 4.7 %, 5.4 %, and 1.9 % respectively. Notably, the detection speed improves from 232 frames/s to 256 frames/s. Comparative analysis with other algorithms reveals superior performance over YOLOv3 and YOLOv4 models, showcasing enhanced capability in identifying small target defects and achieving real-time detection.

18.
Sci Rep ; 14(1): 16108, 2024 Jul 12.
Artigo em Inglês | MEDLINE | ID: mdl-38997415

RESUMO

In the realm of marine environmental engineering, the swift and accurate detection of underwater targets is of considerable significance. Recently, methods based on Convolutional Neural Networks (CNN) have been applied to enhance the detection of such targets. However, deep neural networks usually require a large number of parameters, resulting in slow processing speed. Meanwhile, existing methods present challenges in accurate detection when facing small and densely arranged underwater targets. To address these issues, we propose a new neural network model, YOLOv8-LA, for improving the detection performance of underwater targets. First, we design a Lightweight Efficient Partial Convolution (LEPC) module to optimize spatial feature extraction by selectively processing input channels to improve efficiency and significantly reduce redundant computation and storage requirements. Second, we developed the AP-FasterNet architecture for small targets that are commonly found in underwater datasets. By integrating depth-separable convolutions with different expansion rates into FasterNet, AP-FasterNet enhances the model's ability to capture detailed features of small targets. Finally, we integrate the lightweight and efficient content-aware reorganization (CARAFE) up-sampling operation into YOLOv8 to enhance the model performance by aggregating contextual information over a large perceptual field and mitigating information loss during up-sampling.Evaluation results on the URPC2021 dataset show that the YOLOv8-LA model achieves 84.7% mean accuracy (mAP) on a single Nvidia GeForce RTX 3090 and operates at 189.3 frames per second (FPS), demonstrating that it outperforms existing state-of-the-art methods in terms of performance. This result demonstrates the model's ability to ensure high detection accuracy while maintaining real-time processing capabilities.

19.
Sensors (Basel) ; 24(13)2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-39001052

RESUMO

With the continuous advancement of the economy and technology, the number of cars continues to increase, and the traffic congestion problem on some key roads is becoming increasingly serious. This paper proposes a new vehicle information feature map (VIFM) method and a multi-branch convolutional neural network (MBCNN) model and applies it to the problem of traffic congestion detection based on camera image data. The aim of this study is to build a deep learning model with traffic images as input and congestion detection results as output. It aims to provide a new method for automatic detection of traffic congestion. The deep learning-based method in this article can effectively utilize the existing massive camera network in the transportation system without requiring too much investment in hardware. This study first uses an object detection model to identify vehicles in images. Then, a method for extracting a VIFM is proposed. Finally, a traffic congestion detection model based on MBCNN is constructed. This paper verifies the application effect of this method in the Chinese City Traffic Image Database (CCTRIB). Compared to other convolutional neural networks, other deep learning models, and baseline models, the method proposed in this paper yields superior results. The method in this article obtained an F1 score of 98.61% and an accuracy of 98.62%. Experimental results show that this method effectively solves the problem of traffic congestion detection and provides a powerful tool for traffic management.

20.
Sensors (Basel) ; 24(13)2024 Jul 04.
Artigo em Inglês | MEDLINE | ID: mdl-39001117

RESUMO

With the advancement in living standards, there has been a significant surge in the quantity and diversity of household waste. To safeguard the environment and optimize resource utilization, there is an urgent demand for effective and cost-efficient intelligent waste classification methodologies. This study presents MRS-YOLO (Multi-Resolution Strategy-YOLO), a waste detection and classification model. The paper introduces the SlideLoss_IOU technique for detecting small objects, integrates RepViT of the Transformer mechanism, and devises a novel feature extraction strategy by amalgamating multi-dimensional and dynamic convolution mechanisms. These enhancements not only elevate the detection accuracy and speed but also bolster the robustness of the current YOLO model. Validation conducted on a dataset comprising 12,072 samples across 10 categories, including recyclable metal and paper, reveals a 3.6% enhancement in mAP50% accuracy compared to YOLOv8, coupled with a 15.09% reduction in volume. Furthermore, the model demonstrates improved accuracy in detecting small targets and exhibits comprehensive detection capabilities across diverse scenarios. For transparency and to facilitate further research, the source code and related datasets used in this study have been made publicly available at GitHub.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA