RESUMO
Introduction: In the field of agriculture, automated harvesting of Camellia oleifera fruit has become an important research area. However, accurately detecting Camellia oleifera fruit in a natural environment is a challenging task. The task of accurately detecting Camellia oleifera fruit in natural environments is complex due to factors such as shadows, which can impede the performance of traditional detection techniques, highlighting the need for more robust methods. Methods: To overcome these challenges, we propose an efficient deep learning method called YOLO-CFruit, which is specifically designed to accurately detect Camellia oleifera fruits in challenging natural environments. First, we collected images of Camellia oleifera fruits and created a dataset, and then used a data enhancement method to further enhance the diversity of the dataset. Our YOLO-CFruit model combines a CBAM module for identifying regions of interest in landscapes with Camellia oleifera fruit and a CSP module with Transformer for capturing global information. In addition, we improve YOLOCFruit by replacing the CIoU Loss with the EIoU Loss in the original YOLOv5. Results: By testing the training network, we find that the method performs well, achieving an average precision of 98.2%, a recall of 94.5%, an accuracy of 98%, an F1 score of 96.2, and a frame rate of 19.02 ms. The experimental results show that our method improves the average precision by 1.2% and achieves the highest accuracy and higher F1 score among all state-of-the-art networks compared to the conventional YOLOv5s network. Discussion: The robust performance of YOLO-CFruit under different real-world conditions, including different light and shading scenarios, signifies its high reliability and lays a solid foundation for the development of automated picking devices.
RESUMO
This study proposed an improved full-scale aggregated MobileUNet (FA-MobileUNet) model to achieve more complete detection results of oil spill areas using synthetic aperture radar (SAR) images. The convolutional block attention module (CBAM) in the FA-MobileUNet was modified based on morphological concepts. By introducing the morphological attention module (MAM), the improved FA-MobileUNet model can reduce the fragments and holes in the detection results, providing complete oil spill areas which were more suitable for describing the location and scope of oil pollution incidents. In addition, to overcome the inherent category imbalance of the dataset, label smoothing was applied in model training to reduce the model's overconfidence in majority class samples while improving the model's generalization ability. The detection performance of the improved FA-MobileUNet model reached an mIoU (mean intersection over union) of 84.55%, which was 17.15% higher than that of the original U-Net model. The effectiveness of the proposed model was then verified using the oil pollution incidents that significantly impacted Taiwan's marine environment. Experimental results showed that the extent of the detected oil spill was consistent with the oil pollution area recorded in the incident reports.
Assuntos
Monitoramento Ambiental , Poluição por Petróleo , Radar , Poluição por Petróleo/análise , Monitoramento Ambiental/métodos , Taiwan , AlgoritmosRESUMO
Brain tumors are diseases characterized by abnormal cell growth within or around brain tissues, including various types such as benign and malignant tumors. However, there is currently a lack of early detection and precise localization of brain tumors in MRI images, posing challenges to diagnosis and treatment. In this context, achieving accurate target detection of brain tumors in MRI images becomes particularly important as it can improve the timeliness of diagnosis and the effectiveness of treatment. To address this challenge, we propose a novel approach-the YOLO-NeuroBoost model. This model combines the improved YOLOv8 algorithm with several innovative techniques, including dynamic convolution KernelWarehouse, attention mechanism CBAM (Convolutional Block Attention Module), and Inner-GIoU loss function. Our experimental results demonstrate that our method achieves mAP scores of 99.48 and 97.71 on the Br35H dataset and the open-source Roboflow dataset, respectively, indicating the high accuracy and efficiency of this method in detecting brain tumors in MRI images. This research holds significant importance for improving early diagnosis and treatment of brain tumors and provides new possibilities for the development of the medical image analysis field.
RESUMO
With the rapid advancement of modern medical technology, microscopy imaging systems have become one of the key technologies in medical image analysis. However, manual use of microscopes presents issues such as operator dependency, inefficiency, and time consumption. To enhance the efficiency and accuracy of medical image capture and reduce the burden of subsequent quantitative analysis, this paper proposes an improved microscope salient object detection algorithm based on U2-Net, incorporating deep learning technology. The improved algorithm first enhances the network's key information extraction capability by incorporating the Convolutional Block Attention Module (CBAM) into U2-Net. It then optimizes network complexity by constructing a Simple Pyramid Pooling Module (SPPM) and uses Ghost convolution to achieve model lightweighting. Additionally, data augmentation is applied to the slides to improve the algorithm's robustness and generalization. The experimental results show that the size of the improved algorithm model is 72.5 MB, which represents a 56.85% reduction compared to the original U2-Net model size of 168.0 MB. Additionally, the model's prediction accuracy has increased from 92.24 to 97.13%, providing an efficient means for subsequent image processing and analysis tasks in microscopy imaging systems.
RESUMO
In order to improve the reading efficiency of pointer meter, this paper proposes a reading method based on LinkNet. Firstly, the meter dial area is detected using YOLOv8. Subsequently, the detected images are fed into the improved LinkNet segmentation network. In this network, we replace traditional convolution with partial convolution, which reduces the number of model parameters while ensuring accuracy is not affected. Remove one pair of encoding and decoding modules to further compress the model size. In the feature fusion part of the model, the CBAM (Convolutional Block Attention Module) attention module is added and the direct summing operation is replaced by the AFF (Attention Feature Fusion) module, which enhances the feature extraction capability of the model for the segmented target. In the subsequent rotation correction section, this paper effectively addresses the issue of inaccurate prediction by CNN networks for axisymmetric images within the 0-360° range, by dividing the rotation angle prediction into classification and regression steps. It ensures that the final reading part receives the correct angle of image input, thereby improving the accuracy of the overall reading algorithm. The final experimental results indicate that our proposed reading method has a mean absolute error of 0.20 and a frame rate of 15.
RESUMO
When it comes to road environment perception, millimeter-wave radar with a camera facilitates more reliable detection than a single sensor. However, the limited utilization of radar features and insufficient extraction of important features remain pertinent issues, especially with regard to the detection of small and occluded objects. To address these concerns, we propose a camera-radar fusion with radar channel extension and a dual-CBAM-FPN (CRFRD), which incorporates a radar channel extension (RCE) module and a dual-CBAM-FPN (DCF) module into the camera-radar fusion net (CRF-Net). In the RCE module, we design an azimuth-weighted RCS parameter and extend three radar channels, which leverage the secondary redundant information to achieve richer feature representation. In the DCF module, we present the dual-CBAM-FPN, which enables the model to focus on important features by inserting CBAM at the input and the fusion process of FPN simultaneously. Comparative experiments conducted on the NuScenes dataset and real data demonstrate the superior performance of the CRFRD compared to CRF-Net, as its weighted mean average precision (wmAP) increases from 43.89% to 45.03%. Furthermore, ablation studies verify the indispensability of the RCE and DCF modules and the effectiveness of azimuth-weighted RCS.
RESUMO
Background and Objective: Bronchoscopy is a widely used diagnostic and therapeutic procedure for respiratory disorders such as infections and tumors. However, visualizing the bronchial tubes and lungs can be challenging due to the presence of various objects, such as mucus, blood, and foreign bodies. Accurately identifying the anatomical location of the bronchi can be quite challenging, especially for medical professionals who are new to the field. Deep learning-based object detection algorithms can assist doctors in analyzing images or videos of the bronchial tubes to identify key features such as the epiglottis, vocal cord, and right basal bronchus. This study aims to improve the accuracy of object detection in bronchoscopy images by integrating a YOLO-based algorithm with a CBAM attention mechanism. Methods: The CBAM attention module is implemented in the YOLO-V7 and YOLO-V8 object detection models to improve their object identification and classification capabilities in bronchoscopy images. Various YOLO-based object detection algorithms, such as YOLO-V5, YOLO-V7, and YOLO-V8 are compared on this dataset. Experiments are conducted to evaluate the performance of the proposed method and different algorithms. Results: The proposed method significantly improves the accuracy and reliability of object detection for bronchoscopy images. This approach demonstrates the potential benefits of incorporating an attention mechanism in medical imaging and the benefits of utilizing object detection algorithms in bronchoscopy. In the experiments, the YOLO-V8-based model achieved a mean Average Precision (mAP) of 87.09% on the given dataset with an Intersection over Union (IoU) threshold of 0.5. After incorporating the Convolutional Block Attention Module (CBAM) into the YOLO-V8 architecture, the proposed method achieved a significantly enhanced m A P 0.5 and m A P 0.5 : 0.95 of 88.27% and 55.39%, respectively. Conclusions: Our findings indicate that by incorporating a CBAM attention mechanism with a YOLO-based algorithm, there is a noticeable improvement in object detection performance in bronchoscopy images. This study provides valuable insights into enhancing the performance of attention mechanisms for object detection in medical imaging.
RESUMO
This study addresses the challenges of low detection precision and limited generalization across various ripeness levels and varieties for large non-green-ripe citrus fruits in complex scenarios. We present a high-precision and lightweight model, YOLOC-tiny, built upon YOLOv7, which utilizes EfficientNet-B0 as the feature extraction backbone network. To augment sensing capabilities and improve detection accuracy, we embed a spatial and channel composite attention mechanism, the convolutional block attention module (CBAM), into the head's efficient aggregation network. Additionally, we introduce an adaptive and complete intersection over union regression loss function, designed by integrating the phenotypic features of large non-green-ripe citrus, to mitigate the impact of data noise and efficiently calculate detection loss. Finally, a layer-based adaptive magnitude pruning strategy is employed to further eliminate redundant connections and parameters in the model. Targeting three types of citrus widely planted in Sichuan Province-navel orange, Ehime Jelly orange, and Harumi tangerine-YOLOC-tiny achieves an impressive mean average precision (mAP) of 83.0%, surpassing most other state-of-the-art (SOTA) detectors in the same class. Compared with YOLOv7 and YOLOv8x, its mAP improved by 1.7% and 1.9%, respectively, with a parameter count of only 4.2M. In picking robot deployment applications, YOLOC-tiny attains an accuracy of 92.8% at a rate of 59 frames per second. This study provides a theoretical foundation and technical reference for upgrading and optimizing low-computing-power ground-based robots, such as those used for fruit picking and orchard inspection.
RESUMO
Pomegranate is an important fruit crop that is usually managed manually through experience. Intelligent management systems for pomegranate orchards can improve yields and address labor shortages. Fast and accurate detection of pomegranates is one of the key technologies of this management system, crucial for yield and scientific management. Currently, most solutions use deep learning to achieve pomegranate detection, but deep learning is not effective in detecting small targets and large parameters, and the computation speed is slow; therefore, there is room for improving the pomegranate detection task. Based on the improved You Only Look Once version 5 (YOLOv5) algorithm, a lightweight pomegranate growth period detection algorithm YOLO-Granada is proposed. A lightweight ShuffleNetv2 network is used as the backbone to extract pomegranate features. Using grouped convolution reduces the computational effort of ordinary convolution, and using channel shuffle increases the interaction between different channels. In addition, the attention mechanism can help the neural network suppress less significant features in the channels or space, and the Convolutional Block Attention Module attention mechanism can improve the effect of attention and optimize the object detection accuracy by using the contribution factor of weights. The average accuracy of the improved network reaches 0.922. It is only less than 1% lower than the original YOLOv5s model (0.929) but brings a speed increase and a compression of the model size. and the detection speed is 17.3% faster than the original network. The parameters, floating-point operations, and model size of this network are compressed to 54.7%, 51.3%, and 56.3% of the original network, respectively. In addition, the algorithm detects 8.66 images per second, achieving real-time results. In this study, the Nihui convolutional neural network framework was further utilized to develop an Android-based application for real-time pomegranate detection. The method provides a more accurate and lightweight solution for intelligent management devices in pomegranate orchards, which can provide a reference for the design of neural networks in agricultural applications.
Assuntos
Algoritmos , Frutas , Redes Neurais de Computação , Punica granatum , Punica granatum/química , Aprendizado ProfundoRESUMO
In response to the challenges of accurate identification and localization of garbage in intricate urban street environments, this paper proposes EcoDetect-YOLO, a garbage exposure detection algorithm based on the YOLOv5s framework, utilizing an intricate environment waste exposure detection dataset constructed in this study. Initially, a convolutional block attention module (CBAM) is integrated between the second level of the feature pyramid etwork (P2) and the third level of the feature pyramid network (P3) layers to optimize the extraction of relevant garbage features while mitigating background noise. Subsequently, a P2 small-target detection head enhances the model's efficacy in identifying small garbage targets. Lastly, a bidirectional feature pyramid network (BiFPN) is introduced to strengthen the model's capability for deep feature fusion. Experimental results demonstrate EcoDetect-YOLO's adaptability to urban environments and its superior small-target detection capabilities, effectively recognizing nine types of garbage, such as paper and plastic trash. Compared to the baseline YOLOv5s model, EcoDetect-YOLO achieved a 4.7% increase in mAP0.5, reaching 58.1%, with a compact model size of 15.7 MB and an FPS of 39.36. Notably, even in the presence of strong noise, the model maintained a mAP0.5 exceeding 50%, underscoring its robustness. In summary, EcoDetect-YOLO, as proposed in this paper, boasts high precision, efficiency, and compactness, rendering it suitable for deployment on mobile devices for real-time detection and management of urban garbage exposure, thereby advancing urban automation governance and digital economic development.
RESUMO
Introduction: Field wheat ear counting is an important step in wheat yield estimation, and how to solve the problem of rapid and effective wheat ear counting in a field environment to ensure the stability of food supply and provide more reliable data support for agricultural management and policy making is a key concern in the current agricultural field. Methods: There are still some bottlenecks and challenges in solving the dense wheat counting problem with the currently available methods. To address these issues, we propose a new method based on the YOLACT framework that aims to improve the accuracy and efficiency of dense wheat counting. Replacing the pooling layer in the CBAM module with a GeM pooling layer, and then introducing the density map into the FPN, these improvements together make our method better able to cope with the challenges in dense scenarios. Results: Experiments show our model improves wheat ear counting performance in complex backgrounds. The improved attention mechanism reduces the RMSE from 1.75 to 1.57. Based on the improved CBAM, the R2 increases from 0.9615 to 0.9798 through pixel-level density estimation, the density map mechanism accurately discerns overlapping count targets, which can provide more granular information. Discussion: The findings demonstrate the practical potential of our framework for intelligent agriculture applications.
RESUMO
BACKGROUND: Rapid identification and classification of bats are critical for practical applications. However, species identification of bats is a typically detrimental and time-consuming manual task that depends on taxonomists and well-trained experts. Deep Convolutional Neural Networks (DCNNs) provide a practical approach for the extraction of the visual features and classification of objects, with potential application for bat classification. RESULTS: In this study, we investigated the capability of deep learning models to classify 7 horseshoe bat taxa (CHIROPTERA: Rhinolophus) from Southern China. We constructed an image dataset of 879 front, oblique, and lateral targeted facial images of live individuals collected during surveys between 2012 and 2021. All images were taken using a standard photograph protocol and setting aimed at enhancing the effectiveness of the DCNNs classification. The results demonstrated that our customized VGG16-CBAM model achieved up to 92.15% classification accuracy with better performance than other mainstream models. Furthermore, the Grad-CAM visualization reveals that the model pays more attention to the taxonomic key regions in the decision-making process, and these regions are often preferred by bat taxonomists for the classification of horseshoe bats, corroborating the validity of our methods. CONCLUSION: Our finding will inspire further research on image-based automatic classification of chiropteran species for early detection and potential application in taxonomy.
RESUMO
BACKGROUND: The novel coronavirus pneumonia (COVID-19) outbreak in late 2019 killed millions worldwide. Coronaviruses cause diseases such as severe acute respiratory syndrome (SARS-Cov) and SARS-COV-2. Many peptides in the host defense system have antiviral activity. How to establish a set of efficient models to identify anti-coronavirus peptides is a meaningful study. METHODS: Given this, a new prediction model EACVP is proposed. This model uses the evolutionary scale language model (ESM-2 LM) to characterize peptide sequence information. The ESM model is a natural language processing model trained by machine learning technology. It is trained on a highly diverse and dense dataset (UR50/D 2021_04) and uses the pre-trained language model to obtain peptide sequence features with 320 dimensions. Compared with traditional feature extraction methods, the information represented by ESM-2 LM is more comprehensive and stable. Then, the features are input into the convolutional neural network (CNN), and the convolutional block attention module (CBAM) lightweight attention module is used to perform attention operations on CNN in space dimension and channel dimension. To verify the rationality of the model structure, we performed ablation experiments on the benchmark and independent test datasets. We compared the EACVP with existing methods on the independent test dataset. RESULTS: Experimental results show that ACC, F1-score, and MCC are 3.95%, 35.65% and 0.0725 higher than the most advanced methods, respectively. At the same time, we tested EACVP on ENNAVIA-C and ENNAVIA-D data sets, and the results showed that EACVP has good migration and is a powerful tool for predicting anti-coronavirus peptides. CONCLUSION: The results prove that this model EACVP could fully characterize the peptide information and achieve high prediction accuracy. It can be generalized to different data sets. The data and code of the article have been uploaded to https://github.- com/JYY625/EACVP.git.
RESUMO
Detecting abnormal surface features is an important method for identifying abnormal fish. However, existing methods face challenges in excessive subjectivity, limited accuracy, and poor real-time performance. To solve these challenges, a real-time and accurate detection model of abnormal surface features of in-water fish is proposed, based on improved YOLOv5s. The specific enhancements include: 1) We optimize the complete intersection over union and non-maximum suppression through the normalized Gaussian Wasserstein distance metric to improve the model's ability to detect tiny targets. 2) We design the DenseOne module to enhance the reusability of abnormal surface features, and introduce MobileViTv2 to improve detection speed, which are integrated into the feature extraction network. 3) According to the ACmix principle, we fuse the omni-dimensional dynamic convolution and convolutional block attention module to solve the challenge of extracting deep features within complex backgrounds. We carried out comparative experiments on 160 validation sets of in-water abnormal fish, achieving precision, recall, mAP50, mAP50:95 and frames per second (FPS) of 99.5, 99.1, 99.1, 73.9% and 88 FPS, respectively. The results of our model surpass the baseline by 1.4, 1.2, 3.2, 8.2% and 1 FPS. Moreover, the improved model outperforms other state-of-the-art models regarding comprehensive evaluation indexes.
Assuntos
Peixes , Água , Animais , Distribuição NormalRESUMO
Introduction: Early detection of leaf diseases is necessary to control the spread of plant diseases, and one of the important steps is the segmentation of leaf and disease images. The uneven light and leaf overlap in complex situations make segmentation of leaves and diseases quite difficult. Moreover, the significant differences in ratios of leaf and disease pixels results in a challenge in identifying diseases. Methods: To solve the above issues, the residual attention mechanism combined with atrous spatial pyramid pooling and weight compression loss of UNet is proposed, which is named RAAWC-UNet. Firstly, weights compression loss is a method that introduces a modulation factor in front of the cross-entropy loss, aiming at solving the problem of the imbalance between foreground and background pixels. Secondly, the residual network and the convolutional block attention module are combined to form Res_CBAM. It can accurately localize pixels at the edge of the disease and alleviate the vanishing of gradient and semantic information from downsampling. Finally, in the last layer of downsampling, the atrous spatial pyramid pooling is used instead of two convolutions to solve the problem of insufficient spatial context information. Results: The experimental results show that the proposed RAAWC-UNet increases the intersection over union in leaf and disease segmentation by 1.91% and 5.61%, and the pixel accuracy of disease by 4.65% compared with UNet. Discussion: The effectiveness of the proposed method was further verified by the better results in comparison with deep learning methods with similar network architectures.
RESUMO
BACKGROUND: 5-Methylcytosine (5mC) plays a very important role in gene stability, transcription, and development. Therefore, accurate identification of the 5mC site is of key importance in genetic and pathological studies. However, traditional experimental methods for identifying 5mC sites are time-consuming and costly, so there is an urgent need to develop computational methods to automatically detect and identify these 5mC sites. RESULTS: Deep learning methods have shown great potential in the field of 5mC sites, so we developed a deep learning combinatorial model called i5mC-DCGA. The model innovatively uses the Convolutional Block Attention Module (CBAM) to improve the Dense Convolutional Network (DenseNet), which is improved to extract advanced local feature information. Subsequently, we combined a Bidirectional Gated Recurrent Unit (BiGRU) and a Self-Attention mechanism to extract global feature information. Our model can learn feature representations of abstract and complex from simple sequence coding, while having the ability to solve the sample imbalance problem in benchmark datasets. The experimental results show that the i5mC-DCGA model achieves 97.02%, 96.52%, 96.58% and 85.58% in sensitivity (Sn), specificity (Sp), accuracy (Acc) and matthews correlation coefficient (MCC), respectively. CONCLUSIONS: The i5mC-DCGA model outperforms other existing prediction tools in predicting 5mC sites, and it is currently the most representative promoter 5mC site prediction tool. The benchmark dataset and source code for the i5mC-DCGA model can be found in https://github.com/leirufeng/i5mC-DCGA .
Assuntos
5-Metilcitosina , Benchmarking , Regiões Promotoras Genéticas , Projetos de Pesquisa , SoftwareRESUMO
The European Council completed the legislative procedure to establish the Carbon Border Adjustment Mechanism (CBAM) on April 25, 2023, which will be launched in 2027. The iron and steel sector is the main target of the forthcoming CBAM due to the industry's energy-intensive consumption with high carbon dioxide (CO2) emissions. However, minimal existing research has been conducted in this regard. This study employs GTAP-e 11.0 and TOPSIS models to estimate the effects of CBAM implementation on the major nations around the world from 2027 to 2030, examining countries' GDP, social welfare, iron and steel production, trade balance, and CO2 emissions to the global environment. This study concludes: (1) The GDP and social welfare of important iron and steel trade partners throughout the world will be significantly impacted by the application of CBAM. Most nations, including those in the EU, will experience negative GDP effects, with China undergoing the most pronounced social welfare declines followed by India. In contrast, the EU27 will benefit the most in terms of social welfare, followed by the US, Japan, and Russia. (2) Iron and steel production will decrease in all countries outside the EU, but it will have a positive impact on the trade balance of most countries. (3) The CO2 emissions of all countries except for the EU and Japan will decrease, with a positive impact on preventing carbon leakage in the international iron and steel trade. (4) Comprehensive analysis demonstrates that the EU will benefit the most, and China will suffer the most from CBAM application. Based on the above conclusions, this study proposes corresponding policy recommendations.
Assuntos
Dióxido de Carbono , Ferro , Dióxido de Carbono/análise , Aço , China , ÍndiaRESUMO
In order to improve the segmentation effect of brain tumor images and address the issue of feature information loss during convolutional neural network (CNN) training, we present an MRI brain tumor segmentation method that leverages an enhanced U-Net architecture. First, the ResNet50 network was used as the backbone network of the improved U-Net, the deeper CNN can improve the feature extraction effect. Next, the Residual Module was enhanced by incorporating the Convolutional Block Attention Module (CBAM). To increase characterization capabilities, focus on important features and suppress unnecessary features. Finally, the cross-entropy loss function and the Dice similarity coefficient are mixed to compose the loss function of the network. To solve the class unbalance problem of the data and enhance the tumor area segmentation outcome. The method's segmentation performance was evaluated using the test set. In this test set, the enhanced U-Net achieved an average Intersection over Union (IoU) of 86.64% and a Dice evaluation score of 87.47%. These values were 3.13% and 2.06% higher, respectively, compared to the original U-Net and R-Unet models. Consequently, the proposed enhanced U-Net in this study significantly improves the brain tumor segmentation efficacy, offering valuable technical support for MRI diagnosis and treatment.
Assuntos
Neoplasias Encefálicas , Humanos , Neoplasias Encefálicas/diagnóstico por imagem , Imageamento por Ressonância Magnética , Neuroimagem , Entropia , Redes Neurais de Computação , Processamento de Imagem Assistida por ComputadorRESUMO
In light of the prevalent issues concerning the mechanical grading of fresh tea leaves, characterized by high damage rates and poor accuracy, as well as the limited grading precision through the integration of machine vision and machine learning (ML) algorithms, this study presents an innovative approach for classifying the quality grade of fresh tea leaves. This approach leverages an integration of image recognition and deep learning (DL) algorithm to accurately classify tea leaves' grades by identifying distinct bud and leaf combinations. The method begins by acquiring separate images of orderly scattered and randomly stacked fresh tea leaves. These images undergo data augmentation techniques, such as rotation, flipping, and contrast adjustment, to form the scattered and stacked tea leaves datasets. Subsequently, the YOLOv8x model was enhanced by Space pyramid pooling improvements (SPPCSPC) and the concentration-based attention module (CBAM). The established YOLOv8x-SPPCSPC-CBAM model is evaluated by comparing it with popular DL models, including Faster R-CNN, YOLOv5x, and YOLOv8x. The experimental findings reveal that the YOLOv8x-SPPCSPC-CBAM model delivers the most impressive results. For the scattered tea leaves, the mean average precision, precision, recall, and number of images processed per second rates of 98.2%, 95.8%, 96.7%, and 2.77, respectively, while for stacked tea leaves, they are 99.1%, 99.1%, 97.7% and 2.35, respectively. This study provides a robust framework for accurately classifying the quality grade of fresh tea leaves.
Assuntos
Algoritmos , Aprendizado de Máquina , Rememoração Mental , Folhas de Planta , CháRESUMO
As an important direction in computer vision, human pose estimation has received extensive attention in recent years. A High-Resolution Network (HRNet) can achieve effective estimation results as a classical human pose estimation method. However, the complex structure of the model is not conducive to deployment under limited computer resources. Therefore, an improved Efficient and Lightweight HRNet (EL-HRNet) model is proposed. In detail, point-wise and grouped convolutions were used to construct a lightweight residual module, replacing the original 3 × 3 module to reduce the parameters. To compensate for the information loss caused by the network's lightweight nature, the Convolutional Block Attention Module (CBAM) is introduced after the new lightweight residual module to construct the Lightweight Attention Basicblock (LA-Basicblock) module to achieve high-precision human pose estimation. To verify the effectiveness of the proposed EL-HRNet, experiments were carried out using the COCO2017 and MPII datasets. The experimental results show that the EL-HRNet model requires only 5 million parameters and 2.0 GFlops calculations and achieves an AP score of 67.1% on the COCO2017 validation set. In addition, PCKh@0.5mean is 87.7% on the MPII validation set, and EL-HRNet shows a good balance between model complexity and human pose estimation accuracy.