Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Sensors (Basel) ; 23(6)2023 Mar 12.
Artigo em Inglês | MEDLINE | ID: mdl-36991768

RESUMO

The accurate estimation of a 3D human pose is of great importance in many fields, such as human-computer interaction, motion recognition and automatic driving. In view of the difficulty of obtaining 3D ground truth labels for a dataset of 3D pose estimation techniques, we take 2D images as the research object in this paper, and propose a self-supervised 3D pose estimation model called Pose ResNet. ResNet50 is used as the basic network for extract features. First, a convolutional block attention module (CBAM) was introduced to refine selection of significant pixels. Then, a waterfall atrous spatial pooling (WASP) module is used to capture multi-scale contextual information from the extracted features to increase the receptive field. Finally, the features are input into a deconvolution network to acquire the volume heat map, which is later processed by a soft argmax function to obtain the coordinates of the joints. In addition to the two learning strategies of transfer learning and synthetic occlusion, a self-supervised training method is also used in this model, in which the 3D labels are constructed by the epipolar geometry transformation to supervise the training of the network. Without the need for 3D ground truths for the dataset, accurate estimation of the 3D human pose can be realized from a single 2D image. The results show that the mean per joint position error (MPJPE) is 74.6 mm without the need for 3D ground truth labels. Compared with other approaches, the proposed method achieves better results.


Assuntos
Condução de Veículo , Autogestão , Humanos , Temperatura Alta , Aprendizagem , Movimento (Física)
2.
Sensors (Basel) ; 23(13)2023 Jun 30.
Artigo em Inglês | MEDLINE | ID: mdl-37447900

RESUMO

Accurate detection and timely treatment of component defects in substations is an important measure to ensure the safe operation of power systems. In this study, taking substation meters as an example, a dataset of common meter defects, such as a fuzzy or damaged dial on the meter and broken meter housing, is constructed from the images of manual inspection in power systems. There are several challenges involved in accurately detecting defects in substation meter images, such as the complex background, different meter sizes and large differences in the shapes of meter defects. Therefore, this paper proposes the PHAM-YOLO (Parallel Hybrid Attention Mechanism You Only Look Once) network for automatic detection of substation meter defects. In order to make the network pay attention to the key areas against the complex background of the meter defect images and the differences between different defect features, a Parallel Hybrid Attention Mechanism (PHAM) module is designed and added to the backbone of YOLOv5. PHAM integration of local and non-local correlation information can highlight these differences while remaining focused on the meter defect features. To improve the expressive ability of the feature map, a Spatial Pyramid Pooling Fast (SPPF) module is introduced, which pools the input feature map using a continuous fixed convolution kernel, fusing the feature maps of different receptive fields. Bounding box regression (BBR) is the key way to determine object positioning performance in defect detection. EIOU (Efficient Intersection over Union) is, therefore, introduced as a boundary loss function to solve the ambiguity of the CIOU (Complete Intersection Over Union) loss function, making the BBR regression more accurate. The experimental results show that the Average Precision Mean (mAP), Precision (P) and Recall (R) of the proposed PHAM-YOLO network in the dataset are 78.3%, 78.3%, and 79.9%, respectively, with mAP being improved by 2.7% compared to the original model and higher than SSD, Fast R-CNN, etc.


Assuntos
Algoritmos , Registros , Coluna Vertebral
3.
Sci Rep ; 13(1): 6132, 2023 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-37061550

RESUMO

Ski jumping is a high-speed sport, which makes it difficult to accurately analyze the technical motion in a subjective way. To solve this problem, we propose an image-based pose estimation method for analyzing the motion of ski jumpers. First, an image keypoint dataset of ski jumpers (KDSJ) was constructed. Next, in order to improve the precision of ski jumper pose estimation, an efficient channel attention (ECA) module was embedded in the residual structures of a high-resolution network (HRNet) to fuse more useful feature information. At the training stage, we used a transfer learning method which involved pre-training on the Common Objection in Context (COCO2017) to obtain feature knowledge from the COCO2017 for using in the task of ski jumper pose estimation. Finally, the detected keypoints of the ski jumpers were used to analyze the motion characteristics, using hip and knee angles over time (frames) as an example. Our experimental results showed that the proposed ECA-HRNet achieved the average precision of 73.4% on the COCO2017 test-dev set and the average precision of 86.4% on the KDSJ test set using the ground truth bounding boxes. These research results can provide guidance for auxiliary training and motion evaluation of ski jumpers.


Assuntos
Esqui , Fenômenos Biomecânicos , Movimento (Física) , Articulação do Joelho , Extremidade Inferior
4.
Sci Rep ; 12(1): 2183, 2022 02 09.
Artigo em Inglês | MEDLINE | ID: mdl-35140287

RESUMO

The accurate detection and identification of tea leaf diseases are conducive to its precise prevention and control. Convolutional neural network (CNN) can automatically extract the features of diseased tea leaves in the images. However, tea leaf images taken in natural environments have problems, such as complex backgrounds, dense leaves, and large-scale changes. The existing CNNs have low accuracy in detecting and identifying tea leaf diseases. This study proposes an improved RetinaNet target detection and identification network, AX-RetinaNet, which is used for the automatic detection and identification of tea leaf diseases in natural scene images. AX-RetinaNet uses an improved multiscale feature fusion module of the X-module and adds a channel attention module, Attention. The feature fusion module of the X-module obtains feature maps with rich information through multiple fusions of multi-scale features. The attention module assigns a network adaptively optimized weight to each feature map channel so that the network can select more effective features and reduce the interference of redundant features. This study also uses data augmentation methods to solve the problem of insufficient samples. Experimental results show the detection and identification accuracy of AX-RetinaNet for tea leaf diseases in natural scene images is better than the existing target detection and identification networks, such as SSD, RetinaNet, YOLO-v3, YOLO-v4, Centernet, M2det, and EfficientNet. The AX-RetinaNet detection and identification results indicated the mAP value of 93.83% and the F1-score value of 0.954. Compared with the original network, the mAP value, recall value, and identification accuracy increased by nearly 4%, by 4%, and by nearly 1.5%, respectively.


Assuntos
Redes Neurais de Computação , Folhas de Planta/anatomia & histologia , Chá/anatomia & histologia , Aumento da Imagem , Processamento de Imagem Assistida por Computador , Doenças das Plantas
5.
PLoS One ; 14(5): e0217168, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31136610

RESUMO

This paper focuses on fine-grained image retrieval based on sketches. Sketches capture detailed information, but their highly abstract nature makes visual comparisons with images more difficult. In spite of the fact that the existing models take into account the fine-grained details, they can not accurately highlight the distinctive local features and ignore the correlation between features. To solve this problem, we design a gradually focused bilinear attention model to extract detailed information more effectively. Specifically, the attention model is to accurately focus on representative local positions, and then use the weighted bilinear coding to find more discriminative feature representations. Finally, the global triplet loss function is used to avoid oversampling or undersampling. The experimental results show that the proposed method outperforms the state-of-the-art sketch-based image retrieval methods.


Assuntos
Algoritmos , Bases de Dados Factuais , Processamento de Imagem Assistida por Computador , Armazenamento e Recuperação da Informação , Redes Neurais de Computação , Reconhecimento Automatizado de Padrão , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA