Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Sensors (Basel) ; 24(5)2024 Feb 22.
Artigo em Inglês | MEDLINE | ID: mdl-38474944

RESUMO

In this paper, we introduce a novel panoptic segmentation method called the Mask-Pyramid Network. Existing Mask RCNN-based methods first generate a large number of box proposals and then filter them at each feature level, which requires a lot of computational resources, while most of the box proposals are suppressed and discarded in the Non-Maximum Suppression process. Additionally, for panoptic segmentation, it is a problem to properly fuse the semantic segmentation results with the Mask RCNN-produced instance segmentation results. To address these issues, we propose a new mask pyramid mechanism to distinguish objects and generate much fewer proposals by referring to existing segmented masks, so as to reduce computing resource consumption. The Mask-Pyramid Network generates object proposals and predicts masks from larger to smaller sizes. It records the pixel area occupied by the larger object masks, and then only generates proposals on the unoccupied areas. Each object mask is represented as a H × W × 1 logit, which fits well in format with the semantic segmentation logits. By applying SoftMax to the concatenated semantic and instance segmentation logits, it is easy and natural to fuse both segmentation results. We empirically demonstrate that the proposed Mask-Pyramid Network achieves comparable accuracy performance on the Cityscapes and COCO datasets. Furthermore, we demonstrate the computational efficiency of the proposed method and obtain competitive results.

2.
Sensors (Basel) ; 21(14)2021 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-34300460

RESUMO

Human action recognition methods in videos based on deep convolutional neural networks usually use random cropping or its variants for data augmentation. However, this traditional data augmentation approach may generate many non-informative samples (video patches covering only a small part of the foreground or only the background) that are not related to a specific action. These samples can be regarded as noisy samples with incorrect labels, which reduces the overall action recognition performance. In this paper, we attempt to mitigate the impact of noisy samples by proposing an Auto-augmented Siamese Neural Network (ASNet). In this framework, we propose backpropagating salient patches and randomly cropped samples in the same iteration to perform gradient compensation to alleviate the adverse gradient effects of non-informative samples. Salient patches refer to the samples containing critical information for human action recognition. The generation of salient patches is formulated as a Markov decision process, and a reinforcement learning agent called SPA (Salient Patch Agent) is introduced to extract patches in a weakly supervised manner without extra labels. Extensive experiments were conducted on two well-known datasets UCF-101 and HMDB-51 to verify the effectiveness of the proposed SPA and ASNet.


Assuntos
Redes Neurais de Computação , Reconhecimento Psicológico , Atividades Humanas , Humanos , Aprendizagem , Cadeias de Markov
3.
Sensors (Basel) ; 21(7)2021 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-33805558

RESUMO

Deep reinforcement learning (DRL) has been utilized in numerous computer vision tasks, such as object detection, autonomous driving, etc. However, relatively few DRL methods have been proposed in the area of image segmentation, particularly in left ventricle segmentation. Reinforcement learning-based methods in earlier works often rely on learning proper thresholds to perform segmentation, and the segmentation results are inaccurate due to the sensitivity of the threshold. To tackle this problem, a novel DRL agent is designed to imitate the human process to perform LV segmentation. For this purpose, we formulate the segmentation problem as a Markov decision process and innovatively optimize it through DRL. The proposed DRL agent consists of two neural networks, i.e., First-P-Net and Next-P-Net. The First-P-Net locates the initial edge point, and the Next-P-Net locates the remaining edge points successively and ultimately obtains a closed segmentation result. The experimental results show that the proposed model has outperformed the previous reinforcement learning methods and achieved comparable performances compared with deep learning baselines on two widely used LV endocardium segmentation datasets, namely Automated Cardiac Diagnosis Challenge (ACDC) 2017 dataset, and Sunnybrook 2009 dataset. Moreover, the proposed model achieves higher F-measure accuracy compared with deep learning methods when training with a very limited number of samples.


Assuntos
Ventrículos do Coração , Redes Neurais de Computação , Coração , Ventrículos do Coração/diagnóstico por imagem , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...