Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Sensors (Basel) ; 23(2)2023 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-36679517

RESUMO

Visual geo-localization plays a crucial role in positioning and navigation for unmanned aerial vehicles, whose goal is to match the same geographic target from different views. This is a challenging task due to the drastic variations in different viewpoints and appearances. Previous methods have been focused on mining features inside the images. However, they underestimated the influence of external elements and the interaction of various representations. Inspired by multimodal and bilinear pooling, we proposed a pioneering feature fusion network (MBF) to address these inherent differences between drone and satellite views. We observe that UAV's status, such as flight height, leads to changes in the size of image field of view. In addition, local parts of the target scene act a role of importance in extracting discriminative features. Therefore, we present two approaches to exploit those priors. The first module is to add status information to network by transforming them into word embeddings. Note that they concatenate with image embeddings in Transformer block to learn status-aware features. Then, global and local part feature maps from the same viewpoint are correlated and reinforced by hierarchical bilinear pooling (HBP) to improve the robustness of feature representation. By the above approaches, we achieve more discriminative deep representations facilitating the geo-localization more effectively. Our experiments on existing benchmark datasets show significant performance boosting, reaching the new state-of-the-art result. Remarkably, the recall@1 accuracy achieves 89.05% in drone localization task and 93.15% in drone navigation task in University-1652, and shows strong robustness at different flight heights in the SUES-200 dataset.


Assuntos
Conscientização , Benchmarking , Humanos , Fontes de Energia Elétrica , Aprendizagem , Dispositivos Aéreos não Tripulados
2.
Sensors (Basel) ; 23(5)2023 Feb 27.
Artigo em Inglês | MEDLINE | ID: mdl-36904814

RESUMO

The past decade has demonstrated the potential of human activity recognition (HAR) with WiFi signals owing to non-invasiveness and ubiquity. Previous research has largely concentrated on enhancing precision through sophisticated models. However, the complexity of recognition tasks has been largely neglected. Thus, the performance of the HAR system is markedly diminished when tasked with increasing complexities, such as a larger classification number, the confusion of similar actions, and signal distortion To address this issue, we eliminated conventional convolutional and recurrent backbones and proposed WiTransformer, a novel tactic based on pure Transformers. Nevertheless, Transformer-like models are typically suited to large-scale datasets as pretraining models, according to the experience of the Vision Transformer. Therefore, we adopted the Body-coordinate Velocity Profile, a cross-domain WiFi signal feature derived from the channel state information, to reduce the threshold of the Transformers. Based on this, we propose two modified transformer architectures, united spatiotemporal Transformer (UST) and separated spatiotemporal Transformer (SST) to realize WiFi-based human gesture recognition models with task robustness. SST intuitively extracts spatial and temporal data features using two encoders, respectively. By contrast, UST can extract the same three-dimensional features with only a one-dimensional encoder, owing to its well-designed structure. We evaluated SST and UST on four designed task datasets (TDSs) with varying task complexities. The experimental results demonstrate that UST has achieved recognition accuracy of 86.16% on the most complex task dataset TDSs-22, outperforming the other popular backbones. Simultaneously, the accuracy decreases by at most 3.18% when the task complexity increases from TDSs-6 to TDSs-22, which is 0.14-0.2 times that of others. However, as predicted and analyzed, SST fails because of excessive lack of inductive bias and the limited scale of the training data.


Assuntos
Fontes de Energia Elétrica , Gestos , Humanos , Reconhecimento Psicológico , Sorbitol
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA