Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 83
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Sci Rep ; 14(1): 20090, 2024 Aug 29.
Artículo en Inglés | MEDLINE | ID: mdl-39209928

RESUMEN

Remote Sensing Image Object Detection (RSIOD) faces the challenges of multi-scale objects, dense overlap of objects and uneven data distribution in practical applications. In order to solve these problems, this paper proposes a YOLO-ACPHD RSIOD algorithm. The algorithm adopts Adaptive Condition Awareness Technology (ACAT), which can dynamically adjust the parameters of the convolution kernel, so as to adapt to the objects of different scales and positions. Compared with the traditional fixed convolution kernel, this dynamic adjustment can better adapt to the diversity of scale, direction and shape of the object, thus improving the accuracy and robustness of Object Detection (OD). In addition, a High-Dimensional Decoupling Technology (HDDT) is used to reduce the amount of calculation to 1/N by performing deep convolution on the input data and then performing spatial convolution on each channel. When dealing with large-scale Remote Sensing Image (RSI) data, this reduction in computation can significantly improve the efficiency of the algorithm and accelerate the speed of OD, so as to better adapt to the needs of practical application scenarios. Through the experimental verification of the RSOD RSI data set, the YOLO-ACPHD model in this paper shows very satisfactory performance. The F1 value reaches 0.99, the Precision value reaches 1, the Precision-Recall value reaches 0.994, the Recall value reaches 1, and the mAP value reaches 99.36 % , which indicates that the model shows the highest level in the accuracy and comprehensiveness of OD.

2.
Sensors (Basel) ; 24(14)2024 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-39065972

RESUMEN

Recently, the low-rank representation (LRR) model has been widely used in the field of remote sensing image denoising due to its excellent noise suppression capability. However, those low-rank-based methods always discard important edge details as residuals, leading to a common issue of blurred edges in denoised results. To address this problem, we take a new look at low-rank residuals and try to extract edge information from them. Therefore, a hierarchical denoising framework was combined with a low-rank model to extract edge information from low-rank residuals within the edge subspace. A prior knowledge matrix was designed to enable the model to learn necessary structural information rather than noise. Also, such traditional model-driven approaches require multiple iterations, and the solutions may be very complex and computationally intensive. To further enhance the noise suppression performance and computing efficiency, a hierarchical low-rank denoising model based on deep unrolling (HLR-DUR) was proposed, integrating deep neural networks into the hierarchical low-rank denoising framework to expand the information capture and representation capabilities of the proposed shallow model. Sufficient experiments on optical images, hyperspectral images (HSI), and synthetic aperture radar (SAR) images showed that HLR-DUR achieved state-of-the-art (SOTA) denoising results.

3.
Entropy (Basel) ; 26(6)2024 May 25.
Artículo en Inglés | MEDLINE | ID: mdl-38920454

RESUMEN

Salient object detection (SOD) aims to accurately identify significant geographical objects in remote sensing images (RSI), providing reliable support and guidance for extensive geographical information analyses and decisions. However, SOD in RSI faces numerous challenges, including shadow interference, inter-class feature confusion, as well as unclear target edge contours. Therefore, we designed an effective Global Semantic-aware Aggregation Network (GSANet) to aggregate salient information in RSI. GSANet computes the information entropy of different regions, prioritizing areas with high information entropy as potential target regions, thereby achieving precise localization and semantic understanding of salient objects in remote sensing imagery. Specifically, we proposed a Semantic Detail Embedding Module (SDEM), which explores the potential connections among multi-level features, adaptively fusing shallow texture details with deep semantic features, efficiently aggregating the information entropy of salient regions, enhancing information content of salient targets. Additionally, we proposed a Semantic Perception Fusion Module (SPFM) to analyze map relationships between contextual information and local details, enhancing the perceptual capability for salient objects while suppressing irrelevant information entropy, thereby addressing the semantic dilution issue of salient objects during the up-sampling process. The experimental results on two publicly available datasets, ORSSD and EORSSD, demonstrated the outstanding performance of our method. The method achieved 93.91% Sα, 98.36% Eξ, and 89.37% Fß on the EORSSD dataset.

4.
Sensors (Basel) ; 24(12)2024 Jun 19.
Artículo en Inglés | MEDLINE | ID: mdl-38931757

RESUMEN

Remote sensing images are inevitably affected by the degradation of haze with complex appearance and non-uniform distribution, which remarkably affects the effectiveness of downstream remote sensing visual tasks. However, most current methods principally operate in the original pixel space of the image, which hinders the exploration of the frequency characteristics of remote sensing images, resulting in these models failing to fully exploit their representation ability to produce high-quality images. This paper proposes a frequency-oriented remote sensing dehazing Transformer named FOTformer, to explore information in the frequency domain to eliminate disturbances caused by haze in remote sensing images. It contains three components. Specifically, we developed a frequency-prompt attention evaluator to estimate the self-correlation of features in the frequency domain rather than the spatial domain, improving the image restoration performance. We propose a content reconstruction feed-forward network that captures information between different scales in features and integrates and processes global frequency domain information and local multi-scale spatial information in Fourier space to reconstruct the global content under the guidance of the amplitude spectrum. We designed a spatial-frequency aggregation block to exchange and fuse features from the frequency domain and spatial domain of the encoder and decoder to facilitate the propagation of features from the encoder stream to the decoder and alleviate the problem of information loss in the network. The experimental results show that the FOTformer achieved a more competitive performance against other remote sensing dehazing methods on commonly used benchmark datasets.

5.
Sensors (Basel) ; 24(9)2024 May 06.
Artículo en Inglés | MEDLINE | ID: mdl-38733059

RESUMEN

In response to the challenges posed by small objects in remote sensing images, such as low resolution, complex backgrounds, and severe occlusions, this paper proposes a lightweight improved model based on YOLOv8n. During the detection of small objects, the feature fusion part of the YOLOv8n algorithm retrieves relatively fewer features of small objects from the backbone network compared to large objects, resulting in low detection accuracy for small objects. To address this issue, firstly, this paper adds a dedicated small object detection layer in the feature fusion network to better integrate the features of small objects into the feature fusion part of the model. Secondly, the SSFF module is introduced to facilitate multi-scale feature fusion, enabling the model to capture more gradient paths and further improve accuracy while reducing model parameters. Finally, the HPANet structure is proposed, replacing the Path Aggregation Network with HPANet. Compared to the original YOLOv8n algorithm, the recognition accuracy of mAP@0.5 on the VisDrone data set and the AI-TOD data set has increased by 14.3% and 17.9%, respectively, while the recognition accuracy of mAP@0.5:0.95 has increased by 17.1% and 19.8%, respectively. The proposed method reduces the parameter count by 33% and the model size by 31.7% compared to the original model. Experimental results demonstrate that the proposed method can quickly and accurately identify small objects in complex backgrounds.

6.
Sensors (Basel) ; 24(10)2024 May 18.
Artículo en Inglés | MEDLINE | ID: mdl-38794065

RESUMEN

This study focuses on advancing the field of remote sensing image target detection, addressing challenges such as small target detection, complex background handling, and dense target distribution. We propose solutions based on enhancing the YOLOv7 algorithm. Firstly, we improve the multi-scale feature enhancement (MFE) method of YOLOv7, enhancing its adaptability and precision in detecting small targets and complex backgrounds. Secondly, we design a modified YOLOv7 global information DP-MLP module to effectively capture and integrate global information, thereby improving target detection accuracy and robustness, especially in handling large-scale variations and complex scenes. Lastly, we explore a semi-supervised learning model (SSLM) target detection algorithm incorporating unlabeled data, leveraging information from unlabeled data to enhance the model's generalization ability and performance. Experimental results demonstrate that despite the outstanding performance of YOLOv7, the mean average precision (MAP) can still be improved by 1.9%. Specifically, under testing on the TGRS-HRRSD-Dataset, the MFE and DP-MLP models achieve MAP values of 93.4% and 93.1%, respectively. Across the NWPU VHR-10 dataset, the three models achieve MAP values of 93.1%, 92.1%, and 92.2%, respectively. Significant improvements are observed across various metrics compared to the original model. This study enhances the adaptability, accuracy, and generalization of remote sensing image object detection.

7.
Sci Rep ; 14(1): 11558, 2024 May 21.
Artículo en Inglés | MEDLINE | ID: mdl-38773140

RESUMEN

Remote sensing image fusion is dedicated to obtain a high-resolution multispectral (HRMS) image without spatial or spectral distortion compared to the single source image. In this paper, a novel fusion algorithm based on Bayesian estimation for remote sensing images is proposed from the new perspective of risk decisions. In this study, an observation model based on Bayesian estimation for remote sensing image fusion is constructed. Three categories of probabilities including prior, conditional and posterior probabilities are calculated after an intensity-hue-saturation (IHS) transformation is applied to the original low-resolution MS image. To obtain the desired HRMS image, with the corrected posterior probability, a fusion rule based on Bayesian decisions is designed to estimate which pixels to select from the panchromatic (PAN) image and the intensity component of the MS image. The selected pixels constitute a new component that will participate in an IHS inverse transformation to yield the fused image. Extensive experiments were performed on the Pleiades, WorldView-3, and IKONOS datasets, and the results demonstrate the effectiveness of the proposed method.

8.
Sensors (Basel) ; 24(7)2024 Mar 24.
Artículo en Inglés | MEDLINE | ID: mdl-38610290

RESUMEN

Remote sensing image is a vital basis for land management decisions. The protection of remote sensing images has seen the application of blockchain's notarization function by many scholars. Yet, research on efficient retrieval of such images on the blockchain remains sparse. Addressing this issue, this paper introduces a blockchain-based spatial index verification method using Hyperledger Fabric. It linearizes the spatial information of remote sensing images via Geohash and integrates it with LSM trees for effective retrieval and verification. The system also incorporates IPFS as an underlying storage unit for Hyperledger Fabric, ensuring the safe storage and transmission of images. The experiments indicate that this method significantly reduces the latency in data retrieval and verification without impacting the write performance of Hyperledger Fabric, enhancing throughput and providing a solid foundation for efficient blockchain-based verification of remote sensing images in land registry systems.

9.
Sensors (Basel) ; 24(8)2024 Apr 11.
Artículo en Inglés | MEDLINE | ID: mdl-38676059

RESUMEN

The identification of maritime targets plays a critical role in ensuring maritime safety and safeguarding against potential threats. While satellite remote-sensing imagery serves as the primary data source for monitoring maritime targets, it only provides positional and morphological characteristics without detailed identity information, presenting limitations as a sole data source. To address this issue, this paper proposes a method for enhancing maritime target identification and positioning accuracy through the fusion of Automatic Identification System (AIS) data and satellite remote-sensing imagery. The AIS utilizes radio communication to acquire multidimensional feature information describing targets, serving as an auxiliary data source to complement the limitations of image data and achieve maritime target identification. Additionally, the positional information provided by the AIS can serve as maritime control points to correct positioning errors and enhance accuracy. By utilizing data from the Jilin-1 Spectral-01 satellite imagery with a resolution of 5 m and AIS data, the feasibility of the proposed method is validated through experiments. Following preprocessing, maritime target fusion is achieved using a point-set matching algorithm based on positional features and a fuzzy comprehensive decision method incorporating attribute features. Subsequently, the successful fusion of target points is utilized for positioning error correction. Experimental results demonstrate a significant improvement in maritime target positioning accuracy compared to raw data, with over a 70% reduction in root mean square error and positioning errors controlled within 4 pixels, providing relatively accurate target positions that essentially meet practical requirements.

10.
Sensors (Basel) ; 24(4)2024 Feb 09.
Artículo en Inglés | MEDLINE | ID: mdl-38400288

RESUMEN

Remote sensing image classification (RSIC) is designed to assign specific semantic labels to aerial images, which is significant and fundamental in many applications. In recent years, substantial work has been conducted on RSIC with the help of deep learning models. Even though these models have greatly enhanced the performance of RSIC, the issues of diversity in the same class and similarity between different classes in remote sensing images remain huge challenges for RSIC. To solve these problems, a duplex-hierarchy representation learning (DHRL) method is proposed. The proposed DHRL method aims to explore duplex-hierarchy spaces, including a common space and a label space, to learn discriminative representations for RSIC. The proposed DHRL method consists of three main steps: First, paired images are fed to a pretrained ResNet network for extracting the corresponding features. Second, the extracted features are further explored and mapped into a common space for reducing the intra-class scatter and enlarging the inter-class separation. Third, the obtained representations are used to predict the categories of the input images, and the discrimination loss in the label space is minimized to further promote the learning of discriminative representations. Meanwhile, a confusion score is computed and added to the classification loss for guiding the discriminative representation learning via backpropagation. The comprehensive experimental results show that the proposed method is superior to the existing state-of-the-art methods on two challenging remote sensing image scene datasets, demonstrating that the proposed method is significantly effective.

11.
Sensors (Basel) ; 24(4)2024 Feb 16.
Artículo en Inglés | MEDLINE | ID: mdl-38400425

RESUMEN

To address the challenges of handling imprecise building boundary information and reducing false-positive outcomes during the process of detecting building changes in remote sensing images, this paper proposes a Siamese transformer architecture based on a difference module. This method introduces a layered transformer to provide global context modeling capability and multiscale features to better process building boundary information, and a difference module is used to better obtain the difference features of a building before and after a change. The difference features before and after the change are then fused, and the fused difference features are used to generate a change map, which reduces the false-positive problem to a certain extent. Experiments were conducted on two publicly available building change detection datasets, LEVIR-CD and WHU-CD. The F1 scores for LEVIR-CD and WHU-CD reached 89.58% and 84.51%, respectively. The experimental results demonstrate that when utilized for building change detection in remote sensing images, the proposed method exhibits improved robustness and detection performance. Additionally, this method serves as a valuable technical reference for the identification of building damage in remote sensing images.

12.
Sci Rep ; 14(1): 5054, 2024 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-38424135

RESUMEN

Deep neural networks combined with superpixel segmentation have proven to be superior to high-resolution remote sensing image (HRI) classification. Currently, most HRI classification methods that combine deep learning and superpixel segmentation use stacking on multiple scales to extract contextual information from segmented objects. However, this approach does not take into account the contextual dependencies between each segmented object. To solve this problem, a joint superpixel and Transformer (JST) framework is proposed for HRI classification. In JST, HRI is first segmented into superpixel objects as input, and Transformer is used to model the long-range dependencies. The contextual relationship between each input superpixel object is obtained and the class of analyzed objects is output by designing an encoding and decoding Transformer. Additionally, we explore the effect of semantic range on classification accuracy. JST is also tested by using two HRI datasets with overall classification accuracy, average accuracy and Kappa coefficients of 0.79, 0.70, 0.78 and 0.91, 0.85, 0.89, respectively. The effectiveness of the proposed method is compared qualitatively and quantitatively, and the results achieve competitive and consistently better than the benchmark comparison method.

13.
Entropy (Basel) ; 26(1)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38275499

RESUMEN

The profound impacts of severe air pollution on human health, ecological balance, and economic stability are undeniable. Precise air quality forecasting stands as a crucial necessity, enabling governmental bodies and vulnerable communities to proactively take essential measures to reduce exposure to detrimental pollutants. Previous research has primarily focused on predicting air quality using only time-series data. However, the importance of remote-sensing image data has received limited attention. This paper proposes a new multi-modal deep-learning model, Res-GCN, which integrates high spatial resolution remote-sensing images and time-series air quality data from multiple stations to forecast future air quality. Res-GCN employs two deep-learning networks, one utilizing the residual network to extract hidden visual information from remote-sensing images, and another using a dynamic spatio-temporal graph convolution network to capture spatio-temporal information from time-series data. By extracting features from two different modalities, improved predictive performance can be achieved. To demonstrate the effectiveness of the proposed model, experiments were conducted on two real-world datasets. The results show that the Res-GCN model effectively extracts multi-modal features, significantly enhancing the accuracy of multi-step predictions. Compared to the best-performing baseline model, the multi-step prediction's mean absolute error, root mean square error, and mean absolute percentage error increased by approximately 6%, 7%, and 7%, respectively.

14.
Sensors (Basel) ; 24(2)2024 Jan 21.
Artículo en Inglés | MEDLINE | ID: mdl-38276366

RESUMEN

The present study proposes a novel deep-learning model for remote sensing image enhancement. It maintains image details while enhancing brightness in the feature extraction module. An improved hierarchical model named Global Spatial Attention Network (GSA-Net), based on U-Net for image enhancement, is proposed to improve the model's performance. To circumvent the issue of insufficient sample data, gamma correction is applied to create low-light images, which are then used as training examples. A loss function is constructed using the Structural Similarity (SSIM) and Peak Signal-to-Noise Ratio (PSNR) indices. The GSA-Net network and loss function are utilized to restore images obtained via low-light remote sensing. This proposed method was tested on the Northwestern Polytechnical University Very-High-Resolution 10 (NWPU VHR-10) dataset, and its overall superiority was demonstrated in comparison with other state-of-the-art algorithms using various objective assessment indicators, such as PSNR, SSIM, and Learned Perceptual Image Patch Similarity (LPIPS). Furthermore, in high-level visual tasks such as object detection, this novel method provides better remote sensing images with distinct details and higher contrast than the competing methods.

15.
Front Neurorobot ; 17: 1267231, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37885769

RESUMEN

In light of advancing socio-economic development and urban infrastructure, urban traffic congestion and accidents have become pressing issues. High-resolution remote sensing images are crucial for supporting urban geographic information systems (GIS), road planning, and vehicle navigation. Additionally, the emergence of robotics presents new possibilities for traffic management and road safety. This study introduces an innovative approach that combines attention mechanisms and robotic multimodal information fusion for retrieving traffic scenes from remote sensing images. Attention mechanisms focus on specific road and traffic features, reducing computation and enhancing detail capture. Graph neural algorithms improve scene retrieval accuracy. To achieve efficient traffic scene retrieval, a robot equipped with advanced sensing technology autonomously navigates urban environments, capturing high-accuracy, wide-coverage images. This facilitates comprehensive traffic databases and real-time traffic information retrieval for precise traffic management. Extensive experiments on large-scale remote sensing datasets demonstrate the feasibility and effectiveness of this approach. The integration of attention mechanisms, graph neural algorithms, and robotic multimodal information fusion enhances traffic scene retrieval, promising improved information extraction accuracy for more effective traffic management, road safety, and intelligent transportation systems. In conclusion, this interdisciplinary approach, combining attention mechanisms, graph neural algorithms, and robotic technology, represents significant progress in traffic scene retrieval from remote sensing images, with potential applications in traffic management, road safety, and urban planning.

16.
Sensors (Basel) ; 23(20)2023 Oct 22.
Artículo en Inglés | MEDLINE | ID: mdl-37896728

RESUMEN

The lack of labeled training samples restricts the improvement of Hyperspectral Remote Sensing Image (HRSI) classification accuracy based on deep learning methods. In order to improve the HRSI classification accuracy when there are few training samples, a Lightweight 3D Dense Autoencoder Network (L3DDAN) is proposed. Structurally, the L3DDAN is designed as a stacked autoencoder which consists of an encoder and a decoder. The encoder is a hybrid combination of 3D convolutional operations and 3D dense block for extracting deep features from raw data. The decoder composed of 3D deconvolution operations is designed to reconstruct data. The L3DDAN is trained by unsupervised learning without labeled samples and supervised learning with a small number of labeled samples, successively. The network composed of the fine-tuned encoder and trained classifier is used for classification tasks. The extensive comparative experiments on three benchmark HRSI datasets demonstrate that the proposed framework with fewer trainable parameters can maintain superior performance to the other eight state-of-the-art algorithms when there are only a few training samples. The proposed L3DDAN can be applied to HRSI classification tasks, such as vegetation classification. Future work mainly focuses on training time reduction and applications on more real-world datasets.

17.
Sensors (Basel) ; 23(17)2023 Aug 28.
Artículo en Inglés | MEDLINE | ID: mdl-37687940

RESUMEN

The degradation of visual quality in remote sensing images caused by haze presents significant challenges in interpreting and extracting essential information. To effectively mitigate the impact of haze on image quality, we propose an unsupervised generative adversarial network specifically designed for remote sensing image dehazing. This network includes two generators with identical structures and two discriminators with identical structures. One generator is focused on image dehazing, while the other generates images with added haze. The two discriminators are responsible for distinguishing whether an image is real or generated. The generator, employing an encoder-decoder architecture, is designed based on the proposed multi-scale feature-extraction modules and attention modules. The proposed multi-scale feature-extraction module, comprising three distinct branches, aims to extract features with varying receptive fields. Each branch comprises dilated convolutions and attention modules. The proposed attention module includes both channel and spatial attention components. It guides the feature-extraction network to emphasize haze and texture within the remote sensing image. For enhanced generator performance, a multi-scale discriminator is also designed with three branches. Furthermore, an improved loss function is introduced by incorporating color-constancy loss into the conventional loss framework. In comparison to state-of-the-art methods, the proposed approach achieves the highest peak signal-to-noise ratio and structural similarity index metrics. These results convincingly demonstrate the superior performance of the proposed method in effectively removing haze from remote sensing images.

18.
Sensors (Basel) ; 23(17)2023 Aug 30.
Artículo en Inglés | MEDLINE | ID: mdl-37687999

RESUMEN

Remote sensing image denoising is of great significance for the subsequent use and research of images. Gaussian noise and salt-and-pepper noise are prevalent noises in images. Contemporary denoising algorithms often exhibit limitations when addressing such mixed noise scenarios, manifesting in suboptimal denoising outcomes and the potential blurring of image edges subsequent to the denoising process. To address the above problems, a second-order removal method for mixed noise in remote sensing images was proposed. In the first stage of the method, dilated convolution was introduced into the DnCNN (denoising convolutional neural network) network framework to increase the receptive field of the network, so that more feature information could be extracted from remote sensing images. Meanwhile, a DropoutLayer was introduced after the deep convolution layer to build the noise reduction model to prevent the network from overfitting and to simplify the training difficulty, and then the model was used to perform the preliminary noise reduction on the images. To further improve the image quality of the preliminary denoising results, effectively remove the salt-and-pepper noise in the mixed noise, and preserve more image edge details and texture features, the proposed method employed a second stage on the basis of adaptive median filtering. In this second stage, the median value in the original filter window median was replaced by the nearest neighbor pixel weighted median, so that the preliminary noise reduction result was subjected to secondary processing, and the final denoising result of the mixed noise of the remote sensing image was obtained. In order to verify the feasibility and effectiveness of the algorithm, the remote sensing image denoising experiments and denoised image edge detection experiments were carried out in this paper. When the experimental results are analyzed through subjective visual assessment, images denoised using the proposed method exhibit clearer and more natural details, and they effectively retain edge and texture features. In terms of objective evaluation, the performance of different denoising algorithms is compared using metrics such as mean square error (MSE), peak signal-to-noise ratio (PSNR), and mean structural similarity index (MSSIM). The experimental outcomes indicate that the proposed method for denoising mixed noise in remote sensing images outperforms traditional denoising techniques, achieving a clearer image restoration effect.

19.
Sensors (Basel) ; 23(17)2023 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-37688055

RESUMEN

Due to the increasing capabilities of cybercriminals and the vast quantity of sensitive data, it is necessary to protect remote sensing images during data transmission with "Belt and Road" countries. Joint image compression and encryption techniques exhibit reliability and cost-effectiveness for data transmission. However, the existing methods for multiband remote sensing images have limitations, such as extensive preprocessing times, incompatibility with multiple bands, and insufficient security. To address the aforementioned issues, we propose a joint encryption and compression algorithm (JECA) for multiband remote sensing images, including a preprocessing encryption stage, crypto-compression stage, and decoding stage. In the first stage, multiple bands from an input image can be spliced together in order from left to right to generate a grayscale image, which is then scrambled at the block level by a chaotic system. In the second stage, we encrypt the DC coefficient and AC coefficient. In the final stage, we first decrypt the DC coefficient and AC coefficient, and then restore the out-of-order block through the chaotic system to get the correct grayscale image. Finally, we postprocess the grayscale image and reconstruct it into a remote sensing image. The experimental results show that JECA can reduce the preprocessing time of the sender by 50% compared to existing joint encryption and compression methods. It is also compatible with multiband remote sensing images. Furthermore, JECA improves security while maintaining the same compression ratio as existing methods, especially in terms of visual security and key sensitivity.

20.
PeerJ Comput Sci ; 9: e1488, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37547419

RESUMEN

Pan-sharpening is a fundamental and crucial task in the remote sensing image processing field, which generates a high-resolution multi-spectral image by fusing a low-resolution multi-spectral image and a high-resolution panchromatic image. Recently, deep learning techniques have shown competitive results in pan-sharpening. However, diverse features in the multi-spectral and panchromatic images are not fully extracted and exploited in existing deep learning methods, which leads to information loss in the pan-sharpening process. To solve this problem, a novel pan-sharpening method based on multi-resolution transformer and two-stage feature fusion is proposed in this article. Specifically, a transformer-based multi-resolution feature extractor is designed to extract diverse image features. Then, to fully exploit features with different content and characteristics, a two-stage feature fusion strategy is adopted. In the first stage, a multi-resolution fusion module is proposed to fuse multi-spectral and panchromatic features at each scale. In the second stage, a shallow-deep fusion module is proposed to fuse shallow and deep features for detail generation. Experiments over QuickBird and WorldView-3 datasets demonstrate that the proposed method outperforms current state-of-the-art approaches visually and quantitatively with fewer parameters. Moreover, the ablation study and feature map analysis also prove the effectiveness of the transformer-based multi-resolution feature extractor and the two-stage fusion scheme.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...