Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-38833391

RESUMO

Accurately distinguishing between background and anomalous objects within hyperspectral images poses a significant challenge. The primary obstacle lies in the inadequate modeling of prior knowledge, leading to a performance bottleneck in hyperspectral anomaly detection (HAD). In response to this challenge, we put forth a groundbreaking coupling paradigm that combines model-driven low-rank representation (LRR) methods with data-driven deep learning techniques by learning disentangled priors (LDP). LDP seeks to capture complete priors for effectively modeling the background, thereby extracting anomalies from hyperspectral images more accurately. LDP follows a model-driven deep unfolding architecture, where the prior knowledge is separated into the explicit low-rank prior formulated by expert knowledge and implicit learnable priors by means of deep networks. The internal relationships between explicit and implicit priors within LDP are elegantly modeled through a skip residual connection. Furthermore, we provide a mathematical proof of the convergence of our proposed model. Our experiments, conducted on multiple widely recognized datasets, demonstrate that LDP surpasses most of the current advanced HAD techniques, exceling in both detection performance and generalization capability.

2.
Artigo em Inglês | MEDLINE | ID: mdl-38381635

RESUMO

multimodal image fusion involves tasks like pan-sharpening and depth super-resolution. Both tasks aim to generate high-resolution target images by fusing the complementary information from the texture-rich guidance and low-resolution target counterparts. They are inborn with reconstructing high-frequency information. Despite their inherent frequency domain connection, most existing methods only operate solely in the spatial domain and rarely explore the solutions in the frequency domain. This study addresses this limitation by proposing solutions in both the spatial and frequency domains. We devise a Spatial-Frequency Information Integration Network, abbreviated as SFINet for this purpose. The SFINet includes a core module tailored for image fusion. This module consists of three key components: a spatial-domain information branch, a frequency-domain information branch, and a dual-domain interaction. The spatial-domain information branch employs the spatial convolution-equipped invertible neural operators to integrate local information from different modalities in the spatial domain. Meanwhile, the frequency-domain information branch adopts a modality-aware deep Fourier transformation to capture the image-wide receptive field for exploring global contextual information. In addition, the dual-domain interaction facilitates information flow and the learning of complementary representations. We further present an improved version of SFINet, SFINet++, that enhances the representation of spatial information by replacing the basic convolution unit in the original spatial domain branch with the information-lossless invertible neural operator. We conduct extensive experiments to validate the effectiveness of the proposed networks and demonstrate their outstanding performance against state-of-the-art methods in two representative multimodal image fusion tasks: pan-sharpening and depth super-resolution. The source code is publicly available at https://github.com/manman1995/Awaresome-pansharpening.

3.
IEEE Trans Pattern Anal Mach Intell ; 46(8): 5227-5244, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38568772

RESUMO

The foundation model has recently garnered significant attention due to its potential to revolutionize the field of visual representation learning in a self-supervised manner. While most foundation models are tailored to effectively process RGB images for various visual tasks, there is a noticeable gap in research focused on spectral data, which offers valuable information for scene understanding, especially in remote sensing (RS) applications. To fill this gap, we created for the first time a universal RS foundation model, named SpectralGPT, which is purpose-built to handle spectral RS images using a novel 3D generative pretrained transformer (GPT). Compared to existing foundation models, SpectralGPT 1) accommodates input images with varying sizes, resolutions, time series, and regions in a progressive training fashion, enabling full utilization of extensive RS Big Data; 2) leverages 3D token generation for spatial-spectral coupling; 3) captures spectrally sequential patterns via multi-target reconstruction; and 4) trains on one million spectral RS images, yielding models with over 600 million parameters. Our evaluation highlights significant performance improvements with pretrained SpectralGPT models, signifying substantial potential in advancing spectral RS Big Data applications within the field of geoscience across four downstream tasks: single/multi-label scene classification, semantic segmentation, and change detection.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA