Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
Int J Appl Earth Obs Geoinf ; 110: 102804, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-36338308

RESUMEN

Humans rely on clean water for their health, well-being, and various socio-economic activities. During the past few years, the COVID-19 pandemic has been a constant reminder of about the importance of hygiene and sanitation for public health. The most common approach to securing clean water supplies for this purpose is via wastewater treatment. To date, an effective method of detecting wastewater treatment plants (WWTP) accurately and automatically via remote sensing is unavailable. In this paper, we provide a solution to this task by proposing a novel joint deep learning (JDL) method that consists of a fine-tuned object detection network and a multi-task residual attention network (RAN). By leveraging OpenStreetMap (OSM) and multimodal remote sensing (RS) data, our JDL method is able to simultaneously tackle two different tasks: land use land cover (LULC) and WWTP classification. Moreover, JDL exploits the complementary effects between these tasks for a performance gain. We train JDL using 4,187 WWTP features and 4,200 LULC samples and validate the performance of the proposed method over a selected area around Stuttgart with 723 WWTP features and 1,200 LULC samples to generate an LULC classification map and a WWTP detection map. Extensive experiments conducted with different comparative methods demonstrate the effectiveness and efficiency of our JDL method in automatic WWTP detection in comparison with single-modality/single-task or traditional survey methods. Moreover, lessons learned pave the way for future works to simultaneously and effectively address multiple large-scale mapping tasks (e.g., both mapping LULC and detecting WWTP) from multimodal RS data via deep learning.

2.
ISPRS J Photogramm Remote Sens ; 178: 68-80, 2021 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-34433999

RESUMEN

As remote sensing (RS) data obtained from different sensors become available largely and openly, multimodal data processing and analysis techniques have been garnering increasing interest in the RS and geoscience community. However, due to the gap between different modalities in terms of imaging sensors, resolutions, and contents, embedding their complementary information into a consistent, compact, accurate, and discriminative representation, to a great extent, remains challenging. To this end, we propose a shared and specific feature learning (S2FL) model. S2FL is capable of decomposing multimodal RS data into modality-shared and modality-specific components, enabling the information blending of multi-modalities more effectively, particularly for heterogeneous data sources. Moreover, to better assess multimodal baselines and the newly-proposed S2FL model, three multimodal RS benchmark datasets, i.e., Houston2013 - hyperspectral and multispectral data, Berlin - hyperspectral and synthetic aperture radar (SAR) data, Augsburg - hyperspectral, SAR, and digital surface model (DSM) data, are released and used for land cover classification. Extensive experiments conducted on the three datasets demonstrate the superiority and advancement of our S2FL model in the task of land cover classification in comparison with previously-proposed state-of-the-art baselines. Furthermore, the baseline codes and datasets used in this paper will be made available freely at https://github.com/danfenghong/ISPRS_S2FL.

3.
ISPRS J Photogramm Remote Sens ; 167: 12-23, 2020 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-32904376

RESUMEN

This paper addresses the problem of semi-supervised transfer learning with limited cross-modality data in remote sensing. A large amount of multi-modal earth observation images, such as multispectral imagery (MSI) or synthetic aperture radar (SAR) data, are openly available on a global scale, enabling parsing global urban scenes through remote sensing imagery. However, their ability in identifying materials (pixel-wise classification) remains limited, due to the noisy collection environment and poor discriminative information as well as limited number of well-annotated training images. To this end, we propose a novel cross-modal deep-learning framework, called X-ModalNet, with three well-designed modules: self-adversarial module, interactive learning module, and label propagation module, by learning to transfer more discriminative information from a small-scale hyperspectral image (HSI) into the classification task using a large-scale MSI or SAR data. Significantly, X-ModalNet generalizes well, owing to propagating labels on an updatable graph constructed by high-level features on the top of the network, yielding semi-supervised cross-modality learning. We evaluate X-ModalNet on two multi-modal remote sensing datasets (HSI-MSI and HSI-SAR) and achieve a significant improvement in comparison with several state-of-the-art methods.

4.
ISPRS J Photogramm Remote Sens ; 158: 35-49, 2019 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-31853165

RESUMEN

Hyperspectral dimensionality reduction (HDR), an important preprocessing step prior to high-level data analysis, has been garnering growing attention in the remote sensing community. Although a variety of methods, both unsupervised and supervised models, have been proposed for this task, yet the discriminative ability in feature representation still remains limited due to the lack of a powerful tool that effectively exploits the labeled and unlabeled data in the HDR process. A semi-supervised HDR approach, called iterative multitask regression (IMR), is proposed in this paper to address this need. IMR aims at learning a low-dimensional subspace by jointly considering the labeled and unlabeled data, and also bridging the learned subspace with two regression tasks: labels and pseudo-labels initialized by a given classifier. More significantly, IMR dynamically propagates the labels on a learnable graph and progressively refines pseudo-labels, yielding a well-conditioned feedback system. Experiments conducted on three widely-used hyperspectral image datasets demonstrate that the dimension-reduced features learned by the proposed IMR framework with respect to classification or recognition accuracy are superior to those of related state-of-the-art HDR approaches.

5.
ISPRS J Photogramm Remote Sens ; 147: 193-205, 2019 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-30774220

RESUMEN

In this paper, we aim at tackling a general but interesting cross-modality feature learning question in remote sensing community-can a limited amount of highly-discriminative (e.g., hyperspectral) training data improve the performance of a classification task using a large amount of poorly-discriminative (e.g., multispectral) data? Traditional semi-supervised manifold alignment methods do not perform sufficiently well for such problems, since the hyperspectral data is very expensive to be largely collected in a trade-off between time and efficiency, compared to the multispectral data. To this end, we propose a novel semi-supervised cross-modality learning framework, called learnable manifold alignment (LeMA). LeMA learns a joint graph structure directly from the data instead of using a given fixed graph defined by a Gaussian kernel function. With the learned graph, we can further capture the data distribution by graph-based label propagation, which enables finding a more accurate decision boundary. Additionally, an optimization strategy based on the alternating direction method of multipliers (ADMM) is designed to solve the proposed model. Extensive experiments on two hyperspectral-multispectral datasets demonstrate the superiority and effectiveness of the proposed method in comparison with several state-of-the-art methods.

6.
Artículo en Inglés | MEDLINE | ID: mdl-38833391

RESUMEN

Accurately distinguishing between background and anomalous objects within hyperspectral images poses a significant challenge. The primary obstacle lies in the inadequate modeling of prior knowledge, leading to a performance bottleneck in hyperspectral anomaly detection (HAD). In response to this challenge, we put forth a groundbreaking coupling paradigm that combines model-driven low-rank representation (LRR) methods with data-driven deep learning techniques by learning disentangled priors (LDP). LDP seeks to capture complete priors for effectively modeling the background, thereby extracting anomalies from hyperspectral images more accurately. LDP follows a model-driven deep unfolding architecture, where the prior knowledge is separated into the explicit low-rank prior formulated by expert knowledge and implicit learnable priors by means of deep networks. The internal relationships between explicit and implicit priors within LDP are elegantly modeled through a skip residual connection. Furthermore, we provide a mathematical proof of the convergence of our proposed model. Our experiments, conducted on multiple widely recognized datasets, demonstrate that LDP surpasses most of the current advanced HAD techniques, exceling in both detection performance and generalization capability.

7.
Artículo en Inglés | MEDLINE | ID: mdl-38381635

RESUMEN

multimodal image fusion involves tasks like pan-sharpening and depth super-resolution. Both tasks aim to generate high-resolution target images by fusing the complementary information from the texture-rich guidance and low-resolution target counterparts. They are inborn with reconstructing high-frequency information. Despite their inherent frequency domain connection, most existing methods only operate solely in the spatial domain and rarely explore the solutions in the frequency domain. This study addresses this limitation by proposing solutions in both the spatial and frequency domains. We devise a Spatial-Frequency Information Integration Network, abbreviated as SFINet for this purpose. The SFINet includes a core module tailored for image fusion. This module consists of three key components: a spatial-domain information branch, a frequency-domain information branch, and a dual-domain interaction. The spatial-domain information branch employs the spatial convolution-equipped invertible neural operators to integrate local information from different modalities in the spatial domain. Meanwhile, the frequency-domain information branch adopts a modality-aware deep Fourier transformation to capture the image-wide receptive field for exploring global contextual information. In addition, the dual-domain interaction facilitates information flow and the learning of complementary representations. We further present an improved version of SFINet, SFINet++, that enhances the representation of spatial information by replacing the basic convolution unit in the original spatial domain branch with the information-lossless invertible neural operator. We conduct extensive experiments to validate the effectiveness of the proposed networks and demonstrate their outstanding performance against state-of-the-art methods in two representative multimodal image fusion tasks: pan-sharpening and depth super-resolution. The source code is publicly available at https://github.com/manman1995/Awaresome-pansharpening.

8.
IEEE Trans Pattern Anal Mach Intell ; 46(8): 5227-5244, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38568772

RESUMEN

The foundation model has recently garnered significant attention due to its potential to revolutionize the field of visual representation learning in a self-supervised manner. While most foundation models are tailored to effectively process RGB images for various visual tasks, there is a noticeable gap in research focused on spectral data, which offers valuable information for scene understanding, especially in remote sensing (RS) applications. To fill this gap, we created for the first time a universal RS foundation model, named SpectralGPT, which is purpose-built to handle spectral RS images using a novel 3D generative pretrained transformer (GPT). Compared to existing foundation models, SpectralGPT 1) accommodates input images with varying sizes, resolutions, time series, and regions in a progressive training fashion, enabling full utilization of extensive RS Big Data; 2) leverages 3D token generation for spatial-spectral coupling; 3) captures spectrally sequential patterns via multi-target reconstruction; and 4) trains on one million spectral RS images, yielding models with over 600 million parameters. Our evaluation highlights significant performance improvements with pretrained SpectralGPT models, signifying substantial potential in advancing spectral RS Big Data applications within the field of geoscience across four downstream tasks: single/multi-label scene classification, semantic segmentation, and change detection.

9.
IEEE Trans Cybern ; 53(1): 679-691, 2023 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-35609106

RESUMEN

Recently, low-rank representation (LRR) methods have been widely applied for hyperspectral anomaly detection, due to their potentials in separating the backgrounds and anomalies. However, existing LRR models generally convert 3-D hyperspectral images (HSIs) into 2-D matrices, inevitably leading to the destruction of intrinsic 3-D structure properties in HSIs. To this end, we propose a novel tensor low-rank and sparse representation (TLRSR) method for hyperspectral anomaly detection. A 3-D TLR model is expanded to separate the LR background part represented by a tensorial background dictionary and corresponding coefficients. This representation characterizes the multiple subspace property of the complex LR background. Based on the weighted tensor nuclear norm and the LF,1 sparse norm, a dictionary is designed to make its atoms more relevant to the background. Moreover, a principal component analysis (PCA) method can be assigned as one preprocessing step to exact a subset of HSI bands, retaining enough the HSI object information and reducing computational time of the postprocessing tensorial operations. The proposed model is efficiently solved by the well-designed alternating direction method of multipliers (ADMMs). A comparison with the existing algorithms via experiments establishes the competitiveness of the proposed method with the state-of-the-art competitors in the hyperspectral anomaly detection task.

10.
Artículo en Inglés | MEDLINE | ID: mdl-37379187

RESUMEN

It is generally known that pan-sharpening is fundamentally a PAN-guided multispectral (MS) image super-resolution problem that involves learning the nonlinear mapping from low-resolution (LR) to high-resolution (HR) MS images. Since an infinite number of HR-MS images can be downsampled to produce the same corresponding LR-MS image, learning the mapping from LR-MS to HR-MS image is typically ill-posed and the space of the possible pan-sharpening functions can be extremely large, making it difficult to estimate the optimal mapping solution. To address the above issue, we propose a closed-loop scheme that learns the two opposite mapping including the pan-sharpening and its corresponding degradation process simultaneously to regularize the solution space in a single pipeline. More specifically, an invertible neural network (INN) is introduced to perform a bidirectional closed-loop: the forward operation for LR-MS pan-sharpening and the backward operation for learning the corresponding HR-MS image degradation process. In addition, given the vital importance of high-frequency textures for the Pan-sharpened MS images, we further strengthen the INN by designing a specified multiscale high-frequency texture extraction module. Extensive experimental results demonstrate that the proposed algorithm performs favorably against state-of-the-art methods qualitatively and quantitatively with fewer parameters. Ablation studies also verify the effectiveness of the closed-loop mechanism in pan-sharpening. The source code is made publicly available at https://github.com/manman1995/pan-sharpening-Team-zhouman/.

11.
Artículo en Inglés | MEDLINE | ID: mdl-37015404

RESUMEN

Learning-based infrared small object detection methods currently rely heavily on the classification backbone network. This tends to result in tiny object loss and feature distinguishability limitations as the network depth increases. Furthermore, small objects in infrared images are frequently emerged bright and dark, posing severe demands for obtaining precise object contrast information. For this reason, we in this paper propose a simple and effective "U-Net in U-Net" framework, UIU-Net for short, and detect small objects in infrared images. As the name suggests, UIU-Net embeds a tiny U-Net into a larger U-Net backbone, enabling the multi-level and multi-scale representation learning of objects. Moreover, UIU-Net can be trained from scratch, and the learned features can enhance global and local contrast information effectively. More specifically, the UIU-Net model is divided into two modules: the resolution-maintenance deep supervision (RM-DS) module and the interactive-cross attention (IC-A) module. RM-DS integrates Residual U-blocks into a deep supervision network to generate deep multi-scale resolution-maintenance features while learning global context information. Further, IC-A encodes the local context information between the low-level details and high-level semantic features. Extensive experiments conducted on two infrared single-frame image datasets, i.e., SIRST and Synthetic datasets, show the effectiveness and superiority of the proposed UIU-Net in comparison with several state-of-the-art infrared small object detection methods. The proposed UIU-Net also produces powerful generalization performance for video sequence infrared small object datasets, e.g., ATR ground/air video sequence dataset. The codes of this work are available openly at https://github.com/danfenghong/IEEE_TIP_UIU-Net.

12.
IEEE Trans Image Process ; 31: 5079-5092, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35881603

RESUMEN

Recently, embedding and metric-based few-shot learning (FSL) has been introduced into hyperspectral image classification (HSIC) and achieved impressive progress. To further enhance the performance with few labeled samples, we in this paper propose a novel FSL framework for HSIC with a class-covariance metric (CMFSL). Overall, the CMFSL learns global class representations for each training episode by interactively using training samples from the base and novel classes, and a synthesis strategy is employed on the novel classes to avoid overfitting. During the meta-training and meta-testing, the class labels are determined directly using the Mahalanobis distance measurement rather than an extra classifier. Benefiting from the task-adapted class-covariance estimations, the CMFSL can construct more flexible decision boundaries than the commonly used Euclidean metric. Additionally, a lightweight cross-scale convolutional network (LXConvNet) consisting of 3D and 2D convolutions is designed to thoroughly exploit the spectral-spatial information in the high-frequency and low-frequency scales with low computational complexity. Furthermore, we devise a spectral-prior-based refinement module (SPRM) in the initial stage of feature extraction, which cannot only force the network to emphasize the most informative bands while suppressing the useless ones, but also alleviate the effects of the domain shift between the base and novel categories to learn a collaborative embedding mapping. Extensive experiment results on four benchmark data sets demonstrate that the proposed CMFSL can outperform the state-of-the-art methods with few-shot annotated samples.

13.
IEEE Trans Neural Netw Learn Syst ; 33(11): 6518-6531, 2022 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-34048352

RESUMEN

Over the past decades, enormous efforts have been made to improve the performance of linear or nonlinear mixing models for hyperspectral unmixing (HU), yet their ability to simultaneously generalize various spectral variabilities (SVs) and extract physically meaningful endmembers still remains limited due to the poor ability in data fitting and reconstruction and the sensitivity to various SVs. Inspired by the powerful learning ability of deep learning (DL), we attempt to develop a general DL approach for HU, by fully considering the properties of endmembers extracted from the hyperspectral imagery, called endmember-guided unmixing network (EGU-Net). Beyond the alone autoencoder-like architecture, EGU-Net is a two-stream Siamese deep network, which learns an additional network from the pure or nearly pure endmembers to correct the weights of another unmixing network by sharing network parameters and adding spectrally meaningful constraints (e.g., nonnegativity and sum-to-one) toward a more accurate and interpretable unmixing solution. Furthermore, the resulting general framework is not only limited to pixelwise spectral unmixing but also applicable to spatial information modeling with convolutional operators for spatial-spectral unmixing. Experimental results conducted on three different datasets with the ground truth of abundance maps corresponding to each material demonstrate the effectiveness and superiority of the EGU-Net over state-of-the-art unmixing algorithms. The codes will be available from the website: https://github.com/danfenghong/IEEE_TNNLS_EGU-Net.

14.
IEEE Trans Cybern ; 51(7): 3602-3615, 2021 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-33175688

RESUMEN

Conventional nonlinear subspace learning techniques (e.g., manifold learning) usually introduce some drawbacks in explainability (explicit mapping) and cost effectiveness (linearization), generalization capability (out-of-sample), and representability (spatial-spectral discrimination). To overcome these shortcomings, a novel linearized subspace analysis technique with spatial-spectral manifold alignment is developed for a semisupervised hyperspectral dimensionality reduction (HDR), called joint and progressive subspace analysis (JPSA). The JPSA learns a high-level, semantically meaningful, joint spatial-spectral feature representation from hyperspectral (HS) data by: 1) jointly learning latent subspaces and a linear classifier to find an effective projection direction favorable for classification; 2) progressively searching several intermediate states of subspaces to approach an optimal mapping from the original space to a potential more discriminative subspace; and 3) spatially and spectrally aligning a manifold structure in each learned latent subspace in order to preserve the same or similar topological property between the compressed data and the original data. A simple but effective classifier, that is, nearest neighbor (NN), is explored as a potential application for validating the algorithm performance of different HDR approaches. Extensive experiments are conducted to demonstrate the superiority and effectiveness of the proposed JPSA on two widely used HS datasets: 1) Indian Pines (92.98%) and 2) the University of Houston (86.09%) in comparison with previous state-of-the-art HDR methods. The demo of this basic work (i.e., ECCV2018) is openly available at https://github.com/danfenghong/ECCV2018_J-Play.

15.
IEEE Trans Neural Netw Learn Syst ; 32(2): 826-840, 2021 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-32275618

RESUMEN

Interferometric phase restoration has been investigated for decades and most of the state-of-the-art methods have achieved promising performances for InSAR phase restoration. These methods generally follow the nonlocal filtering processing chain, aiming at circumventing the staircase effect and preserving the details of phase variations. In this article, we propose an alternative approach for InSAR phase restoration, that is, Complex Convolutional Sparse Coding (ComCSC) and its gradient regularized version. To the best of the authors' knowledge, this is the first time that we solve the InSAR phase restoration problem in a deconvolutional fashion. The proposed methods can not only suppress interferometric phase noise, but also avoid the staircase effect and preserve the details. Furthermore, they provide an insight into the elementary phase components for the interferometric phases. The experimental results on synthetic and realistic high- and medium-resolution data sets from TerraSAR-X StripMap and Sentinel-1 interferometric wide swath mode, respectively, show that our method outperforms those previous state-of-the-art methods based on nonlocal InSAR filters, particularly the state-of-the-art method: InSAR-BM3D. The source code of this article will be made publicly available for reproducible research inside the community.

16.
Artículo en Inglés | MEDLINE | ID: mdl-30418901

RESUMEN

Hyperspectral imagery collected from airborne or satellite sources inevitably suffers from spectral variability, making it difficult for spectral unmixing to accurately estimate abundance maps. The classical unmixing model, the linear mixing model (LMM), generally fails to handle this sticky issue effectively. To this end, we propose a novel spectral mixture model, called the augmented linear mixing model (ALMM), to address spectral variability by applying a data-driven learning strategy in inverse problems of hyperspectral unmixing. The proposed approach models the main spectral variability (i.e., scaling factors) generated by variations in illumination or typography separately by means of the endmember dictionary. It then models other spectral variabilities caused by environmental conditions (e.g., local temperature and humidity, atmospheric effects) and instrumental configurations (e.g., sensor noise), as well as material nonlinear mixing effects, by introducing a spectral variability dictionary. To effectively run the data-driven learning strategy, we also propose a reasonable prior knowledge for the spectral variability dictionary, whose atoms are assumed to be low-coherent with spectral signatures of endmembers, which leads to a well-known low-coherence dictionary learning problem. Thus, a dictionary learning technique is embedded in the framework of spectral unmixing so that the algorithm can learn the spectral variability dictionary and estimate the abundance maps simultaneously. Extensive experiments on synthetic and real datasets are performed to demonstrate the superiority and effectiveness of the proposed method in comparison with previous state-of-the-art methods.

18.
PLoS One ; 9(7): e101866, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24992328

RESUMEN

As palmprints are captured using non-contact devices, image blur is inevitably generated because of the defocused status. This degrades the recognition performance of the system. To solve this problem, we propose a stable-feature extraction method based on a Vese-Osher (VO) decomposition model to recognize blurred palmprints effectively. A Gaussian defocus degradation model is first established to simulate image blur. With different degrees of blurring, stable features are found to exist in the image which can be investigated by analyzing the blur theoretically. Then, a VO decomposition model is used to obtain structure and texture layers of the blurred palmprint images. The structure layer is stable for different degrees of blurring (this is a theoretical conclusion that needs to be further proved via experiment). Next, an algorithm based on weighted robustness histogram of oriented gradients (WRHOG) is designed to extract the stable features from the structure layer of the blurred palmprint image. Finally, a normalized correlation coefficient is introduced to measure the similarity in the palmprint features. We also designed and performed a series of experiments to show the benefits of the proposed method. The experimental results are used to demonstrate the theoretical conclusion that the structure layer is stable for different blurring scales. The WRHOG method also proves to be an advanced and robust method of distinguishing blurred palmprints. The recognition results obtained using the proposed method and data from two palmprint databases (PolyU and Blurred-PolyU) are stable and superior in comparison to previous high-performance methods (the equal error rate is only 0.132%). In addition, the authentication time is less than 1.3 s, which is fast enough to meet real-time demands. Therefore, the proposed method is a feasible way of implementing blurred palmprint recognition.


Asunto(s)
Dermatoglifia , Interpretación de Imagen Asistida por Computador/métodos , Algoritmos , Identificación Biométrica/métodos , Bases de Datos Factuales , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA