Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Artículo en Inglés | MEDLINE | ID: mdl-38568772

RESUMEN

The foundation model has recently garnered significant attention due to its potential to revolutionize the field of visual representation learning in a self-supervised manner. While most foundation models are tailored to effectively process RGB images for various visual tasks, there is a noticeable gap in research focused on spectral data, which offers valuable information for scene understanding, especially in remote sensing (RS) applications. To fill this gap, we created for the first time a universal RS foundation model, named SpectralGPT, which is purpose-built to handle spectral RS images using a novel 3D generative pretrained transformer (GPT). Compared to existing foundation models, SpectralGPT 1) accommodates input images with varying sizes, resolutions, time series, and regions in a progressive training fashion, enabling full utilization of extensive RS Big Data; 2) leverages 3D token generation for spatial-spectral coupling; 3) captures spectrally sequential patterns via multi-target reconstruction; 4) trains on one million spectral RS images, yielding models with over 600 million parameters. Our evaluation highlights significant performance improvements with pretrained SpectralGPT models, signifying substantial potential in advancing spectral RS Big Data applications within the field of geoscience across four downstream tasks: single/multi-label scene classification, semantic segmentation, and change detection.

2.
IEEE Trans Image Process ; 32: 6047-6060, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37917517

RESUMEN

In recent years, advanced research has focused on the direct learning and analysis of remote-sensing images using natural language processing (NLP) techniques. The ability to accurately describe changes occurring in multi-temporal remote sensing images is becoming increasingly important for geospatial understanding and land planning. Unlike natural image change captioning tasks, remote sensing change captioning aims to capture the most significant changes, irrespective of various influential factors such as illumination, seasonal effects, and complex land covers. In this study, we highlight the significance of accurately describing changes in remote sensing images and present a comparison of the change captioning task for natural and synthetic images and remote sensing images. To address the challenge of generating accurate captions, we propose an attentive changes-to-captions network, called Chg2Cap for short, for bi-temporal remote sensing images. The network comprises three main components: 1) a Siamese CNN-based feature extractor to collect high-level representations for each image pair; 2) an attentive encoder that includes a hierarchical self-attention block to locate change-related features and a residual block to generate the image embedding; and 3) a transformer-based caption generator to decode the relationship between the image embedding and the word embedding into a description. The proposed Chg2Cap network is evaluated on two representative remote sensing datasets, and a comprehensive experimental analysis is provided. The code and pre-trained models will be available online at https://github.com/ShizhenChang/Chg2Cap.

3.
IEEE Trans Image Process ; 32: 5737-5750, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37847620

RESUMEN

The synthesis of high-resolution remote sensing images based on text descriptions has great potential in many practical application scenarios. Although deep neural networks have achieved great success in many important remote sensing tasks, generating realistic remote sensing images from text descriptions is still very difficult. To address this challenge, we propose a novel text-to-image modern Hopfield network (Txt2Img-MHN). The main idea of Txt2Img-MHN is to conduct hierarchical prototype learning on both text and image embeddings with modern Hopfield layers. Instead of directly learning concrete but highly diverse text-image joint feature representations for different semantics, Txt2Img-MHN aims to learn the most representative prototypes from text-image embeddings, achieving a coarse-to-fine learning strategy. These learned prototypes can then be utilized to represent more complex semantics in the text-to-image generation task. To better evaluate the realism and semantic consistency of the generated images, we further conduct zero-shot classification on real remote sensing data using the classification model trained on synthesized images. Despite its simplicity, we find that the overall accuracy in the zero-shot classification may serve as a good metric to evaluate the ability to generate an image from text. Extensive experiments on the benchmark remote sensing text-image dataset demonstrate that the proposed Txt2Img-MHN can generate more realistic remote sensing images than existing methods. Code and pre-trained models are available online (https://github.com/YonghaoXu/Txt2Img-MHN).

4.
IEEE Trans Image Process ; 31: 7419-7434, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36417727

RESUMEN

Semantic segmentation methods based on deep neural networks have achieved great success in recent years. However, training such deep neural networks relies heavily on a large number of images with accurate pixel-level labels, which requires a huge amount of human effort, especially for large-scale remote sensing images. In this paper, we propose a point-based weakly supervised learning framework called the deep bilateral filtering network (DBFNet) for the semantic segmentation of remote sensing images. Compared with pixel-level labels, point annotations are usually sparse and cannot reveal the complete structure of the objects; they also lack boundary information, thus resulting in incomplete prediction within the object and the loss of object boundaries. To address these problems, we incorporate the bilateral filtering technique into deeply learned representations in two respects. First, since a target object contains smooth regions that always belong to the same category, we perform deep bilateral filtering (DBF) to filter the deep features by a nonlinear combination of nearby feature values, which encourages the nearby and similar features to become closer, thus achieving a consistent prediction in the smooth region. In addition, the DBF can distinguish the boundary by enlarging the distance between the features on different sides of the edge, thus preserving the boundary information well. Experimental results on two widely used datasets, the ISPRS 2-D semantic labeling Potsdam and Vaihingen datasets, demonstrate that our proposed DBFNet can achieve a highly competitive performance compared with state-of-the-art fully-supervised methods. Code is available at https://github.com/Luffy03/DBFNet.

5.
Int J Appl Earth Obs Geoinf ; 110: 102804, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-36338308

RESUMEN

Humans rely on clean water for their health, well-being, and various socio-economic activities. During the past few years, the COVID-19 pandemic has been a constant reminder of about the importance of hygiene and sanitation for public health. The most common approach to securing clean water supplies for this purpose is via wastewater treatment. To date, an effective method of detecting wastewater treatment plants (WWTP) accurately and automatically via remote sensing is unavailable. In this paper, we provide a solution to this task by proposing a novel joint deep learning (JDL) method that consists of a fine-tuned object detection network and a multi-task residual attention network (RAN). By leveraging OpenStreetMap (OSM) and multimodal remote sensing (RS) data, our JDL method is able to simultaneously tackle two different tasks: land use land cover (LULC) and WWTP classification. Moreover, JDL exploits the complementary effects between these tasks for a performance gain. We train JDL using 4,187 WWTP features and 4,200 LULC samples and validate the performance of the proposed method over a selected area around Stuttgart with 723 WWTP features and 1,200 LULC samples to generate an LULC classification map and a WWTP detection map. Extensive experiments conducted with different comparative methods demonstrate the effectiveness and efficiency of our JDL method in automatic WWTP detection in comparison with single-modality/single-task or traditional survey methods. Moreover, lessons learned pave the way for future works to simultaneously and effectively address multiple large-scale mapping tasks (e.g., both mapping LULC and detecting WWTP) from multimodal RS data via deep learning.

6.
Artículo en Inglés | MEDLINE | ID: mdl-36083964

RESUMEN

As the foundation of image interpretation, semantic segmentation is an active topic in the field of remote sensing. Facing the complex combination of multiscale objects existing in remote sensing images (RSIs), the exploration and modeling of contextual information have become the key to accurately identifying the objects at different scales. Although several methods have been proposed in the past decade, insufficient context modeling of global or local information, which easily results in the fragmentation of large-scale objects, the ignorance of small-scale objects, and blurred boundaries. To address the above issues, we propose a contextual representation enhancement network (CRENet) to strengthen the global context (GC) and local context (LC) modeling in high-level features. The core components of the CRENet are the local feature alignment enhancement module (LFAEM) and the superpixel affinity loss (SAL). The LFAEM aligns and enhances the LC in low-level features by constructing contextual contrast through multilayer cascaded deformable convolution and is then supplemented with high-level features to refine the segmentation map. The SAL assists the network to accurately capture the GC by supervising semantic information and relationship learned from superpixels. The proposed method is plug-and-play and can be embedded in any FCN-based network. Experiments on two popular RSI datasets demonstrate the effectiveness of our proposed network with competitive performance in qualitative and quantitative aspects.

7.
IEEE Trans Image Process ; 31: 5038-5051, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35877807

RESUMEN

Deep learning algorithms have obtained great success in semantic segmentation of very high-resolution (VHR) remote sensing images. Nevertheless, training these models generally requires a large amount of accurate pixel-wise annotations, which is very laborious and time-consuming to collect. To reduce the annotation burden, this paper proposes a consistency-regularized region-growing network (CRGNet) to achieve semantic segmentation of VHR remote sensing images with point-level annotations. The key idea of CRGNet is to iteratively select unlabeled pixels with high confidence to expand the annotated area from the original sparse points. However, since there may exist some errors and noises in the expanded annotations, directly learning from them may mislead the training of the network. To this end, we further propose the consistency regularization strategy, where a base classifier and an expanded classifier are employed. Specifically, the base classifier is supervised by the original sparse annotations, while the expanded classifier aims to learn from the expanded annotations generated by the base classifier with the region-growing mechanism. The consistency regularization is thereby achieved by minimizing the discrepancy between the predictions from both the base and the expanded classifiers. We find such a simple regularization strategy is yet very useful to control the quality of the region-growing mechanism. Extensive experiments on two benchmark datasets demonstrate that the proposed CRGNet significantly outperforms the existing state-of-the-art methods. Codes and pre-trained models are available online (https://github.com/YonghaoXu/CRGNet).


Asunto(s)
Algoritmos , Semántica , Benchmarking
8.
Sensors (Basel) ; 22(9)2022 Apr 19.
Artículo en Inglés | MEDLINE | ID: mdl-35590797

RESUMEN

This work evaluates the performance of three machine learning (ML) techniques, namely logistic regression (LGR), linear regression (LR), and support vector machines (SVM), and two multi-criteria decision-making (MCDM) techniques, namely analytical hierarchy process (AHP) and the technique for order of preference by similarity to ideal solution (TOPSIS), for mapping landslide susceptibility in the Chitral district, northern Pakistan. Moreover, we create landslide inventory maps from LANDSAT-8 satellite images through the change vector analysis (CVA) change detection method. The change detection yields more than 500 landslide spots. After some manual post-processing correction, the landslide inventory spots are randomly split into two sets with a 70/30 ratio for training and validating the performance of the ML techniques. Sixteen topographical, hydrological, and geological landslide-related factors of the study area are prepared as GIS layers. They are used to produce landslide susceptibility maps (LSMs) with weighted overlay techniques using different weights of landslide-related factors. The accuracy assessment shows that the ML techniques outperform the MCDM methods, while SVM yields the highest accuracy of 88% for the resulting LSM.


Asunto(s)
Deslizamientos de Tierra , Sistemas de Información Geográfica , Modelos Logísticos , Pakistán , Máquina de Vectores de Soporte
9.
Sci Rep ; 11(1): 14629, 2021 07 16.
Artículo en Inglés | MEDLINE | ID: mdl-34272463

RESUMEN

Earthquakes and heavy rainfalls are the two leading causes of landslides around the world. Since they often occur across large areas, landslide detection requires rapid and reliable automatic detection approaches. Currently, deep learning (DL) approaches, especially different convolutional neural network and fully convolutional network (FCN) algorithms, are reliably achieving cutting-edge accuracies in automatic landslide detection. However, these successful applications of various DL approaches have thus far been based on very high resolution satellite images (e.g., GeoEye and WorldView), making it easier to achieve such high detection performances. In this study, we use freely available Sentinel-2 data and ALOS digital elevation model to investigate the application of two well-known FCN algorithms, namely the U-Net and residual U-Net (or so-called ResU-Net), for landslide detection. To our knowledge, this is the first application of FCN for landslide detection only from freely available data. We adapt the algorithms to the specific aim of landslide detection, then train and test with data from three different case study areas located in Western Taitung County (Taiwan), Shuzheng Valley (China), and Eastern Iburi (Japan). We characterize three different window size sample patches to train the algorithms. Our results also contain a comprehensive transferability assessment achieved through different training and testing scenarios in the three case studies. The highest f1-score value of 73.32% was obtained by ResU-Net, trained with a dataset from Japan, and tested on China's holdout testing area using the sample patch size of 64 × 64 pixels.

10.
Sensors (Basel) ; 20(13)2020 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-32635611

RESUMEN

Geological objects are characterized by a high complexity inherent to a strong compositional variability at all scales and usually unclear class boundaries. Therefore, dedicated processing schemes are required for the analysis of such data for mineralogical mapping. On the other hand, the variety of optical sensing technology reveals different data attributes and therefore multi-sensor approaches are adapted to solve such complicated mapping problems. In this paper, we devise an adapted multi-optical sensor fusion (MOSFus) workflow which takes the geological characteristics into account. The proposed processing chain exhaustively covers all relevant stages, including data acquisition, preprocessing, feature fusion, and mineralogical mapping. The concept includes (i) a spatial feature extraction based on morphological profiles on RGB data with high spatial resolution, (ii) a specific noise reduction applied on the hyperspectral data that assumes mixed sparse and Gaussian contamination, and (iii) a subsequent dimensionality reduction using a sparse and smooth low rank analysis. The feature extraction approach allows one to fuse heterogeneous data at variable resolutions, scales, and spectral ranges and improve classification substantially. The last step of the approach, an SVM classifier, is robust to unbalanced and sparse training sets and is particularly efficient with complex imaging data. We evaluate the performance of the procedure with two different multi-optical sensor datasets. The results demonstrate the superiority of this dedicated approach over common strategies.

11.
Sci Total Environ ; 701: 134474, 2020 Jan 20.
Artículo en Inglés | MEDLINE | ID: mdl-31704408

RESUMEN

Air pollution, and especially atmospheric particulate matter (PM), has a profound impact on human mortality and morbidity, environment, and ecological system. Accordingly, it is very relevant predicting air quality. Although the application of the machine learning (ML) models for predicting air quality parameters, such as PM concentrations, has been evaluated in previous studies, those on the spatial hazard modeling of them are very limited. Due to the high potential of the ML models, the spatial modeling of PM can help managers to identify the pollution hotspots. Accordingly, this study aims at developing new ML models, such as Random Forest (RF), Bagged Classification and Regression Trees (Bagged CART), and Mixture Discriminate Analysis (MDA) for the hazard prediction of PM10 (particles with a diameter less than 10 µm) in the Barcelona Province, Spain. According to the annual PM10 concentration in 75 stations, the healthy and unhealthy locations are determined, and a ratio 70/30 (53/22 stations) is applied for calibrating and validating the ML models to predict the most hazardous areas for PM10. In order to identify the influential variables of PM modeling, the simulated annealing (SA) feature selection method is used. Seven features, among the thirteen features, are selected as critical features. According to the results, all the three-machine learning (ML) models achieve an excellent performance (Accuracy > 87% and precision > 86%). However, the Bagged CART and RF models have the same performance and higher than the MDA model. Spatial hazard maps predicted by the three models indicate that the high hazardous areas are located in the middle of the Barcelona Province more than in the Barcelona's Metropolitan Area.

12.
Environ Res ; 179(Pt A): 108770, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31577962

RESUMEN

Earth fissures are the cracks on the surface of the earth mainly formed in the arid and the semi-arid basins. The excessive withdrawal of groundwater, as well as the other underground natural resources, has been introduced as the significant causing of land subsidence and potentially, the earth fissuring. Fissuring is rapidly turning into the nations' major disasters which are responsible for significant economic, social, and environmental damages with devastating consequences. Modeling the earth fissure hazard is particularly important for identifying the vulnerable groundwater areas for the informed water management, and effectively enforce the groundwater recharge policies toward the sustainable conservation plans to preserve existing groundwater resources. Modeling the formation of earth fissures and ultimately prediction of the hazardous areas has been greatly challenged due to the complexity, and the multidisciplinary involved to predict the earth fissures. This paper aims at proposing novel machine learning models for prediction of earth fissuring hazards. The Simulated annealing feature selection (SAFS) method was applied to identify key features, and the generalized linear model (GLM), multivariate adaptive regression splines (MARS), classification and regression tree (CART), random forest (RF), and support vector machine (SVM) have been used for the first time to build the prediction models. Results indicated that all the models had good accuracy (>86%) and precision (>81%) in the prediction of the earth fissure hazard. The GLM model (as a linear model) had the lowest performance, while the RF model was the best model in the modeling process. Sensitivity analysis indicated that the hazardous class in the study area was mainly related to low elevations with characteristics of high groundwater withdrawal, drop in groundwater level, high well density, high road density, low precipitation, and Quaternary sediments distribution.


Asunto(s)
Fenómenos Geológicos , Agua Subterránea , Modelos de Riesgos Proporcionales , Monitoreo del Ambiente/métodos , Aprendizaje Automático
13.
Sensors (Basel) ; 19(12)2019 Jun 21.
Artículo en Inglés | MEDLINE | ID: mdl-31234309

RESUMEN

Rapid, efficient and reproducible drillcore logging is fundamental in mineral exploration. Drillcore mapping has evolved rapidly in the recent decade, especially with the advances in hyperspectral spectral imaging. A wide range of imaging sensors is now available, providing rapidly increasing spectral as well as spatial resolution and coverage. However, the fusion of data acquired with multiple sensors is challenging and usually not conducted operationally. We propose an innovative solution based on the recent developments made in machine learning to integrate such multi-sensor datasets. Image feature extraction using orthogonal total variation component analysis enables a strong reduction in dimensionality and memory size of each input dataset, while maintaining the majority of its spatial and spectral information. This is in particular advantageous for sensors with very high spatial and/or spectral resolution, which are otherwise difficult to jointly process due to their large data memory requirements during classification. The extracted features are not only bound to absorption features but recognize specific and relevant spatial or spectral patterns. We exemplify the workflow with data acquired with five commercially available hyperspectral sensors and a pair of RGB cameras. The robust and efficient spectral-spatial procedure is evaluated on a representative set of geological samples. We validate the process with independent and detailed mineralogical and spectral data. The suggested workflow provides a versatile solution for the integration of multi-source hyperspectral data in a diversity of geological applications. In this study, we show a straight-forward integration of visible/near-infrared (VNIR), short-wave infrared (SWIR) and long-wave infrared (LWIR) data for sensors with highly different spatial and spectral resolution that greatly improves drillcore mapping.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...