RESUMO
The article presents results of using remote sensing images and machine learning to map and assess land potential based on time-series of potential Fraction of Absorbed Photosynthetically Active Radiation (FAPAR) composites. Land potential here refers to the potential vegetation productivity in the hypothetical absence of short-term anthropogenic influence, such as intensive agriculture and urbanization. Knowledge on this ecological land potential could support the assessment of levels of land degradation as well as restoration potentials. Monthly aggregated FAPAR time-series of three percentiles (0.05, 0.50 and 0.95 probability) at 250 m spatial resolution were derived from the 8-day GLASS FAPAR V6 product for 2000-2021 and used to determine long-term trends in FAPAR, as well as to model potential FAPAR in the absence of human pressure. CCa 3 million training points sampled from 12,500 locations across the globe were overlaid with 68 bio-physical variables representing climate, terrain, landform, and vegetation cover, as well as several variables representing human pressure including: population count, cropland intensity, nightlights and a human footprint index. The training points were used in an ensemble machine learning model that stacks three base learners (extremely randomized trees, gradient descended trees and artificial neural network) using a linear regressor as meta-learner. The potential FAPAR was then projected by removing the impact of urbanization and intensive agriculture in the covariate layers. The results of strict cross-validation show that the global distribution of FAPAR can be explained with an R2 of 0.89, with the most important covariates being growing season length, forest cover indicator and annual precipitation. From this model, a global map of potential monthly FAPAR for the recent year (2021) was produced, and used to predict gaps in actual vs. potential FAPAR. The produced global maps of actual vs. potential FAPAR and long-term trends were each spatially matched with stable and transitional land cover classes. The assessment showed large negative FAPAR gaps (actual lower than potential) for classes: urban, needle-leave deciduous trees, and flooded shrub or herbaceous cover, while strong negative FAPAR trends were found for classes: urban, sparse vegetation and rainfed cropland. On the other hand, classes: irrigated or post-flooded cropland, tree cover mixed leaf type, and broad-leave deciduous showed largely positive trends. The framework allows land managers to assess potential land degradation from two aspects: as an actual declining trend in observed FAPAR and as a difference between actual and potential vegetation FAPAR.
Assuntos
Clima , Florestas , Humanos , Agricultura , Estações do AnoRESUMO
The global potential distribution of biomes (natural vegetation) was modelled using 8,959 training points from the BIOME 6000 dataset and a stack of 72 environmental covariates representing terrain and the current climatic conditions based on historical long term averages (1979-2013). An ensemble machine learning model based on stacked regularization was used, with multinomial logistic regression as the meta-learner and spatial blocking (100 km) to deal with spatial autocorrelation of the training points. Results of spatial cross-validation for the BIOME 6000 classes show an overall accuracy of 0.67 and R2logloss of 0.61, with "tropical evergreen broadleaf forest" being the class with highest gain in predictive performances (R2logloss = 0.74) and "prostrate dwarf shrub tundra" the class with the lowest (R2logloss = -0.09) compared to the baseline. Temperature-related covariates were the most important predictors, with the mean diurnal range (BIO2) being shared by all the base-learners (i.e.,random forest, gradient boosted trees and generalized linear models). The model was next used to predict the distribution of future biomes for the periods 2040-2060 and 2061-2080 under three climate change scenarios (RCP 2.6, 4.5 and 8.5). Comparisons of predictions for the three epochs (present, 2040-2060 and 2061-2080) show that increasing aridity and higher temperatures will likely result in significant shifts in natural vegetation in the tropical area (shifts from tropical forests to savannas up to 1.7 ×105 km2 by 2080) and around the Arctic Circle (shifts from tundra to boreal forests up to 2.4 ×105 km2 by 2080). Projected global maps at 1 km spatial resolution are provided as probability and hard classes maps for BIOME 6000 classes and as hard classes maps for the IUCN classes (six aggregated classes). Uncertainty maps (prediction error) are also provided and should be used for careful interpretation of the future projections.
Assuntos
Mudança Climática , Ecossistema , Temperatura , Modelos Logísticos , Regiões ÁrticasRESUMO
The article describes the production steps and accuracy assessment of an analysis-ready, open-access European data cube consisting of 2000-2020+ Landsat data, 2017-2021+ Sentinel-2 data and a 30 m resolution digital terrain model (DTM). The main purpose of the data cube is to make annual continental-scale spatiotemporal machine learning tasks accessible to a wider user base by providing a spatially and temporally consistent multidimensional feature space. This has required systematic spatiotemporal harmonization, efficient compression, and imputation of missing values. Sentinel-2 and Landsat reflectance values were aggregated into four quarterly averages approximating the four seasons common in Europe (winter, spring, summer and autumn), as well as the 25th and 75th percentile, in order to retain intra-seasonal variance. Remaining missing data in the Landsat time-series was imputed with a temporal moving window median (TMWM) approach. An accuracy assessment shows TMWM performs relatively better in Southern Europe and lower in mountainous regions such as the Scandinavian Mountains, the Alps, and the Pyrenees. We quantify the usability of the different component data sets for spatiotemporal machine learning tasks with a series of land cover classification experiments, which show that models utilizing the full feature space (30 m DTM, 30 m Landsat, 30 m and 10 m Sentinel-2) yield the highest land cover classification accuracy, with different data sets improving the results for different land cover classes. The data sets presented in the article are part of the EcoDataCube platform, which also hosts open vegetation, soil, and land use/land cover (LULC) maps created. All data sets are available under CC-BY license as Cloud-Optimized GeoTIFFs (ca. 12 TB in size) through SpatioTemporal Asset Catalog (STAC) and the EcoDataCube data portal.
Assuntos
Síndrome Linfoproliferativa Autoimune , Compressão de Dados , Humanos , Europa (Continente) , Estações do Ano , ClimaRESUMO
This dataset presents global soil organic carbon stocks in mangrove forests at 30 m resolution, predicted for 2020. We used spatiotemporal ensemble machine learning to produce predictions of soil organic carbon content and bulk density (BD) to 1 m soil depth, which were then aggregated to calculate soil organic carbon stocks. This was done by using training data points of both SOC (%) and BD in mangroves from a global dataset and from recently published studies, and globally consistent predictive covariate layers. A total of 10,331 soil samples were validated to have SOC (%) measurements and were used for predictive soil mapping. We used time-series remote sensing data specific to time periods when the training data were sampled, as well as long-term (static) layers to train an ensemble of machine learning model. Ensemble models were used to improve performance, robustness and unbiasedness as opposed to just using one learner. In addition, we performed spatial cross-validation by using spatial blocking of training data points to assess model performance. We predicted SOC stocks for the 2020 time period and applied them to a 2020 mangrove extent map, presenting both mean predictions and prediction intervals to represent the uncertainty around our predictions. Predictions are available for download under CC-BY license from 10.5281/zenodo.7729491 and also as Cloud-Optimized GeoTIFFs (global mosaics).
RESUMO
This article describes a data-driven framework based on spatiotemporal machine learning to produce distribution maps for 16 tree species (Abies alba Mill., Castanea sativa Mill., Corylus avellana L., Fagus sylvatica L., Olea europaea L., Picea abies L. H. Karst., Pinus halepensis Mill., Pinus nigra J. F. Arnold, Pinus pinea L., Pinus sylvestris L., Prunus avium L., Quercus cerris L., Quercus ilex L., Quercus robur L., Quercus suber L. and Salix caprea L.) at high spatial resolution (30 m). Tree occurrence data for a total of three million of points was used to train different algorithms: random forest, gradient-boosted trees, generalized linear models, k-nearest neighbors, CART and an artificial neural network. A stack of 305 coarse and high resolution covariates representing spectral reflectance, different biophysical conditions and biotic competition was used as predictors for realized distributions, while potential distribution was modelled with environmental predictors only. Logloss and computing time were used to select the three best algorithms to tune and train an ensemble model based on stacking with a logistic regressor as a meta-learner. An ensemble model was trained for each species: probability and model uncertainty maps of realized distribution were produced for each species using a time window of 4 years for a total of six distribution maps per species, while for potential distributions only one map per species was produced. Results of spatial cross validation show that the ensemble model consistently outperformed or performed as good as the best individual model in both potential and realized distribution tasks, with potential distribution models achieving higher predictive performances (TSS = 0.898, R2 logloss = 0.857) than realized distribution ones on average (TSS = 0.874, R2 logloss = 0.839). Ensemble models for Q. suber achieved the best performances in both potential (TSS = 0.968, R2 logloss = 0.952) and realized (TSS = 0.959, R2 logloss = 0.949) distribution, while P. sylvestris (TSS = 0.731, 0.785, R2 logloss = 0.585, 0.670, respectively, for potential and realized distribution) and P. nigra (TSS = 0.658, 0.686, R2 logloss = 0.623, 0.664) achieved the worst. Importance of predictor variables differed across species and models, with the green band for summer and the Normalized Difference Vegetation Index (NDVI) for fall for realized distribution and the diffuse irradiation and precipitation of the driest quarter (BIO17) being the most frequent and important for potential distribution. On average, fine-resolution models outperformed coarse resolution models (250 m) for realized distribution (TSS = +6.5%, R2 logloss = +7.5%). The framework shows how combining continuous and consistent Earth Observation time series data with state of the art machine learning can be used to derive dynamic distribution maps. The produced predictions can be used to quantify temporal trends of potential forest degradation and species composition change.
Assuntos
Abies , Fagus , Pinus , Quercus , Europa (Continente)RESUMO
A spatiotemporal machine learning framework for automated prediction and analysis of long-term Land Use/Land Cover dynamics is presented. The framework includes: (1) harmonization and preprocessing of spatial and spatiotemporal input datasets (GLAD Landsat, NPP/VIIRS) including five million harmonized LUCAS and CORINE Land Cover-derived training samples, (2) model building based on spatial k-fold cross-validation and hyper-parameter optimization, (3) prediction of the most probable class, class probabilities and model variance of predicted probabilities per pixel, (4) LULC change analysis on time-series of produced maps. The spatiotemporal ensemble model consists of a random forest, gradient boosted tree classifier, and an artificial neural network, with a logistic regressor as meta-learner. The results show that the most important variables for mapping LULC in Europe are: seasonal aggregates of Landsat green and near-infrared bands, multiple Landsat-derived spectral indices, long-term surface water probability, and elevation. Spatial cross-validation of the model indicates consistent performance across multiple years with overall accuracy (a weighted F1-score) of 0.49, 0.63, and 0.83 when predicting 43 (level-3), 14 (level-2), and five classes (level-1). Additional experiments show that spatiotemporal models generalize better to unknown years, outperforming single-year models on known-year classification by 2.7% and unknown-year classification by 3.5%. Results of the accuracy assessment using 48,365 independent test samples shows 87% match with the validation points. Results of time-series analysis (time-series of LULC probabilities and NDVI images) suggest forest loss in large parts of Sweden, the Alps, and Scotland. Positive and negative trends in NDVI in general match the land degradation and land restoration classes, with "urbanization" showing the most negative NDVI trend. An advantage of using spatiotemporal ML is that the fitted model can be used to predict LULC in years that were not included in its training dataset, allowing generalization to past and future periods, e.g. to predict LULC for years prior to 2000 and beyond 2020. The generated LULC time-series data stack (ODSE-LULC), including the training points, is publicly available via the ODSE Viewer. Functions used to prepare data and run modeling are available via the eumap library for Python.
Assuntos
Monitoramento Ambiental , Urbanização , Probabilidade , Europa (Continente) , Fatores de TempoRESUMO
Across South America, the expansion of commodity land uses has underpinned substantial economic development at the expense of natural land cover and associated ecosystem services. Here, we show that such human impact on the continent's land surface, specifically land use conversion and natural land cover modification, expanded by 268 million hectares (Mha), or 60%, from 1985 to 2018. By 2018, 713 Mha, or 40%, of the South American landmass was impacted by human activity. Since 1985, the area of natural tree cover decreased by 16%, and pasture, cropland, and plantation land uses increased by 23, 160, and 288%, respectively. A substantial area of disturbed natural land cover, totaling 55 Mha, had no discernable land use, representing land that is degraded in terms of ecosystem function but not economically productive. These results illustrate the extent of ongoing human appropriation of natural ecosystems in South America, which intensifies threats to ecosystem-scale functions.
RESUMO
Soil property and class maps for the continent of Africa were so far only available at very generalised scales, with many countries not mapped at all. Thanks to an increasing quantity and availability of soil samples collected at field point locations by various government and/or NGO funded projects, it is now possible to produce detailed pan-African maps of soil nutrients, including micro-nutrients at fine spatial resolutions. In this paper we describe production of a 30 m resolution Soil Information System of the African continent using, to date, the most comprehensive compilation of soil samples ([Formula: see text]) and Earth Observation data. We produced predictions for soil pH, organic carbon (C) and total nitrogen (N), total carbon, effective Cation Exchange Capacity (eCEC), extractable-phosphorus (P), potassium (K), calcium (Ca), magnesium (Mg), sulfur (S), sodium (Na), iron (Fe), zinc (Zn)-silt, clay and sand, stone content, bulk density and depth to bedrock, at three depths (0, 20 and 50 cm) and using 2-scale 3D Ensemble Machine Learning framework implemented in the mlr (Machine Learning in R) package. As covariate layers we used 250 m resolution (MODIS, PROBA-V and SM2RAIN products), and 30 m resolution (Sentinel-2, Landsat and DTM derivatives) images. Our fivefold spatial Cross-Validation results showed varying accuracy levels ranging from the best performing soil pH (CCC = 0.900) to more poorly predictable extractable phosphorus (CCC = 0.654) and sulphur (CCC = 0.708) and depth to bedrock. Sentinel-2 bands SWIR (B11, B12), NIR (B09, B8A), Landsat SWIR bands, and vertical depth derived from 30 m resolution DTM, were the overall most important 30 m resolution covariates. Climatic data images-SM2RAIN, bioclimatic variables and MODIS Land Surface Temperature-however, remained as the overall most important variables for predicting soil chemical variables at continental scale. This publicly available 30-m Soil Information System of Africa aims at supporting numerous applications, including soil and fertilizer policies and investments, agronomic advice to close yield gaps, environmental programs, or targeting of nutrition interventions.