RESUMEN
The National Forestry Commission of Mexico continuously monitors forest structure within the country's continental territory by the implementation of the National Forest and Soils Inventory (INFyS). Due to the challenges involved in collecting data exclusively from field surveys, there are spatial information gaps for important forest attributes. This can produce bias or increase uncertainty when generating estimates required to support forest management decisions. Our objective is to predict the spatial distribution of tree height and tree density in all Mexican forests. We performed wall-to-wall spatial predictions of both attributes in 1-km grids, using ensemble machine learning across each forest type in Mexico. Predictor variables include remote sensing imagery and other geospatial data (e.g., mean precipitation, surface temperature, canopy cover). Training data is from the 2009 to 2014 cycle (n > 26,000 sampling plots). Spatial cross validation suggested that the model had a better performance when predicting tree height r 2 = .35 [.12, .51] (mean [min, max]) than for tree density r 2 = .23 [.05, .42]. The best predictive performance when mapping tree height was for broadleaf and coniferous-broadleaf forests (model explained ~50% of variance). The best predictive performance when mapping tree density was for tropical forest (model explained ~40% of variance). Although most forests had relatively low uncertainty for tree height predictions, e.g., values <60%, arid and semiarid ecosystems had high uncertainty, e.g., values >80%. Uncertainty values for tree density predictions were >80% in most forests. The applied open science approach we present is easily replicable and scalable, thus it is helpful to assist in the decision-making and future of the National Forest and Soils Inventory. This work highlights the need for analytical tools that help us exploit the full potential of the Mexican forest inventory datasets.
RESUMEN
In this paper, a monitoring system of agricultural production is modeled as a Data Fusion System (data from local fairs and meteorological data). The proposal considers the particular information of sales in agricultural markets for knowledge extraction about the associations among them. This association knowledge is employed to improve predictions of sales using a spatial prediction technique, as shown with data collected from local markets of the Andean region of Ecuador. The commercial activity in these markets uses Alternative Marketing Circuits (CIALCO). This market platform establishes a direct relationship between producer and consumer prices and promotes direct commercial interaction among family groups. The problem is presented first as a general fusion problem with a network of spatially distributed heterogeneous data sources, and is then applied to the prediction of products sales based on association rules mined in available sales data. First, transactional data is used as the base to extract the best association rules between products sold in different local markets, knowledge that allows the system to gain a significant improvement in prediction accuracy in the spatial region considered.
RESUMEN
ABSTRACT Determination of soil properties helps in the correct management of soil fertility. The portable X-ray fluorescence spectrometer (pXRF) has been recently adopted to determine total chemical element contents in soils, allowing soil property inferences. However, these studies are still scarce in Brazil and other countries. The objectives of this work were to predict soil properties using pXRF data, comparing stepwise multiple linear regression (SMLR) and random forest (RF) methods, as well as mapping and validating soil properties. 120 soil samples were collected at three depths and submitted to laboratory analyses. pXRF was used in the samples and total element contents were determined. From pXRF data, SMLR and RF were used to predict soil laboratory results, reflecting soil properties, and the models were validated. The best method was used to spatialize soil properties. Using SMLR, models had high values of R² (≥0.8), however the highest accuracy was obtained in RF modeling. Exchangeable Ca, Al, Mg, potential and effective cation exchange capacity, soil organic matter, pH, and base saturation had adequate adjustment and accurate predictions with RF. Eight out of the 10 soil properties predicted by RF using pXRF data had CaO as the most important variable helping predictions, followed by P2O5, Zn and Cr. Maps generated using RF from pXRF data had high accuracy for six soil properties, reaching R2 up to 0.83. pXRF in association with RF can be used to predict soil properties with high accuracy at low cost and time, besides providing variables aiding digital soil mapping.
RESUMO A determinação de atributos do solo auxilia no correto manejo da sua fertilidade. O equipamento portátil de fluorescência de raios-X (pXRF) foi recentemente adotado para determinar o teor total de elementos químicos em solos, permitindo inferências sobre atributos do solo. No entanto, esses estudos ainda são escassos no Brasil e em outros países. Os objetivos deste trabalho foram prever atributos do solo a partir de dados do pXRF, comparando-se os métodos de regressão linear múltipla stepwise (SMLR) e de random forest (RF), além de mapear e validar atributos do solo. 120 amostras de solo foram coletadas em três profundidades e submetidas a análises laboratoriais. Utilizou-se o pXRF para leitura das amostras e determinou-se o teor total de elementos. A partir dos dados do pXRF, foram utilizadas SMLR e RF para predizer resultados laboratoriais, que refletem atributos do solo, e os modelos foram validados. O melhor método foi utilizado para espacializar os atributos do solo. Utilizando SMLR, os modelos apresentaram valores elevados de R² (≥0,8), porém maior acurácia foi obtida na modelagem com RF. A capacidade de troca de cátions potencial e efetiva, matéria orgânica do solo, pH, saturação por bases e teores trocáveis de Ca, Al e Mg apresentaram ajustes adequados e predições acuradas com RF. Dos dez atributos do solo preditos por RF a partir de dados do pXRF, sete apresentavam CaO como a variável mais importante para auxiliar as predições, seguido por P2O5, Zn e Cr. Os mapas gerados a partir de dados do pXRF usando RF apresentaram adequados valores de R² para seis atributos do solo, atingindo R2 de até 0,83. O pXRF em associação com RF pode ser usado para prever atributos do solo com elevada acurácia, com rapidez e a baixo custo, além de proporcionar variáveis que auxiliam o mapeamento digital de solos.
RESUMEN
ABSTRACT Terrain models that represent riverbed topography are used for analyzing geomorphologic changes, calculating water storage capacity, and making hydrologic simulations. These models are generated by interpolating bathymetry points. River bathymetry is usually surveyed through cross-sections, which may lead to a sparse sampling pattern. Hybrid kriging methods, such as regression kriging (RK) and co-kriging (CK) employ the correlation with auxiliary predictors, as well as inter-variable correlation, to improve the predictions of the target variable. In this study, we use the orthogonal distance of a (x, y) point to the river centerline as a covariate for RK and CK. Given that riverbed elevation variability is abrupt transversely to the flow direction, it is expected that the greater the Euclidean distance of a point to the thalweg, the greater the bed elevation will be. The aim of this study was to evaluate if the use of the proposed covariate improves the spatial prediction of riverbed topography. In order to asses such premise, we perform an external validation. Transversal cross-sections are used to make the spatial predictions, and the point data surveyed between sections are used for testing. We compare the results from CK and RK to the ones obtained from ordinary kriging (OK). The validation indicates that RK yields the lowest RMSE among the interpolators. RK predictions represent the thalweg between cross-sections, whereas the other methods under-predict the river thalweg depth. Therefore, we conclude that RK provides a simple approach for enhancing the quality of the spatial prediction from sparse bathymetry data.
RESUMO Modelos de terreno de rios são usados para análise de mudanças geomorfológicas e para simulações hidrológicas. Estes modelos são interpolados a partir de pontos batimétricos. A batimetria fluvial é geralmente conduzida através de seções transversais, o que pode acarretar em uma malha amostral esparsa. Métodos híbridos de krigagem, como krigagem por regressão (KR) e co-krigagem (CK), empregam a correlação com preditores auxiliares, além da auto-correlação entre variáveis, na predição da variável resposta. Neste estudo, sugere-se que a distância ortogonal de um ponto até a linha de centro do talvegue de um rio pode ser usada como covariável para KR e CK. Considerando-se que a variabilidade da cota do leito do rio é abrupta transversalmente a direção do fluxo, espera-se que quanto maior a distância euclidiana de um ponto até o talvegue, maior será sua elevação. O objetivo deste estudo foi avaliar o uso da covariável proposta em métodos híbridos de krigagem para a predição espacial da topografia do leito de rios. Para tanto, foi realizada uma validação externa, em que seções transversais foram usadas para interpolação e dados levantados entre as seções consistiram na amostra de teste. Os resultados da KR e CK foram comparados aos da krigagem ordinária. A KR apresentou a menor REQM. No mapa resultante da KR, o talvegue foi preservado nas lacunas não amostradas entre as seções, enquanto os demais métodos subestimaram a profundidade do talvegue nestes espaços. Assim, conclui-se que a KR pode melhorar a predição espacial de dados batimétricos fluviais.
RESUMEN
Soil bulk density (b) data are needed for a wide range of environmental studies. However, b is rarely reported in soil surveys. An alternative to obtain b for data-scarce regions, such as the Rio Doce basin in southeastern Brazil, is indirect estimation from less costly covariates using pedotransfer functions (PTF). This study primarily aims to develop region-specific PTFs for b using multiple linear regressions (MLR) and random forests (RF). Secondly, it assessed the accuracy of PTFs for data grouped into soil horizons and soil classes. For that purpose, we compared the performance of PTFs compiled from the literature with those developed here. Two groups of data were evaluated as covariates: 1) readily available soil properties and 2) maps derived from a digital elevation model and MODIS satellite imagery, jointly with lithological and pedological maps. The MLR model was applied step-wise to select significant predictors and its accuracy assessed by means of cross-validation. The PTFs developed using all data estimated b from soil properties by MLR and RF, with R2 of 0.41 and 0.51, respectively. Alternatively, using environmental covariates, RF predicted b with R2 of 0.41. Grouping criteria did not lead to a significant increase in the estimates of b. The accuracy of the regional PTFs developed for this study was greater than that found with the compiled PTFs. The best PTF will be firstly used to assess soil carbon stocks and changes in the Rio Doce basin.
Asunto(s)
Análisis del Suelo , Características del Suelo , Predicción , Bosques , Modelos Estadísticos , SueloRESUMEN
Soil bulk density (b) data are needed for a wide range of environmental studies. However, b is rarely reported in soil surveys. An alternative to obtain b for data-scarce regions, such as the Rio Doce basin in southeastern Brazil, is indirect estimation from less costly covariates using pedotransfer functions (PTF). This study primarily aims to develop region-specific PTFs for b using multiple linear regressions (MLR) and random forests (RF). Secondly, it assessed the accuracy of PTFs for data grouped into soil horizons and soil classes. For that purpose, we compared the performance of PTFs compiled from the literature with those developed here. Two groups of data were evaluated as covariates: 1) readily available soil properties and 2) maps derived from a digital elevation model and MODIS satellite imagery, jointly with lithological and pedological maps. The MLR model was applied step-wise to select significant predictors and its accuracy assessed by means of cross-validation. The PTFs developed using all data estimated b from soil properties by MLR and RF, with R2 of 0.41 and 0.51, respectively. Alternatively, using environmental covariates, RF predicted b with R2 of 0.41. Grouping criteria did not lead to a significant increase in the estimates of b. The accuracy of the regional PTFs developed for this study was greater than that found with the compiled PTFs. The best PTF will be firstly used to assess soil carbon stocks and changes in the Rio Doce basin.(AU)