RESUMO
As climate change shifts crop exposure to dry and wet extremes, a better understanding of factors governing crop response is needed. Recent studies identified shallow groundwater-groundwater within or near the crop rooting zone-as influential, yet existing evidence is largely based on theoretical crop model simulations, indirect or static groundwater data, or small-scale field studies. Here, we use observational satellite yield data and dynamic water table simulations from 1999 to 2018 to provide field-scale evidence for shallow groundwater effects on maize yields across the United States Corn Belt. We identify three lines of evidence supporting groundwater influence: 1) crop model simulations better match observed yields after improvements in groundwater representation; 2) machine learning analysis of observed yields and modeled groundwater levels reveals a subsidy zone between 1.1 and 2.5 m depths, with yield penalties at shallower depths and no effect at deeper depths; and 3) locations with groundwater typically in the subsidy zone display higher yield stability across time. We estimate an average 3.4% yield increase when groundwater levels are at optimum depth, and this effect roughly doubles in dry conditions. Groundwater yield subsidies occur ~35% of years on average across locations, with 75% of the region benefitting in at least 10% of years. Overall, we estimate that groundwater-yield interactions had a net monetary contribution of approximately $10 billion from 1999 to 2018. This study provides empirical evidence for region-wide groundwater yield impacts and further underlines the need for better quantification of groundwater levels and their dynamic responses to short- and long-term weather conditions.
RESUMO
Understanding how microbial communities are shaped across spatial dimensions is of fundamental importance in microbial ecology. However, most studies on soil biogeography have focused on the topsoil microbiome, while the factors driving the subsoil microbiome distribution are largely unknown. Here we used 16S rRNA amplicon sequencing to analyse the factors underlying the bacterial ß-diversity along vertical (0-240 cm of soil depth) and horizontal spatial dimensions (~500,000 km2 ) in the U.S. Corn Belt. With these data we tested whether the horizontal or vertical spatial variation had stronger impacts on the taxonomic (Bray-Curtis) and phylogenetic (weighted Unifrac) ß-diversity. Additionally, we assessed whether the distance-decay (horizontal dimension) was greater in the topsoil (0-30 cm) or subsoil (in each 30 cm layer from 30-240 cm) using Mantel tests. The influence of geographic distance versus edaphic variables on the bacterial communities from the different soil layers was also compared. Results indicated that the phylogenetic ß-diversity was impacted more by soil depth, while the taxonomic ß-diversity changed more between geographic locations. The distance-decay was lower in the topsoil than in all subsoil layers analysed. Moreover, some subsoil layers were influenced more by geographic distance than any edaphic variable, including pH. Although different factors affected the topsoil and subsoil biogeography, niche-based models explained the community assembly of all soil layers. This comprehensive study contributed to elucidating important aspects of soil bacterial biogeography including the major impact of soil depth on the phylogenetic ß-diversity, and the greater influence of geographic distance on subsoil than on topsoil bacterial communities in agroecosystems.
Assuntos
Solo , Zea mays , Zea mays/genética , Microbiologia do Solo , RNA Ribossômico 16S/genética , FilogeniaRESUMO
Crop yield prediction is of great importance for decision making, yet it remains an ongoing scientific challenge. Interactions among different genetic, environmental, and management factors and uncertainty in input values are making crop yield prediction complex. Building upon a previous work in which we coupled crop modeling with machine learning (ML) models to predict maize yields for three US Corn Belt states, here, we expand the concept to the entire US Corn Belt (12 states). More specifically, we built five new ML models and their ensemble models, considering the scenarios with and without crop modeling variables. Additional input values in our models are soil, weather, management, and historical yield data. A unique aspect of our work is the spatial analysis to investigate causes for low or high model prediction errors. Our results indicated that the prediction accuracy increases by coupling crop modeling with machine learning. The ensemble model overperformed the individual ML models, having a relative root mean square error (RRMSE) of about 9% for the test years (2018, 2019, and 2020), which is comparable to previous studies. In addition, analysis of the sources of error revealed that counties and crop reporting districts with low cropland ratios have high RRMSE. Furthermore, we found that soil input data and extreme weather events were responsible for high errors in some regions. The proposed models can be deployed for large-scale prediction at the county level and, contingent upon data availability, can be utilized for field level prediction.
RESUMO
Aging in perennial plants is traditionally observed in terms of changes in end-of-season biomass; however, the driving phenological and physiological changes are poorly understood. We found that 3-year-old (mature) stands of the perennial grass Miscanthus×giganteus had 19-30% lower Anet than 1-year-old M.×giganteus (juvenile) stands; 10-34% lower maximum carboxylation rates of Rubisco and 34% lower light-saturated Anet (Asat). These changes could be related to nitrogen (N) limitations, as mature plants were larger and had 14-34% lower leaf N on an area basis (Na) than juveniles. However, N fertilization restored Na to juvenile levels but compensated only 50% of the observed decline in leaf photosynthesis with age. Comparison of leaf photosynthesis per unit of leaf N (PNUE) showed that mature stands had at least 26% lower PNUE than juvenile stands across all N fertilization rates, suggesting that other factors, besides N, may be limiting photosynthesis in mature stands. We hypothesize that sink limitations in mature stands could be causing feedback inhibition of photosynthesis which is associated with the age-related decline in photosynthesis.
Assuntos
Nitrogênio , PoaceaeRESUMO
Limited knowledge about how nitrogen (N) dynamics are affected by climate change, weather variability, and crop management is a major barrier to improving the productivity and environmental performance of soybean-based cropping systems. To fill this knowledge gap, we created a systems understanding of agroecosystem N dynamics and quantified the impact of controllable (management) and uncontrollable (weather, climate) factors on N fluxes and soybean yields. We performed a simulation experiment across 10 soybean production environments in the United States using the Agricultural Production Systems sIMulator (APSIM) model and future climate projections from five global circulation models. Climate change (2020-2080) increased N mineralization (24%) and N2O emissions (19%) but decreased N fixation (32%), seed N (20%), and yields (19%). Soil and crop management practices altered N fluxes at a similar magnitude as climate change but in many different directions, revealing opportunities to improve soybean systems' performance. Among many practices explored, we identified two solutions with great potential: improved residue management (short-term) and water management (long-term). Inter-annual weather variability and management practices affected soybean yield less than N fluxes, which creates opportunities to manage N fluxes without compromising yields, especially in regions with adequate to excess soil moisture. This work provides actionable results (tradeoffs, synergies, directions) to inform decision-making for adapting crop management in a changing climate to improve soybean production systems.
RESUMO
The relationship between collared leaf number and growing degree days (GDD) is crucial for predicting maize phenology. Biophysical crop models convert GDD accumulation to leaf numbers by using a constant parameter termed phyllochron (°C-day leaf-1) or leaf appearance rate (LAR; leaf oC-day-1). However, such important parameter values are rarely estimated for modern maize hybrids. To fill this gap, we sourced and analyzed experimental datasets from the United States Corn Belt with the objective to (i) determine phyllochron values for two types of models: linear (1-parameter) and bilinear (3-parameters; phase I and II phyllochron, and transition point) and (ii) explore whether environmental factors such as photoperiod and radiation, and physiological variables such as plant growth rate can explain variability in phyllochron and improve predictability of maize phenology. The datasets included different locations (latitudes between 48° N and 41° N), years (2009-2019), hybrids, and management settings. Results indicated that the bilinear model represented the leaf number vs. GDD relationship more accurately than the linear model (R 2 = 0.99 vs. 0.95, n = 4,694). Across datasets, first phase phyllochron, transition leaf number, and second phase phyllochron averaged 57.9 ± 7.5°C-day, 9.8 ± 1.2 leaves, and 30.9 ± 5.7°C-day, respectively. Correlation analysis revealed that radiation from the V3 to the V9 developmental stages had a positive relationship with phyllochron (r = 0.69), while photoperiod was positively related to days to flowering or total leaf number (r = 0.89). Additionally, a positive nonlinear relationship between maize LAR and plant growth rate was found. Present findings provide important parameter values for calibration and optimization of maize crop models in the United States Corn Belt, as well as new insights to enhance mechanisms in crop models.
RESUMO
In the U.S. Corn Belt, annual croplands are the primary source of nitrate loading to waterways. Long periods of fallow cause most nitrate loss, but there is extreme interannual variability in the magnitude of nitrate loss due to weather. Using mean annual (2001-2018) flow-weighted nitrate-N concentration (FWNC; mg NO3 - -N L-1 ), load (kg NO3 - -N), and yield (kg NO3 - -N ha-1 cropland) for 29 watersheds, our objectives were (a) to quantify the magnitude and interannual variability of 5-yr moving average FWNC, load, and yield; (2) to estimate the probability of measuring 41% reductions in nitrate loss after isolating the effect of weather on nitrate loss by quantifying the interannual variability of nitrate loss in watersheds where there was no trend in 5-yr moving average nitrate loss (Iowa targets a 41% nitrate loss reduction from croplands); and (c) to identify factors that, in the absence of long-term trends in nitrate loss, best explain the interannual variability in nitrate loss. Averaged across all watersheds, the mean probability of measuring a statistically significant 41% reduction in FWNC across 15 yr, should it occur, was 96%. However, the probabilities of measuring 41% reductions in nitrate load and yield were only 44 and 32%. Across watersheds, soil organic matter, tile drainage, interannual variability of precipitation, and watershed area accounted for interannual variability in these nitrate loss indices. Our results have important implications for setting realistic timelines to measure nitrate loss reductions against the background of interannual weather variation and can help to target monitoring intensity across diverse watersheds.
Assuntos
Agricultura , Nitratos , Iowa , Nitratos/análise , Solo , Zea maysRESUMO
Nitrogen (N) fertilizer recommendations for corn (Zea mays L.) in the US Midwest have been a puzzle for several decades, without agreement among stakeholders for which methodology is the best to balance environmental and economic outcomes. Part of the reason is the lack of long-term data of crop responses to N over multiple fields since trial data is often limited in the number of soils and years it can explore. To overcome this limitation, we designed an analytical platform based on crop simulations run over millions of farming scenarios over extensive geographies. The database was calibrated and validated using data from more than four hundred trials in the region. This dataset can have an important role for research and education in N management, machine leaching, and environmental policy analysis. The calibration and validation procedure provides a framework for future gridded crop model studies. We describe dataset characteristics and provide thorough descriptions of the model setup.
RESUMO
Biological nitrogen (N) fixation is the most relevant process in soybeans (Glycine max L.) to satisfy plant N demand and sustain seed protein formation. Past studies describing N fixation for field-grown soybeans mainly focused on a single point time measurement (mainly toward the end of the season) and on the partial N budget (fixed-N minus seed N removal), overlooking the seasonal pattern of this process. Therefore, this study synthesized field datasets involving multiple temporal measurements during the crop growing season to characterize N fixation dynamics using both fixed-N (kg ha-1) and N derived from the atmosphere [Ndfa (%)] to define: (i) time to the maximum rate of N fixation (ß2), (ii) time to the maximum Ndfa (α2), and (iii) the cumulative fixed-N. The main outcomes of this study are that (1) the maximum rate of N fixation was around the beginning of pod formation (R3 stage), (2) time to the maximum Ndfa (%) was after full pod formation (R4), and (3) cumulative fixation was positively associated with the seasonal vapor-pressure deficit (VPD) and growth cycle length but negatively associated with soil clay content, and (4) time to the maximum N fixation rate (ß2) was positively impacted by season length and negatively impacted by high temperatures during vegetative growth (but positively for VPD, during the same period). Overall, variation in the timing of the maximum rate of N fixation occurred within a much narrower range of growth stages (R3) than the timing of the maximum Ndfa (%), which varied broadly from flowering (R1) to seed filing (R5-R6) depending on the evaluated studies. From a phenotyping standpoint, N fixation determinations after the R4 growth stage would most likely permit capturing both maximum fixed-N rate and maximum Ndfa (%). Further investigations that more closely screen the interplay between N fixation with soil-plant-environment factors should be pursued.
RESUMO
Crop yield prediction is crucial for global food security yet notoriously challenging due to multitudinous factors that jointly determine the yield, including genotype, environment, management, and their complex interactions. Integrating the power of optimization, machine learning, and agronomic insight, we present a new predictive model (referred to as the interaction regression model) for crop yield prediction, which has three salient properties. First, it achieved a relative root mean square error of 8% or less in three Midwest states (Illinois, Indiana, and Iowa) in the US for both corn and soybean yield prediction, outperforming state-of-the-art machine learning algorithms. Second, it identified about a dozen environment by management interactions for corn and soybean yield, some of which are consistent with conventional agronomic knowledge whereas some others interactions require additional analysis or experiment to prove or disprove. Third, it quantitatively dissected crop yield into contributions from weather, soil, management, and their interactions, allowing agronomists to pinpoint the factors that favorably or unfavorably affect the yield of a given location under a given weather and management scenario. The most significant contribution of the new prediction model is its capability to produce accurate prediction and explainable insights simultaneously. This was achieved by training the algorithm to select features and interactions that are spatially and temporally robust to balance prediction accuracy for the training data and generalizability to the test data.
RESUMO
We investigate the predictive performance of two novel CNN-DNN machine learning ensemble models in predicting county-level corn yields across the US Corn Belt (12 states). The developed data set is a combination of management, environment, and historical corn yields from 1980 to 2019. Two scenarios for ensemble creation are considered: homogenous and heterogenous ensembles. In homogenous ensembles, the base CNN-DNN models are all the same, but they are generated with a bagging procedure to ensure they exhibit a certain level of diversity. Heterogenous ensembles are created from different base CNN-DNN models which share the same architecture but have different hyperparameters. Three types of ensemble creation methods were used to create several ensembles for either of the scenarios: Basic Ensemble Method (BEM), Generalized Ensemble Method (GEM), and stacked generalized ensembles. Results indicated that both designed ensemble types (heterogenous and homogenous) outperform the ensembles created from five individual ML models (linear regression, LASSO, random forest, XGBoost, and LightGBM). Furthermore, by introducing improvements over the heterogenous ensembles, the homogenous ensembles provide the most accurate yield predictions across US Corn Belt states. This model could make 2019 yield predictions with a root mean square error of 866 kg/ha, equivalent to 8.5% relative root mean square and could successfully explain about 77% of the spatio-temporal variation in the corn grain yields. The significant predictive power of this model can be leveraged for designing a reliable tool for corn yield prediction which will in turn assist agronomic decision makers.
RESUMO
The performance of crop models in simulating various aspects of the cropping system is sensitive to parameter calibration. Parameter estimation is challenging, especially for time-dependent parameters such as cultivar parameters with 2-3 years of lifespan. Manual calibration of the parameters is time-consuming, requires expertise, and is prone to error. This research develops a new automated framework to estimate time-dependent parameters for crop models using a parallel Bayesian optimization algorithm. This approach integrates the power of optimization and machine learning with prior agronomic knowledge. To test the proposed time-dependent parameter estimation method, we simulated historical yield increase (from 1985 to 2018) in 25 environments in the US Corn Belt with APSIM. Then we compared yield simulation results and nine parameter estimates from our proposed parallel Bayesian framework, with Bayesian optimization and manual calibration. Results indicated that parameters calibrated using the proposed framework achieved an 11.6% reduction in the prediction error over Bayesian optimization and a 52.1% reduction over manual calibration. We also trained nine machine learning models for yield prediction and found that none of them was able to outperform the proposed method in terms of root mean square error and R2. The most significant contribution of the new automated framework for time-dependent parameter estimation is its capability to find close-to-optimal parameters for the crop model. The proposed approach also produced explainable insight into cultivar traits' trends over 34 years (1985-2018).
RESUMO
Identifying mechanisms and pathways involved in gene-environment interplay and phenotypic plasticity is a long-standing challenge. It is highly desirable to establish an integrated framework with an environmental dimension for complex trait dissection and prediction. A critical step is to identify an environmental index that is both biologically relevant and estimable for new environments. With extensive field-observed complex traits, environmental profiles, and genome-wide single nucleotide polymorphisms for three major crops (maize, wheat, and oat), we demonstrated that identifying such an environmental index (i.e., a combination of environmental parameter and growth window) enables genome-wide association studies and genomic selection of complex traits to be conducted with an explicit environmental dimension. Interestingly, genes identified for two reaction-norm parameters (i.e., intercept and slope) derived from flowering time values along the environmental index were less colocalized for a diverse maize panel than for wheat and oat breeding panels, agreeing with the different diversity levels and genetic constitutions of the panels. In addition, we showcased the usefulness of this framework for systematically forecasting the performance of diverse germplasm panels in new environments. This general framework and the companion CERIS-JGRA analytical package should facilitate biologically informed dissection of complex traits, enhanced performance prediction in breeding for future climates, and coordinated efforts to enrich our understanding of mechanisms underlying phenotypic variation.
Assuntos
Avena/genética , Interação Gene-Ambiente , Triticum/genética , Zea mays/genética , Avena/crescimento & desenvolvimento , Regulação da Expressão Gênica de Plantas , Estudo de Associação Genômica Ampla , Fenótipo , Melhoramento Vegetal , Polimorfismo de Nucleotídeo Único , Triticum/crescimento & desenvolvimento , Zea mays/crescimento & desenvolvimentoRESUMO
This study investigates whether coupling crop modeling and machine learning (ML) improves corn yield predictions in the US Corn Belt. The main objectives are to explore whether a hybrid approach (crop modeling + ML) would result in better predictions, investigate which combinations of hybrid models provide the most accurate predictions, and determine the features from the crop modeling that are most effective to be integrated with ML for corn yield prediction. Five ML models (linear regression, LASSO, LightGBM, random forest, and XGBoost) and six ensemble models have been designed to address the research question. The results suggest that adding simulation crop model variables (APSIM) as input features to ML models can decrease yield prediction root mean squared error (RMSE) from 7 to 20%. Furthermore, we investigated partial inclusion of APSIM features in the ML prediction models and we found soil moisture related APSIM variables are most influential on the ML predictions followed by crop-related and phenology-related variables. Finally, based on feature importance measure, it has been observed that simulated APSIM average drought stress and average water table depth during the growing season are the most important APSIM inputs to ML. This result indicates that weather information alone is not sufficient and ML models need more hydrological inputs to make improved yield predictions.
Assuntos
Produção Agrícola , Aprendizado de Máquina , Zea mays/crescimento & desenvolvimento , Clima , Estados UnidosRESUMO
The emergence of new technologies to synthesize and analyze big data with high-performance computing has increased our capacity to more accurately predict crop yields. Recent research has shown that machine learning (ML) can provide reasonable predictions faster and with higher flexibility compared to simulation crop modeling. However, a single machine learning model can be outperformed by a "committee" of models (machine learning ensembles) that can reduce prediction bias, variance, or both and is able to better capture the underlying distribution of the data. Yet, there are many aspects to be investigated with regard to prediction accuracy, time of the prediction, and scale. The earlier the prediction during the growing season the better, but this has not been thoroughly investigated as previous studies considered all data available to predict yields. This paper provides a machine leaning based framework to forecast corn yields in three US Corn Belt states (Illinois, Indiana, and Iowa) considering complete and partial in-season weather knowledge. Several ensemble models are designed using blocked sequential procedure to generate out-of-bag predictions. The forecasts are made in county-level scale and aggregated for agricultural district and state level scales. Results show that the proposed optimized weighted ensemble and the average ensemble are the most precise models with RRMSE of 9.5%. Stacked LASSO makes the least biased predictions (MBE of 53 kg/ha), while other ensemble models also outperformed the base learners in terms of bias. On the contrary, although random k-fold cross-validation is replaced by blocked sequential procedure, it is shown that stacked ensembles perform not as good as weighted ensemble models for time series data sets as they require the data to be non-IID to perform favorably. Comparing our proposed model forecasts with the literature demonstrates the acceptable performance of forecasts made by our proposed ensemble model. Results from the scenario of having partial in-season weather knowledge reveals that decent yield forecasts with RRMSE of 9.2% can be made as early as June 1st. Moreover, it was shown that the proposed model performed better than individual models and benchmark ensembles at agricultural district and state-level scales as well as county-level scale. To find the marginal effect of each input feature on the forecasts made by the proposed ensemble model, a methodology is suggested that is the basis for finding feature importance for the ensemble model. The findings suggest that weather features corresponding to weather in weeks 18-24 (May 1st to June 1st) are the most important input features.
RESUMO
Despite the detrimental impact that excess moisture can have on soybean (Glycine max [L.] Merr) yields, most of today's crop models do not capture soybean's dynamic responses to waterlogged conditions. In light of this, we synthesized literature data and used the APSIM software to enhance the modeling capacity to simulate plant growth, development, and N fixation response to flooding. Literature data included greenhouse and field experiments from across the U.S. that investigated the impact of flood timing and duration on soybean. Five datasets were used for model parameterization of new functions and three datasets were used for testing. Improvements in prediction accuracy were quantified by comparing model performance before and after the implementation of new stage-dependent excess water functions for phenology, photosynthesis and N-fixation. The relative root mean square error (RRMSE) for yield predictions improved by 26% and the RRMSE predictions of biomass improved by 40%. Extensive model testing found that the improved model accurately simulates plant responses to flooding including how these responses change with flood timing and duration. When used to project soybean response to future climate scenarios, the model showed that intense rain events had a greater negative effect on yield than a 25% increase in rainfall distributed over 1 or 3 month(s). These developments advance our ability to understand, predict and, thereby, mitigate yield loss as increases in climatic volatility lead to more frequent and intense flooding events in the future.
RESUMO
A delayed harvest of maize and soybean crops is associated with yield or revenue losses, whereas a premature harvest requires additional costs for artificial grain drying. Accurately predicting the ideal harvest date can increase profitability of US Midwest farms, but today's predictive capacity is low. To fill this gap, we collected and analyzed time-series grain moisture datasets from field experiments in Iowa, Minnesota and North Dakota, US with various maize (n = 102) and soybean (n = 36) genotype-by-environment treatments. Our goal was to examine factors driving the post-maturity grain drying process, and develop scalable algorithms for decision-making. The algorithms evaluated are driven by changes in the grain equilibrium moisture content (function of air relative humidity and temperature) and require three input parameters: moisture content at physiological maturity, a drying coefficient and a power constant. Across independent genotypes and environments, the calibrated algorithms accurately predicted grain dry-down of maize (r2 = 0.79; root mean square error, RMSE = 1.8% grain moisture) and soybean field crops (r2 = 0.72; RMSE = 6.7% grain moisture). Evaluation of variance components and treatment effects revealed that genotypes, weather-years, and planting dates had little influence on the post-maturity drying coefficient, but significantly influenced grain moisture content at physiological maturity. Therefore, accurate implementation of the algorithms across environments would require estimating the initial grain moisture content, via modeling approaches or in-field measurements. Our work contributes new insights to understand the post-maturity grain dry-down and provides a robust and scalable predictive algorithm to forecast grain dry-down and ideal harvest dates across environments in the US Corn Belt.
Assuntos
Algoritmos , Glycine max/crescimento & desenvolvimento , Zea mays/crescimento & desenvolvimento , Produção Agrícola , Dessecação/métodos , Grão Comestível/química , Genótipo , Glycine max/genética , Temperatura , Água/química , Zea mays/genéticaRESUMO
Evidence suggests that global maize yield declines with a warming climate, particularly with extreme heat events. However, the degree to which important maize processes such as biomass growth rate, growing season length (GSL) and grain formation are impacted by an increase in temperature is uncertain. Such knowledge is necessary to understand yield responses and develop crop adaptation strategies under warmer climate. Here crop models, satellite observations, survey, and field data were integrated to investigate how high temperature stress influences maize yield in the U.S. Midwest. We showed that both observational evidence and crop model ensemble mean (MEM) suggests the nonlinear sensitivity in yield was driven by the intensified sensitivity of harvest index (HI), but MEM underestimated the warming effects through HI and overstated the effects through GSL. Further analysis showed that the intensified sensitivity in HI mainly results from a greater sensitivity of yield to high temperature stress during the grain filling period, which explained more than half of the yield reduction. When warming effects were decomposed into direct heat stress and indirect water stress (WS), observational data suggest that yield is more reduced by direct heat stress (-4.6 ± 1.0%/°C) than by WS (-1.7 ± 0.65%/°C), whereas MEM gives opposite results. This discrepancy implies that yield reduction by heat stress is underestimated, whereas the yield benefit of increasing atmospheric CO2 might be overestimated in crop models, because elevated CO2 brings yield benefit through water conservation effect but produces limited benefit over heat stress. Our analysis through integrating data and crop models suggests that future adaptation strategies should be targeted at the heat stress during grain formation and changes in agricultural management need to be better accounted for to adequately estimate the effects of heat stress.
Assuntos
Temperatura Alta , Zea mays , Agricultura , Grão Comestível , TemperaturaRESUMO
Crop yield prediction is extremely challenging due to its dependence on multiple factors such as crop genotype, environmental factors, management practices, and their interactions. This paper presents a deep learning framework using convolutional neural networks (CNNs) and recurrent neural networks (RNNs) for crop yield prediction based on environmental data and management practices. The proposed CNN-RNN model, along with other popular methods such as random forest (RF), deep fully connected neural networks (DFNN), and LASSO, was used to forecast corn and soybean yield across the entire Corn Belt (including 13 states) in the United States for years 2016, 2017, and 2018 using historical data. The new model achieved a root-mean-square-error (RMSE) 9% and 8% of their respective average yields, substantially outperforming all other methods that were tested. The CNN-RNN has three salient features that make it a potentially useful method for other crop yield prediction studies. (1) The CNN-RNN model was designed to capture the time dependencies of environmental factors and the genetic improvement of seeds over time without having their genotype information. (2) The model demonstrated the capability to generalize the yield prediction to untested environments without significant drop in the prediction accuracy. (3) Coupled with the backpropagation method, the model could reveal the extent to which weather conditions, accuracy of weather predictions, soil conditions, and management practices were able to explain the variation in the crop yields.
RESUMO
Historically crop models have been used to evaluate crop yield responses to nitrogen (N) rates after harvest when it is too late for the farmers to make in-season adjustments. We hypothesize that the use of a crop model as an in-season forecast tool will improve current N decision-making. To explore this, we used the Agricultural Production Systems sIMulator (APSIM) calibrated with long-term experimental data for central Iowa, USA (16-years in continuous corn and 15-years in soybean-corn rotation) combined with actual weather data up to a specific crop stage and historical weather data thereafter. The objectives were to: (1) evaluate the accuracy and uncertainty of corn yield and economic optimum N rate (EONR) predictions at four forecast times (planting time, 6th and 12th leaf, and silking phenological stages); (2) determine whether the use of analogous historical weather years based on precipitation and temperature patterns as opposed to using a 35-year dataset could improve the accuracy of the forecast; and (3) quantify the value added by the crop model in predicting annual EONR and yields using the site-mean EONR and the yield at the EONR to benchmark predicted values. Results indicated that the mean corn yield predictions at planting time (R2 = 0.77) using 35-years of historical weather was close to the observed and predicted yield at maturity (R2 = 0.81). Across all forecasting times, the EONR predictions were more accurate in corn-corn than soybean-corn rotation (relative root mean square error, RRMSE, of 25 vs. 45%, respectively). At planting time, the APSIM model predicted the direction of optimum N rates (above, below or at average site-mean EONR) in 62% of the cases examined (n = 31) with an average error range of ±38 kg N ha-1 (22% of the average N rate). Across all forecast times, prediction error of EONR was about three times higher than yield predictions. The use of the 35-year weather record was better than using selected historical weather years to forecast (RRMSE was on average 3% lower). Overall, the proposed approach of using the crop model as a forecasting tool could improve year-to-year predictability of corn yields and optimum N rates. Further improvements in modeling and set-up protocols are needed toward more accurate forecast, especially for extreme weather years with the most significant economic and environmental cost.