Constraining uncertainty in projected gross primary production (GPP) with machine learning

Overview

These recipes reproduce the analysis of Schlund et al. (2020). In this paper, a machine learning regression (MLR) approach (using the MLR algorithm Gradient Boosted Regression Trees, GBRT) is proposed to constrain uncertainties in projected gross primary production (GPP) in the RCP 8.5 scenario using observations of process-based diagnostics.

Available recipes and diagnostics

Recipes are stored in recipes/

  • schlund20jgr/recipe_schlund20jgr_gpp_abs_rcp85.yml

  • schlund20jgr/recipe_schlund20jgr_gpp_change_1pct.yml

  • schlund20jgr/recipe_schlund20jgr_gpp_change_rcp85.yml

Diagnostics are stored in diag_scripts/

General information (including an example and more details) on machine learning regression (MLR) diagnostics is given here. The API documentation is available here.

Variables

  • co2s (atmos, monthly, longitude, latitude, time)

  • gpp (land, monthly, longitude, latitude, time)

  • gppStderr (land, monthly, longitude, latitude, time)

  • lai (land, monthly, longitude, latitude, time)

  • pr (atmos, monthly, longitude, latitude, time)

  • rsds (atmos, monthly, longitude, latitude, time)

  • tas (atmos, monthly, longitude, latitude, time)

Observations and reformat scripts

References

  • Schlund et al., JGR: Biogeosciences, accepted (2020). TBA

Example plots

../_images/map_prediction_output___GBRT_change.png

Fig. 50 GBRT-based prediction of the fractional GPP change over the 21st century (= GPP(2091-2100) / GPP(1991-2000)).

../_images/map_prediction_output_error___GBRT_change.png

Fig. 51 Corresponding error of the GBRT-based prediction of the fractional GPP change over the 21st century (considering errors in the MLR model and errors in the predictors).

../_images/map_prediction_output___GBRT_abs.png

Fig. 52 GBRT-based prediction of the absolute GPP at the end of the 21st century (2091-2100).

../_images/map_prediction_output_error___GBRT_abs.png

Fig. 53 Corresponding error of the GBRT-based prediction of the absolute GPP at the end of the 21st century (considering errors in the MLR model and errors in the predictors).

../_images/rmse_plot.png

Fig. 54 Boxplot of the root mean square error of prediction (RMSEP) distributions for six different statistical models used to predict future absolute GPP (2091-2100) using a leave-one-model-out cross-validation approach. The distribution for each statistical model contains seven points (black dots, one for each climate model used as truth) and is represented in the following way: the lower and upper limit of the blue boxes correspond to the 25% and 75% quantiles, respectively. The central line in the box shows the median, the black “x” the mean of the distribution. The whiskers outside the box represent the range of the distribution

../_images/feature_importance.png

Fig. 55 Global feature importance of the GBRT model for prediction of the absolute GPP at the end of the 21st century (2091-2100).

../_images/residuals_distribution.png

Fig. 56 Distribution of the residuals of the GBRT model for the prediction of absolute GPP at the end of the 21st century (2091-2100) for the training data (blue) and test data excluded from training (green).

../_images/training_progress.png

Fig. 57 Training progress of the GBRT model for the prediction of absolute GPP at the end of the 21st century (2091-2100) evaluated as normalized root mean square error on the training data (blue) and test data excluded from training (green).