Postprocessing functionalities#
Simple postprocessing of MLR model output.
Description#
This diagnostic performs postprocessing operations for MLR model output (mean and error).
Project#
CRESCENDO
Notes
Prior to postprocessing, this diagnostic groups input datasets according to
tag
and prediction_name
. For each group, accepts datasets with three
different var_type
s:
prediction_output
: Exactly one necessary, refers to the mean prediction and serves as reference dataset (regarding shape).prediction_output_error
: Arbitrary number of error datasets. If not given, error calculation is skipped. May be squared errors (marked by the attributesquared
) or not. In addition, a single covariance dataset can be specified (short_name
ending with_cov
).prediction_input
: Dataset used to estimate covariance structure of the mean prediction (i.e. matrix of Pearson correlation coefficients) for error estimation. At most one dataset allowed. Ignored when noprediction_output_error
is given. This is only possible when (1) the shape of theprediction_input
dataset is identical to the shape of theprediction_output_error
datasets, (2) the number of dimensions of theprediction_input
dataset is higher than the number of dimensions of theprediction_output_error
datasets and they have identical trailing (rightmost) dimensions or (3) the number of dimensions of theprediction_input
dataset is higher than the number of dimensions ofprediction_output_error
datasets and all dimensions of theprediction_output_error
datasets are mapped to a corresponding dimension of theprediction_input
using thecov_estimate_dim_map
option (e.g. whenprediction_input
has shape(10, 5, 100, 20)
andprediction_output_error
has shape(5, 20)
, you can usecov_estimate_dim_map: [1, 3]
to map the dimensions ofprediction_output_error
to dimension 1 and 3 ofprediction_input
).
All data with other var_type
s is ignored (feature
, label
, etc.).
Real error calculation (using covariance dataset given as
prediction_output_error
) and estimation (using prediction_input
dataset
to estimate covariance structure) is only possible if the mean prediction cube
is collapsed completely during postprocessing, i.e. all coordinates are listed
for either mean
or sum
.
Configuration options in recipe#
- add_var_from_cov: bool, optional (default: True)
Calculate variances from covariance matrix (diagonal elements) and add those to (squared) error datasets. Set to
False
if variance is already given separately in prediction output.- area_weighted: bool, optional (default: True)
Calculate weighted averages/sums when collapsing over latitude and/or longitude coordinates using grid cell areas (calculated using grid cell bounds). Only possible for datasets on regular grids that contain
latitude
andlongitude
coordinates.- convert_units_to: str, optional
Convert units of the input data.
- cov_estimate_dim_map: list of int, optional
Map dimensions of
prediction_output_error
datasets to corresponding dimensions ofprediction_input
used for estimating covariance. Only relevant if both dataset types are given. See notes above for more information.- ignore: list of dict, optional
Ignore specific datasets by specifying multiple
dict
s of metadata.- landsea_fraction_weighted: str, optional
When given, calculate weighted averages/sums when collapsing over latitude and/or longitude coordinates using land/sea fraction (calculated using Natural Earth masks). Only possible if the datasets contains
latitude
andlongitude
coordinates. Must be one of'land'
,'sea'
.- mean: list of str, optional
Perform mean over the given coordinates.
- pattern: str, optional
Pattern matched against ancestor file names.
- sum: list of str, optional
Perform sum over the given coordinates.
- time_weighted: bool, optional (default: True)
Calculate weighted averages/sums for time (using time bounds).