Postprocessing functionalities
Simple postprocessing of MLR model output.
Description
This diagnostic performs postprocessing operations for MLR model output (mean and error).
Project
CRESCENDO
Notes
Prior to postprocessing, this diagnostic groups input datasets according to
tag
and prediction_name
. For each group, accepts datasets with three
different var_type
s:
prediction_output
: Exactly one necessary, refers to the mean prediction and serves as reference dataset (regarding shape).prediction_output_error
: Arbitrary number of error datasets. If not given, error calculation is skipped. May be squared errors (marked by the attributesquared
) or not. In addition, a single covariance dataset can be specified (short_name
ending with_cov
).prediction_input
: Dataset used to estimate covariance structure of the mean prediction (i.e. matrix of Pearson correlation coefficients) for error estimation. At most one dataset allowed. Ignored when noprediction_output_error
is given. This is only possible when (1) the shape of theprediction_input
dataset is identical to the shape of theprediction_output_error
datasets, (2) the number of dimensions of theprediction_input
dataset is higher than the number of dimensions of theprediction_output_error
datasets and they have identical trailing (rightmost) dimensions or (3) the number of dimensions of theprediction_input
dataset is higher than the number of dimensions ofprediction_output_error
datasets and all dimensions of theprediction_output_error
datasets are mapped to a corresponding dimension of theprediction_input
using thecov_estimate_dim_map
option (e.g. whenprediction_input
has shape(10, 5, 100, 20)
andprediction_output_error
has shape(5, 20)
, you can usecov_estimate_dim_map: [1, 3]
to map the dimensions ofprediction_output_error
to dimension 1 and 3 ofprediction_input
).
All data with other var_type
s is ignored (feature
, label
, etc.).
Real error calculation (using covariance dataset given as
prediction_output_error
) and estimation (using prediction_input
dataset
to estimate covariance structure) is only possible if the mean prediction cube
is collapsed completely during postprocessing, i.e. all coordinates are listed
for either mean
or sum
.
Configuration options in recipe
- add_var_from_cov: bool, optional (default: True)
Calculate variances from covariance matrix (diagonal elements) and add those to (squared) error datasets. Set to
False
if variance is already given separately in prediction output.- area_weighted: bool, optional (default: True)
Calculate weighted averages/sums when collapsing over latitude and/or longitude coordinates using grid cell areas (calculated using grid cell bounds). Only possible for datasets on regular grids that contain
latitude
andlongitude
coordinates.- convert_units_to: str, optional
Convert units of the input data.
- cov_estimate_dim_map: list of int, optional
Map dimensions of
prediction_output_error
datasets to corresponding dimensions ofprediction_input
used for estimating covariance. Only relevant if both dataset types are given. See notes above for more information.- ignore: list of dict, optional
Ignore specific datasets by specifying multiple
dict
s of metadata.- landsea_fraction_weighted: str, optional
When given, calculate weighted averages/sums when collapsing over latitude and/or longitude coordinates using land/sea fraction (calculated using Natural Earth masks). Only possible if the datasets contains
latitude
andlongitude
coordinates. Must be one of'land'
,'sea'
.- mean: list of str, optional
Perform mean over the given coordinates.
- pattern: str, optional
Pattern matched against ancestor file names.
- sum: list of str, optional
Perform sum over the given coordinates.
- time_weighted: bool, optional (default: True)
Calculate weighted averages/sums for time (using time bounds).