Shared diagnostic script code¶

Code that is shared between multiple diagnostic scripts.

Functions

`run_diagnostic`()	Run a Python diagnostic.
`save_figure`(basename, provenance, cfg[, …])	Save a figure to file.
`save_data`(basename, provenance, cfg, cube, …)	Save the data used to create a plot to file.
`get_plot_filename`(basename, cfg)	Get a valid path for saving a diagnostic plot.
`get_diagnostic_filename`(basename, cfg[, …])	Get a valid path for saving a diagnostic data file.
`select_metadata`(metadata, **attributes)	Select specific metadata describing preprocessed data.
`sorted_metadata`(metadata, sort)	Sort a list of metadata describing preprocessed data.
`group_metadata`(metadata, attribute[, sort])	Group metadata describing preprocessed data by attribute.
`sorted_group_metadata`(metadata_groups, sort)	Sort grouped metadata.
`extract_variables`(cfg[, as_iris])	Extract basic variable information from configuration dictionary.
`variables_available`(cfg, short_names)	Check if data from certain variables is available.
`get_cfg`([filename])	Read diagnostic script configuration from settings.yml.
`get_control_exper_obs`(short_name, …)	Get control, exper and obs datasets
`apply_supermeans`(ctrl, exper, obs_list)	Apply supermeans on data components ie MEAN on time

Classes

`ProvenanceLogger`(cfg)	Open the provenance logger.
`Variable`(short_name, standard_name, …)
`Variables`([cfg])	Class to easily access a recipe’s variables in a diagnostic.
`Datasets`(cfg)	Class to easily access a recipe’s datasets in a diagnostic script.

class esmvaltool.diag_scripts.shared.Datasets(cfg)[source]¶

Bases: object

Class to easily access a recipe’s datasets in a diagnostic script.

Examples

Methods

`add_dataset`(path[, data])	Add dataset to class.
`add_to_data`(data[, path])	Add element to a dataset’s data.
`get_data`([path])	Access a dataset’s data.
`get_data_list`(**dataset_info)	Access the datasets’ data in a list.
`get_dataset_info`([path])	Access a dataset’s information.
`get_dataset_info_list`(**dataset_info)	Access dataset’s information in a list.
`get_info`(key[, path])	Access a ‘dataset_info`’s key.
`get_info_list`(key, **dataset_info)	Access dataset_info’s key values.
`get_path`(**dataset_info)	Access a dataset’s path.
`get_path_list`(**dataset_info)	Access dataset’s paths in a list.
`set_data`(data[, path])	Set element as a dataset’s data.

Get all variables of a recipe configuration cfg:

datasets = Datasets(cfg)

Access data of a dataset with path dataset_path:

datasets.get_data(path=dataset_path)

Access dataset information of the dataset:

datasets.get_dataset_info(path=dataset_path)

Access the data of all datasets with exp=piControl:

datasets.get_data_list(exp=piControl)

add_dataset(path, data=None, **dataset_info)[source]¶

Add dataset to class.

Parameters

path (str) – (Unique) path to the dataset.
data (optional) – Arbitrary object to be saved as data for the dataset.
**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.

add_to_data(data, path=None, **dataset_info)[source]¶

Add element to a dataset’s data.

Notes

Either path or a unique dataset_info description have to be given. Fails when given information is ambiguous.

Parameters

data – Element to be added to the dataset’s data.
path (str, optional) – Path to the dataset
**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.

Raises

RuntimeError – If data given by dataset_info is ambiguous.

get_data(path=None, **dataset_info)[source]¶

Access a dataset’s data.

Notes

Either path or a unique dataset_info description have to be given. Fails when given information is ambiguous.

Parameters

path (str, optional) – Path to the dataset
**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.

Returns

Data of the selected dataset.

Return type

data_object

Raises

RuntimeError – If data given by dataset_info is ambiguous.

get_data_list(**dataset_info)[source]¶

Access the datasets’ data in a list.

Notes

The returned data is sorted alphabetically respective to the paths.

Parameters: **dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.
Returns: Data of the selected datasets.
Return type: list

get_dataset_info(path=None, **dataset_info)[source]¶

Access a dataset’s information.

Notes

Either path or a unique dataset_info description have to be given. Fails when given information is ambiguous.

Parameters

path (str, optional) – Path to the dataset.
**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.

Returns

All dataset information.

Return type

dict

Raises

RuntimeError – If data given by dataset_info is ambiguous.

get_dataset_info_list(**dataset_info)[source]¶

Access dataset’s information in a list.

Notes

The returned data is sorted alphabetically respective to the paths.

Parameters: **dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.
Returns: Information dictionaries of the selected datasets.
Return type: list

get_info(key, path=None, **dataset_info)[source]¶

Access a ‘dataset_info`’s key.

Notes

Either path or a unique dataset_info description have to be given. Fails when given information is ambiguous. If the dataset_info does not contain the key, returns None.

Parameters

key (str) – Desired dictionary key.
path (str) – Path to the dataset.
**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.

Returns

key information of the given dataset.

Return type

str

Raises

RuntimeError – If data given by dataset_info is ambiguous.

get_info_list(key, **dataset_info)[source]¶

Access dataset_info’s key values.

Notes

The returned data is sorted alphabetically respective to the paths.

Parameters: **dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.
Returns: key information of the selected datasets.
Return type: list

get_path(**dataset_info)[source]¶

Access a dataset’s path.

Notes

A unique dataset_info description has to be given. Fails when given information is ambiguous.

Parameters: **dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.
Returns: Path of the selected dataset.
Return type: str
Raises: RuntimeError – If data given by dataset_info is ambiguous.

get_path_list(**dataset_info)[source]¶

Access dataset’s paths in a list.

Notes

The returned data is sorted alphabetically respective to the paths.

Parameters: **dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.
Returns: Paths of the selected datasets.
Return type: list

set_data(data, path=None, **dataset_info)[source]¶

Set element as a dataset’s data.

Notes

Either path or a unique dataset_info description have to be given. Fails when if given information is ambiguous.

Parameters

data – Element to be set as the dataset’s data.
path (str, optional) – Path to the dataset.
**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.

Raises

RuntimeError – If data given by dataset_info is ambiguous.

class esmvaltool.diag_scripts.shared.ProvenanceLogger(cfg)[source]¶

Bases: object

Open the provenance logger.

Parameters: cfg (dict) – Dictionary with diagnostic configuration.

Methods

log(filename, record)

Record provenance.

Example

Use as a context manager:

record = {
    'caption': "This is a nice plot.",
    'statistics': ['mean'],
    'domain': ['global'],
    'plot_type': ['zonal'],
    'authors': [
        'first_author',
        'second_author',
    ],
    'references': [
        'author20journal',
    ],
    'ancestors': [
        '/path/to/input_file_1.nc',
        '/path/to/input_file_2.nc',
    ],
}
output_file = '/path/to/result.nc'

with ProvenanceLogger(cfg) as provenance_logger:
    provenance_logger.log(output_file, record)

log(filename, record)[source]¶

Record provenance.

Parameters

filename (str) – Name of the file containing the diagnostic data.
record (dict) –
Dictionary with the provenance information to be logged.
Typical keys are:
- ancestors
- authors
- caption
- domain
- plot_type
- references
- statistics

Note

See the provenance documentation for more information.

class esmvaltool.diag_scripts.shared.Variable(short_name, standard_name, long_name, units)¶

Bases: tuple

Methods

`count`(value, /)	Return number of occurrences of value.
`index`(value[, start, stop])	Return first index of value.

Attributes

`long_name`	Alias for field number 2
`short_name`	Alias for field number 0
`standard_name`	Alias for field number 1
`units`	Alias for field number 3

count(value, /)¶: Return number of occurrences of value.

index(value, start=0, stop=9223372036854775807, /)¶

Return first index of value.

Raises ValueError if the value is not present.

property long_name¶: Alias for field number 2

property short_name¶: Alias for field number 0

property standard_name¶: Alias for field number 1

property units¶: Alias for field number 3

class esmvaltool.diag_scripts.shared.Variables(cfg=None, **names)[source]¶

Bases: object

Class to easily access a recipe’s variables in a diagnostic.

Examples

Methods

`add_vars`(**names)	Add costum variables to the class.
`iris_dict`(var)	Access `iris` dictionary of the variable.
`long_name`(var)	Access long name.
`modify_var`(var, **names)	Modify an already existing variable of the class.
`short_name`(var)	Access short name.
`short_names`()	Get list of all short_names.
`standard_name`(var)	Access standard name.
`standard_names`()	Get list of all standard_names.
`units`(var)	Access units.
`var_name`(var)	Access var name.
`vars_available`(*args)	Check if given variables are available.

Get all variables of a recipe configuration cfg:

variables = Variables(cfg)

Access information of a variable tas:

variables.short_name('tas')
variables.standard_name('tas')
variables.long_name('tas')
variables.units('tas')

Access iris-suitable dictionary of a variable tas:

variables.iris_dict('tas')

Check if variables tas and pr are available:

variables.vars_available('tas', 'pr')

add_vars(**names)[source]¶

Add costum variables to the class.

Parameters: **names (dict or Variable, optional) – Keyword arguments of the form short_name=Variable_object where Variable_object can be given as dict or Variable.

iris_dict(var)[source]¶

Access iris dictionary of the variable.

Parameters: var (str) – (Short) name of the variable.
Returns: Dictionary containing all attributes of the variable which can be used directly in iris (short_name replaced by var_name).
Return type: dict

long_name(var)[source]¶

Access long name.

Parameters: var (str) – (Short) name of the variable.
Returns: Long name of the variable.
Return type: str

modify_var(var, **names)[source]¶

Modify an already existing variable of the class.

Parameters

var (str) – (Short) name of the existing variable.
**names – Keyword arguments of the form short_name=tas.

Raises

ValueError – If var is not an existing variable.
TypeError – If a non-valid keyword argument is given.

short_name(var)[source]¶

Access short name.

Parameters: var (str) – (Short) name of the variable.
Returns: Short name of the variable.
Return type: str

short_names()[source]¶

Get list of all short_names.

Returns: List of all short_names.
Return type: list

standard_name(var)[source]¶

Access standard name.

Parameters: var (str) – (Short) name of the variable.
Returns: Standard name of the variable.
Return type: str

standard_names()[source]¶

Get list of all standard_names.

Returns: List of all standard_names.
Return type: list

units(var)[source]¶

Access units.

Parameters: var (str) – (Short) name of the variable.
Returns: Units of the variable.
Return type: str

var_name(var)[source]¶

Access var name.

Parameters: var (str) – (Short) name of the variable.
Returns: Var name (=short name) of the variable.
Return type: str

vars_available(*args)[source]¶

Check if given variables are available.

Parameters: *args – Short names of the variables to be tested.
Returns: True if variables are available, False if not.
Return type: bool

esmvaltool.diag_scripts.shared.apply_supermeans(ctrl, exper, obs_list)[source]¶

Apply supermeans on data components ie MEAN on time

This function is an extension of climate_statistics() meant to ease the time-meaning procedure when dealing with CONTROL, EXPERIMENT and OBS (if any) datasets. ctrl: dictionary of CONTROL dataset exper: dictionary of EXPERIMENT dataset obs_lis: list of dicts for OBS datasets (0, 1 or many)

Returns: control and experiment cubes and list of obs cubes

esmvaltool.diag_scripts.shared.extract_variables(cfg, as_iris=False)[source]¶

Extract basic variable information from configuration dictionary.

Returns short_name, standard_name, long_name and units keys for each variable.

Parameters

cfg (dict) – Diagnostic script configuration.
as_iris (bool, optional) – Replace short_name by var_name, this can be used directly in iris classes.

Returns

Variable information in dict`s (values) for each `short_name (key).

Return type

dict

esmvaltool.diag_scripts.shared.get_cfg(filename=None)[source]¶: Read diagnostic script configuration from settings.yml.

esmvaltool.diag_scripts.shared.get_control_exper_obs(short_name, input_data, cfg, cmip_type)[source]¶

Get control, exper and obs datasets

This function is used when running recipes that need a clear distinction between a control dataset, an experiment dataset and have optional obs (OBS, obs4mips etc) datasets; such recipes include recipe_validation, and all the autoassess ones; short_name: variable short name input_data: dict containing the input data info cfg: config file as used in this module

esmvaltool.diag_scripts.shared.get_diagnostic_filename(basename, cfg, extension='nc')[source]¶

Get a valid path for saving a diagnostic data file.

Parameters

basename (str) – The basename of the file.
cfg (dict) – Dictionary with diagnostic configuration.
extension (str) – File name extension.

Returns

A valid path for saving a diagnostic data file.

Return type

str

esmvaltool.diag_scripts.shared.get_plot_filename(basename, cfg)[source]¶

Get a valid path for saving a diagnostic plot.

Parameters

basename (str) – The basename of the file.
cfg (dict) – Dictionary with diagnostic configuration.

Returns

A valid path for saving a diagnostic plot.

Return type

str

esmvaltool.diag_scripts.shared.group_metadata(metadata, attribute, sort=None)[source]¶

Group metadata describing preprocessed data by attribute.

Parameters

metadata (list of dict) – A list of metadata describing preprocessed data.
attribute (str) – The attribute name that the metadata should be grouped by.
sort – See sorted_group_metadata.

Returns

A dictionary containing the requested groups.

Return type

dict of list of dict

esmvaltool.diag_scripts.shared.run_diagnostic()[source]¶

Run a Python diagnostic.

This context manager is the main entry point for most Python diagnostics.

Example

See esmvaltool/diag_scripts/examples/diagnostic.py for an extensive example of how to start your diagnostic.

Basic usage is as follows, add these lines at the bottom of your script:

def main(cfg):
    # Your diagnostic code goes here.
    print(cfg)

if __name__ == '__main__':
    with run_diagnostic() as cfg:
        main(cfg)

The cfg dict passed to main contains the script configuration that can be used with the other functions in this module.

esmvaltool.diag_scripts.shared.save_data(basename, provenance, cfg, cube, **kwargs)[source]¶

Save the data used to create a plot to file.

Parameters

basename (str) – The basename of the file.
provenance (dict) – The provenance record for the data.
cfg (dict) – Dictionary with diagnostic configuration.
cube (iris.cube.Cube) – Data cube to save.
**kwargs – Extra keyword arguments to pass to iris.save.

Plotting¶

Module that provides common plot functions.

Functions

`get_path_to_mpl_style`([style_file])	Get path to matplotlib style file.
`get_dataset_style`(dataset[, style_file])	Retrieve the style information for the given dataset.
`global_contourf`(cube[, cbar_center, …])	Plot global filled contour plot.
`global_pcolormesh`(cube[, cbar_center, …])	Plot global color mesh.
`quickplot`(cube, plot_type[, filename])	Plot a cube using one of the iris.quickplot functions.
`multi_dataset_scatterplot`(x_data, y_data, …)	Plot a multi dataset scatterplot.
`scatterplot`(x_data, y_data, filepath, **kwargs)	Plot a scatterplot.

esmvaltool.diag_scripts.shared.plot.get_dataset_style(dataset, style_file=None)[source]¶: Retrieve the style information for the given dataset.

esmvaltool.diag_scripts.shared.plot.get_path_to_mpl_style(style_file=None)[source]¶: Get path to matplotlib style file.

esmvaltool.diag_scripts.shared.plot.global_contourf(cube, cbar_center=None, cbar_label=None, cbar_range=None, cbar_ticks=None, **kwargs)[source]¶

Plot global filled contour plot.

Note

This is only possible if the cube has the coordinates latitude and longitude. A mean is performed over excessive coordinates.

Parameters

cube (iris.cube.Cube) – Cube to plot.
cbar_center (float, optional) – Central value for the colormap, useful for diverging colormaps. Can only be used if cbar_range is given.
cbar_label (str, optional) – Label for the colorbar.
cbar_range (list of float, optional) – Range of the colorbar (first and second list element) and number of distinct colors (third element). See numpy.linspace.
cbar_ticks (list, optional) – Ticks for the colorbar.
**kwargs – Keyword argument for iris.plot.contourf().

Returns

Plot object.

Return type

matplotlib.contour.QuadContourSet

Raises

iris.exceptions.CoordinateNotFoundError – iris.cube.Cube does not contain necessary coordinates 'latitude' and 'longitude'.

esmvaltool.diag_scripts.shared.plot.global_pcolormesh(cube, cbar_center=None, cbar_label=None, cbar_ticks=None, **kwargs)[source]¶

Plot global color mesh.

Note

This is only possible if the cube has the coordinates latitude and longitude. A mean is performed over excessive coordinates.

Parameters

cube (iris.cube.Cube) – Cube to plot.
cbar_center (float, optional) – Central value for the colormap, useful for diverging colormaps. Can only be used if vmin and vmax are given.
cbar_label (str, optional) – Label for the colorbar.
cbar_ticks (list, optional) – Ticks for the colorbar.
**kwargs – Keyword argument for iris.plot.pcolormesh().

Returns

Plot object.

Return type

matplotlib.contour.QuadContourSet

Raises

iris.exceptions.CoordinateNotFoundError – iris.cube.Cube does not contain necessary coordinates 'latitude' and 'longitude'.

esmvaltool.diag_scripts.shared.plot.multi_dataset_scatterplot(x_data, y_data, datasets, filepath, **kwargs)[source]¶

Plot a multi dataset scatterplot.

Notes

Allowed keyword arguments:

mpl_style_file (str): Path to the matplotlib style file.
dataset_style_file (str): Path to the dataset style file.
plot_kwargs (array-like): Keyword arguments for the plot (e.g. label, makersize, etc.).
save_kwargs (dict): Keyword arguments for saving the plot.
axes_functions (dict): Arbitrary functions for axes, i.e. axes.set_title(‘title’).

Parameters

x_data (array-like) – x data of each dataset.
y_data (array-like) – y data of each dataset.
datasets (array-like) – Names of the datasets.
filepath (str) – Path to which plot is written.
**kwargs – Keyword arguments.

Raises

TypeError – A non-valid keyword argument is given or x_data, y_data, datasets or (if given) plot_kwargs is not array-like.
ValueError – x_data, y_data, datasets or plot_kwargs do not have the same size.

esmvaltool.diag_scripts.shared.plot.quickplot(cube, plot_type, filename=None, **kwargs)[source]¶: Plot a cube using one of the iris.quickplot functions.

esmvaltool.diag_scripts.shared.plot.scatterplot(x_data, y_data, filepath, **kwargs)[source]¶

Plot a scatterplot.

Notes

Allowed keyword arguments:

mpl_style_file (str): Path to the matplotlib style file.
plot_kwargs (array-like): Keyword arguments for the plot (e.g. label, makersize, etc.).
save_kwargs (dict): Keyword arguments for saving the plot.
axes_functions (dict): Arbitrary functions for axes, i.e. axes.set_title(‘title’).

Parameters

x_data (array-like) – x data of each dataset.
y_data (array-like) – y data of each dataset.
filepath (str) – Path to which plot is written.
**kwargs – Keyword arguments.

Raises

TypeError – A non-valid keyword argument is given or x_data, y_data or (if given) plot_kwargs is not array-like.
ValueError – x_data, y_data or plot_kwargs do not have the same size.