Shared diagnostic script code

Code that is shared between multiple diagnostic scripts.

Functions

run_diagnostic()

Run a Python diagnostic.

get_diagnostic_filename(basename, cfg[, …])

Get a valid path for saving a diagnostic data file.

get_plot_filename(basename, cfg)

Get a valid path for saving a diagnostic plot.

select_metadata(metadata, **attributes)

Select specific metadata describing preprocessed data.

sorted_metadata(metadata, sort)

Sort a list of metadata describing preprocessed data.

group_metadata(metadata, attribute[, sort])

Group metadata describing preprocessed data by attribute.

sorted_group_metadata(metadata_groups, sort)

Sort grouped metadata.

extract_variables(cfg[, as_iris])

Extract basic variable information from configuration dictionary.

variables_available(cfg, short_names)

Check if data from certain variables is available.

get_cfg([filename])

Read diagnostic script configuration from settings.yml.

get_control_exper_obs(short_name, …)

Get control, exper and obs datasets

apply_supermeans(ctrl, exper, obs_list)

Apply supermeans on data components ie MEAN on time

Classes

ProvenanceLogger(cfg)

Open the provenance logger.

Variable(short_name, standard_name, …)

Variables([cfg])

Class to easily access a recipe’s variables in a diagnostic.

Datasets(cfg)

Class to easily access a recipe’s datasets in a diagnostic script.

class esmvaltool.diag_scripts.shared.Datasets(cfg)[source]

Bases: object

Class to easily access a recipe’s datasets in a diagnostic script.

Examples

Methods

add_dataset(path[, data])

Add dataset to class.

add_to_data(data[, path])

Add element to a dataset’s data.

get_data([path])

Access a dataset’s data.

get_data_list(**dataset_info)

Access the datasets’ data in a list.

get_dataset_info([path])

Access a dataset’s information.

get_dataset_info_list(**dataset_info)

Access dataset’s information in a list.

get_info(key[, path])

Access a ‘dataset_info`’s key.

get_info_list(key, **dataset_info)

Access dataset_info’s key values.

get_path(**dataset_info)

Access a dataset’s path.

get_path_list(**dataset_info)

Access dataset’s paths in a list.

set_data(data[, path])

Set element as a dataset’s data.

Get all variables of a recipe configuration cfg:

datasets = Datasets(cfg)

Access data of a dataset with path dataset_path:

datasets.get_data(path=dataset_path)

Access dataset information of the dataset:

datasets.get_dataset_info(path=dataset_path)

Access the data of all datasets with exp=piControl:

datasets.get_data_list(exp=piControl)
add_dataset(path, data=None, **dataset_info)[source]

Add dataset to class.

Parameters
  • path (str) – (Unique) path to the dataset.

  • data (optional) – Arbitrary object to be saved as data for the dataset.

  • **dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.

add_to_data(data, path=None, **dataset_info)[source]

Add element to a dataset’s data.

Notes

Either path or a unique dataset_info description have to be given. Fails when given information is ambiguous.

Parameters
  • data – Element to be added to the dataset’s data.

  • path (str, optional) – Path to the dataset

  • **dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.

Raises

RuntimeError – If data given by dataset_info is ambiguous.

get_data(path=None, **dataset_info)[source]

Access a dataset’s data.

Notes

Either path or a unique dataset_info description have to be given. Fails when given information is ambiguous.

Parameters
  • path (str, optional) – Path to the dataset

  • **dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.

Returns

Data of the selected dataset.

Return type

data_object

Raises

RuntimeError – If data given by dataset_info is ambiguous.

get_data_list(**dataset_info)[source]

Access the datasets’ data in a list.

Notes

The returned data is sorted alphabetically respective to the paths.

Parameters

**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.

Returns

Data of the selected datasets.

Return type

list

get_dataset_info(path=None, **dataset_info)[source]

Access a dataset’s information.

Notes

Either path or a unique dataset_info description have to be given. Fails when given information is ambiguous.

Parameters
  • path (str, optional) – Path to the dataset.

  • **dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.

Returns

All dataset information.

Return type

dict

Raises

RuntimeError – If data given by dataset_info is ambiguous.

get_dataset_info_list(**dataset_info)[source]

Access dataset’s information in a list.

Notes

The returned data is sorted alphabetically respective to the paths.

Parameters

**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.

Returns

Information dictionaries of the selected datasets.

Return type

list

get_info(key, path=None, **dataset_info)[source]

Access a ‘dataset_info`’s key.

Notes

Either path or a unique dataset_info description have to be given. Fails when given information is ambiguous. If the dataset_info does not contain the key, returns None.

Parameters
  • key (str) – Desired dictionary key.

  • path (str) – Path to the dataset.

  • **dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.

Returns

key information of the given dataset.

Return type

str

Raises

RuntimeError – If data given by dataset_info is ambiguous.

get_info_list(key, **dataset_info)[source]

Access dataset_info’s key values.

Notes

The returned data is sorted alphabetically respective to the paths.

Parameters

**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.

Returns

key information of the selected datasets.

Return type

list

get_path(**dataset_info)[source]

Access a dataset’s path.

Notes

A unique dataset_info description has to be given. Fails when given information is ambiguous.

Parameters

**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.

Returns

Path of the selected dataset.

Return type

str

Raises

RuntimeError – If data given by dataset_info is ambiguous.

get_path_list(**dataset_info)[source]

Access dataset’s paths in a list.

Notes

The returned data is sorted alphabetically respective to the paths.

Parameters

**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.

Returns

Paths of the selected datasets.

Return type

list

set_data(data, path=None, **dataset_info)[source]

Set element as a dataset’s data.

Notes

Either path or a unique dataset_info description have to be given. Fails when if given information is ambiguous.

Parameters
  • data – Element to be set as the dataset’s data.

  • path (str, optional) – Path to the dataset.

  • **dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.

Raises

RuntimeError – If data given by dataset_info is ambiguous.

class esmvaltool.diag_scripts.shared.ProvenanceLogger(cfg)[source]

Bases: object

Open the provenance logger.

Parameters

cfg (dict) – Dictionary with diagnostic configuration.

Methods

log(filename, record)

Record provenance.

Example

Use as a context manager:

record = {
    'caption': "This is a nice plot.",
    'statistics': ['mean'],
    'domain': 'global',
    'plot_type': 'zonal',
    'plot_file': '/path/to/result.png',
    'authors': [
        'first_author',
        'second_author',
    ],
    'references': [
        'acknow_project',
    ],
    'ancestors': [
        '/path/to/input_file_1.nc',
        '/path/to/input_file_2.nc',
    ],
}
output_file = '/path/to/result.nc'

with ProvenanceLogger(cfg) as provenance_logger:
    provenance_logger.log(output_file, record)
log(filename, record)[source]

Record provenance.

Parameters
  • filename (str) – Name of the file containing the diagnostic data.

  • record (dict) –

    Dictionary with the provenance information to be logged.

    Typical keys are:
    • plot_type

    • plot_file

    • caption

    • ancestors

    • authors

    • references

Note

See also esmvaltool/config-references.yml

class esmvaltool.diag_scripts.shared.Variable(short_name, standard_name, long_name, units)

Bases: tuple

Methods

count(value, /)

Return number of occurrences of value.

index(value[, start, stop])

Return first index of value.

Attributes

long_name

Alias for field number 2

short_name

Alias for field number 0

standard_name

Alias for field number 1

units

Alias for field number 3

count(value, /)

Return number of occurrences of value.

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

property long_name

Alias for field number 2

property short_name

Alias for field number 0

property standard_name

Alias for field number 1

property units

Alias for field number 3

class esmvaltool.diag_scripts.shared.Variables(cfg=None, **names)[source]

Bases: object

Class to easily access a recipe’s variables in a diagnostic.

Examples

Methods

add_vars(**names)

Add costum variables to the class.

iris_dict(var)

Access iris dictionary of the variable.

long_name(var)

Access long name.

modify_var(var, **names)

Modify an already existing variable of the class.

short_name(var)

Access short name.

short_names()

Get list of all short_names.

standard_name(var)

Access standard name.

standard_names()

Get list of all standard_names.

units(var)

Access units.

var_name(var)

Access var name.

vars_available(*args)

Check if given variables are available.

Get all variables of a recipe configuration cfg:

variables = Variables(cfg)

Access information of a variable tas:

variables.short_name('tas')
variables.standard_name('tas')
variables.long_name('tas')
variables.units('tas')

Access iris-suitable dictionary of a variable tas:

variables.iris_dict('tas')

Check if variables tas and pr are available:

variables.vars_available('tas', 'pr')
add_vars(**names)[source]

Add costum variables to the class.

Parameters

**names (dict or Variable, optional) – Keyword arguments of the form short_name=Variable_object where Variable_object can be given as dict or Variable.

iris_dict(var)[source]

Access iris dictionary of the variable.

Parameters

var (str) – (Short) name of the variable.

Returns

Dictionary containing all attributes of the variable which can be used directly in iris (short_name replaced by var_name).

Return type

dict

long_name(var)[source]

Access long name.

Parameters

var (str) – (Short) name of the variable.

Returns

Long name of the variable.

Return type

str

modify_var(var, **names)[source]

Modify an already existing variable of the class.

Parameters
  • var (str) – (Short) name of the existing variable.

  • **names – Keyword arguments of the form short_name=tas.

Raises
  • ValueError – If var is not an existing variable.

  • TypeError – If a non-valid keyword argument is given.

short_name(var)[source]

Access short name.

Parameters

var (str) – (Short) name of the variable.

Returns

Short name of the variable.

Return type

str

short_names()[source]

Get list of all short_names.

Returns

List of all short_names.

Return type

list

standard_name(var)[source]

Access standard name.

Parameters

var (str) – (Short) name of the variable.

Returns

Standard name of the variable.

Return type

str

standard_names()[source]

Get list of all standard_names.

Returns

List of all standard_names.

Return type

list

units(var)[source]

Access units.

Parameters

var (str) – (Short) name of the variable.

Returns

Units of the variable.

Return type

str

var_name(var)[source]

Access var name.

Parameters

var (str) – (Short) name of the variable.

Returns

Var name (=short name) of the variable.

Return type

str

vars_available(*args)[source]

Check if given variables are available.

Parameters

*args – Short names of the variables to be tested.

Returns

True if variables are available, False if not.

Return type

bool

esmvaltool.diag_scripts.shared.apply_supermeans(ctrl, exper, obs_list)[source]

Apply supermeans on data components ie MEAN on time

This function is an extension of climate_statistics() meant to ease the time-meaning procedure when dealing with CONTROL, EXPERIMENT and OBS (if any) datasets. ctrl: dictionary of CONTROL dataset exper: dictionary of EXPERIMENT dataset obs_lis: list of dicts for OBS datasets (0, 1 or many)

Returns: control and experiment cubes and list of obs cubes

esmvaltool.diag_scripts.shared.extract_variables(cfg, as_iris=False)[source]

Extract basic variable information from configuration dictionary.

Returns short_name, standard_name, long_name and units keys for each variable.

Parameters
  • cfg (dict) – Diagnostic script configuration.

  • as_iris (bool, optional) – Replace short_name by var_name, this can be used directly in iris classes.

Returns

Variable information in dict`s (values) for each `short_name (key).

Return type

dict

esmvaltool.diag_scripts.shared.get_cfg(filename=None)[source]

Read diagnostic script configuration from settings.yml.

esmvaltool.diag_scripts.shared.get_control_exper_obs(short_name, input_data, cfg, cmip_type)[source]

Get control, exper and obs datasets

This function is used when running recipes that need a clear distinction between a control dataset, an experiment dataset and have optional obs (OBS, obs4mips etc) datasets; such recipes include recipe_validation, and all the autoassess ones; short_name: variable short name input_data: dict containing the input data info cfg: config file as used in this module

esmvaltool.diag_scripts.shared.get_diagnostic_filename(basename, cfg, extension='nc')[source]

Get a valid path for saving a diagnostic data file.

Parameters
  • basename (str) – The basename of the file.

  • cfg (dict) – Dictionary with diagnostic configuration.

  • extension (str) – File name extension.

Returns

A valid path for saving a diagnostic data file.

Return type

str

esmvaltool.diag_scripts.shared.get_plot_filename(basename, cfg)[source]

Get a valid path for saving a diagnostic plot.

Parameters
  • basename (str) – The basename of the file.

  • cfg (dict) – Dictionary with diagnostic configuration.

Returns

A valid path for saving a diagnostic plot.

Return type

str

esmvaltool.diag_scripts.shared.group_metadata(metadata, attribute, sort=None)[source]

Group metadata describing preprocessed data by attribute.

Parameters
  • metadata (list of dict) – A list of metadata describing preprocessed data.

  • attribute (str) – The attribute name that the metadata should be grouped by.

  • sort – See sorted_group_metadata.

Returns

A dictionary containing the requested groups. If sorting is requested, an OrderedDict will be returned.

Return type

dict of list of dict

esmvaltool.diag_scripts.shared.run_diagnostic()[source]

Run a Python diagnostic.

This context manager is the main entry point for most Python diagnostics.

Example

See esmvaltool/diag_scripts/examples/diagnostic.py for an extensive example of how to start your diagnostic.

Basic usage is as follows, add these lines at the bottom of your script:

def main(cfg):
    # Your diagnostic code goes here.
    print(cfg)

if __name__ == '__main__':
    with run_diagnostic() as cfg:
        main(cfg)

The cfg dict passed to main contains the script configuration that can be used with the other functions in this module.

esmvaltool.diag_scripts.shared.select_metadata(metadata, **attributes)[source]

Select specific metadata describing preprocessed data.

Parameters
  • metadata (list of dict) – A list of metadata describing preprocessed data.

  • **attributes – Keyword arguments specifying the required variable attributes and their values. Use the value ‘*’ to select any variable that has the attribute.

Returns

A list of matching metadata.

Return type

list of dict

esmvaltool.diag_scripts.shared.sorted_group_metadata(metadata_groups, sort)[source]

Sort grouped metadata.

Sorting is done on strings and is not case sensitive.

Parameters
  • metadata_groups (dict of list of dict) – Dictionary containing the groups of metadata.

  • sort (bool or str or list of str) – One or more attributes to sort by or True to just sort the groups but not the lists.

Returns

A dictionary containing the requested groups.

Return type

OrderedDict of list of dict

esmvaltool.diag_scripts.shared.sorted_metadata(metadata, sort)[source]

Sort a list of metadata describing preprocessed data.

Sorting is done on strings and is not case sensitive.

Parameters
  • metadata (list of dict) – A list of metadata describing preprocessed data.

  • sort (str or list of str) – One or more attributes to sort by.

Returns

The sorted list of variable metadata.

Return type

list of dict

esmvaltool.diag_scripts.shared.variables_available(cfg, short_names)[source]

Check if data from certain variables is available.

Parameters
  • cfg (dict) – Diagnostic script configuration.

  • short_names (list of str) – Variable short_names which should be checked.

Returns

True if all variables available, False if not.

Return type

bool

Plotting

Module that provides common plot functions.

Functions

get_path_to_mpl_style([style_file])

Get path to matplotlib style file.

get_dataset_style(dataset[, style_file])

Retrieve the style information for the given dataset.

quickplot(cube, filename, plot_type, **kwargs)

Plot a cube using one of the iris.quickplot functions.

multi_dataset_scatterplot(x_data, y_data, …)

Plot a multi dataset scatterplot.

scatterplot(x_data, y_data, filepath, **kwargs)

Plot a scatterplot.

esmvaltool.diag_scripts.shared.plot.get_dataset_style(dataset, style_file=None)[source]

Retrieve the style information for the given dataset.

esmvaltool.diag_scripts.shared.plot.get_path_to_mpl_style(style_file=None)[source]

Get path to matplotlib style file.

esmvaltool.diag_scripts.shared.plot.multi_dataset_scatterplot(x_data, y_data, datasets, filepath, **kwargs)[source]

Plot a multi dataset scatterplot.

Notes

Allowed keyword arguments:

  • mpl_style_file (str): Path to the matplotlib style file.

  • dataset_style_file (str): Path to the dataset style file.

  • plot_kwargs (array-like): Keyword arguments for the plot (e.g. label, makersize, etc.).

  • save_kwargs (dict): Keyword arguments for saving the plot.

  • axes_functions (dict): Arbitrary functions for axes, i.e. axes.set_title(‘title’).

Parameters
  • x_data (array-like) – x data of each dataset.

  • y_data (array-like) – y data of each dataset.

  • datasets (array-like) – Names of the datasets.

  • filepath (str) – Path to which plot is written.

  • **kwargs – Keyword arguments.

Raises
  • TypeError – A non-valid keyword argument is given or x_data, y_data, datasets or (if given) plot_kwargs is not array-like.

  • ValueErrorx_data, y_data, datasets or plot_kwargs do not have the same size.

esmvaltool.diag_scripts.shared.plot.quickplot(cube, filename, plot_type, **kwargs)[source]

Plot a cube using one of the iris.quickplot functions.

esmvaltool.diag_scripts.shared.plot.scatterplot(x_data, y_data, filepath, **kwargs)[source]

Plot a scatterplot.

Notes

Allowed keyword arguments:

  • mpl_style_file (str): Path to the matplotlib style file.

  • plot_kwargs (array-like): Keyword arguments for the plot (e.g. label, makersize, etc.).

  • save_kwargs (dict): Keyword arguments for saving the plot.

  • axes_functions (dict): Arbitrary functions for axes, i.e. axes.set_title(‘title’).

Parameters
  • x_data (array-like) – x data of each dataset.

  • y_data (array-like) – y data of each dataset.

  • filepath (str) – Path to which plot is written.

  • **kwargs – Keyword arguments.

Raises
  • TypeError – A non-valid keyword argument is given or x_data, y_data or (if given) plot_kwargs is not array-like.

  • ValueErrorx_data, y_data or plot_kwargs do not have the same size.