Shared diagnostic script code#
Code that is shared between multiple diagnostic scripts.
Classes:
|
Class to easily access a recipe's datasets in a diagnostic script. |
|
Open the provenance logger. |
|
Variable class containing all relevant information. |
|
Class to easily access a recipe's variables in a diagnostic. |
Functions:
|
Apply supermeans on data components ie MEAN on time. |
|
Extract basic variable information from configuration dictionary. |
|
Read diagnostic script configuration from settings.yml. |
|
Get control, exper and obs datasets. |
|
Get a valid path for saving a diagnostic data file. |
|
Get a valid path for saving a diagnostic plot. |
|
Group metadata describing preprocessed data by attribute. |
Run a Python diagnostic. |
|
|
Save the data used to create a plot to file. |
|
Save a figure to file. |
|
Select specific metadata describing preprocessed data. |
|
Sort grouped metadata. |
|
Sort a list of metadata describing preprocessed data. |
|
Check if data from certain variables is available. |
- class esmvaltool.diag_scripts.shared.Datasets(cfg)[source]#
Bases:
object
Class to easily access a recipe’s datasets in a diagnostic script.
Note
This class has been deprecated in version 2.2 and will be removed two minor releases later in version 2.4.
Examples
Get all variables of a recipe configuration cfg:
datasets = Datasets(cfg)
Access data of a dataset with path dataset_path:
datasets.get_data(path=dataset_path)
Access dataset information of the dataset:
datasets.get_dataset_info(path=dataset_path)
Access the data of all datasets with exp=piControl:
datasets.get_data_list(exp=piControl)
Methods:
add_dataset
(path[, data])Add dataset to class.
add_to_data
(data[, path])Add element to a dataset's data.
get_data
([path])Access a dataset's data.
get_data_list
(**dataset_info)Access the datasets' data in a list.
get_dataset_info
([path])Access a dataset's information.
get_dataset_info_list
(**dataset_info)Access dataset's information in a list.
get_info
(key[, path])Access a 'dataset_info`'s key.
get_info_list
(key, **dataset_info)Access dataset_info's key values.
get_path
(**dataset_info)Access a dataset's path.
get_path_list
(**dataset_info)Access dataset's paths in a list.
set_data
(data[, path])Set element as a dataset's data.
- add_dataset(path, data=None, **dataset_info)[source]#
Add dataset to class.
- Parameters:
path (str) – (Unique) path to the dataset.
data (optional) – Arbitrary object to be saved as data for the dataset.
**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.
- add_to_data(data, path=None, **dataset_info)[source]#
Add element to a dataset’s data.
Notes
Either path or a unique dataset_info description have to be given. Fails when given information is ambiguous.
- Parameters:
data – Element to be added to the dataset’s data.
path (str, optional) – Path to the dataset
**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.
- Raises:
RuntimeError – If data given by dataset_info is ambiguous.
- get_data(path=None, **dataset_info)[source]#
Access a dataset’s data.
Notes
Either path or a unique dataset_info description have to be given. Fails when given information is ambiguous.
- Parameters:
path (str, optional) – Path to the dataset
**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.
- Returns:
Data of the selected dataset.
- Return type:
data_object
- Raises:
RuntimeError – If data given by dataset_info is ambiguous.
- get_data_list(**dataset_info)[source]#
Access the datasets’ data in a list.
Notes
The returned data is sorted alphabetically respective to the paths.
- Parameters:
**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.
- Returns:
Data of the selected datasets.
- Return type:
- get_dataset_info(path=None, **dataset_info)[source]#
Access a dataset’s information.
Notes
Either path or a unique dataset_info description have to be given. Fails when given information is ambiguous.
- Parameters:
path (str, optional) – Path to the dataset.
**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.
- Returns:
All dataset information.
- Return type:
- Raises:
RuntimeError – If data given by dataset_info is ambiguous.
- get_dataset_info_list(**dataset_info)[source]#
Access dataset’s information in a list.
Notes
The returned data is sorted alphabetically respective to the paths.
- Parameters:
**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.
- Returns:
Information dictionaries of the selected datasets.
- Return type:
- get_info(key, path=None, **dataset_info)[source]#
Access a ‘dataset_info`’s key.
Notes
Either path or a unique dataset_info description have to be given. Fails when given information is ambiguous. If the dataset_info does not contain the key, returns None.
- Parameters:
- Returns:
key information of the given dataset.
- Return type:
- Raises:
RuntimeError – If data given by dataset_info is ambiguous.
- get_info_list(key, **dataset_info)[source]#
Access dataset_info’s key values.
Notes
The returned data is sorted alphabetically respective to the paths.
- get_path(**dataset_info)[source]#
Access a dataset’s path.
Notes
A unique dataset_info description has to be given. Fails when given information is ambiguous.
- Parameters:
**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.
- Returns:
Path of the selected dataset.
- Return type:
- Raises:
RuntimeError – If data given by dataset_info is ambiguous.
- get_path_list(**dataset_info)[source]#
Access dataset’s paths in a list.
Notes
The returned data is sorted alphabetically respective to the paths.
- Parameters:
**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.
- Returns:
Paths of the selected datasets.
- Return type:
- set_data(data, path=None, **dataset_info)[source]#
Set element as a dataset’s data.
Notes
Either path or a unique dataset_info description have to be given. Fails when if given information is ambiguous.
- Parameters:
data – Element to be set as the dataset’s data.
path (str, optional) – Path to the dataset.
**dataset_info (optional) – Keyword arguments describing the dataset, e.g. dataset=CanESM2, exp=piControl or short_name=tas.
- Raises:
RuntimeError – If data given by dataset_info is ambiguous.
- class esmvaltool.diag_scripts.shared.ProvenanceLogger(cfg)[source]#
Bases:
object
Open the provenance logger.
- Parameters:
cfg (dict) – Dictionary with diagnostic configuration.
Example
Use as a context manager:
record = { 'caption': "This is a nice plot.", 'statistics': ['mean'], 'domain': ['global'], 'plot_type': ['zonal'], 'authors': [ 'first_author', 'second_author', ], 'references': [ 'author20journal', ], 'ancestors': [ '/path/to/input_file_1.nc', '/path/to/input_file_2.nc', ], } output_file = '/path/to/result.nc' with ProvenanceLogger(cfg) as provenance_logger: provenance_logger.log(output_file, record)
Methods:
log
(filename, record)Record provenance.
- log(filename, record)[source]#
Record provenance.
- Parameters:
Note
See the provenance documentation for more information.
- class esmvaltool.diag_scripts.shared.Variable(short_name, standard_name, long_name, units)[source]#
Bases:
Variable
Variable class containing all relevant information.
Note
This class has been deprecated in version 2.2 and will be removed two minor releases later in version 2.4.
Methods:
count
(value, /)Return number of occurrences of value.
index
(value[, start, stop])Return first index of value.
Attributes:
Alias for field number 2
Alias for field number 0
Alias for field number 1
Alias for field number 3
- count(value, /)#
Return number of occurrences of value.
- index(value, start=0, stop=9223372036854775807, /)#
Return first index of value.
Raises ValueError if the value is not present.
- long_name#
Alias for field number 2
- short_name#
Alias for field number 0
- standard_name#
Alias for field number 1
- units#
Alias for field number 3
- class esmvaltool.diag_scripts.shared.Variables(cfg=None, **names)[source]#
Bases:
object
Class to easily access a recipe’s variables in a diagnostic.
Note
This class has been deprecated in version 2.2 and will be removed two minor releases later in version 2.4.
Examples
Get all variables of a recipe configuration cfg:
variables = Variables(cfg)
Access information of a variable tas:
variables.short_name('tas') variables.standard_name('tas') variables.long_name('tas') variables.units('tas')
Access
iris
-suitable dictionary of a variable tas:variables.iris_dict('tas')
Check if variables tas and pr are available:
variables.vars_available('tas', 'pr')
Methods:
add_vars
(**names)Add custom variables to the class.
iris_dict
(var)Access
iris
dictionary of the variable.long_name
(var)Access long name.
modify_var
(var, **names)Modify an already existing variable of the class.
short_name
(var)Access short name.
Get list of all short_names.
standard_name
(var)Access standard name.
Get list of all standard_names.
units
(var)Access units.
var_name
(var)Access var name.
vars_available
(*args)Check if given variables are available.
- modify_var(var, **names)[source]#
Modify an already existing variable of the class.
- Parameters:
var (str) – (Short) name of the existing variable.
**names – Keyword arguments of the form short_name=tas.
- Raises:
ValueError – If var is not an existing variable.
TypeError – If a non-valid keyword argument is given.
- standard_names()[source]#
Get list of all standard_names.
- Returns:
List of all standard_names.
- Return type:
- esmvaltool.diag_scripts.shared.apply_supermeans(ctrl, exper, obs_list)[source]#
Apply supermeans on data components ie MEAN on time.
This function is an extension of climate_statistics() meant to ease the time-meaning procedure when dealing with CONTROL, EXPERIMENT and OBS (if any) datasets. ctrl: dictionary of CONTROL dataset exper: dictionary of EXPERIMENT dataset obs_lis: list of dicts for OBS datasets (0, 1 or many)
Returns: control and experiment cubes and list of obs cubes
- esmvaltool.diag_scripts.shared.extract_variables(cfg, as_iris=False)[source]#
Extract basic variable information from configuration dictionary.
Returns short_name, standard_name, long_name and units keys for each variable.
- esmvaltool.diag_scripts.shared.get_cfg(filename=None)[source]#
Read diagnostic script configuration from settings.yml.
- esmvaltool.diag_scripts.shared.get_control_exper_obs(short_name, input_data, cfg, cmip_type=None)[source]#
Get control, exper and obs datasets.
This function is used when running recipes that need a clear distinction between a control dataset, an experiment dataset and have optional obs (OBS, obs4MIPs etc) datasets; such recipes include recipe_validation, and all the autoassess ones; short_name: variable short name input_data: dict containing the input data info cfg: config file as used in this module cmip_type: optional, CMIP project type (CMIP5 or CMIP6)
- esmvaltool.diag_scripts.shared.get_diagnostic_filename(basename, cfg, extension='nc')[source]#
Get a valid path for saving a diagnostic data file.
- esmvaltool.diag_scripts.shared.get_plot_filename(basename, cfg)[source]#
Get a valid path for saving a diagnostic plot.
- esmvaltool.diag_scripts.shared.group_metadata(metadata, attribute, sort=None)[source]#
Group metadata describing preprocessed data by attribute.
- esmvaltool.diag_scripts.shared.run_diagnostic()[source]#
Run a Python diagnostic.
This context manager is the main entry point for most Python diagnostics.
Example
See esmvaltool/diag_scripts/examples/diagnostic.py for an extensive example of how to start your diagnostic.
Basic usage is as follows, add these lines at the bottom of your script:
def main(cfg): # Your diagnostic code goes here. print(cfg) if __name__ == '__main__': with run_diagnostic() as cfg: main(cfg)
To prevent the diagnostic script from using the Dask Distributed scheduler, set
no_distributed: true
in the diagnostic script definition in the recipe or in the resulting settings.yml file.The cfg dict passed to main contains the script configuration that can be used with the other functions in this module.
- esmvaltool.diag_scripts.shared.save_data(basename, provenance, cfg, cube, **kwargs)[source]#
Save the data used to create a plot to file.
- Parameters:
basename (str) – The basename of the file.
provenance (dict) – The provenance record for the data.
cfg (dict) – Dictionary with diagnostic configuration.
cube (iris.cube.Cube) – Data cube to save.
**kwargs – Extra keyword arguments to pass to
iris.save
.
See also
ProvenanceLogger
For an example provenance record that can be used with this function.
- esmvaltool.diag_scripts.shared.save_figure(basename, provenance, cfg, figure=None, close=True, **kwargs)[source]#
Save a figure to file.
- Parameters:
basename (str) – The basename of the file.
provenance (dict) – The provenance record for the figure.
cfg (dict) – Dictionary with diagnostic configuration.
figure (matplotlib.figure.Figure) – Figure to save.
close (bool) – Close the figure after saving.
**kwargs – Keyword arguments to pass to
matplotlib.figure.Figure.savefig
.
See also
ProvenanceLogger
For an example provenance record that can be used with this function.
- esmvaltool.diag_scripts.shared.select_metadata(metadata, **attributes)[source]#
Select specific metadata describing preprocessed data.
- Parameters:
- Returns:
A list of matching metadata.
- Return type:
- esmvaltool.diag_scripts.shared.sorted_group_metadata(metadata_groups, sort)[source]#
Sort grouped metadata.
Sorting is done on strings and is not case sensitive.
- Parameters:
- Returns:
A dictionary containing the requested groups.
- Return type:
- esmvaltool.diag_scripts.shared.sorted_metadata(metadata, sort)[source]#
Sort a list of metadata describing preprocessed data.
Sorting is done on strings and is not case sensitive.
- esmvaltool.diag_scripts.shared.variables_available(cfg, short_names)[source]#
Check if data from certain variables is available.
Iris helper functions#
Convenience functions for iris
objects.
Functions:
|
Compare coordinate of cubes and raise error if not identical. |
|
Change all appearances of |
|
Get mean cube of a list of datasets. |
Compare dataset coordinates of cubes and match them if necessary. |
|
|
Create |
|
Prepare single |
|
Unify 1D cubes by transforming them to identical coordinates. |
|
Unify time coordinate of cube in-place. |
- esmvaltool.diag_scripts.shared.iris_helpers.check_coordinate(cubes, coord_name)[source]#
Compare coordinate of cubes and raise error if not identical.
- Parameters:
cubes (iris.cube.CubeList) – Cubes to be compared.
coord_name (str) – Name of the coordinate.
- Returns:
Points of the coordinate.
- Return type:
numpy.array
- Raises:
iris.exceptions.CoordinateNotFoundError – Coordinate
coord_name
is not a coordinate of one of the cubes.ValueError – Given coordinate differs for the input cubes.
- esmvaltool.diag_scripts.shared.iris_helpers.convert_to_iris(dict_)[source]#
Change all appearances of
short_name
tovar_name
.
- esmvaltool.diag_scripts.shared.iris_helpers.get_mean_cube(datasets)[source]#
Get mean cube of a list of datasets.
- esmvaltool.diag_scripts.shared.iris_helpers.intersect_dataset_coordinates(cubes)[source]#
Compare dataset coordinates of cubes and match them if necessary.
Use intersection of coordinate ‘dataset’ of all given cubes and remove elements which are not given in all cubes.
- Parameters:
cubes (iris.cube.CubeList) – Cubes to be compared.
- Returns:
Transformed cubes.
- Return type:
- Raises:
iris.exceptions.CoordinateNotFoundError – Coordinate
dataset
is not a coordinate of one of the cubes.ValueError – At least one of the cubes contains a
dataset
coordinate with duplicate elements or the cubes do not share common elements.
- esmvaltool.diag_scripts.shared.iris_helpers.iris_project_constraint(projects, input_data, negate=False)[source]#
Create
iris.Constraint
to select specific projects from data.- Parameters:
- Returns:
constraint for coordinate
dataset
.- Return type:
- esmvaltool.diag_scripts.shared.iris_helpers.prepare_cube_for_merging(cube, cube_label)[source]#
Prepare single
iris.cube.Cube
in order to merge it later.- Parameters:
cube (iris.cube.Cube) – Cube to be pre-processed.
cube_label (str) – Label for the new scalar coordinate
cube_label
.
- esmvaltool.diag_scripts.shared.iris_helpers.unify_1d_cubes(cubes, coord_name)[source]#
Unify 1D cubes by transforming them to identical coordinates.
Use union of all coordinates as reference and transform other cubes to it by adding missing values.
- Parameters:
cubes (iris.cube.CubeList) – Cubes to be processed.
coord_name (str) – Name of the coordinate.
- Returns:
Transformed cubes.
- Return type:
- Raises:
ValueError – Cubes are not 1D, coordinate name differs or not all cube coordinates are subsets of longest coordinate.
- esmvaltool.diag_scripts.shared.iris_helpers.unify_time_coord(cube, target_units='days since 1850-01-01 00:00:00')[source]#
Unify time coordinate of cube in-place.
- Parameters:
cube (iris.cube.Cube) – Cube whose time coordinate is transformed in-place.
target_units (str or cf_units.Unit, optional) – Target time units.
- Raises:
iris.exceptions.CoordinateNotFoundError – Cube does not contain coordinate
time
.
Plotting#
Module that provides common plot functions.
Functions:
|
Retrieve the style information for the given dataset. |
|
Get path to matplotlib style file. |
|
Plot global filled contour plot. |
|
Plot global color mesh. |
|
Plot a multi dataset scatterplot. |
|
Plot a cube using one of the iris.quickplot functions. |
|
Plot a scatterplot. |
- esmvaltool.diag_scripts.shared.plot.get_dataset_style(dataset, style_file=None)[source]#
Retrieve the style information for the given dataset.
- esmvaltool.diag_scripts.shared.plot.get_path_to_mpl_style(style_file=None)[source]#
Get path to matplotlib style file.
- esmvaltool.diag_scripts.shared.plot.global_contourf(cube, cbar_center=None, cbar_label=None, cbar_range=None, cbar_ticks=None, **kwargs)[source]#
Plot global filled contour plot.
Note
This is only possible if the cube is 2D with dimensional coordinates latitude and longitude.
- Parameters:
cube (iris.cube.Cube) – Cube to plot.
cbar_center (float, optional) – Central value for the colormap, useful for diverging colormaps. Can only be used if
cbar_range
is given.cbar_label (str, optional) – Label for the colorbar.
cbar_range (list of float, optional) – Range of the colorbar (first and second list element) and number of distinct colors (third element). See
numpy.linspace
.cbar_ticks (list, optional) – Ticks for the colorbar.
**kwargs – Keyword argument for
iris.plot.contourf()
.
- Returns:
Plot object.
- Return type:
- Raises:
iris.exceptions.CoordinateNotFoundError – Input
iris.cube.Cube
does not contain the necessary dimensional coordinates'latitude'
and'longitude'
.ValueError – Input
iris.cube.Cube
is not 2D.
- esmvaltool.diag_scripts.shared.plot.global_pcolormesh(cube, cbar_center=None, cbar_label=None, cbar_ticks=None, **kwargs)[source]#
Plot global color mesh.
Note
This is only possible if the cube is 2D with dimensional coordinates latitude and longitude.
- Parameters:
cube (iris.cube.Cube) – Cube to plot.
cbar_center (float, optional) – Central value for the colormap, useful for diverging colormaps. Can only be used if
vmin
andvmax
are given.cbar_label (str, optional) – Label for the colorbar.
cbar_ticks (list, optional) – Ticks for the colorbar.
**kwargs – Keyword argument for
iris.plot.pcolormesh()
.
- Returns:
Plot object.
- Return type:
- Raises:
iris.exceptions.CoordinateNotFoundError – Input
iris.cube.Cube
does not contain the necessary dimensional coordinates'latitude'
and'longitude'
.ValueError – Input
iris.cube.Cube
is not 2D.
- esmvaltool.diag_scripts.shared.plot.multi_dataset_scatterplot(x_data, y_data, datasets, filepath, **kwargs)[source]#
Plot a multi dataset scatterplot.
Notes
Allowed keyword arguments:
mpl_style_file (
str
): Path to the matplotlib style file.dataset_style_file (
str
): Path to the dataset style file.plot_kwargs (array-like): Keyword arguments for the plot (e.g. label, makersize, etc.).
save_kwargs (
dict
): Keyword arguments for saving the plot.axes_functions (
dict
): Arbitrary functions for axes, i.e. axes.set_title(‘title’).
- Parameters:
x_data (array-like) – x data of each dataset.
y_data (array-like) – y data of each dataset.
datasets (array-like) – Names of the datasets.
filepath (str) – Path to which plot is written.
**kwargs – Keyword arguments.
- Raises:
TypeError – A non-valid keyword argument is given or x_data, y_data, datasets or (if given) plot_kwargs is not array-like.
ValueError – x_data, y_data, datasets or plot_kwargs do not have the same size.
- esmvaltool.diag_scripts.shared.plot.quickplot(cube, plot_type, filename=None, **kwargs)[source]#
Plot a cube using one of the iris.quickplot functions.
- esmvaltool.diag_scripts.shared.plot.scatterplot(x_data, y_data, filepath, **kwargs)[source]#
Plot a scatterplot.
Notes
Allowed keyword arguments:
mpl_style_file (
str
): Path to the matplotlib style file.plot_kwargs (array-like): Keyword arguments for the plot (e.g. label, makersize, etc.).
save_kwargs (
dict
): Keyword arguments for saving the plot.axes_functions (
dict
): Arbitrary functions for axes, i.e. axes.set_title(‘title’).
- Parameters:
x_data (array-like) – x data of each dataset.
y_data (array-like) – y data of each dataset.
filepath (str) – Path to which plot is written.
**kwargs – Keyword arguments.
- Raises:
TypeError – A non-valid keyword argument is given or x_data, y_data or (if given) plot_kwargs is not array-like.
ValueError – x_data, y_data or plot_kwargs do not have the same size.