Preprocessor functions¶
- esmvalcore.preprocessor.DEFAULT_ORDER = ('fix_file', 'load', 'derive', 'fix_metadata', 'concatenate', 'cmor_check_metadata', 'clip_timerange', 'fix_data', 'cmor_check_data', 'add_fx_variables', 'extract_time', 'extract_season', 'extract_month', 'resample_hours', 'resample_time', 'extract_levels', 'weighting_landsea_fraction', 'mask_landsea', 'mask_glaciated', 'mask_landseaice', 'regrid', 'extract_point', 'extract_location', 'mask_multimodel', 'mask_fillvalues', 'mask_above_threshold', 'mask_below_threshold', 'mask_inside_range', 'mask_outside_range', 'clip', 'extract_region', 'extract_shape', 'extract_volume', 'extract_trajectory', 'extract_transect', 'detrend', 'extract_named_regions', 'depth_integration', 'area_statistics', 'volume_statistics', 'amplitude', 'zonal_statistics', 'meridional_statistics', 'hourly_statistics', 'daily_statistics', 'monthly_statistics', 'seasonal_statistics', 'annual_statistics', 'decadal_statistics', 'climate_statistics', 'anomalies', 'regrid_time', 'timeseries_filter', 'linear_trend', 'linear_trend_stderr', 'convert_units', 'ensemble_statistics', 'multi_model_statistics', 'bias', 'remove_fx_variables', 'save', 'cleanup')¶
By default, preprocessor functions are applied in this order.
Preprocessor module.
Functions:
|
Load requested fx files, check with CMOR standards and add the fx variables as cell measures or ancillary variables in the cube containing the data. |
|
Calculate amplitude of cycles by aggregating over coordinates. |
|
Compute annual statistics. |
|
Compute anomalies using a mean with the specified granularity. |
|
Apply a statistical operator in the horizontal direction. |
|
Calculate biases. |
|
Clean up after running the preprocessor. |
|
Compute climate statistics with the specified granularity. |
|
Clip values at a specified minimum and/or maximum value. |
|
Extract time range with a resolution up to seconds. |
|
Check if data conforms to variable's CMOR definition. |
|
Check if metadata conforms to variable's CMOR definition. |
|
Concatenate all cubes after fixing metadata. |
|
Convert the units of a cube to new ones. |
|
Compute daily statistics. |
|
Compute decadal statistics. |
|
Determine the total sum over the vertical component. |
|
Derive variable. |
|
Detrend data along a given dimension. |
|
Entry point for ensemble statistics. |
|
Perform vertical interpolation. |
|
Extract a point using a location name, with interpolation. |
|
Slice cube to get only the data belonging to a specific month. |
|
Extract a specific named region. |
|
Extract a point, with interpolation. |
|
Extract a region from a cube. |
|
Slice cube to get only the data belonging to a specific season. |
|
Extract a region defined by a shapefile. |
|
Extract a time range from a cube. |
|
Extract data along a trajectory. |
|
Extract data along a line of constant latitude or longitude. |
|
Subset a cube based on a range of values in the z-coordinate. |
|
Fix cube data if fixes add present and check it anyway. |
|
Fix files before ESMValTool can load them. |
|
Fix cube metadata if fixes are required and check it anyway. |
|
Compute hourly statistics. |
|
Calculate linear trend of data along a given coordinate. |
|
Calculate standard error of linear trend along a given coordinate. |
|
Load iris cubes from files. |
|
Mask above a specific threshold value. |
|
Mask below a specific threshold value. |
|
Compute and apply a multi-dataset fillvalues mask. |
|
Mask out glaciated areas. |
|
Mask inside a specific threshold range. |
|
Mask out either land mass or sea (oceans, seas and lakes). |
|
Mask out either landsea (combined) or ice. |
|
Apply common mask to all datasets (using logical OR). |
|
Mask outside a specific threshold range. |
|
Compute meridional statistics. |
|
Compute monthly statistics. |
|
Compute multi-model statistics. |
|
Perform horizontal regridding. |
|
Align time axis for cubes so they can be subtracted. |
|
Remove fx variables present as cell measures or ancillary variables in the cube containing the data. |
|
Convert x-hourly data to y-hourly by eliminating extra timesteps. |
|
Change frequency of data by resampling it. |
|
Save iris cubes to file. |
|
Compute seasonal statistics. |
|
Apply a timeseries filter. |
|
Apply a statistical operation over a volume. |
|
Weight fields using land or sea fraction. |
|
Compute zonal statistics. |
- esmvalcore.preprocessor.add_fx_variables(cube, fx_variables, check_level)[source]¶
Load requested fx files, check with CMOR standards and add the fx variables as cell measures or ancillary variables in the cube containing the data.
- Parameters
cube (iris.cube.Cube) – Iris cube with input data.
fx_variables (dict) – Dictionary with fx_variable information.
check_level (CheckLevels) – Level of strictness of the checks.
- Returns
Cube with added cell measures or ancillary variables.
- Return type
- esmvalcore.preprocessor.amplitude(cube, coords)[source]¶
Calculate amplitude of cycles by aggregating over coordinates.
Note
The amplitude is calculated as peak-to-peak amplitude (difference between maximum and minimum value of the signal). Other amplitude types are currently not supported.
- Parameters
cube (iris.cube.Cube) – Input data.
coords (str or list of str) – Coordinates over which is aggregated. For example, use
'year'
to extract the annual cycle amplitude for each year in the data or['day_of_year', 'year']
to extract the diurnal cycle amplitude for each individual day in the data. If the coordinates are not found incube
, try to add it viairis.coord_categorisation
(at the moment, this only works for the temporal coordinatesday_of_month
,day_of_year
,hour
,month
,month_fullname
,month_number
,season
,season_number
,season_year
,weekday
,weekday_fullname
,weekday_number
oryear
.
- Returns
Amplitudes.
- Return type
- Raises
iris.exceptions.CoordinateNotFoundError – A coordinate is not found in
cube
and cannot be added viairis.coord_categorisation
.
- esmvalcore.preprocessor.annual_statistics(cube, operator='mean')[source]¶
Compute annual statistics.
Note that this function does not weight the annual mean if uneven time periods are present. Ie, all data inside the year are treated equally.
- Parameters
cube (iris.cube.Cube) – input cube.
operator (str, optional) – Select operator to apply. Available operators: ‘mean’, ‘median’, ‘std_dev’, ‘sum’, ‘min’, ‘max’, ‘rms’
- Returns
Annual statistics cube
- Return type
- esmvalcore.preprocessor.anomalies(cube, period, reference=None, standardize=False, seasons=('DJF', 'MAM', 'JJA', 'SON'))[source]¶
Compute anomalies using a mean with the specified granularity.
Computes anomalies based on daily, monthly, seasonal or yearly means for the full available period
- Parameters
cube (iris.cube.Cube) – input cube.
period (str) – Period to compute the statistic over. Available periods: ‘full’, ‘season’, ‘seasonal’, ‘monthly’, ‘month’, ‘mon’, ‘daily’, ‘day’
reference (list int, optional, default: None) – Period of time to use a reference, as needed for the ‘extract_time’ preprocessor function If None, all available data is used as a reference
standardize (bool, optional) – If True standardized anomalies are calculated
seasons (list or tuple of str, optional) – Seasons to use if needed. Defaults to (‘DJF’, ‘MAM’, ‘JJA’, ‘SON’)
- Returns
Anomalies cube
- Return type
- esmvalcore.preprocessor.area_statistics(cube, operator)[source]¶
Apply a statistical operator in the horizontal direction.
The average in the horizontal direction. We assume that the horizontal directions are [‘longitude’, ‘latutude’].
This function can be used to apply several different operations in the horizontal plane: mean, standard deviation, median variance, minimum and maximum. These options are specified using the operator argument and the following key word arguments:
mean
Area weighted mean.
median
Median (not area weighted)
std_dev
Standard Deviation (not area weighted)
sum
Area weighted sum.
variance
Variance (not area weighted)
min:
Minimum value
max
Maximum value
rms
Area weighted root mean square.
- Parameters
cube (iris.cube.Cube) – Input cube.
operator (str) – The operation, options: mean, median, min, max, std_dev, sum, variance, rms.
- Returns
collapsed cube.
- Return type
- Raises
iris.exceptions.CoordinateMultiDimError – Exception for latitude axis with dim > 2.
ValueError – if input data cube has different shape than grid area weights
- esmvalcore.preprocessor.bias(products, bias_type='absolute', denominator_mask_threshold=0.001, keep_reference_dataset=False)[source]¶
Calculate biases.
Notes
This preprocessor requires a reference dataset. For this, exactly one input dataset needs to have the facet
reference_for_bias: true
defined in the recipe. In addition, all input datasets need to have identical dimensional coordinates. This can for example be ensured with the preprocessorsesmvalcore.preprocessor.regrid()
and/oresmvalcore.preprocessor.regrid_time()
.- Parameters
products (set of esmvalcore.preprocessor.PreprocessorFile) – Input datasets. Exactly one datasets needs the facet
reference_for_bias: true
.bias_type (str, optional (default: 'absolute')) – Bias type that is calculated. Must be one of
'absolute'
(dataset - ref) or'relative'
((dataset - ref) / ref).denominator_mask_threshold (float, optional (default: 1e-3)) – Threshold to mask values close to zero in the denominator (i.e., the reference dataset) during the calculation of relative biases. All values in the reference dataset with absolute value less than the given threshold are masked out. This setting is ignored when
bias_type
is set to'absolute'
. Please note that for some variables with very small absolute values (e.g., carbon cycle fluxes, which are usually \(< 10^{-6}\) kg m \(^{-2}\) s \(^{-1}\)) it is absolutely essential to change the default value in order to get reasonable results.keep_reference_dataset (bool, optional (default: False)) – If
True
, keep the reference dataset in the output. IfFalse
, drop the reference dataset.
- Returns
Output datasets.
- Return type
set of esmvalcore.preprocessor.PreprocessorFile
- Raises
ValueError – Not exactly one input datasets contains the facet
reference_for_bias: true
;bias_type
is not one of'absolute'
or'relative'
.
- esmvalcore.preprocessor.cleanup(files, remove=None)[source]¶
Clean up after running the preprocessor.
- esmvalcore.preprocessor.climate_statistics(cube, operator='mean', period='full', seasons=('DJF', 'MAM', 'JJA', 'SON'))[source]¶
Compute climate statistics with the specified granularity.
Computes statistics for the whole dataset. It is possible to get them for the full period or with the data grouped by day, month or season
- Parameters
cube (iris.cube.Cube) – input cube.
operator (str, optional) – Select operator to apply. Available operators: ‘mean’, ‘median’, ‘std_dev’, ‘sum’, ‘min’, ‘max’, ‘rms’
period (str, optional) – Period to compute the statistic over. Available periods: ‘full’, ‘season’, ‘seasonal’, ‘monthly’, ‘month’, ‘mon’, ‘daily’, ‘day’
seasons (list or tuple of str, optional) – Seasons to use if needed. Defaults to (‘DJF’, ‘MAM’, ‘JJA’, ‘SON’)
- Returns
Monthly statistics cube
- Return type
- esmvalcore.preprocessor.clip(cube, minimum=None, maximum=None)[source]¶
Clip values at a specified minimum and/or maximum value.
Values lower than minimum are set to minimum and values higher than maximum are set to maximum.
- Parameters
cube (iris.cube.Cube) – iris cube to be clipped
minimum (float) – lower threshold to be applied on input cube data.
maximum (float) – upper threshold to be applied on input cube data.
- Returns
clipped cube.
- Return type
- esmvalcore.preprocessor.clip_timerange(cube, timerange)[source]¶
Extract time range with a resolution up to seconds.
- Parameters
cube (iris.cube.Cube) – Input cube.
timerange (str) – Time range in ISO 8601 format.
- Returns
Sliced cube.
- Return type
- Raises
ValueError – Time ranges are outside the cube’s time limits.
- esmvalcore.preprocessor.cmor_check_data(cube, cmor_table, mip, short_name, frequency, check_level=CheckLevels.DEFAULT)[source]¶
Check if data conforms to variable’s CMOR definition.
The checks performed at this step require the data in memory.
- Parameters
cube (iris.cube.Cube) – Data cube to check.
cmor_table (str) – CMOR definitions to use.
mip – Variable’s mip.
short_name (str) – Variable’s short name
frequency (str) – Data frequency
check_level (CheckLevels) – Level of strictness of the checks.
- esmvalcore.preprocessor.cmor_check_metadata(cube, cmor_table, mip, short_name, frequency, check_level=CheckLevels.DEFAULT)[source]¶
Check if metadata conforms to variable’s CMOR definition.
None of the checks at this step will force the cube to load the data.
- Parameters
cube (iris.cube.Cube) – Data cube to check.
cmor_table (str) – CMOR definitions to use.
mip – Variable’s mip.
short_name (str) – Variable’s short name.
frequency (str) – Data frequency.
check_level (CheckLevels) – Level of strictness of the checks.
- esmvalcore.preprocessor.convert_units(cube, units)[source]¶
Convert the units of a cube to new ones.
This converts units of a cube.
- Parameters
cube (iris.cube.Cube) – input cube
units (str) – new units in udunits form
- Returns
converted cube.
- Return type
- esmvalcore.preprocessor.daily_statistics(cube, operator='mean')[source]¶
Compute daily statistics.
Chunks time in daily periods and computes statistics over them;
- Parameters
cube (iris.cube.Cube) – input cube.
operator (str, optional) – Select operator to apply. Available operators: ‘mean’, ‘median’, ‘std_dev’, ‘sum’, ‘min’, ‘max’, ‘rms’
- Returns
Daily statistics cube
- Return type
- esmvalcore.preprocessor.decadal_statistics(cube, operator='mean')[source]¶
Compute decadal statistics.
Note that this function does not weight the decadal mean if uneven time periods are present. Ie, all data inside the decade are treated equally.
- Parameters
cube (iris.cube.Cube) – input cube.
operator (str, optional) – Select operator to apply. Available operators: ‘mean’, ‘median’, ‘std_dev’, ‘sum’, ‘min’, ‘max’, ‘rms’
- Returns
Decadal statistics cube
- Return type
- esmvalcore.preprocessor.depth_integration(cube)[source]¶
Determine the total sum over the vertical component.
Requires a 3D cube. The z-coordinate integration is calculated by taking the sum in the z direction of the cell contents multiplied by the cell thickness.
- Parameters
cube (iris.cube.Cube) – input cube.
- Returns
collapsed cube.
- Return type
- esmvalcore.preprocessor.derive(cubes, short_name, long_name, units, standard_name=None)[source]¶
Derive variable.
- Parameters
cubes (iris.cube.CubeList) – Includes all the needed variables for derivation defined in
get_required()
.short_name (str) – short_name
long_name (str) – long_name
units (str) – units
standard_name (str, optional) – standard_name
- Returns
The new derived variable.
- Return type
- esmvalcore.preprocessor.detrend(cube, dimension='time', method='linear')[source]¶
Detrend data along a given dimension.
- Parameters
cube (iris.cube.Cube) – input cube.
dimension (str) – Dimension to detrend
method (str) – Method to detrend. Available: linear, constant. See documentation of ‘scipy.signal.detrend’ for details
- Returns
Detrended cube
- Return type
- esmvalcore.preprocessor.ensemble_statistics(products, statistics, output_products, span='overlap')[source]¶
Entry point for ensemble statistics.
An ensemble grouping is performed on the input products. The statistics are then computed calling the
esmvalcore.preprocessor.multi_model_statistics()
module, taking the grouped products as an input.- Parameters
products (list) – Cubes (or products) over which the statistics will be computed.
statistics (list) – Statistical metrics to be computed, e.g. [
mean
,max
]. Choose from the operators listed in the iris.analysis package. Percentiles can be specified likepXX.YY
.output_products (dict) – For internal use only. A dict with statistics names as keys and preprocessorfiles as values. If products are passed as input, the statistics cubes will be assigned to these output products.
span (str (default: 'overlap')) – Overlap or full; if overlap, statitstics are computed on common time- span; if full, statistics are computed on full time spans, ignoring missing data.
- Returns
A set of output_products with the resulting ensemble statistics.
- Return type
See also
- esmvalcore.preprocessor.extract_levels(cube, levels, scheme, coordinate=None, rtol=1e-07, atol=None)[source]¶
Perform vertical interpolation.
- Parameters
cube (iris.cube.Cube) – The source cube to be vertically interpolated.
levels (ArrayLike) – One or more target levels for the vertical interpolation. Assumed to be in the same S.I. units of the source cube vertical dimension coordinate. If the requested levels are sufficiently close to the levels of the cube, cube slicing will take place instead of interpolation.
scheme (str) – The vertical interpolation scheme to use. Choose from ‘linear’, ‘nearest’, ‘linear_extrapolate’, ‘nearest_extrapolate’.
coordinate (optional str) – The coordinate to interpolate. If specified, pressure levels (if present) can be converted to height levels and vice versa using the US standard atmosphere. E.g. ‘coordinate = altitude’ will convert existing pressure levels (air_pressure) to height levels (altitude); ‘coordinate = air_pressure’ will convert existing height levels (altitude) to pressure levels (air_pressure).
rtol (float) – Relative tolerance for comparing the levels in cube to the requested levels. If the levels are sufficiently close, the requested levels will be assigned to the cube and no interpolation will take place.
atol (float) – Absolute tolerance for comparing the levels in cube to the requested levels. If the levels are sufficiently close, the requested levels will be assigned to the cube and no interpolation will take place. By default, atol will be set to 10^-7 times the mean value of the levels on the cube.
- Returns
A cube with the requested vertical levels.
- Return type
See also
regrid
Perform horizontal regridding.
- esmvalcore.preprocessor.extract_location(cube, location, scheme)[source]¶
Extract a point using a location name, with interpolation.
Extracts a single location point from a cube, according to the interpolation scheme
scheme
.The function just retrieves the coordinates of the location and then calls the
extract_point
preprocessor.It can be used to locate cities and villages, but also mountains or other geographical locations.
Note
The geolocator needs a working internet connection.
- Parameters
- Returns
Returns a cube with the extracted point, and with adjusted
latitude and longitude coordinates.
- Raises
ValueError: – If location is not supplied as a preprocessor parameter.
ValueError: – If scheme is not supplied as a preprocessor parameter.
ValueError: – If given location cannot be found by the geolocator.
- esmvalcore.preprocessor.extract_month(cube, month)[source]¶
Slice cube to get only the data belonging to a specific month.
- Parameters
cube (iris.cube.Cube) – Original data
month (int) – Month to extract as a number from 1 to 12
- Returns
data cube for specified month.
- Return type
- Raises
ValueError – if requested month is not present in the cube
- esmvalcore.preprocessor.extract_named_regions(cube, regions)[source]¶
Extract a specific named region.
The region coordinate exist in certain CMIP datasets. This preprocessor allows a specific named regions to be extracted.
- Parameters
cube (iris.cube.Cube) – input cube.
regions (str, list) – A region or list of regions to extract.
- Returns
collapsed cube.
- Return type
- Raises
ValueError – regions is not list or tuple or set.
ValueError – region not included in cube.
- esmvalcore.preprocessor.extract_point(cube, latitude, longitude, scheme)[source]¶
Extract a point, with interpolation.
Extracts a single latitude/longitude point from a cube, according to the interpolation scheme scheme.
Multiple points can also be extracted, by supplying an array of latitude and/or longitude coordinates. The resulting point cube will match the respective latitude and longitude coordinate to those of the input coordinates. If the input coordinate is a scalar, the dimension will be missing in the output cube (that is, it will be a scalar).
- Parameters
- Returns
Returns a cube with the extracted point(s), and with adjusted
latitude and longitude coordinates (see above).
- Raises
ValueError: – If the interpolation scheme is None or unrecognized.
Examples
With a cube that has the coordinates
latitude: [1, 2, 3, 4]
longitude: [1, 2, 3, 4]
- data values: [[[1, 2, 3, 4], [5, 6, …], […], […],
… ]]]
>>> point = extract_point(cube, 2.5, 2.5, 'linear') >>> point.data array([ 8.5, 24.5, 40.5, 56.5])
Extraction of multiple points at once, with a nearest matching scheme. The values for 0.1 will result in masked values, since this lies outside the cube grid.
>>> point = extract_point(cube, [1.4, 2.1], [0.1, 1.1], ... 'nearest') >>> point.data.shape (4, 2, 2) >>> # x, y, z indices of masked values >>> np.where(~point.data.mask) (array([0, 0, 1, 1, 2, 2, 3, 3]), array([0, 1, 0, 1, 0, 1, 0, 1]), array([1, 1, 1, 1, 1, 1, 1, 1])) >>> point.data[~point.data.mask].data array([ 1, 5, 17, 21, 33, 37, 49, 53])
- esmvalcore.preprocessor.extract_region(cube, start_longitude, end_longitude, start_latitude, end_latitude)[source]¶
Extract a region from a cube.
Function that subsets a cube on a box (start_longitude, end_longitude, start_latitude, end_latitude)
- Parameters
cube (iris.cube.Cube) – input data cube.
start_longitude (float) – Western boundary longitude.
end_longitude (float) – Eastern boundary longitude.
start_latitude (float) – Southern Boundary latitude.
end_latitude (float) – Northern Boundary Latitude.
- Returns
smaller cube.
- Return type
- esmvalcore.preprocessor.extract_season(cube, season)[source]¶
Slice cube to get only the data belonging to a specific season.
- Parameters
cube (iris.cube.Cube) – Original data
season (str) – Season to extract. Available: DJF, MAM, JJA, SON and all sequentially correct combinations: e.g. JJAS
- Returns
data cube for specified season.
- Return type
- Raises
ValueError – if requested season is not present in the cube
- esmvalcore.preprocessor.extract_shape(cube, shapefile, method='contains', crop=True, decomposed=False, ids=None)[source]¶
Extract a region defined by a shapefile.
Note that this function does not work for shapes crossing the prime meridian or poles.
- Parameters
cube (iris.cube.Cube) – input cube.
shapefile (str) – A shapefile defining the region(s) to extract.
method (str, optional) – Select all points contained by the shape or select a single representative point. Choose either ‘contains’ or ‘representative’. If ‘contains’ is used, but not a single grid point is contained by the shape, a representative point will selected.
crop (bool, optional) – Crop the resulting cube using extract_region(). Note that data on irregular grids will not be cropped.
decomposed (bool, optional) – Whether or not to retain the sub shapes of the shapefile in the output. If this is set to True, the output cube has a dimension for the sub shapes.
ids (list(str), optional) – List of shapes to be read from the file. The ids are assigned from the attributes ‘name’ or ‘id’ (in that priority order) if present in the file or correspond to the reading order if not.
- Returns
Cube containing the extracted region.
- Return type
See also
extract_region
Extract a region from a cube.
- esmvalcore.preprocessor.extract_time(cube, start_year, start_month, start_day, end_year, end_month, end_day)[source]¶
Extract a time range from a cube.
Given a time range passed in as a series of years, months and days, it returns a time-extracted cube with data only within the specified time range.
- Parameters
- Returns
Sliced cube.
- Return type
- Raises
ValueError – if time ranges are outside the cube time limits
- esmvalcore.preprocessor.extract_trajectory(cube, latitudes, longitudes, number_points=2)[source]¶
Extract data along a trajectory.
latitudes and longitudes are the pairs of coordinates for two points. number_points is the number of points between the two points.
This version uses the expensive interpolate method, but it may be necceasiry for irregular grids.
If only two latitude and longitude coordinates are given, extract_trajectory will produce a cube will extrapolate along a line between those two points, and will add number_points points between the two corners.
If more than two points are provided, then extract_trajectory will produce a cube which has extrapolated the data of the cube to those points, and number_points is not needed.
- Parameters
cube (iris.cube.Cube) – input cube.
latitudes (list) – list of latitude coordinates (floats).
longitudes (list) – list of longitude coordinates (floats).
number_points (int) – number of points to extrapolate (optional).
- Returns
collapsed cube.
- Return type
- Raises
ValueError – if latitude and longitude have different dimensions.
- esmvalcore.preprocessor.extract_transect(cube, latitude=None, longitude=None)[source]¶
Extract data along a line of constant latitude or longitude.
Both arguments, latitude and longitude, are treated identically. Either argument can be a single float, or a pair of floats, or can be left empty. The single float indicates the latitude or longitude along which the transect should be extracted. A pair of floats indicate the range that the transect should be extracted along the secondairy axis.
For instance ‘extract_transect(cube, longitude=-28)’ will produce a transect along 28 West.
Also, ‘extract_transect(cube, longitude=-28, latitude=[-50, 50])’ will produce a transect along 28 West between 50 south and 50 North.
This function is not yet implemented for irregular arrays - instead try the extract_trajectory function, but note that it is currently very slow. Alternatively, use the regrid preprocessor to regrid along a regular grid and then extract the transect.
- Parameters
- Returns
collapsed cube.
- Return type
- Raises
ValueError – slice extraction not implemented for irregular grids.
ValueError – latitude and longitude are both floats or lists; not allowed to slice on both axes at the same time.
- esmvalcore.preprocessor.extract_volume(cube, z_min, z_max)[source]¶
Subset a cube based on a range of values in the z-coordinate.
Function that subsets a cube on a box (z_min, z_max) This function is a restriction of masked_cube_lonlat(); Note that this requires the requested z-coordinate range to be the same sign as the iris cube. ie, if the cube has z-coordinate as negative, then z_min and z_max need to be negative numbers.
- Parameters
cube (iris.cube.Cube) – input cube.
z_min (float) – minimum depth to extract.
z_max (float) – maximum depth to extract.
- Returns
z-coord extracted cube.
- Return type
- esmvalcore.preprocessor.fix_data(cube, short_name, project, dataset, mip, frequency=None, check_level=CheckLevels.DEFAULT, **extra_facets)[source]¶
Fix cube data if fixes add present and check it anyway.
This method assumes that metadata is already fixed and checked.
This method collects all the relevant fixes for a given variable, applies them and checks resulting cube (or the original if no fixes were needed) metadata to ensure that it complies with the standards of its project CMOR tables.
- Parameters
cube (iris.cube.Cube) – Cube to fix
short_name (str) – Variable’s short name
project (str) –
dataset (str) –
mip (str) – Variable’s MIP
frequency (str, optional) – Variable’s data frequency, if available
check_level (CheckLevels) – Level of strictness of the checks. Set to default.
**extra_facets (dict, optional) – Extra facets are mainly used for data outside of the big projects like CMIP, CORDEX, obs4MIPs. For details, see Extra Facets.
- Returns
Fixed and checked cube
- Return type
- Raises
CMORCheckError – If the checker detects errors in the data that it can not fix.
- esmvalcore.preprocessor.fix_file(file, short_name, project, dataset, mip, output_dir, **extra_facets)[source]¶
Fix files before ESMValTool can load them.
This fixes are only for issues that prevent iris from loading the cube or that cannot be fixed after the cube is loaded.
Original files are not overwritten.
- Parameters
file (str) – Path to the original file
short_name (str) – Variable’s short name
project (str) –
dataset (str) –
output_dir (str) – Output directory for fixed files
**extra_facets (dict, optional) – Extra facets are mainly used for data outside of the big projects like CMIP, CORDEX, obs4MIPs. For details, see Extra Facets.
- Returns
Path to the fixed file
- Return type
- esmvalcore.preprocessor.fix_metadata(cubes, short_name, project, dataset, mip, frequency=None, check_level=CheckLevels.DEFAULT, **extra_facets)[source]¶
Fix cube metadata if fixes are required and check it anyway.
This method collects all the relevant fixes for a given variable, applies them and checks the resulting cube (or the original if no fixes were needed) metadata to ensure that it complies with the standards of its project CMOR tables.
- Parameters
cubes (iris.cube.CubeList) – Cubes to fix
short_name (str) – Variable’s short name
project (str) –
dataset (str) –
mip (str) – Variable’s MIP
frequency (str, optional) – Variable’s data frequency, if available
check_level (CheckLevels) – Level of strictness of the checks. Set to default.
**extra_facets (dict, optional) – Extra facets are mainly used for data outside of the big projects like CMIP, CORDEX, obs4MIPs. For details, see Extra Facets.
- Returns
Fixed and checked cube
- Return type
- Raises
CMORCheckError – If the checker detects errors in the metadata that it can not fix.
- esmvalcore.preprocessor.hourly_statistics(cube, hours, operator='mean')[source]¶
Compute hourly statistics.
Chunks time in x hours periods and computes statistics over them.
- Parameters
cube (iris.cube.Cube) – input cube.
hours (int) – Number of hours per period. Must be a divisor of 24 (1, 2, 3, 4, 6, 8, 12)
operator (str, optional) – Select operator to apply. Available operators: ‘mean’, ‘median’, ‘std_dev’, ‘sum’, ‘min’, ‘max’
- Returns
Hourly statistics cube
- Return type
- esmvalcore.preprocessor.linear_trend(cube, coordinate='time')[source]¶
Calculate linear trend of data along a given coordinate.
The linear trend is defined as the slope of an ordinary linear regression.
- Parameters
cube (iris.cube.Cube) – Input data.
coordinate (str, optional (default: 'time')) – Dimensional coordinate over which the trend is calculated.
- Returns
Trends.
- Return type
- Raises
iris.exceptions.CoordinateNotFoundError – The dimensional coordinate with the name
coordinate
is not found incube
.
- esmvalcore.preprocessor.linear_trend_stderr(cube, coordinate='time')[source]¶
Calculate standard error of linear trend along a given coordinate.
This gives the standard error (not confidence intervals!) of the trend defined as the standard error of the estimated slope of an ordinary linear regression.
- Parameters
cube (iris.cube.Cube) – Input data.
coordinate (str, optional (default: 'time')) – Dimensional coordinate over which the standard error of the trend is calculated.
- Returns
Standard errors of trends.
- Return type
- Raises
iris.exceptions.CoordinateNotFoundError – The dimensional coordinate with the name
coordinate
is not found incube
.
- esmvalcore.preprocessor.mask_above_threshold(cube, threshold)[source]¶
Mask above a specific threshold value.
Takes a value ‘threshold’ and masks off anything that is above it in the cube data. Values equal to the threshold are not masked.
- Parameters
cube (iris.cube.Cube) – iris cube to be thresholded.
threshold (float) – threshold to be applied on input cube data.
- Returns
thresholded cube.
- Return type
- esmvalcore.preprocessor.mask_below_threshold(cube, threshold)[source]¶
Mask below a specific threshold value.
Takes a value ‘threshold’ and masks off anything that is below it in the cube data. Values equal to the threshold are not masked.
- Parameters
cube (iris.cube.Cube) – iris cube to be thresholded
threshold (float) – threshold to be applied on input cube data.
- Returns
thresholded cube.
- Return type
- esmvalcore.preprocessor.mask_fillvalues(products, threshold_fraction, min_value=None, time_window=1)[source]¶
Compute and apply a multi-dataset fillvalues mask.
Construct the mask that fills a certain time window with missing values if the number of values in that specific window is less than a given fractional threshold. This function is the extension of _get_fillvalues_mask and performs the combination of missing values masks from each model (of multimodels) into a single fillvalues mask to be applied to each model.
- Parameters
products (iris.cube.Cube) – data products to be masked.
threshold_fraction (float) – fractional threshold to be used as argument for Aggregator. Must be between 0 and 1.
min_value (float) – minimum value threshold; default None If default, no thresholding applied so the full mask will be selected.
time_window (float) – time window to compute missing data counts; default set to 1.
- Returns
Masked iris cubes.
- Return type
- Raises
NotImplementedError – Implementation missing for data with higher dimensionality than 4.
- esmvalcore.preprocessor.mask_glaciated(cube, mask_out)[source]¶
Mask out glaciated areas.
It applies a Natural Earth mask. Note that for computational reasons only the 10 largest polygons are used for masking.
- Parameters
cube (iris.cube.Cube) – data cube to be masked.
mask_out (str) – “glaciated” to mask out glaciated areas
- Returns
Returns the masked iris cube.
- Return type
- Raises
ValueError – Error raised if masking on irregular grids is attempted or if mask_out has a wrong value.
- esmvalcore.preprocessor.mask_inside_range(cube, minimum, maximum)[source]¶
Mask inside a specific threshold range.
Takes a MINIMUM and a MAXIMUM value for the range, and masks off anything that’s between the two in the cube data.
- Parameters
cube (iris.cube.Cube) – iris cube to be thresholded
minimum (float) – lower threshold to be applied on input cube data.
maximum (float) – upper threshold to be applied on input cube data.
- Returns
thresholded cube.
- Return type
- esmvalcore.preprocessor.mask_landsea(cube, mask_out, always_use_ne_mask=False)[source]¶
Mask out either land mass or sea (oceans, seas and lakes).
It uses dedicated ancillary variables (sftlf or sftof) or, in their absence, it applies a Natural Earth mask (land or ocean contours). Note that the Natural Earth masks have different resolutions: 10m for land, and 50m for seas. These are more than enough for ESMValTool purposes.
- Parameters
cube (iris.cube.Cube) – data cube to be masked.
mask_out (str) – either “land” to mask out land mass or “sea” to mask out seas.
always_use_ne_mask (bool, optional (default: False)) –
always apply Natural Earths mask, regardless if fx files are available or not.
Warning
This option has been deprecated in ESMValCore version 2.5. To always use Natural Earth masks, either explicitly remove all
ancillary_variables
from the input cube (when this function is used directly) or specifyfx_variables: null
as option for this preprocessor in the recipe (when this function is used as part of ESMValTool).
- Returns
Returns the masked iris cube.
- Return type
- Raises
ValueError – Error raised if masking on irregular grids is attempted. Irregular grids are not currently supported for masking with Natural Earth shapefile masks.
- esmvalcore.preprocessor.mask_landseaice(cube, mask_out)[source]¶
Mask out either landsea (combined) or ice.
Function that masks out either landsea (land and seas) or ice (Antarctica and Greenland and some wee glaciers).
It uses dedicated ancillary variables (sftgif).
- Parameters
cube (iris.cube.Cube) – data cube to be masked.
mask_out (str) – either “landsea” to mask out landsea or “ice” to mask out ice.
- Returns
Returns the masked iris cube with either land or ice masked out.
- Return type
- Raises
ValueError – Error raised if landsea-ice mask not found as an ancillary variable.
- esmvalcore.preprocessor.mask_multimodel(products)[source]¶
Apply common mask to all datasets (using logical OR).
- Parameters
products (iris.cube.CubeList or list of PreprocessorFile) – Data products/cubes to be masked.
- Returns
Masked data products/cubes.
- Return type
iris.cube.CubeList or list of PreprocessorFile
- Raises
ValueError – Datasets have different shapes.
TypeError – Invalid input data.
- esmvalcore.preprocessor.mask_outside_range(cube, minimum, maximum)[source]¶
Mask outside a specific threshold range.
Takes a MINIMUM and a MAXIMUM value for the range, and masks off anything that’s outside the two in the cube data.
- Parameters
cube (iris.cube.Cube) – iris cube to be thresholded
minimum (float) – lower threshold to be applied on input cube data.
maximum (float) – upper threshold to be applied on input cube data.
- Returns
thresholded cube.
- Return type
- esmvalcore.preprocessor.meridional_statistics(cube, operator)[source]¶
Compute meridional statistics.
- Parameters
cube (iris.cube.Cube) – input cube.
operator (str, optional) – Select operator to apply. Available operators: ‘mean’, ‘median’, ‘std_dev’, ‘sum’, ‘min’, ‘max’, ‘rms’.
- Returns
Meridional statistics cube.
- Return type
- Raises
ValueError – Error raised if computation on irregular grids is attempted. Zonal statistics not yet implemented for irregular grids.
- esmvalcore.preprocessor.monthly_statistics(cube, operator='mean')[source]¶
Compute monthly statistics.
Chunks time in monthly periods and computes statistics over them;
- Parameters
cube (iris.cube.Cube) – input cube.
operator (str, optional) – Select operator to apply. Available operators: ‘mean’, ‘median’, ‘std_dev’, ‘sum’, ‘min’, ‘max’, ‘rms’
- Returns
Monthly statistics cube
- Return type
- esmvalcore.preprocessor.multi_model_statistics(products, span, statistics, output_products=None, groupby=None, keep_input_datasets=True)[source]¶
Compute multi-model statistics.
This function computes multi-model statistics on a list of
products
, which can be instances ofCube
orPreprocessorFile
. The latter is used internally by ESMValCore to store workflow and provenance information, and this option should typically be ignored.Apart from the time coordinate, cubes must have consistent shapes. There are two options to combine time coordinates of different lengths, see the
span
argument.Uses the statistical operators in
iris.analysis
, includingmean
,median
,min
,max
, andstd
. Percentiles are also supported and can be specified likepXX.YY
(for percentileXX.YY
; decimal part optional).Notes
Some of the operators in
iris.analysis
require additional arguments. Except for percentiles, these operators are currently not supported.- Parameters
products (list) – Cubes (or products) over which the statistics will be computed.
span (str) – Overlap or full; if overlap, statitstics are computed on common time- span; if full, statistics are computed on full time spans, ignoring missing data.
statistics (list) – Statistical metrics to be computed, e.g. [
mean
,max
]. Choose from the operators listed in the iris.analysis package. Percentiles can be specified likepXX.YY
.output_products (dict) – For internal use only. A dict with statistics names as keys and preprocessorfiles as values. If products are passed as input, the statistics cubes will be assigned to these output products.
groupby (tuple) – Group products by a given tag or attribute, e.g. (‘project’, ‘dataset’, ‘tag1’).
keep_input_datasets (bool) – If True, the output will include the input datasets. If False, only the computed statistics will be returned.
- Returns
A dictionary of statistics cubes with statistics’ names as keys. (If input type is products, then it will return a set of output_products.)
- Return type
- Raises
ValueError – If span is neither overlap nor full, or if input type is neither cubes nor products.
- esmvalcore.preprocessor.regrid(cube, target_grid, scheme, lat_offset=True, lon_offset=True)[source]¶
Perform horizontal regridding.
Note that the target grid can be a cube (
Cube
), path to a cube (str
), a grid spec (str
) in the form of MxN, or adict
specifying the target grid.For the latter, the
target_grid
should be adict
with the following keys:start_longitude
: longitude at the center of the first grid cell.end_longitude
: longitude at the center of the last grid cell.step_longitude
: constant longitude distance between grid cellcenters.
start_latitude
: latitude at the center of the first grid cell.end_latitude
: longitude at the center of the last grid cell.step_latitude
: constant latitude distance between grid cell centers.
- Parameters
cube (
Cube
) – The source cube to be regridded.target_grid (Cube or str or dict) –
The (location of a) cube that specifies the target or reference grid for the regridding operation.
Alternatively, a string cell specification may be provided, of the form
MxN
, which specifies the extent of the cell, longitude by latitude (degrees) for a global, regular target grid.Alternatively, a dictionary with a regional target grid may be specified (see above).
scheme (str) – The regridding scheme to perform, choose from
linear
,linear_extrapolate
,nearest
,area_weighted
,unstructured_nearest
.lat_offset (bool) – Offset the grid centers of the latitude coordinate w.r.t. the pole by half a grid step. This argument is ignored if
target_grid
is a cube or file.lon_offset (bool) – Offset the grid centers of the longitude coordinate w.r.t. Greenwich meridian by half a grid step. This argument is ignored if
target_grid
is a cube or file.
- Returns
Regridded cube.
- Return type
See also
extract_levels
Perform vertical regridding.
- esmvalcore.preprocessor.regrid_time(cube, frequency)[source]¶
Align time axis for cubes so they can be subtracted.
Operations on time units, time points and auxiliary coordinates so that any cube from cubes can be subtracted from any other cube from cubes. Currently this function supports yearly (frequency=yr), monthly (frequency=mon), daily (frequency=day), 6-hourly (frequency=6hr), 3-hourly (frequency=3hr) and hourly (frequency=1hr) data time frequencies.
- Parameters
cube (iris.cube.Cube) – input cube.
frequency (str) – data frequency: mon, day, 1hr, 3hr or 6hr
- Returns
cube with converted time axis and units.
- Return type
- esmvalcore.preprocessor.remove_fx_variables(cube)[source]¶
Remove fx variables present as cell measures or ancillary variables in the cube containing the data.
- Parameters
cube (iris.cube.Cube) – Iris cube with data and cell measures or ancillary variables.
- Returns
Cube without cell measures or ancillary variables.
- Return type
- esmvalcore.preprocessor.resample_hours(cube, interval, offset=0)[source]¶
Convert x-hourly data to y-hourly by eliminating extra timesteps.
Convert x-hourly data to y-hourly (y > x) by eliminating the extra timesteps. This is intended to be used only with instantaneous values.
For example:
resample_hours(cube, interval=6): Six-hourly intervals at 0:00, 6:00, 12:00, 18:00.
resample_hours(cube, interval=6, offset=3): Six-hourly intervals at 3:00, 9:00, 15:00, 21:00.
resample_hours(cube, interval=12, offset=6): Twelve-hourly intervals at 6:00, 18:00.
- Parameters
cube (iris.cube.Cube) – Input cube.
interval (int) – The period (hours) of the desired data.
offset (int, optional) – The firs hour (hours) of the desired data.
- Returns
Cube with the new frequency.
- Return type
- Raises
ValueError: – The specified frequency is not a divisor of 24.
- esmvalcore.preprocessor.resample_time(cube, month=None, day=None, hour=None)[source]¶
Change frequency of data by resampling it.
Converts data from one frequency to another by extracting the timesteps that match the provided month, day and/or hour. This is meant to be used with instantaneous values when computing statistics is not desired.
For example:
resample_time(cube, hour=6): Daily values taken at 6:00.
resample_time(cube, day=15, hour=6): Monthly values taken at 15th 6:00.
resample_time(cube, month=6): Yearly values, taking in June
resample_time(cube, month=6, day=1): Yearly values, taking 1st June
The condition must yield only one value per interval: the last two samples above will produce yearly data, but the first one is meant to be used to sample from monthly output and the second one will work better with daily.
- Parameters
cube (iris.cube.Cube) – Input cube.
month (int, optional) – Month to extract
day (int, optional) – Day to extract
hour (int, optional) – Hour to extract
- Returns
Cube with the new frequency.
- Return type
- esmvalcore.preprocessor.save(cubes, filename, optimize_access='', compress=False, alias='', **kwargs)[source]¶
Save iris cubes to file.
- Parameters
cubes (iterable of iris.cube.Cube) – Data cubes to be saved
filename (str) – Name of target file
optimize_access (str) –
Set internal NetCDF chunking to favour a reading scheme
Values can be map or timeseries, which improve performance when reading the file one map or time series at a time. Users can also provide a coordinate or a list of coordinates. In that case the better performance will be avhieved by loading all the values in that coordinate at a time
compress (bool, optional) – Use NetCDF internal compression.
alias (str, optional) – Var name to use when saving instead of the one in the cube.
- Returns
filename
- Return type
- Raises
ValueError – cubes is empty.
- esmvalcore.preprocessor.seasonal_statistics(cube, operator='mean', seasons=('DJF', 'MAM', 'JJA', 'SON'))[source]¶
Compute seasonal statistics.
Chunks time seasons and computes statistics over them.
- Parameters
cube (iris.cube.Cube) – input cube.
operator (str, optional) – Select operator to apply. Available operators: ‘mean’, ‘median’, ‘std_dev’, ‘sum’, ‘min’, ‘max’, ‘rms’
seasons (list or tuple of str, optional) – Seasons to build. Available: (‘DJF’, ‘MAM’, ‘JJA’, SON’) (default) and all sequentially correct combinations holding every month of a year: e.g. (‘JJAS’,’ONDJFMAM’), or less in case of prior season extraction.
- Returns
Seasonal statistic cube
- Return type
- esmvalcore.preprocessor.timeseries_filter(cube, window, span, filter_type='lowpass', filter_stats='sum')[source]¶
Apply a timeseries filter.
Method borrowed from iris example
Apply each filter using the rolling_window method used with the weights keyword argument. A weighted sum is required because the magnitude of the weights are just as important as their relative sizes.
See also the iris rolling window
iris.cube.Cube.rolling_window
.- Parameters
cube (iris.cube.Cube) – input cube.
window (int) – The length of the filter window (in units of cube time coordinate).
span (int) – Number of months/days (depending on data frequency) on which weights should be computed e.g. 2-yearly: span = 24 (2 x 12 months). Span should have same units as cube time coordinate.
filter_type (str, optional) – Type of filter to be applied; default ‘lowpass’. Available types: ‘lowpass’.
filter_stats (str, optional) – Type of statistic to aggregate on the rolling window; default ‘sum’. Available operators: ‘mean’, ‘median’, ‘std_dev’, ‘sum’, ‘min’, ‘max’, ‘rms’
- Returns
cube time-filtered using ‘rolling_window’.
- Return type
- Raises
iris.exceptions.CoordinateNotFoundError: – Cube does not have time coordinate.
NotImplementedError: – If filter_type is not implemented.
- esmvalcore.preprocessor.volume_statistics(cube, operator)[source]¶
Apply a statistical operation over a volume.
The volume average is weighted according to the cell volume. Cell volume is calculated from iris’s cartography tool multiplied by the cell thickness.
- Parameters
cube (iris.cube.Cube) – Input cube.
operator (str) – The operation to apply to the cube, options are: ‘mean’.
- Returns
collapsed cube.
- Return type
- Raises
ValueError – if input cube shape differs from grid volume cube shape.
- esmvalcore.preprocessor.weighting_landsea_fraction(cube, area_type)[source]¶
Weight fields using land or sea fraction.
This preprocessor function weights a field with its corresponding land or sea area fraction (value between 0 and 1). The application of this is important for most carbon cycle variables (and other land-surface outputs), which are e.g. reported in units of kgC m-2. This actually refers to ‘per square meter of land/sea’ and NOT ‘per square meter of gridbox’. So in order to integrate these globally or regionally one has to both area-weight the quantity but also weight by the land/sea fraction.
- Parameters
cube (iris.cube.Cube) – Data cube to be weighted.
area_type (str) – Use land (
'land'
) or sea ('sea'
) fraction for weighting.
- Returns
Land/sea fraction weighted cube.
- Return type
- Raises
TypeError –
area_type
is not'land'
or'sea'
.ValueError – Land/sea fraction variables
sftlf
orsftof
not found.
- esmvalcore.preprocessor.zonal_statistics(cube, operator)[source]¶
Compute zonal statistics.
- Parameters
cube (iris.cube.Cube) – input cube.
operator (str, optional) – Select operator to apply. Available operators: ‘mean’, ‘median’, ‘std_dev’, ‘sum’, ‘min’, ‘max’, ‘rms’.
- Returns
Zonal statistics cube.
- Return type
- Raises
ValueError – Error raised if computation on irregular grids is attempted. Zonal statistics not yet implemented for irregular grids.