Release: recipes runs and comparison
The release procedure for ESMValTool is a fairly involved process (at the moment), so it is important to be very well organized and to have documented each procedural steps, so that the next release manager can follow said steps, and finalize the release without any delays.
The workflow below assumes an ESMValCore release candidate, or a completed stable release, have been released and deployed on conda-forge and PyPI; it also assumes the release manager has access to accounts on DKRZ/Levante.
Below is a list of steps that the release manager, together with the previous release manager, should go through before the actual release;
these include testing the new code by running all available recipes in the
main branch, and comparing the output against
the previous release.
Open an issue on GitHub
First, open an issue on GitHub where the testing workflow before the release is documented (see example https://github.com/ESMValGroup/ESMValTool/issues/2881). Name it something relevant like “Recipe testing and comparison for release 2.x.x”, and populate the issue description with information about where the testing is taking place, what tools are used, and what versions, here are some suggestions:
path to the output directories on DKRZ/Levante
We should document various utilities’ versions so that the work can be reproduced in case there is an issue, or release work needs to be picked up mid-release by another release manager:
documenting conda/mamba versions:
documenting git branch and its state:
Furthermore, the runtime environment needs to be documented: make a copy of the environment file, and attach it in the release testing issue; to record the environment in a yaml file use e.g.
conda env export > ToolEnv_v2xx_Test.txt
Modifications to configuration files need to be documented as well.
To test recipes, it is recommended to only use the default options and DKRZ data directories, simply by uncommenting
the DKRZ-Levante block of a newly generated
Submit run scripts - test recipe runs
We are now ready to start running all the available recipes, to compare output against previous release. Running is currently done
via batch scripts submitted to a schedulers (SLURM). Generate the submission scripts using the
generate.py utility Python script.
You will have to set the name of your environment, your email address (if you want to get email notifications for successful/failed jobs) and the name of the directory you want to store the log files of the jobs. A compute project from which resources are billed needs to be set, and the default partition is set to interactive. More information on running jobs with SLURM on DKRZ/Levante can be found in the DKRZ documentation.
You can also specify the path to your
config-user.yml file where
max_parallel_tasks can be set. The script was found to work well with
max_parallel_tasks=8. Some recipes need to be run with
max_parallel_tasks=1 (large memory requirements, CMIP3 data, diagnostic issues, …). These recipes are listed in ONE_TASK_RECIPES.
Some recipes need other job requirements, you can add their headers in the SPECIAL_RECIPES dictionary. Otherwise the header will be written following the template that is written in the lines below. If you want to exclude recipes, you can do so by uncommenting the exclude lines.
Before submitting all jobs, it is recommended to try the batch script generation with
submit = False and check the generated files. If recipes with special runtime requirements have been added to ESMValTool since the previous release, these may need to be added to SPECIAL_RECIPES and/or to ONE_TASK_RECIPES.
Other recipes should run successfully with the default SLURM settings set in this script.
The launch scripts will be saved in the same directory you execute the script from. These are named like
To submit these scripts to the SLURM scheduler, use the
sbatch launch_recipe_<name>.sh command. You can check the status of your BATCH queue by invoking:
squeue -u $USER
Also, for computationally-heavy recipes, you can require more memory and/or time, see e.g. edited batch header below (note the compute partition which is used for such heavy runs):
#SBATCH --partition=compute #SBATCH --time=08:00:00 #SBATCH --mem=0 #SBATCH --constraint=512G
On DKRZ/Levante, a user can’t have more than 20 SLURM jobs running at a time.
As soon as a job is finished, the next one should start. More information on the job handling at DKRZ here.
Also note that the
--mem=0 argument needs be specified if any of the
--constraint arguments are
used for memory requests, so that the node’s full memory is allocated.
Analyse the results
Once all jobs are completed, assemble some statistics so that issues with certain recipes can be followed-up, and document this information in the release issue, such as:
number of successfully run recipes
number of failed recipes with preprocessor errors (can they be fixed? Can the fixes be included in the release?)
number of failed recipes with diagnostic errors (can they be fixed? Can the fixes be included in the release?)
number of recipes that are missing data
number of recipes that have various other issues (and document them)
To parse the output of all these runs, use the
parse_recipes_output.py utility Python script.
It is recommended to run the recipes with log_level: info in your config file to enable the parsing script to run fast.
Running the comparison
To compare the newly produced output from running all recipes, follow these steps below.
Access to the DKRZ esmvaltool VM, then install miniconda on the VM, and if you have a Miniconda installer already downloaded in your Levante $HOME
scp Miniconda3-py39_4.12.0-Linux-x86_64.sh email@example.com:/mnt/esmvaltool_disk2/work/<username>
conda environments should not be created in the home directory because it is on a very small disk, but rather in a directory with your username under /mnt/esmvaltool_disk2/work/<username>
Next, we need to set up the input files
/work partition is visible from the VM so you can run the compare tool straight on the VM.
The steps to running the compare tool on the VM are the following:
run date: log the run date here
conda env: log the name of the conda environment you are using
ESMValTool branch: log the name of the code branch you are using (e.g. v2.8.x)
prerequisite - install imagehash: pip install imagehash
reference run (v2.7.0; previous stable release): export reference_dir=/work/bd0854/b382109/v270 (contains preproc/ dirs too, 122 recipes)
current run (v2.8.0): export current_dir=path_to_current_run
run the comparison script with:
nohup python ESMValTool/esmvaltool/utils/testing/regression/compare.py --reference $reference_dir --current $current_dir > compare_v280_output.txt
Copy the comparison txt file to the release issue. Some of the recipes will appear as having identical output to the one from previous release. However, others will need human inspection. Ask the recipe maintainers (@ESMValGroup/esmvaltool-recipe-maintainers) and ESMValTool Development Team (@ESMValGroup/esmvaltool-developmentteam) to provide assistance in checking the results. Here are some guidelines on how to perform the human inspection:
look at plots from current run vs previous release run: most of them will be identical, but if Matplotlib has changed some plotting feature, images may look slightly different so the comparison script may report them if the difference is larger than the threshold - but Mark I eyeball inspection will show they are identical
other plots will differ due to changes in plot settings (different colours, axes etc) due to updated settings from the diagnostic developers: if they look similar enough, then it’s fine
report (and subsequently open issues) if you notice major differences in plots; most times a simple comment on the release issue, whereby you tag the diagnostic developers leads to them having a look at the plots and OK-ing them; if that’s not the case, then open a separate issue. You can example of release issues containing overview lists and tables of failures and problems in 2881 and 3076.
Here you can find a list of utility scripts used to run recipes and analyse the results:
Python scripts that create slurm submission scripts and parse slurm log files.
Python script that compares one or more recipe runs to known good previous run(s).
Python script that creates the