{ "cells": [ { "cell_type": "markdown", "id": "63635ce7-0c74-422a-963d-a92e06ffa3bf", "metadata": {}, "source": [ "# Composing recipes\n", "\n", "This notebooks shows how to fill the datasets section in a [recipe](https://docs.esmvaltool.org/projects/esmvalcore/en/latest/recipe/overview.html)." ] }, { "cell_type": "code", "execution_count": 1, "id": "2c4dfa8a-b1d2-4d3f-9dc1-4aa930501ed5", "metadata": {}, "outputs": [], "source": [ "from esmvalcore.config import CFG\n", "from esmvalcore.dataset import Dataset, datasets_to_recipe\n", "import yaml" ] }, { "attachments": {}, "cell_type": "markdown", "id": "ad877bb4-1ca7-4819-852d-46d462890b32", "metadata": {}, "source": [ "Configure ESMValCore so it always searches the ESGF for data" ] }, { "cell_type": "code", "execution_count": 2, "id": "0d0a8ce7-6f42-4956-8fee-82238f5ace85", "metadata": {}, "outputs": [], "source": [ "CFG['search_esgf'] = 'always'" ] }, { "cell_type": "markdown", "id": "4a210f70-89d4-4cee-86db-3e0e353a36ab", "metadata": {}, "source": [ "Here is a small example recipe, that uses the `datasets_to_recipe` function to convert a list of datasets to a recipe:" ] }, { "cell_type": "code", "execution_count": 3, "id": "90d738a2-9934-4fd9-aeeb-edf6ba68a50c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "datasets:\n", "- dataset: CanESM5-1\n", "diagnostics:\n", " diagnostic_name:\n", " variables:\n", " pr:\n", " ensemble: r1i1p1f1\n", " exp: historical\n", " grid: gn\n", " mip: Amon\n", " project: CMIP6\n", " timerange: 2000/2002\n", " tas:\n", " ensemble: r1i1p1f1\n", " exp: historical\n", " grid: gn\n", " mip: Amon\n", " project: CMIP6\n", " timerange: 2000/2002\n", "\n" ] } ], "source": [ "tas = Dataset(\n", " short_name='tas',\n", " mip='Amon',\n", " project='CMIP6',\n", " dataset='CanESM5-1',\n", " ensemble='r1i1p1f1',\n", " exp='historical',\n", " grid='gn',\n", " timerange='2000/2002',\n", ")\n", "tas['diagnostic'] = 'diagnostic_name'\n", "\n", "pr = tas.copy(short_name='pr')\n", "\n", "print(yaml.safe_dump(datasets_to_recipe([tas, pr])))" ] }, { "cell_type": "markdown", "id": "392a9235-606d-4270-b0d6-f6895dce4cde", "metadata": {}, "source": [ "A more ambitious recipe might want to use all data that is available on ESGF. We can define a dataset template with a facet value of `*` where any value can be used. This can then be expanded to a list of datasets using the `from_files()` method." ] }, { "cell_type": "code", "execution_count": 4, "id": "7764bbee-83e2-4061-a3a4-103b266d55e9", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "778" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dataset_template = Dataset(\n", " short_name='tas',\n", " mip='Amon',\n", " project='CMIP6',\n", " exp='historical',\n", " dataset='*',\n", " institute='*',\n", " ensemble='*',\n", " grid='*',\n", ")\n", "datasets = list(dataset_template.from_files())\n", "len(datasets)" ] }, { "cell_type": "markdown", "id": "ba556a90-ba20-49f3-a103-5bbfd9c332a8", "metadata": {}, "source": [ "This results in the following recipe:" ] }, { "cell_type": "code", "execution_count": 5, "id": "025952e4-fc31-4193-ad3a-9cef20f93449", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "datasets:\n", "- dataset: TaiESM1\n", " ensemble: r(1:2)i1p1f1\n", " grid: gn\n", " institute: AS-RCEC\n", "- dataset: AWI-CM-1-1-MR\n", " ensemble: r(1:5)i1p1f1\n", " grid: gn\n", " institute: AWI\n", "- dataset: AWI-ESM-1-1-LR\n", " ensemble: r1i1p1f1\n", " grid: gn\n", " institute: AWI\n", "- dataset: BCC-CSM2-MR\n", " ensemble: r(1:3)i1p1f1\n", " grid: gn\n", " institute: BCC\n", "- dataset: BCC-ESM1\n", " ensemble: r(1:3)i1p1f1\n", " grid: gn\n", " institute: BCC\n", "- dataset: CAMS-CSM1-0\n", " ensemble: r1i1p1f2\n", " grid: gn\n", " institute: CAMS\n", "- dataset: CAMS-CSM1-0\n", " ensemble: r(1:2)i1p1f1\n", " grid: gn\n", " institute: CAMS\n", "- dataset: CAS-ESM2-0\n", " ensemble: r(1:4)i1p1f1\n", " grid: gn\n", " institute: CAS\n", "- dataset: FGOALS-f3-L\n", " ensemble: r(1:3)i1p1f1\n", " grid: gr\n", " institute: CAS\n", "- dataset: FGOALS-g3\n", " ensemble: r(1:6)i1p1f1\n", " grid: gn\n", " institute: CAS\n", "- dataset: IITM-ESM\n", " ensemble: r1i1p1f1\n", " grid: gn\n", " institute: CCCR-IITM\n", "- dataset: CanESM5-1\n", " ensemble: r(1:20)i1p1f1\n", " grid: gn\n", " institute: CCCma\n", "- dataset: CanESM5-1\n", " ensemble: r(1:25)i1p2f1\n", " grid: gn\n", " institute: CCCma\n", "- dataset: CanESM5-1\n", " ensemble: r22i1p1f1\n", " grid: gn\n", " institute: CCCma\n", "- dataset: CanESM5-1\n", " ensemble: r(24:39)i1p1f1\n", " grid: gn\n", " institute: CCCma\n", "- dataset: CanESM5-1\n", " ensemble: r(41:50)i1p1f1\n", " grid: gn\n", " institute: CCCma\n", "- dataset: CanESM5-CanOE\n", " ensemble: r(1:3)i1p2f1\n", " grid: gn\n", " institute: CCCma\n", "- dataset: CanESM5\n", " ensemble: r(1:25)i1p1f1\n", " grid: gn\n", " institute: CCCma\n", "- dataset: CanESM5\n", " ensemble: r(1:40)i1p2f1\n", " grid: gn\n", " institute: CCCma\n", "- dataset: CMCC-CM2-HR4\n", " ensemble: r1i1p1f1\n", " grid: gn\n", " institute: CMCC\n", "- dataset: CMCC-CM2-SR5\n", " ensemble: r1i1p1f1\n", " grid: gn\n", " institute: CMCC\n", "- dataset: CMCC-CM2-SR5\n", " ensemble: r(2:11)i1p2f1\n", " grid: gn\n", " institute: CMCC\n", "- dataset: CMCC-ESM2\n", " ensemble: r1i1p1f1\n", " grid: gn\n", " institute: CMCC\n", "- dataset: CNRM-CM6-1-HR\n", " ensemble: r1i1p1f2\n", " grid: gr\n", " institute: CNRM-CERFACS\n", "- dataset: CNRM-CM6-1\n", " ensemble: r(1:30)i1p1f2\n", " grid: gr\n", " institute: CNRM-CERFACS\n", "- dataset: CNRM-ESM2-1\n", " ensemble: r(1:11)i1p1f2\n", " grid: gr\n", " institute: CNRM-CERFACS\n", "- dataset: ACCESS-CM2\n", " ensemble: r(1:10)i1p1f1\n", " grid: gn\n", " institute: CSIRO-ARCCSS\n", "- dataset: ACCESS-ESM1-5\n", " ensemble: r(1:40)i1p1f1\n", " grid: gn\n", " institute: CSIRO\n", "- dataset: E3SM-1-0\n", " ensemble: r(1:5)i1p1f1\n", " grid: gr\n", " institute: E3SM-Project\n", "- dataset: E3SM-1-1-ECA\n", " ensemble: r1i1p1f1\n", " grid: gr\n", " institute: E3SM-Project\n", "- dataset: E3SM-1-1\n", " ensemble: r1i1p1f1\n", " grid: gr\n", " institute: E3SM-Project\n", "- dataset: E3SM-2-0\n", " ensemble: r(1:5)i1p1f1\n", " grid: gr\n", " institute: E3SM-Project\n", "- dataset: EC-Earth3-AerChem\n", " ensemble: r1i1p1f1\n", " grid: gr\n", " institute: EC-Earth-Consortium\n", "- dataset: EC-Earth3-AerChem\n", " ensemble: r(3:4)i1p1f1\n", " grid: gr\n", " institute: EC-Earth-Consortium\n", "- dataset: EC-Earth3-CC\n", " ensemble: r1i1p1f1\n", " grid: gr\n", " institute: EC-Earth-Consortium\n", "- dataset: EC-Earth3-CC\n", " ensemble: r4i1p1f1\n", " grid: gr\n", " institute: EC-Earth-Consortium\n", "- dataset: EC-Earth3-CC\n", " ensemble: r(6:13)i1p1f1\n", " grid: gr\n", " institute: EC-Earth-Consortium\n", "- dataset: EC-Earth3-Veg-LR\n", " ensemble: r(1:3)i1p1f1\n", " grid: gr\n", " institute: EC-Earth-Consortium\n", "- dataset: EC-Earth3-Veg\n", " ensemble: r(1:6)i1p1f1\n", " grid: gr\n", " institute: EC-Earth-Consortium\n", "- dataset: EC-Earth3-Veg\n", " ensemble: r10i1p1f1\n", " grid: gr\n", " institute: EC-Earth-Consortium\n", "- dataset: EC-Earth3-Veg\n", " ensemble: r12i1p1f1\n", " grid: gr\n", " institute: EC-Earth-Consortium\n", "- dataset: EC-Earth3-Veg\n", " ensemble: r14i1p1f1\n", " grid: gr\n", " institute: EC-Earth-Consortium\n", "- dataset: EC-Earth3\n", " ensemble: r(1:7)i1p1f1\n", " grid: gr\n", " institute: EC-Earth-Consortium\n", "- dataset: EC-Earth3\n", " ensemble: r(9:25)i1p1f1\n", " grid: gr\n", " institute: EC-Earth-Consortium\n", "- dataset: EC-Earth3\n", " ensemble: r(101:150)i1p1f1\n", " grid: gr\n", " institute: EC-Earth-Consortium\n", "- dataset: FIO-ESM-2-0\n", " ensemble: r(1:3)i1p1f1\n", " grid: gn\n", " institute: FIO-QLNM\n", "- dataset: MPI-ESM-1-2-HAM\n", " ensemble: r(1:3)i1p1f1\n", " grid: gn\n", " institute: HAMMOZ-Consortium\n", "- dataset: INM-CM4-8\n", " ensemble: r1i1p1f1\n", " grid: gr1\n", " institute: INM\n", "- dataset: INM-CM5-0\n", " ensemble: r(1:10)i1p1f1\n", " grid: gr1\n", " institute: INM\n", "- dataset: IPSL-CM5A2-INCA\n", " ensemble: r1i1p1f1\n", " grid: gr\n", " institute: IPSL\n", "- dataset: IPSL-CM6A-LR-INCA\n", " ensemble: r1i1p1f1\n", " grid: gr\n", " institute: IPSL\n", "- dataset: IPSL-CM6A-LR\n", " ensemble: r(1:33)i1p1f1\n", " grid: gr\n", " institute: IPSL\n", "- dataset: KIOST-ESM\n", " ensemble: r1i1p1f1\n", " grid: gr1\n", " institute: KIOST\n", "- dataset: MIROC-ES2H\n", " ensemble: r1i1p(1:3)f2\n", " grid: gn\n", " institute: MIROC\n", "- dataset: MIROC-ES2H\n", " ensemble: r(1:3)i1p4f2\n", " grid: gn\n", " institute: MIROC\n", "- dataset: MIROC-ES2L\n", " ensemble: r1i1000p1f2\n", " grid: gn\n", " institute: MIROC\n", "- dataset: MIROC-ES2L\n", " ensemble: r(1:30)i1p1f2\n", " grid: gn\n", " institute: MIROC\n", "- dataset: MIROC6\n", " ensemble: r(1:50)i1p1f1\n", " grid: gn\n", " institute: MIROC\n", "- dataset: HadGEM3-GC31-LL\n", " ensemble: r(1:5)i1p1f3\n", " grid: gn\n", " institute: MOHC\n", "- dataset: HadGEM3-GC31-MM\n", " ensemble: r(1:4)i1p1f3\n", " grid: gn\n", " institute: MOHC\n", "- dataset: UKESM1-0-LL\n", " ensemble: r(1:4)i1p1f2\n", " grid: gn\n", " institute: MOHC\n", "- dataset: UKESM1-0-LL\n", " ensemble: r(5:7)i1p1f3\n", " grid: gn\n", " institute: MOHC\n", "- dataset: UKESM1-0-LL\n", " ensemble: r(8:12)i1p1f2\n", " grid: gn\n", " institute: MOHC\n", "- dataset: UKESM1-0-LL\n", " ensemble: r(16:19)i1p1f2\n", " grid: gn\n", " institute: MOHC\n", "- dataset: UKESM1-1-LL\n", " ensemble: r1i1p1f2\n", " grid: gn\n", " institute: MOHC\n", "- dataset: ICON-ESM-LR\n", " ensemble: r(1:5)i1p1f1\n", " grid: gn\n", " institute: MPI-M\n", "- dataset: MPI-ESM1-2-HR\n", " ensemble: r(1:10)i1p1f1\n", " grid: gn\n", " institute: MPI-M\n", "- dataset: MPI-ESM1-2-LR\n", " ensemble: r1i2000p1f1\n", " grid: gn\n", " institute: MPI-M\n", "- dataset: MPI-ESM1-2-LR\n", " ensemble: r(1:30)i1p1f1\n", " grid: gn\n", " institute: MPI-M\n", "- dataset: MRI-ESM2-0\n", " ensemble: r1i2p1f1\n", " grid: gn\n", " institute: MRI\n", "- dataset: MRI-ESM2-0\n", " ensemble: r1i1000p1f1\n", " grid: gn\n", " institute: MRI\n", "- dataset: MRI-ESM2-0\n", " ensemble: r(1:10)i1p1f1\n", " grid: gn\n", " institute: MRI\n", "- dataset: GISS-E2-1-G-CC\n", " ensemble: r1i1p1f1\n", " grid: gn\n", " institute: NASA-GISS\n", "- dataset: GISS-E2-1-G\n", " ensemble: r(1:4)i1p5f1\n", " grid: gn\n", " institute: NASA-GISS\n", "- dataset: GISS-E2-1-G\n", " ensemble: r(1:5)i1p1f3\n", " grid: gn\n", " institute: NASA-GISS\n", "- dataset: GISS-E2-1-G\n", " ensemble: r(1:10)i1p1f1\n", " grid: gn\n", " institute: NASA-GISS\n", "- dataset: GISS-E2-1-G\n", " ensemble: r(1:10)i1p3f1\n", " grid: gn\n", " institute: NASA-GISS\n", "- dataset: GISS-E2-1-G\n", " ensemble: r(1:11)i1p1f2\n", " grid: gn\n", " institute: NASA-GISS\n", "- dataset: GISS-E2-1-G\n", " ensemble: r(6:10)i1p5f1\n", " grid: gn\n", " institute: NASA-GISS\n", "- dataset: GISS-E2-1-G\n", " ensemble: r(101:102)i1p1f1\n", " grid: gn\n", " institute: NASA-GISS\n", "- dataset: GISS-E2-1-H\n", " ensemble: r(1:5)i1p1f2\n", " grid: gn\n", " institute: NASA-GISS\n", "- dataset: GISS-E2-1-H\n", " ensemble: r(1:5)i1p3f1\n", " grid: gn\n", " institute: NASA-GISS\n", "- dataset: GISS-E2-1-H\n", " ensemble: r(1:5)i1p5f1\n", " grid: gn\n", " institute: NASA-GISS\n", "- dataset: GISS-E2-1-H\n", " ensemble: r(1:10)i1p1f1\n", " grid: gn\n", " institute: NASA-GISS\n", "- dataset: GISS-E2-2-G\n", " ensemble: r(1:5)i1p3f1\n", " grid: gn\n", " institute: NASA-GISS\n", "- dataset: GISS-E2-2-G\n", " ensemble: r(1:6)i1p1f1\n", " grid: gn\n", " institute: NASA-GISS\n", "- dataset: GISS-E2-2-H\n", " ensemble: r(1:5)i1p1f1\n", " grid: gn\n", " institute: NASA-GISS\n", "- dataset: CESM2-FV2\n", " ensemble: r1i2p2f1\n", " grid: gn\n", " institute: NCAR\n", "- dataset: CESM2-FV2\n", " ensemble: r(1:3)i1p1f1\n", " grid: gn\n", " institute: NCAR\n", "- dataset: CESM2-WACCM-FV2\n", " ensemble: r(1:3)i1p1f1\n", " grid: gn\n", " institute: NCAR\n", "- dataset: CESM2-WACCM\n", " ensemble: r(1:3)i1p1f1\n", " grid: gn\n", " institute: NCAR\n", "- dataset: CESM2\n", " ensemble: r(1:11)i1p1f1\n", " grid: gn\n", " institute: NCAR\n", "- dataset: NorCPM1\n", " ensemble: r(1:30)i1p1f1\n", " grid: gn\n", " institute: NCC\n", "- dataset: NorESM2-LM\n", " ensemble: r(1:3)i1p1f1\n", " grid: gn\n", " institute: NCC\n", "- dataset: NorESM2-MM\n", " ensemble: r(1:3)i1p1f1\n", " grid: gn\n", " institute: NCC\n", "- dataset: KACE-1-0-G\n", " ensemble: r(1:3)i1p1f1\n", " grid: gr\n", " institute: NIMS-KMA\n", "- dataset: UKESM1-0-LL\n", " ensemble: r(13:15)i1p1f2\n", " grid: gn\n", " institute: NIMS-KMA\n", "- dataset: GFDL-CM4\n", " ensemble: r1i1p1f1\n", " grid: gr1\n", " institute: NOAA-GFDL\n", "- dataset: GFDL-ESM4\n", " ensemble: r(1:3)i1p1f1\n", " grid: gr1\n", " institute: NOAA-GFDL\n", "- dataset: NESM3\n", " ensemble: r(1:5)i1p1f1\n", " grid: gn\n", " institute: NUIST\n", "- dataset: SAM0-UNICON\n", " ensemble: r1i1p1f1\n", " grid: gn\n", " institute: SNU\n", "- dataset: CIESM\n", " ensemble: r(1:3)i1p1f1\n", " grid: gr\n", " institute: THU\n", "- dataset: MCM-UA-1-0\n", " ensemble: r1i1p1f(1:2)\n", " grid: gn\n", " institute: UA\n", "diagnostics:\n", " diagnostic_name:\n", " variables:\n", " tas:\n", " exp: historical\n", " mip: Amon\n", " project: CMIP6\n", "\n" ] } ], "source": [ "for dataset in datasets:\n", " dataset.facets['diagnostic'] = 'diagnostic_name'\n", "print(yaml.safe_dump(datasets_to_recipe(datasets)))" ] } ], "metadata": { "kernelspec": { "display_name": "esm", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.9" }, "vscode": { "interpreter": { "hash": "17e81e49408864327be43d3caebcb8eca32ff92a01becb15aa27be73c37f0517" } } }, "nbformat": 4, "nbformat_minor": 5 }