Find and download files from ESGF

This module provides the function esmvalcore.esgf.find_files() for searching for files on ESGF using the ESMValTool vocabulary. It returns esmvalcore.esgf.ESGFFile objects, which have a convenient esmvalcore.esgf.ESGFFile.download() method for downloading the files.

See ESGF configuration for instructions on configuring this module.

esmvalcore.esgf

esmvalcore.esgf.find_files(*, project, short_name, dataset, **facets)[source]

Search for files on ESGF.

Parameters
  • project (str) – Choose from CMIP3, CMIP5, CMIP6, CORDEX, or obs4MIPs.

  • short_name (str) – The name of the variable.

  • dataset (str) – The name of the dataset.

  • **facets – Any other search facets. Values can be strings, list of strings, or ‘start_year’ and ‘end_year’ with values of type int.

Examples

Examples of how to use the search function for all supported projects.

Search for a CMIP3 dataset:

>>> search(
...     project='CMIP3',
...     frequency='mon',
...     short_name='tas',
...     dataset='cccma_cgcm3_1',
...     exp='historical',
...     ensemble='run1',
... )  
[ESGFFile:cmip3/CCCma/cccma_cgcm3_1/historical/mon/atmos/run1/tas/v1/tas_a1_20c3m_1_cgcm3.1_t47_1850_2000.nc]

Search for a CMIP5 dataset:

>>> search(
...     project='CMIP5',
...     mip='Amon',
...     short_name='tas',
...     dataset='inmcm4',
...     exp='historical',
...     ensemble='r1i1p1',
... )  
[ESGFFile:cmip5/output1/INM/inmcm4/historical/mon/atmos/Amon/r1i1p1/v20130207/tas_Amon_inmcm4_historical_r1i1p1_185001-200512.nc]

Search for a CMIP6 dataset:

>>> search(
...     project='CMIP6',
...     mip='Amon',
...     short_name='tas',
...     dataset='CanESM5',
...     exp='historical',
...     ensemble='r1i1p1f1',
... )  
[ESGFFile:CMIP6/CMIP/CCCma/CanESM5/historical/r1i1p1f1/Amon/tas/gn/v20190429/tas_Amon_CanESM5_historical_r1i1p1f1_gn_185001-201412.nc]

Search for a CORDEX dataset and limit the search results to files containing data to the years in the range 1990-2000:

>>> search(
...     project='CORDEX',
...     frequency='mon',
...     dataset='COSMO-crCLIM-v1-1',
...     short_name='tas',
...     exp='historical',
...     ensemble='r1i1p1',
...     domain='EUR-11',
...     driver='MPI-M-MPI-ESM-LR',
...     start_year=1990,
...     end_year=2000,
... )  
[ESGFFile:cordex/output/EUR-11/CLMcom-ETH/MPI-M-MPI-ESM-LR/historical/r1i1p1/COSMO-crCLIM-v1-1/v1/mon/tas/v20191219/tas_EUR-11_MPI-M-MPI-ESM-LR_historical_r1i1p1_CLMcom-ETH-COSMO-crCLIM-v1-1_v1_mon_198101-199012.nc,
ESGFFile:cordex/output/EUR-11/CLMcom-ETH/MPI-M-MPI-ESM-LR/historical/r1i1p1/COSMO-crCLIM-v1-1/v1/mon/tas/v20191219/tas_EUR-11_MPI-M-MPI-ESM-LR_historical_r1i1p1_CLMcom-ETH-COSMO-crCLIM-v1-1_v1_mon_199101-200012.nc]

Search for a obs4MIPs dataset:

>>> search(
...     project='obs4MIPs',
...     frequency='mon',
...     dataset='CERES-EBAF',
...     short_name='rsutcs',
... )  
[ESGFFile:obs4MIPs/NASA-LaRC/CERES-EBAF/atmos/mon/v20160610/rsutcs_CERES-EBAF_L3B_Ed2-8_200003-201404.nc]
Returns

A list of files that have been found.

Return type

list of ESGFFile

esmvalcore.esgf.download(files, dest_folder, n_jobs=4)[source]

Download multiple ESGFFiles in parallel.

Parameters
  • files (list of ESGFFile) – The files to download.

  • dest_folder (Path) – The destination folder.

  • n_jobs (int) – The number of files to download in parallel.

Raises

DownloadError: – Raised if one or more files failed to download.

class esmvalcore.esgf.ESGFFile(results)[source]

Bases: object

File on the ESGF.

This is the object returned by the function esmvalcore.esgf.search().

urls

The URLs where the file can be downloaded.

Type

list of str

dataset

The name of the dataset that the file is part of.

Type

str

name

The name of the file.

Type

str

size

The size of the file in bytes.

Type

int

Methods:

download(dest_folder)

Download the file.

local_file(dest_folder)

Return the path to the local file after download.

download(dest_folder)[source]

Download the file.

Parameters

dest_folder (Path) – The destination folder.

Raises

DownloadError: – Raised if downloading the file failed.

Returns

The path where the file will be located after download.

Return type

Path

local_file(dest_folder)[source]

Return the path to the local file after download.

Parameters

dest_folder (Path) – The destination folder.

Returns

The path where the file will be located after download.

Return type

Path

esmvalcore.esgf.facets

Module containing mappings from our names to ESGF names.

Data:

DATASET_MAP

Cache for the mapping between recipe/filesystem and ESGF dataset names.

FACETS

Mapping between the recipe and ESGF facet names.

Functions:

create_dataset_map()

Create the DATASET_MAP from recipe datasets to ESGF dataset names.

esmvalcore.esgf.facets.DATASET_MAP = {'CMIP3': {}, 'CMIP5': {'ACCESS1-0': 'ACCESS1.0', 'ACCESS1-3': 'ACCESS1.3', 'CESM1-BGC': 'CESM1(BGC)', 'CESM1-CAM5': 'CESM1(CAM5)', 'CESM1-CAM5-1-FV2': 'CESM1(CAM5.1,FV2)', 'CESM1-FASTCHEM': 'CESM1(FASTCHEM)', 'CESM1-WACCM': 'CESM1(WACCM)', 'CSIRO-Mk3-6-0': 'CSIRO-Mk3.6.0', 'GFDL-CM2p1': 'GFDL-CM2.1', 'MRI-AGCM3-2H': 'MRI-AGCM3.2H', 'MRI-AGCM3-2S': 'MRI-AGCM3.2S', 'bcc-csm1-1': 'BCC-CSM1.1', 'bcc-csm1-1-m': 'BCC-CSM1.1(m)', 'fio-esm': 'FIO-ESM', 'inmcm4': 'INM-CM4'}, 'CMIP6': {}, 'CORDEX': {}, 'obs4MIPs': {}}

Cache for the mapping between recipe/filesystem and ESGF dataset names.

esmvalcore.esgf.facets.FACETS = {'CMIP3': {'dataset': 'model', 'ensemble': 'ensemble', 'exp': 'experiment', 'frequency': 'time_frequency', 'short_name': 'variable'}, 'CMIP5': {'dataset': 'model', 'ensemble': 'ensemble', 'exp': 'experiment', 'mip': 'cmor_table', 'product': 'product', 'short_name': 'variable'}, 'CMIP6': {'dataset': 'source_id', 'ensemble': 'variant_label', 'exp': 'experiment_id', 'grid': 'grid_label', 'mip': 'table_id', 'short_name': 'variable'}, 'CORDEX': {'dataset': 'rcm_name', 'domain': 'domain', 'driver': 'driving_model', 'ensemble': 'ensemble', 'exp': 'experiment', 'frequency': 'time_frequency', 'short_name': 'variable'}, 'obs4MIPs': {'dataset': 'source_id', 'frequency': 'time_frequency', 'short_name': 'variable'}}

Mapping between the recipe and ESGF facet names.

esmvalcore.esgf.facets.create_dataset_map()[source]

Create the DATASET_MAP from recipe datasets to ESGF dataset names.

Run python -m esmvalcore.esgf.facets to print an up to date map.