Find the CMIP6 data not storaged locally without "Clef" package

Hi folks

As the conda environment is moved from hh5 to xp65, the formal package for selecting the CMIP6 dataset “Clef” is no longer working. Currently, most of the data selection is based on the “intake“. However, in “intake“, we won’t know if it is the latest dataset we selected, or if there are more datasets for analysis but not downloaded in Gadi. I came up with a solution (I named it Ztake :slight_smile: ) to find the CMIP6 data, which I have included here for your reference.

This is very draft code, but it currently works well for me to select CMIP6 data. And I am only focusing on one member_id in each model, it may not suit other needs.

The package code is here (hope this link works):

First intake the CMIP6 catalogue

import intake

cmip6 = intake.open_esm_datastore("/g/data/dk92/catalog/v2/esm/cmip6-oi10/catalog.json")

Nest set the constraints and search the data

constraints = dict(
    experiment_id="historical",
    variable_id="fgco2nat",
    member_id="r1i1p1f1",
    table_id="Omon"
)

# Build your Ztake
zt = Ztake(cmip6_catalog=cmip6, constraints= constraints,
           prefer_members=("r1i1p1f1","r1i1p1f2"),
           prefer_grids=("gn","gr","gr1")
          )

Fanally compare to the result from ESGF node, it will return the node link which has been compared, if request_ids = True, it will return the .txt file of the datasets only online.

dif_result = zt.compare_with_esgf(mode="latest", request_ids=True)

2 Likes