Issue using xESMF to conservatively regrid CMIP6 data

I’m using the xESMF package to conservatively regid CMIP6 data, but I’m running into a problem where new (and often wildly inaccurate) minimum and maximum values are being created in the regridded dataset (I thought that conservative regridding wasn’t supposed to do this). My example notebook is here: example_xesmf_interp_issue.ipynb.

Confusingly, the regridding seems to work OK for some variables, but for surface air temperature (tas), the new min and max values are absurd (see attached figure - e.g. models ACCESS-ESM1-5, CAS-ESM-0, CMCC-CM2-SR5, CMCC-ESM2, IITM-ESM, IPSL-CM6A-LR). The extreme values are located near 90S so, certainly for some models, I think the problem lies in not specifying the correct lat and lon bounds (see section on untangling corners here: https://pavics-sdi.readthedocs.io/en/latest/notebooks/regridding.html ), but I’ve played around with this and not had any improvement with the regridding. More often than not, my attempts to specify the lat/lon bounds just cause the xESMF regridding to fail.

Has anyone come across this before? Or can anyone spot something I’m missing? Quite puzzled at the moment. My current work around is to use cdo to do the conservative regridding, but I’d really like to understand why this is happening.

2 Likes

Hey Hannah! I have come across this as well, but couldn’t manage to fix it. In my case it was for surface winds. I think it is quite an important issue to address because xesmf seems to be the most popular choice for CMIP6 regridding…

Interested in what other people have to say!

There’s a lot going on in your notebook - could you try simplifying it? A conservative regrid of ACCESS-ESM ssp585 tas to the JRA55 grid looks fine to me - regrid_cmip_to_jra.ipynb · GitHub

Sure, simplified notebook here: example_xesmf_interp_issue.ipynb .

I’ve run your script. Interestingly, I get a different result to you (the incorrect min/max values) when using the cmip6 dataset that I’ve downloaded with intake (see screenshot below), to when I use the input file path you’ve provided. So perhaps something has gone wrong with the way I’ve used intake to collect the cmip6 data on gadi, in terms of dropping important file attributes/bounding information? Is there another way to efficiently collect the cmip6 data that I’m after without using intake?

Doing some testing it looks like this happens when you are missing the lat_bnds and lon_bnds variables from the original file. These are used to compute the area of each grid cell for the conservative algorithm.

Make sure when you’re loading the file from intake these attributes are preserved if you wish to use conservative regridding. Cell boundaries are only important for conservative regridding, you should be able to use bilinear etc. on the files you have without any problem.

1 Like

Thanks Scott. With intake, how do I make sure the attributes are preserved? Is this an argument I feed in?

It should do so by default, maybe inspect the data before you save it to make sure that the lat_bnds and lon_bnds are still present - regrid_cmip_to_jra(1).ipynb · GitHub

1 Like

Thanks Scott! Retaining the bounds works well for most models. However, I’m still having a problem with some, e.g. AWI-CM-1-1-MR. The regridded dataset contains different minimum values even when the bounds are retained - example: regrid_cmip_to_jra_example.ipynb . Any ideas why in this case?

The two grids cover different areas, JRA starts at 90S, AWI starts at 89.75S. Here’s a zoom into the grid boundaries at the lower left corner:

image