Regrid from an irregular grid by summing

I have an array of Australia’s population density, pop. The data have spatial point dimensions (x and y) in units of metres (GDA94 / Australian Albers - EPSG:3577). Using pyproj, I transformed the projection to a regular lat/lon grid and added these as new coordinates to my array. The two projections look like this:

import xarray as xr
import matplotlib.pyplot as plt

pop = xr.open_dataset("/g/data/w42/dr6273/work/data/ABS/")["population_density"]

fig, ax = plt.subplots(1, 2, figsize=(7,3))
pop.plot.imshow(ax=ax[0], add_colorbar=False, vmax=5)
pop.plot.pcolormesh("lon", "lat", ax=ax[1], add_colorbar=False, vmax=5)
ax[0].set_title("Point grid")
ax[1].set_title("Regular lat/lon")


I want to regrid this to the coarser ERA5 grid. Ideally, this would be by summing all the higher resolution cells that lie within each ERA5 grid cell. It doesn’t seem like xesmf offers this option.

If my array had regular lat/lon dimensions I think I could solve this (using groupby_bins or something). But, as lat/lon are irregularly-spaced coordinates its not clear how to approach this. By irregular, I mean the difference in lat/lon between consecutive x/y cells is not uniform. This figure shows the difference in lon values for a given y:

If I didn’t care about summing, I could probably use xesmf with conservative regridding, but that raises another problem in that ERA5/NCI does not seem to supply the lat/lon bounds, and I’m unsure how I could manually add my own bounds to the irregular lat/lon in pop.

Does anyone have any suggestions? I’m a bit stuck with it!

Can you bin in x,y space and then convert the grid coordinates with pyproj?

Thanks for the suggestion @Scott. I’m not sure about this - to obtain the desired output grid in x,y space would presumably require converting the ERA5 lat/lon grid to x,y space, which would be irregular, and the problem would be the same. I need the population per lat/lon cell…

In a different topic @dale.roberts provided a Jupyter notebook that utilised an Iris function guess_bounds that might be useful in your case:

According to the ESMF docs

  • First-order conservative: Preserves the integral of the source field across the regridding. For this method, weight calculation is based on the ratio of source cell area overlapped with the corresponding destination cell area. If the user areas option (see below) is not used, then the areas used in this calculation are those calculated by ESMF and thus the ones for which the conservation holds. The user areas option allows the user to adjust the interpolation weights so that conservation is based on user-supplied areas.

Which sounds like it should sum when regridding. Easy enough to check.

Thanks a lot @Aidan. I tried using iris.guess_bounds with conservative remapping, but that method does not sum. However, your comment prompted me to read about the other regridding methods properly, and it seems as though the following method will do what I want:

  • Nearest destination to source: Each source point is mapped to the closest destination point. A destination point can be mapped to multiple source points, in which case the destination is the sum of the source values. Some destination points may not be mapped.

In short, the solution was to use xesmf with the method nearest_d2s.

In case others come across this, I’ve put my code on my GitHub, including the steps on how to convert the projection: