I am posting here a question that I originally asked in slack.

In short, I need to calculate the distance from each grid cell to the sea ice edge (where sea ice concentration is < 10%). My map would look something like the figure below.

Does anyone have any suggestions on how to do this calculation efficiently? @navidcy suggested that I calculate the Haversine distance for every pixel. Any suggestions would be appreciated.

How are you doing this calculation? Are you using some sort of edge detection for the ice-edge points? Is it time consuming? Do you have to do it repeatedly for multiple time steps?

If you have a regular grid, you can probably use: scipy.ndimage.distance_transform_edt â€” SciPy v1.10.1 Manual
Otherwise, you may need to multiply this mask by the grid scale factors, but in that case it may be more complicated, and Iâ€™m not sure you will get the same results

@lidefi87, should we change the title to â€ścalculating pair-wise distances between points on the globeâ€ť or something that summarizes the heart of the matter and doesnâ€™t sound sea-ice specific since the actual subject we discuss is more generic than that

Hello everyone and thanks for your input. I decided to go with @edoddridgeâ€™s suggestion for the distance calculation.

So far, I have been able to calculate the sea ice edge, and I can also calculate the distance to the closest sea ice edge pixel. The problem is that I am stuck with parallelising the calculation of the distance to the sea ice edge for every pixel in my data frame.

I always get stuck when using the xarray.apply_ufunc function. I can never seem to get this right. So, I decided to use the np.apply_along_axis function, which is also failing. I tried mapping the function along axis 0 and 1, and neither work.

Ideally, I will only calculate the distance from pixels containing sea ice to the sea ice edge, but at this stage, I do not care if distances are calculated for the entire data array. I will apply a mask to keep only the values I need.

Also, since this exercise has been more complicated than I initially thought (it is way easier to do this in ArcGIS), I will contribute this as an example in the cookbook to save others time.

I wonder if maybe some kind of nearest-neighbour tree would help you do this more efficiently? If you apply the distance calculation + sorting to every pixel in your input, your complexity is something like O(nx \times ny \times nx \times log(nx)). A tree would shave a factor of nx out of that, which is a few orders of magnitude.

Unfortunately, I donâ€™t know of a way to efficiently use the GeoPy distance calculation in the construction of a tree, but you can use the haversine (great-circle) distance to construct a BallTree. Using ice_10 to define the sea ice edge as in your notebook:

from sklearn.neighbors import BallTree
ice_yt = ice_10.yt_ocean[ice_10.isel(time=0).argmax(dim="yt_ocean")]
ice_coords = np.vstack([ice_yt, ice_10.xt_ocean]).T
ball_tree = BallTree(np.radians(ice_coords), metric="haversine")
x, y = np.meshgrid(
np.radians(var_ice.xt_ocean.values),
np.radians(var_ice.yt_ocean.values)
)
distances, indices = ball_tree.query(np.vstack([y.flat, x.flat]).T, return_distance=True)

This took about 30s on ARE. It seems to follow the ice edge:

Thanks for your reply @angus-g. You are amazing! I have been stuck with this for a couple of days and I was getting no where with xarray.apply_ufunc. I tried it and it gives me what I need. I just had to convert the results to kilometers and it is exactly what I was after. Do you mind if I share this in a notebook in the COSIMA repo? I will add your name as a contributor in the notebook.