Calculating pairwise distances between points

Hello everyone,

I am posting here a question that I originally asked in slack.

In short, I need to calculate the distance from each grid cell to the sea ice edge (where sea ice concentration is < 10%). My map would look something like the figure below.
image

Does anyone have any suggestions on how to do this calculation efficiently? @navidcy suggested that I calculate the Haversine distance for every pixel. Any suggestions would be appreciated.

I’d also suggest you use a different colormap :wink:

I’d second @edoddridge’s suggestion to use the geopy.distance function.

How are you doing this calculation? Are you using some sort of edge detection for the ice-edge points? Is it time consuming? Do you have to do it repeatedly for multiple time steps?

from @Josue Martinez Moreno

If you have a regular grid, you can probably use:
scipy.ndimage.distance_transform_edt — SciPy v1.11.4 Manual
Otherwise, you may need to multiply this mask by the grid scale factors, but in that case it may be more complicated, and I’m not sure you will get the same results

An example of that is here:
Image transforms — Introduction to Bioimage Analysis

@lidefi87, should we change the title to “calculating pair-wise distances between points on the globe” or something that summarizes the heart of the matter and doesn’t sound sea-ice specific since the actual subject we discuss is more generic than that :wink:

1 Like

I did it :wink:

So geopy.distance should have worked (did it?). But were you able to parallelize this via xarray @lidefi87?

1 Like

Hello everyone and thanks for your input. I decided to go with @edoddridge’s suggestion for the distance calculation.

So far, I have been able to calculate the sea ice edge, and I can also calculate the distance to the closest sea ice edge pixel. The problem is that I am stuck with parallelising the calculation of the distance to the sea ice edge for every pixel in my data frame.

I always get stuck when using the xarray.apply_ufunc function. I can never seem to get this right. So, I decided to use the np.apply_along_axis function, which is also failing. I tried mapping the function along axis 0 and 1, and neither work.

I have a reproducible notebook here: Crabeaters_data_cleaning/IceEdge.ipynb at main · lidefi87/Crabeaters_data_cleaning · GitHub. If anyone has any ideas on how to apply the calculation of distance to nearest ice edge pixel, it would be amazing.

Ideally, I will only calculate the distance from pixels containing sea ice to the sea ice edge, but at this stage, I do not care if distances are calculated for the entire data array. I will apply a mask to keep only the values I need.

Also, since this exercise has been more complicated than I initially thought (it is way easier to do this in ArcGIS), I will contribute this as an example in the cookbook to save others time.

Thanks again for your help!

Perhaps someone can help with xarray.apply_ufunc? @angus-g?

I wonder if maybe some kind of nearest-neighbour tree would help you do this more efficiently? If you apply the distance calculation + sorting to every pixel in your input, your complexity is something like O(nx \times ny \times nx \times log(nx)). A tree would shave a factor of nx out of that, which is a few orders of magnitude.

Unfortunately, I don’t know of a way to efficiently use the GeoPy distance calculation in the construction of a tree, but you can use the haversine (great-circle) distance to construct a BallTree. Using ice_10 to define the sea ice edge as in your notebook:

from sklearn.neighbors import BallTree

ice_yt = ice_10.yt_ocean[ice_10.isel(time=0).argmax(dim="yt_ocean")]
ice_coords = np.vstack([ice_yt, ice_10.xt_ocean]).T

ball_tree = BallTree(np.radians(ice_coords), metric="haversine")

x, y = np.meshgrid(
    np.radians(var_ice.xt_ocean.values),
    np.radians(var_ice.yt_ocean.values)
)

distances, indices = ball_tree.query(np.vstack([y.flat, x.flat]).T, return_distance=True)

This took about 30s on ARE. It seems to follow the ice edge:
image

3 Likes

Thanks for your reply @angus-g. You are amazing! I have been stuck with this for a couple of days and I was getting no where with xarray.apply_ufunc. I tried it and it gives me what I need. I just had to convert the results to kilometers and it is exactly what I was after. Do you mind if I share this in a notebook in the COSIMA repo? I will add your name as a contributor in the notebook.

Thanks again! :grinning:

1 Like

Glad it worked for you!

Not at all, go for it :slight_smile:

1 Like

Thanks again everyone! I just submitted a pull request with the new notebook :blush:

3 Likes

for reference; Adding notebook calculating distance to nearest neighbour by lidefi87 · Pull Request #258 · COSIMA/cosima-recipes · GitHub

1 Like