I am planning to model the habitat range of a seal in the Southern Ocean, but to do this I need to extract data for some model outputs for each location where a seal has been recorded.
Initially, I decided to assign a reference system (EPSG 4326) to model outputs and save the results as a georeference image (tif file). The problem is that given the model grid is not even (size/area of grid cells change with latitude), I get some large differences between model outputs and observations.
The image below shows this issue. The data is the bathymetry (from GEBCO, but already regridded to match ACCESS-OM2-01). The areas in grey are the land masks in the bathymetry data. The dots are all the observations for which I need to extract data, and the purple polygons are the location of continents (i.e., areas where land mask should overlap).
I have a couple of questions:
- Has anyone been able to extract data for grid cell closest to a coordinate pairs (i.e., seal location in my case) from the netcdf files directly?, OR
- Does anyone know how I can save my uneven grid in the netcdf files into a georeferenced raster? Once I have this, I can extract the data I need really easily.
Any ideas are appreciated. I feel a little stuck with this.
Hi @lidefi87. I possibly haven’t properly grokked what you’re wanting to do here, but
xoak might be helpful here. E.g. see the example in the introduction.
Note that the sort of functionality that
xoak provides is/will be available natively in
xarray with the ongoing flexible indexes refactor
Alternatively, you could interpolate the data at the seal locations using something like xesmf?
Thanks @dougiesquire. I have not managed to test this option because
xoak is not currently available in the conda environments in gadi, and I am having trouble installing it. I think I will have to stick to the using
da.sel() and providing the coordinates for each point.
for the interpolation solution, you can also use LinearNDInterpolator which performs a Delaunay triangulation interpolation using Qhull. The inputs are given as vectors (Lon/lat pairs + data values at that location), so the origin grid does not matter anymore (it does not have to be regular, and can be even unstructured). You can also get rid of the land points and only keep the ocean points, which solve any mask issue. the outputs are a list of lon/lat pairs as well, so works well for observations locations.
from scipy.interpolate import LinearNDInterpolator
Note that because it is based on triangulation, extrapolation outside of the convex hull is not possible.
If you have obs which are on the edge of a land point and end up outside of the convex hull of the ocean point, this will not work well.
When I need to extrapolate, I use xesmf regridder. I use the same orgin and destination grid, to “fill” the land point with ocean values.
I’m pretty sure you can do lookups for entire vectors of positions with
xarray advanced interpolation.
See this stackoverflow answer for an example of advanced interpolation.
I thought this is what is used for the langrangian particle tracking work, but @hrsdawson would know better.
Good point @Aidan - no need to worry about the curvilinear grid down there
You should be able to interpolate to points with xesmf using the methods that it supports
Hi @Scott, will your notebook work with a little adapting to go the other way i.e. from a
.tif to an
xarray object in lat/lon coords?
If .tif is a list of lat, lon, value data then yeah I guess you could do it in a similar way, though I’ve not tried it. You may want to turn on extrapolation to fill in the space between the input data points