'NetCDF: Not a valid ID' errors

@dougiesquire I think I found it. open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 · Issue #7079 · pydata/xarray · GitHub. I didn’t find it before as I didn’t look back far enough.

1 Like

Nice find! To summarise in this thread, it looks like a work-around in netcdf4-python to deal with netcdf-c not being thread safe was removed in 1.6.1. The solution (for now) is to make sure your cluster only uses 1 thread per worker.

2 Likes

Yep, given that’s impractical to advise all hh5 users to update all of their dask cluster initialisations, I’ll pin netcdf4-python in our analysis3-unstable environment until the issue is resolved.

2 Likes

I’m just here to say WOO HOO! Nice team work @dale.roberts and @dougiesquire.

(Also I marked @dougiesquire’s answer summarising @dale.roberts’ sleuthing as the solution so it shows up at the top of the topic, hope that is ok)