@dougiesquire I think I found it. open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 · Issue #7079 · pydata/xarray · GitHub. I didn’t find it before as I didn’t look back far enough.
1 Like
Nice find! To summarise in this thread, it looks like a work-around in netcdf4-python
to deal with netcdf-c
not being thread safe was removed in 1.6.1. The solution (for now) is to make sure your cluster only uses 1 thread per worker.
2 Likes
Yep, given that’s impractical to advise all hh5 users to update all of their dask cluster initialisations, I’ll pin netcdf4-python
in our analysis3-unstable
environment until the issue is resolved.
2 Likes
I’m just here to say WOO HOO! Nice team work @dale.roberts and @dougiesquire.
(Also I marked @dougiesquire’s answer summarising @dale.roberts’ sleuthing as the solution so it shows up at the top of the topic, hope that is ok)