I have been trying to use intake
to do some analysis with @KZCurtin using the 1/20th degree PanAntarctic model. I don’t personally use intake - I find that using xr.open_mfdataset
with a preprocessing function works smoothly 99.9% of the time - but intake
seems a cool tool with potential.
Anyway, we were trying to do a temperature average of the first 500m in the Antarctic shelf, and using intake (with and without preprocessing, with the kwargs suggested in the cosima-recipes ) the kernel crashes. Doing the same thing with mfdataset
and preprocessing works just fine.
I am not looking for a specific solution, I am personally very happy continuing to use mfdataset
. But I did spend some time yesterday trying to make intake
work. I am no dask wizard, and this all would perhaps be solved by doing some smart chunking magic, but I don’t think it is realistic to expect every intake
user to be a dask expert, specially since I understand intake
is aiming to provide a high-level way of opening datasets.
I have made a notebook showing this issue (Intake_vs_mfdatarray.ipynb · GitHub). I’ve used 28 cores on the normalbw queue.