A reference for making dask work (faster)

Also see my post from just last week here, indicating the following resources:

A few other bits and pieces:

A key point here is that if chunks is not specified, no chunking will be done (if open_mfdataset is used then the chunk size will be the file size).

You can see the native chunking of variables in a netcdf file using ncdump -hs <filename>.

I’ve found it useful and efficient to save some intermediate results that are expensive to calculate to .zarr stores using a for loop over sections of the datasets (.isel) and appending to the file as described here.

2 Likes