Description of request: rechunker.rechunk failing in current analysis3 environment
Environment: conda/analysis3-26.03
Running in ARE and dask PBSClusters
What executed:
The part of my code where I use rechunker fails in the 26.03 conda analysis environment..
Specifically, this environment contains rechunker v0.5.2, but the current version of xarray requires rechunker >= v0.5.4 to enable the zarr_format argument to be passed, as specifying zarr v2 or 3 is now required but the older rechunker library does not support this argument.
import dask
import xarray as xr
import zarr
from rechunker import rechunk
# Step 2: Set up target_options with encoding
array_plan = rechunk(
sorted_data,
target_chunks,
max_mem,
target_store_path,
target_options={
'overwrite': True,
'encoding': encoding # Pass the encoding to preserve data types and other options
},
temp_store=tmp_store_path,
)
Actual results:
zarr created is empty, dask job final task shows error TypeError("extract_zarr_variable_encoding() missing 1 required keyword-only argument: 'zarr_format'")
Expected results:
Zarr is created containing data
Additional info:
Note: it’s a problem specifically with the new environment(s), I was able to roll back to conda/analysis3-25.08 and run my code successfully. I realise that in the future the code may have to be refactored to not use rechunker, I’m just reporting the error.
Yeah I think that’s right - that in course rechunker won’t be required, at the moment we are still using it though, so maybe it should be removed from the current env??
And relatedly can we please not delete at least one of the older envs until the workflows can be seamlessly ported to modern dask - dask is notoriously flaky for these tasks
Yeah, that sounds sensible. I’ll have a look at figure out when we migrated to zarr 3. If I remember rightly, it was kerchunk related and happened in September - looks like that roughly jives with what you’ve found?
I think parcels is also currently pinned to zarr 2, so we should be keeping an old zarr 2 environment around for some time yet. Heads up @rbeucher on this front.
Seems like there’s a bit of a fracture in the zarr ecosystem right now annoyingly with the v2 => v3 transition
I’ll come back and crosslink the relevant threads when I get the chance.
Bit of a mess at the moment unfortunately! It also appears I was the trigger happy fool that updated to analysis3 zarr v3, so the blame is unfortunately squarely with me .