I’m trying to load the variable aice_m from the 01deg_jra55v140_iaf_cycle4experiment using intake, but intake is returning two separate datasets for this one diagnostic. This is the code I’m using to read the data:
dictionary = catalog['01deg_jra55v140_iaf_cycle4'].search(variable='aice_m').to_dataset_dict(
xarray_open_kwargs={"chunks": -1 ,"decode_coords": False,},
xarray_combine_by_coords_kwargs={"compat": "override",
"data_vars": "minimal",
"coords": "minimal",},)
And this is the output:
{'seaIce.1mon.d2:2.nc:5.ni:3600.nj:2700.nkice:4.mean': <xarray.Dataset> Size: 2GB
Dimensions: (time: 59, nj: 2700, ni: 3600)
Coordinates:
* time (time) datetime64[ns] 472B 1958-02-01 1958-03-01 ... 2017-01-01
Dimensions without coordinates: nj, ni
Data variables:
aice_m (time, nj, ni) float32 2GB dask.array<chunksize=(1, 2700, 3600), meta=np.ndarray>
Attributes: (12/14)
title: sea ice model output for CICE
contents: Diagnostic and Prognostic Variables
source: Los Alamos Sea Ice Model (CICE) Version 5
time_period_freq: month_1
comment3: seconds elapsed into model date: 0
conventions: CF-1.0
... ...
intake_esm_attrs:file_id: seaIce.1mon.d2:2.nc:5.ni:3600.nj:2700.n...
intake_esm_attrs:frequency: 1mon
intake_esm_attrs:realm: seaIce
intake_esm_attrs:temporal_label: mean
intake_esm_attrs:_data_format_: netcdf
intake_esm_dataset_key: seaIce.1mon.d2:2.nc:5.ni:3600.nj:2700.n...,
'seaIce.1mon.d2:2.nc:5.ni:3600.nj:2700.mean': <xarray.Dataset> Size: 26GB
Dimensions: (time: 673, nj: 2700, ni: 3600)
Coordinates:
* time (time) datetime64[ns] 5kB 1960-01-01 1960-02-01 ... 2019-01-01
Dimensions without coordinates: nj, ni
Data variables:
aice_m (time, nj, ni) float32 26GB dask.array<chunksize=(1, 2700, 3600), meta=np.ndarray>
Attributes: (12/14)
title: sea ice model output for CICE
contents: Diagnostic and Prognostic Variables
source: Los Alamos Sea Ice Model (CICE) Version 5
time_period_freq: month_1
comment3: seconds elapsed into model date: 0
conventions: CF-1.0
... ...
intake_esm_attrs:file_id: seaIce.1mon.d2:2.nc:5.ni:3600.nj:2700
intake_esm_attrs:frequency: 1mon
intake_esm_attrs:realm: seaIce
intake_esm_attrs:temporal_label: mean
intake_esm_attrs:_data_format_: netcdf
intake_esm_dataset_key: seaIce.1mon.d2:2.nc:5.ni:3600.nj:2700.mean}
The first dataset in this dictionary contains aice_m data from 1958 to 1959 and 2014 to 2017, while the second dataset contains all other time steps. @aekiss and I think this is because in the 1958-1959 & 2014-2017 years of the simulation, certain sea ice diagnostics were saved that used a grid coordinate callednkice. Even though aice_m doesn’t use this coordinate, intake doesn’t seem to recognise this, and instead groups them as separate datasets. Is there a way to make intake recognise this is one single dataset @CharlesTurner ?