Intake vs mfdataset

Sorry for taking so long to get to this @JuliaN - I was sick last week.

I think this difference is due to the files for dates prior to 2000, which you are including when using Intake-ESM, but not when you open directly with xr.open_mfdataset. Are these files formatted differently that the later dates?

If I restrict the Intake-esm approach to files post 2000, I can run your analysis without issue:

esmds = intake.open_esm_datastore("~/catalog/panant.json", columns_with_iterables=["variable"])

shelf_mask = xr.open_dataset('/g/data/ik11/grids/Antarctic_slope_contour_1000m_MOM6_005deg.nc')['contour_masked_above']
area = esmds.search(variable='areacello', path=[".*output021.*"]).to_dask()['areacello']

def preprocess(ds):
    return ds.sel(yh=shelf_mask['yh'], z_l=slice(None,500))


ds = esmds.search(
    variable=["thetao", "volcello"],
    frequency="1mon",
    filename=".*ocean_month_z",
    start_date="^2.*" # 2000s+ - this will be easier/nicer when date-based searching is implemented
).to_dask(
    xarray_open_kwargs=dict(
        chunks="auto"
    ),
    preprocess=preprocess
)

temp = ds['thetao']
volc = ds['volcello']
thik = ds['volcello']/area

temp.isel(z_l=0, time=0).where(shelf_mask==0).plot()