Why is ACCESS Upside Down?

Specifically, the ACCESS CMIP6 SSP585 data in fs38 has the plev inverted.
Is it supposed to be or is it an error?

I thought it was worth mentioning because it you are looking at future changes in a vertical integral (e.g. total column moisture), you’ll end up with a negative integral and strong drying trend in the ACCESS models but not the other models (and then spend half the day trying to figure out why).


It looks like an error, the levels should be the same as the other simulations. I’m actually surprised this went through CMOR as it is so picky it wouldn’t have allowed to write the levels in that order. Could you post which files are you looking at? I checked a few of the monthly hus ssp585 ACCESS-CM2 files and the plev looks fine in them.

Hmm strange, I’m looking at Amon SSP585.
These are the file paths from the intake catalogue:
[‘/g/data/fs38/publications/CMIP6/ScenarioMIP/CSIRO-ARCCSS/ACCESS-CM2/ssp585/r1i1p1f1/Amon/hus/gn/v20210317/hus_Amon_ACCESS-CM2_ssp585_r1i1p1f1_gn_201501-210012.nc’,
‘/g/data/fs38/publications/CMIP6/ScenarioMIP/CSIRO-ARCCSS/ACCESS-CM2/ssp585/r1i1p1f1/Amon/hus/gn/v20210317/hus_Amon_ACCESS-CM2_ssp585_r1i1p1f1_gn_210101-220012.nc’,
‘/g/data/fs38/publications/CMIP6/ScenarioMIP/CSIRO-ARCCSS/ACCESS-CM2/ssp585/r1i1p1f1/Amon/hus/gn/v20210317/hus_Amon_ACCESS-CM2_ssp585_r1i1p1f1_gn_220101-230012.nc’]

And

[‘/g/data/fs38/publications/CMIP6/ScenarioMIP/CSIRO/ACCESS-ESM1-5/ssp585/r1i1p1f1/Amon/hus/gn/v20210318/hus_Amon_ACCESS-ESM1-5_ssp585_r1i1p1f1_gn_201501-210012.nc’,
‘/g/data/fs38/publications/CMIP6/ScenarioMIP/CSIRO/ACCESS-ESM1-5/ssp585/r1i1p1f1/Amon/hus/gn/v20210318/hus_Amon_ACCESS-ESM1-5_ssp585_r1i1p1f1_gn_210101-220012.nc’,
‘/g/data/fs38/publications/CMIP6/ScenarioMIP/CSIRO/ACCESS-ESM1-5/ssp585/r1i1p1f1/Amon/hus/gn/v20210318/hus_Amon_ACCESS-ESM1-5_ssp585_r1i1p1f1_gn_220101-230012.nc’]

Hi Kym,

again, I didn’t check all of these but just for one of them:

ncdump -v plev /g/data/fs38/publications/CMIP6/ScenarioMIP/CSIRO-ARCCSS/ACCESS-CM2/ssp585/r1i1p1f1/Amon/hus/gn/v20210317/hus_Amon_ACCESS-CM2_ssp585_r1i1p1f1_gn_201501-210012.nc

… plev = 100000.00000001, 92500.00000001, 85000.00000001, 70000.00000001,
60000.00000001, 50000.00000001, 40000.00000001, 30000.00000001,
25000.00000001, 20000.00000001, 15000.00000001, 10000.00000001,
7000.00000001, 5000.00000001, 3000.00000001, 2000.00000001,
1000.00000001, 500.00000001, 100.00000001 ;

Hmm okay, I got the same as you with ncdump but I’m still getting it upside down and with too many levels in Jupyter:
Here is my code (which didn’t produce this error for other models or CMIP scenarios):

subset_ssp585 = cmip6.search(experiment_id='ssp585',source_id=["ACCESS-CM2","ACCESS-ESM1-5"],member_id=["r1i1p1f1","r1i1p1f2"],table_id='Amon', variable_id='hus')

model='ACCESS-ESM1-5'
hus_file_paths = subset_ssp585.df.path[
    (subset_ssp585.df.source_id == model) & 
    (subset_ssp585.df.variable_id == 'hus')
].dropna().tolist()

def _preprocess_hus(ds):
  return ds['hus'].sel(lat=slice(-50,0), lon=slice(90,180), time=slice("2070-01-01","2099-12-31"))
hus=xr.open_mfdataset(hus_file_paths,combine='nested',concat_dim='time',parallel=True,preprocess=_preprocess_hus)

Which results in…
Screenshot 2024-07-23 at 3.17.10 PM

Hey @Kim_Reid,

Can you provide the file path for one of the models that does what you expect? I’m wondering if there is meta-data that xarray is using to interpret things differently between models.

Hi Aiden,

Yes, SSP245 is normal: g/data/fs38/publications/CMIP6/ScenarioMIP/CSIRO/ACCESS-ESM1-5/ssp245/r1i1p1f1/Amon/hus/gn/v20191115/hus_Amon_ACCESS-ESM1-5_ssp245_r1i1p1f1_gn_201501-210012.nc

1 Like

For the extra levels, I wonder if it’s a question of precision in the plev coordinate. When you do a ncdump on the files, the values for plev sometimes come as:

 plev = 100000, 92500, 85000, 70000, 60000, 50000, 40000, 30000, 25000, 
    20000, 15000, 10000, 7000, 5000, 3000, 2000, 1000, 500, 100 ;

and sometimes as:

 plev = 100000.00000001, 92500.00000001, 85000.00000001, 70000.00000001, 
    60000.00000001, 50000.00000001, 40000.00000001, 30000.00000001, 
    25000.00000001, 20000.00000001, 15000.00000001, 10000.00000001, 
    7000.00000001, 5000.00000001, 3000.00000001, 2000.00000001, 
    1000.00000001, 500.00000001, 100.00000001 ;

Which xarray is likely to consider different, hence the extra levels when concatenating the data from several files.
@Aidan , did you have a solution for that? I can’t recall.

I’d noticed that different precision. The ESM and CM2 examples both have the “spurious precision” and yet yield different results for @Kim_Reid .

Interestingly if the data is loaded directly the axes are consistent in ordering

In [2]: esm = xr.open_dataset('/g/data/fs38/publications/CMIP6/ScenarioMIP/CSIRO/ACCESS-ESM1-5/ssp245/r1i1p1f1/Amon/hus/gn/v20191115/hus_Amon_ACCESS-ESM1-5_ssp245_r1i1p1f1_gn_201501-210012.nc')
                            
In [3]: cm2 = xr.open_dataset('/g/data/fs38/publications/CMIP6/ScenarioMIP/CSIRO-ARCCSS/ACCESS-CM2/ssp585/r1i1p1f1/Amon/hus/gn/v20210317/hus_Amon_ACCESS-CM2_ssp585_r1i1p1f1_gn_201501-210012.nc')
In [5]: esm.plev
Out[5]:                   
<xarray.DataArray 'plev' (plev: 19)> Size: 152B                   
array([100000.,  92500.,  85000.,  70000.,  60000.,  50000.,  40000.,  30000.,
        25000.,  20000.,  15000.,  10000.,   7000.,   5000.,   3000.,   2000.,  
         1000.,    500.,    100.])                                             
Coordinates:                                                               
  * plev     (plev) float64 152B 1e+05 9.25e+04 8.5e+04 ... 1e+03 500.0 100.0  
Attributes:                         
    bounds:         plev_bnds
    units:          Pa                             
    axis:           Z                       
    positive:       down                  
    long_name:      pressure              
    standard_name:  air_pressure                     
                   
In [6]: cm2.plev                           
Out[6]:                                
<xarray.DataArray 'plev' (plev: 19)> Size: 152B
array([100000.,  92500.,  85000.,  70000.,  60000.,  50000.,  40000.,  30000.,
        25000.,  20000.,  15000.,  10000.,   7000.,   5000.,   3000.,   2000.,
         1000.,    500.,    100.])              
Coordinates:                   
  * plev     (plev) float64 152B 1e+05 9.25e+04 8.5e+04 ... 1e+03 500.0 100.0
Attributes:                         
    bounds:         plev_bnds        
    units:          Pa           
    axis:           Z                                                        
    positive:       down                                                        
    long_name:      pressure
    standard_name:  air_pressure

In [23]: esm.plev.values[0]
Out[23]: 100000.00000001001

In [24]: cm2.plev.values[0]
Out[24]: 100000.00000001001

So I think the “spurious precision” isn’t involved in the observed difference.

The fact that the ordering of the plev axis is “correct” (unchanged) when these individual data files is loaded directly by open_dataset suggests to me that there is something happening in the open_mfdataset call.

Perhaps one of the files is reversed and open_mfdataset is doing something to make sure they’re all consistent and in doing so flips the axis ordering.

@Kim_Reid you could try looping through all the files in the example that is flipped and check if there are any files with different ordering of the plev axis.

Since the faulty cases come up with too many pressure levels in addition of swapping the direction, it might still be good to correct it. Would you just recast the plev coordinate to integer?

Ah I hadn’t noticed that! Right, so that makes sense. The files have inconsistent axes so mf_dataset is smooshing them together and then reordering because that makes sense to do so when combining two different axes.

I’d be tracking down which files are the culprits, but as for solutions, yeah some sort of truncation/rounding in the preprocess step would work I should think.

I checked the files individually and none of them seem odd individually when I open them with open_dataset. But when I open them together with mf_dataset then the plev axis gets flipped and expanded. I’ve just been selecting every 2nd level as a workaround.

There are minor differences

fa = netCDF4.Dataset('/g/data/fs38/publications/CMIP6/ScenarioMIP/CSIRO/ACCESS-ESM1-5/ssp585/r1i1p1f1/Amon/hus/gn/v20210318/hus_Amon_ACCESS-ESM1-5_ssp585_r1i1p1f1_gn_201501-210012.nc')
fb = netCDF4.Dataset('/g/data/fs38/publications/CMIP6/ScenarioMIP/CSIRO/ACCESS-ESM1-5/ssp585/r1i1p1f1/Amon/hus/gn/v20210318/hus_Amon_ACCESS-ESM1-5_ssp585_r1i1p1f1_gn_210101-220012.nc')

fa.variables['plev'][:] - fb.variables['plev'][:] 
masked_array(data=[1.00117177e-08, 1.00117177e-08, 1.00117177e-08,
                   1.00117177e-08, 1.00044417e-08, 9.99716576e-09,
                   9.99716576e-09, 9.99716576e-09, 9.99716576e-09,
                   9.99716576e-09, 9.99898475e-09, 1.00008037e-08,
                   9.99989425e-09, 9.99989425e-09, 9.99989425e-09,
                   9.99989425e-09, 1.00000079e-08, 1.00000079e-08,
                   1.00000079e-08],
             mask=False,
       fill_value=1e+20)
1 Like

Nice find @Scott

This is what I was trying to explain @Kim_Reid. If there are slight differences in the plev axis between files when xarray tries to concatenate the data it will create a new plev axis which contains all the values from the the plev variables are joining together. Because xarray is creating a new axis it also sorts it, which is a sensible thing to do in most cases, but that is why it is then “upside down”.

This is why @clairecarouge suggested adding a step to the preprocess function to remove spurious precision in the plev variable so they are always consistent.