Vegetation ancillary creation

Hi,

We are trying to conduct an experiment in N48 slab-ocean suite (job id: valse) with the modified land-ocean distribution (later we are planning to repeat this experiment in the O-A coupled N48 suite).
I have already altered a few other ancillary files but am unable to change the vegetation files using xancil.

The ACCESS N48 suite reads two vegetation ancillary files: …veg.func_igbp and veg.func_seas. But I didn’t find an option to generate them in xancil software, which expects a different set of input variables than the existing ones in the present vegetation functional type files. My altered vegetation files are available on NCI:

/scratch/public/sza565/n48.veg.func_seas.shiftedAusNZ.nc

/scratch/public/sza565/n48.veg.func_igbp.shiftedAusNZ.nc

Any advice on this would be greatly appreciated.

Thanks,
Abhik

1 Like

Using mule to modify the data in an existing ancillary file (or even create a new one from scratch) is easier than using xancil (which I’ve never much liked).

Davide is having a look at this.

Hi Martin,

We have converted …func_igbp file into an ancillary file using the mule. Thanks to Dale from cws_help for helping me out.

However, we didn’t get success in converting …func_sea file as it contains different time dimensions for two variables. The iris/mule can handle a single time dimension in the output cube and it generates a runtime error when attempting to save multiple cubes.

RuntimeError: Currently, only support writing cubes with identical coordinates. time is not identical in all cubes.

Regards,
Abhik

The func_seas file has a t dimension of size 60 which is the cross product of the true time dimension of length 12 and a PFT pseudo-dimension of length 5. It should work if you have a file with separate dimensions.

It would be great if you could include the commands you used to do this, in case someone else needs to do something in the future.

I have a potential solution to this. The approach with ANTS didn’t work as it did not set the pseudo dimension in the ancil file correctly, so the resulting fields were interpreted as time steps instead of levels. In the help ticket, @abhik suggested that the values in the canopy conductance field be repeated across the longer dimension of the other two fields.
I’ve put the code to do this on the coecms github at GitHub - coecms/convert_vegetation_ancil: A python script to convert netCDF format vegetation forcing to UM ancil files
I’m sure there is an easier way to do a lot of this, but this is where I’m at.
If the t dimension is somewhat mislabled as @MartinDix has suggested, it won’t take much extra to modify that code to turn it into 12 time values x 5 pseudo levels.

The ‘leaf area index’ and ‘canopy height of plant’ fields have 60 timesteps and they have unique values at every time step. Therefore, they can’t be repeated across the time dimension.
I have yet to check Dale’s code, but the above description doesn’t sound ok.
The timesteps of the vegetation file can be quickly checked using the below code.

import xarray as xr

diri = ‘/scratch/public/sza565/’
f = xr.open_dataset(diri+‘n48.veg.func_seas.shiftedAusNZ.nc’)
x1 = f.field1392.isel(t=1)
x2 = f.field1392.isel(t=13)
xdiff= x1 - x2
xdiff.plot()

Regards,
Abhik

Hi @abhik, I’d trust @MartinDix’s analysis here. I did as you suggested with xarray, except I went a little further and produced some animations. An animation displaying all 60 timesteps sequentially looks quite disjointed. Whereas starting at timestep 2 and skipping every 5 looks more like what you’d expect leaf area index to look like, with quite a lot of coverage in the northern hemisphere around the middle of the year (i.e. summer). Unfortunately I can’t upload the animations, but I’ve uploaded them to my google drive.
Here is the ‘all time steps’ animation: field1392_allts.mp4 - Google Drive
and the ‘every 5 time steps’ animation: field1392_every5ts.mp4 - Google Drive

Hi Abhik,

I had a look at the n48.veg.func_seas.shiftedAusNZ.nc vegetation file.

It looks like the non-null values in the field are all identical and equal to 0.01, therefore I am not really sure what the different times are needed for here.

In this case, I think you could simply extend the conductance field to have the same time dimension as the other two fields by repeating its values over the time axis.

I wrote a simple script using xarray to do this:

import numpy as np
import xarray as xr
import os

file="/g/data/tm70/dm5220/scripts/abhik/n48.veg.func_seas.shiftedAusNZ.nc"

d=xr.open_dataset(file)
var="field1384"
cond=d[var]
dims=list(cond.dims)
dims[0]="t"
factor=int(len(d.t)/len(d.t_1))

newcond=xr.apply_ufunc(lambda x: np.repeat(x,factor,axis=-1), cond,
    exclude_dims=set(["t_1"]),
    input_core_dims=[["t_1"]],
    output_core_dims=[["t_1"]],
    ).assign_coords(t_1=d.t.values).rename({"t_1":"t"}).transpose(*dims)

newd=d.drop([var,"t_1"]).assign({var:newcond})
newfile=f"{os.path.splitext(file)[0]}_modified.nc"
newd.to_netcdf(newfile)

You should not have problems converting the output file to UM ancillary using mule, similarly to what you already did with the other vegetation file.

Davide

Hi Davide,

Thanks, we followed the approach you suggested above, i.e., repeating the 12 canopy conductance values 5 times to match with 60 timesteps of the other two variables in the file. However, the model doesn’t accept the extended data length and the recon job failed with the following error:

ERROR Reading field no 0

FIELD NO. 0 CANOPY CONDUCTANCE AFTER TIMESTEP

VALID AT: 0000Z 15/12/0004 DAY 0 DATA TIME: 2359Z 30/12/0004 DAY 0

LBTIM LBFT LBLREC LBCODE LBHEM LBROW LBNPT LBEXT LBPACK

0 0 7008 1 0 73 96 0 0

LBREL LBFC LBCFC LBPROC LBVC LBRVC LBEXP LBBEGIN LBNREC

3 1384 -99 0 129 -99 -99 1807360 7168

LBPROJ LBTYP LBLEV LBRSVD LBRSVD LBRSVD LBRSVD LBSRCE

-99 -99 8888 -99 -99 -99 -99 -99

DATA_TYPE NADDR LBUSER ITEM_CODE LBPLEV LBUSER MODEL_CODE

1 -99 0 213 0 0 1

BULEV BHULEV BRSVD(3) BRSVD(4) BDATUM

0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00

BACC BLEV BRLEV BHLEV BHRLEV

0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00

BPLAT BPLON BGOR BZY BDY

9.0000E+01 0.0000E+00 0.0000E+00 -9.2500E+01 2.5000E+00

BZX BDX BMDI BMKS

-3.7500E+00 3.7500E+00 -1.0737E+09 0.0000E+00

FATAL ERROR WHEN READING/WRITING MODEL DUMP

buffer in of real data

Error code = 0.00

Length requested = 7008

Length actually transferred = 0

Fatal error codes are as follows:

-1.0 Mismatch between actual and requested data length

0.0 End-of-file was read

1.0 Error occurred during read

2.0 Other disk malfunction

3.0 File does not exist

Hi Abhik,

The problem might be related to the model expecting a time coordinate with length 12 instead of 60.
Checking the “valse” run, I can see you currently don’t update the values of the conductance from the ancillary file, but those are instead “configured”.

Can you please try to go on

umuix > valse (edit) > Atmosphere > Ancillary and input data files > Climatologies and potential climatologies > Vegetation Distribution: Area and structure

and in the “Grid-box-mean canopy conductance to be:” tick “Updated from ancil” every 5 Days, like the others. (Even though in this case it doesn’t really matter since all the values of the conductance are the same, so you could as well update every 5 years, which is the length of the conductance field.)

This should work.

Davide

Hi @atteggiani

I’ve been helping @abhik with this as well, and I thought I’d respond over here so we’re all on the same page. Did you end up using the solution in the last post to solve this problem? How did you go about creating the ancils? Did you end up converting from @abhik’s original netcdf files or shifting the control ancils using e.g. xarray? I’ve not done much work with UM ancils before, so it’d be good to share your solution so I can see where I went wrong.

Thanks.

Dale

Hello @dale.roberts and @abhik

I will write a post also here for anyone else having comparable issues.
We managed to solve Abhik’s problem and the experiment with shifted ancillary files seems to run smoothly.

There were a few issues with the ancillary files:

  1. The internal structure of many ancillary files that are used for the control experiment is not proper. This often means that, even though the model might still read them, if you try to modify them with a tool like mule, you will not be able to write out the modified ancil (without fixing the inner issues of the original ancil files).
    So, in general, the suggestion is to start from ancillary files that are structurally consistent and go from there. At ACCESS-NRI we are currently working to provide a well-documented and structurally proper set of UM ancillary files to be used for different simulations, as well as a tool (which uses mule) to easily modify an existing UM ancillary file with the data from a netCDF file. These will be available in the near(ish) future.

  2. Dale’s Github script works properly, but there were other problems with the ancillary files that Abhik was trying to modify:

    • Related to what said above, the original ‘veg.func_seas’ file did not have the vegetation functional types set properly. As a consequence if you inspect the file on gadi (using xconv for example) it shows a time dimension of 60 (instead of 12), being the product between 12 timesteps and 5 functional types. As Dale also did in his code, you have to modify the LBUSER(5) metadata in the PP header to account for different functional types.

    • Another issue was related to the other vegetation file that UM needs: the ‘veg.frac_igbp’ file (that Abhik mistakenly called ‘veg.func_igbp’ above). The shifted version of this file, in the experiment attempted by Abhik, did not contain what was supposed to contain. In fact, it had the same data that was stored in the ‘veg.func_seas’ shifted file. Therefore we had to change this to the proper shifted data.

    • Similar things happened with the LAND FRACTION file (in Abhik’s shifted run attempt it was replaced with the LANDMASK file). Even though the two files are almost identical and can contain the same geographical data, they cannot be used in place of each other. This because the model will complain when it looks for a specific STASH code (e.g. ‘505’ for LAND FRACTION) but finds another one (in this case ‘30’ used for the LANDMASK). The STASH information is found in the LBUSER(4) metadata in the ancil file PP header.

I believe these were the major issues why the model was failing to run.
Hope this makes things a bit clearer.

Cheers
Davide

Hi @atteggiani

Thanks for that, very informative! I’m glad I can stop tweaking that script now. A standard set of ancils sounds like a great asset. Having not worked with ancils before, the hardest part for me was checking and re-checking the file against known-good ancils and F03 to derive all of that metadata from scratch. Something that does the “load standard ancil → load netcdf field → swap ancil field for netcdf field → write new ancil” workflow would be a fantastic tool in cases like this.

Thanks for your work on this!

Dale

1 Like