AUS2200 vegetation fraction ancil creation issues

Hi All

@cchambers and I are attempting to run a shifted domain AUS2200 configuration, and we’re having issues around the creation of the vegetation fraction ancils. What seems to be happening is that one of the plant functional types has a huge number of unresolved points (~27k) out to a search radius of nearly 700 grid cells. As the cost of the search goes by npoints*radius**2, we’re finding that even if we set this job to the maximum walltime of 48 hours on Gadi, it won’t finish if we continue to request seasonal ancils. A quick and dirty way to get this done would be to limit it to a single month as that’s all we need for this simulation, but that isn’t a viable solution in the long term. This job takes about 10 minutes for the standard AUS2200 domain. What would the reason for such a drastic change in runtime be, and what are some potential ways to speed this up?


Are you generating the ancillaries with the regional ancillary suite?

Yep, the AUS2200 version thereof, u-cp145. The only change we’ve made from the version in the repo is shifting the centre longitude to 145E.

Just a guess - potentially islands that are resolved in the land mask but not in the vegetation master data. do you have coordinates of any of the misbehaving points?


That seems likely, the MODIS dataset is at 4km and we’re working with a 2.2km land mask derived from SRTM orography. Unfortunately the ancil code linearises the data, so all I know is that some points around 3,039,384 seem to take the longest to resolve. If my maths is right, that’s about 26.1S, 156.7E, which was outside of the original AUS2200 domain Nope, that’s inside the original domain. I’m having a look now to see what is (or isn’t) there in the ancils.

My maths was completely wrong, point 3,039,384 corresponds to 29.0S, 167.9E, and there is a little island Norfolk Island there which is completely missing from the MODIS dataset.


A quick update before the long weekend, I’ve made a test land mask without Norfolk Island and it did not solve the issue. In the interest of expedience @cchambers has shifted the domain centre to 140E which appears to remove the area causing the problem, so that gives us an idea where the issues lie. There is something more to it than just missing vegetation data, as this only happens on the second plant functional type. In fact, I can see exactly when it manages to resolve Norfolk Island at a search radius of 321 grid points, corresponding to about the distance between it and New Caledonia (~750km). At that point, however there are still 27,463 points remaining unresolved.

I looked at this a bit further on Thursday evening, and it turns out that the issue is New Zealand. This is the shifted domain @cchambers has been attempting to use.
I’ve verified that the 27,463 unresolved points correspond to the bit of New Zealand visible in the bottom right of the image. As far as I know, this only happens with the second plant functional type. The first completes quite quickly as only the Norfolk island points are unresolved after a few steps. I’ve been trying to figure out exactly where the plant functional types data comes from, I think its this dataset here: /g/data/access/TIDS/UM/ancil/atmos/master/vegetation/cover/igbp/v2/qrdata.igbp, but its not in a format I’m familiar with. Any clues on how to work with this data would be very much appreciated.

From decoding the cap source:

from typing import BinaryIO
import numpy

def read_igbp(f: BinaryIO):
    Reads UKMO IGBP master data file
    igbp = {}
    igbp['points_lambda_srce'] = int(
    igbp['points_phi_srce'] = int(
    igbp['phi_origin_srce'] = float(
    igbp['lambda_origin_srce'] = float(
    igbp['delta_phi_srce'] = float(
    igbp['delta_lambda_srce'] = float(
    igbp['data'] = numpy.frombuffer(['points_lambda_srce'] * igbp['points_phi_srce']), dtype='b').reshape((igbp['points_phi_srce'], igbp['points_lambda_srce']))

    return igbp

Class codes:


      CASE('A')                      ! EN forest
      CASE('B')                      ! EB forest
      CASE('C')                      ! DN forest
      CASE('D')                      ! DB forest
      CASE('E')                      ! mixed forest
      CASE('F')                      ! closed shrub
      CASE('G')                      ! open shrub
      CASE('H')                      ! woody savannah
      CASE('I')                      ! savannah
      CASE('J')                      ! grassland
      CASE('K')                      ! wetland
      CASE('L')                      ! cropland
      CASE('M')                      ! urban
      CASE('N')                      ! mosaic
      CASE('O')                      ! snow and ice
      CASE('P')                      ! barren
      CASE('Q')                      ! water
      CASE('R')                      ! open sea
      CASE('Z')                      ! missing data
1 Like

So after further digging, what @MartinDix said in our meeting a couple of weeks ago was correct, ANTS is used to create the veg_frac ancil, however, CAP is still used to create the soil ancil. CAP’s workflow is veg_frac/veg_func → land ice correction → soil, so this whole thing is getting hung up on an ancil that isn’t going to be used. The only dependency I can see between the vegetation and soil calculations is the land ice mask which has to be computed after all of the vegetation functional types are resolved and after any additional glaciers are added in. However, looking at the namelist values for this calculation, No land ice processing is ever done, so I think its safe to leave points in the vegetation ancils unresolved and skip straight to the soil ancil. I’ve tested this out on the original AUS2200 domain that doesn’t hang and it has created and identical qrparm.soil ancil. It still feels like a workaround rather than a solution, as CAP needs to be modified and it won’t work as soon as glaciers become a factor. As far as I can tell, soil parameters are still generated by CAP in RAL3.2, so perhaps workarounds are all we have until ANTS fully replaces CAP, which is either a work in progress or fully completed depending on which ticket you look at in the RMED space on MOSRS.