AUS2200 vegetation fraction ancil creation issues

Hi All

@cchambers and I are attempting to run a shifted domain AUS2200 configuration, and we’re having issues around the creation of the vegetation fraction ancils. What seems to be happening is that one of the plant functional types has a huge number of unresolved points (~27k) out to a search radius of nearly 700 grid cells. As the cost of the search goes by npoints*radius**2, we’re finding that even if we set this job to the maximum walltime of 48 hours on Gadi, it won’t finish if we continue to request seasonal ancils. A quick and dirty way to get this done would be to limit it to a single month as that’s all we need for this simulation, but that isn’t a viable solution in the long term. This job takes about 10 minutes for the standard AUS2200 domain. What would the reason for such a drastic change in runtime be, and what are some potential ways to speed this up?

2 Likes

Are you generating the ancillaries with the regional ancillary suite?

Yep, the AUS2200 version thereof, u-cp145. The only change we’ve made from the version in the repo is shifting the centre longitude to 145E.

Just a guess - potentially islands that are resolved in the land mask but not in the vegetation master data. do you have coordinates of any of the misbehaving points?

2 Likes

That seems likely, the MODIS dataset is at 4km and we’re working with a 2.2km land mask derived from SRTM orography. Unfortunately the ancil code linearises the data, so all I know is that some points around 3,039,384 seem to take the longest to resolve. If my maths is right, that’s about 26.1S, 156.7E, which was outside of the original AUS2200 domain Nope, that’s inside the original domain. I’m having a look now to see what is (or isn’t) there in the ancils.

My maths was completely wrong, point 3,039,384 corresponds to 29.0S, 167.9E, and there is a little island Norfolk Island there which is completely missing from the MODIS dataset.

3 Likes

A quick update before the long weekend, I’ve made a test land mask without Norfolk Island and it did not solve the issue. In the interest of expedience @cchambers has shifted the domain centre to 140E which appears to remove the area causing the problem, so that gives us an idea where the issues lie. There is something more to it than just missing vegetation data, as this only happens on the second plant functional type. In fact, I can see exactly when it manages to resolve Norfolk Island at a search radius of 321 grid points, corresponding to about the distance between it and New Caledonia (~750km). At that point, however there are still 27,463 points remaining unresolved.

I looked at this a bit further on Thursday evening, and it turns out that the issue is New Zealand. This is the shifted domain @cchambers has been attempting to use.
Aus2200_shifted
I’ve verified that the 27,463 unresolved points correspond to the bit of New Zealand visible in the bottom right of the image. As far as I know, this only happens with the second plant functional type. The first completes quite quickly as only the Norfolk island points are unresolved after a few steps. I’ve been trying to figure out exactly where the plant functional types data comes from, I think its this dataset here: /g/data/access/TIDS/UM/ancil/atmos/master/vegetation/cover/igbp/v2/qrdata.igbp, but its not in a format I’m familiar with. Any clues on how to work with this data would be very much appreciated.

From decoding the cap source:

from typing import BinaryIO
import numpy

def read_igbp(f: BinaryIO):
    """
    Reads UKMO IGBP master data file
    """
    igbp = {}
    igbp['points_lambda_srce'] = int(f.read(5))
    f.read(1)
    igbp['points_phi_srce'] = int(f.read(5))
    igbp['phi_origin_srce'] = float(f.read(6))
    f.read(1)
    igbp['lambda_origin_srce'] = float(f.read(6))
    f.read(1)
    igbp['delta_phi_srce'] = float(f.read(10))
    f.read(1)
    igbp['delta_lambda_srce'] = float(f.read(10))
    igbp['data'] = numpy.frombuffer(f.read(igbp['points_lambda_srce'] * igbp['points_phi_srce']), dtype='b').reshape((igbp['points_phi_srce'], igbp['points_lambda_srce']))

    return igbp

Class codes:

      SELECT CASE (CIGBP_CLASS)

      CASE('A')                      ! EN forest
        IGBP_CLASS=1
      CASE('B')                      ! EB forest
        IGBP_CLASS=2
      CASE('C')                      ! DN forest
        IGBP_CLASS=3
      CASE('D')                      ! DB forest
        IGBP_CLASS=4
      CASE('E')                      ! mixed forest
        IGBP_CLASS=5
      CASE('F')                      ! closed shrub
        IGBP_CLASS=6
      CASE('G')                      ! open shrub
        IGBP_CLASS=7
      CASE('H')                      ! woody savannah
        IGBP_CLASS=8
      CASE('I')                      ! savannah
        IGBP_CLASS=9
      CASE('J')                      ! grassland
        IGBP_CLASS=10
      CASE('K')                      ! wetland
        IGBP_CLASS=11
      CASE('L')                      ! cropland
        IGBP_CLASS=12
      CASE('M')                      ! urban
        IGBP_CLASS=13
      CASE('N')                      ! mosaic
        IGBP_CLASS=14
      CASE('O')                      ! snow and ice
        IGBP_CLASS=15
      CASE('P')                      ! barren
        IGBP_CLASS=16
      CASE('Q')                      ! water
        IGBP_CLASS=17
      CASE('R')                      ! open sea
        IGBP_CLASS=18
      CASE('Z')                      ! missing data
        IGBP_CLASS=19
      CASE DEFAULT
        IGBP_CLASS=19
1 Like