UM RNS scaling on NCI

Hi Team,
It is that time of year that many of us are putting in our merit applications on the NCI. I was wondering if there has been some statistics produced around the scalability of the UM RNS on NCI, perhaps for AUS2200 or another ‘default’ suite that could help us with our proposals? I know this possibly varies a lot depending on the specific application, but I thought I’d throw the question out there?
Thanks! Sonya

Hi @sonyafiddes - just letting you know that we’ve seen the question and are getting it to the right people :slight_smile:

1 Like

Hi Sonya,

We have talked offline but I am posting some statistics here for others who may be interested.

The ACCESS-NRI regional nesting suite configuration with ERA5 land/surface variables replaced with data from either ERA5-land or BARRA-R2 has the following cost on Gadi.

For a double-level nest with era5+era5-land:

  • 0.1 degree resolution for nest 1
  • 0.0198 degree resolution for nest 2
  • 450x450 grid-points in both levels
  • 18x16 CPUs in both levels
  • 24h cycle-length and 6 hour chunk length (again for both levels).

The total cost for a +24 hour run total for all jobs (including the reconfigurations, boundary conditions and forecast, etc) is 867SU.

If you are more interested in the forecast job alone, each +6 hour forecast job alone costs roughly 25 SU (and about 2 and a half minutes wall time) for the coarser nest and 110 SU (and about 11 minutes wall time) for the higher resolution nest.

I understand that this is not enough detail to produce scaling graphs but hopefully it is some indication.

For scaling information on AUS2200 please contact @dale.roberts, @Yi_Huang or @Claire_Vincent.

@Heidi will comment soon on the ACCESS-NRI plans regarding the scaling information that will accompany the regional nesting suite supported configuration.

Best regards,

Chermelle

4 Likes

Hi Sonya,

Supplementing @cbengel’s reply, ACCESS-NRI is planning to release a supported configuration of the Regional Nesting Suite configuration. As we plan to optimise the configuration prior to releasing it, scaling information will not be available in time for this year’s NCMAS submission.
There is, however, scaling information for the Aus2200 configuration, which was optimised by @dale.roberts as follows:

Aus2200 CPU Scaling

Cascade Lake: best efficiency at 6,384 cores
→ 30.8 forecast days/day @ 9.9kSU/day

Sapphire Rapids: best efficiency at 10,192 cores
→ 54.5 forecast days/day @ 8.9kSU/day

1 Like