A thread to document large rAM3 experiments used as ‘flagships’ for the 21stCenturyWeather Centre-of-Excellence.
Here is the domain for the first experiment. We are using BARRA-R2 to initialise the land-surface, which constrains the outer nest extents to lie within the BARRA-R2 domain.
I had to make the following changes to the rAM3 defaults to cycle for two days.
- Change the rose-suite metadata to allow
rose edit
to view four resolutions:python2 setup_metadata 4 4
(note: this will erase your existingrose-suit.conf
file) - Doubled
NPROC
to36,32
for the 5 km resolution - Increased
NPROC
to54,48
for 1 km resolution - Double
WALL_CLOCK_LIMIT
from 1800 (30 mins) to 3600 (60 mins) to run the 6 hour forecast tasks for 1 km resolution (they take about 40 minutes using 58x48 processors).
Current suite takes about 5.5 hours to run 24 hours. The forecast tasks themselves take just over 4 hours.
It takes about 15 kSU to run 24 hours of forecasts, which consumes about 750 Gb of disk space on /scratch.
Here are the details of this configuration (open in new tab to enlarge):
I repeated this configuration with rg01_rs0[1-3]_ioproc=48
which showed no difference.
Further work:
- Try OpenMP threads. What is the UM namelist control to activate this? And will this require recompilation of the UM?
- Incorporate @m.lipson 's world-cover surface data for the 1km nest.
- Run some scaling tests for the 5 km and 1 km domains.
I’ve encountered some NPROC
restrictions with the outer resolution. I generated this message: ? Error message: Too many processors in the North-South direction.The maximum permitted is 32
,when NPROC
36,32 was applied to the 12km nest. What is causing this error? Is this a restriction with the GAL9 model? (Because I can run the RAL3P2 model with more CPUs).