An update.
With @mlipson’s help we have incorporated the World Cover ancillaries into the flagship experiment. The differences the urban vegetation fields around Sydney and in the full 1-km domain can be seen at UM_configuration_tools/notebooks/Check_Flagship_worldcover.ipynb at main · 21centuryweather/UM_configuration_tools · GitHub)
There are issues with the suite when increasing the number of processors beyond 52x48 for the 1km domain. Any attempted increase (80x72, 72x70, 56x54) fails. The UM forecast tasks just consume increasing amounts of walltime and intermittently fail.
I checked a 56x54 task which initially failed, but then succeeded after resubmission. The stdout file (containing the solver residuals for every timestep) and stderr are exact, until the point of failure. When using more processors, the UM forecast completes, but then the job hangs. There is no error message which explains why it fails at the UM write dump and STASH field gathering functions.
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
libifcoremt.so.5 000014D48FFF2555 for__signal_handl Unknown Unknown
libpthread-2.28.s 000014D48D45F990 Unknown Unknown Unknown
libucp.so.0.0.0 000014D488BDB7D0 ucp_worker_progre Unknown Unknown
libmpi.so.40.30.5 000014D490608BFF mca_pml_ucx_recv Unknown Unknown
libmpi.so.40.30.5 000014D4906E00BB mca_coll_basic_ga Unknown Unknown
libmpi.so.40.30.5 000014D490771D30 PMPI_Gatherv Unknown Unknown
libmpi_mpifh.so 000014D490ADF5FC Unknown Unknown Unknown
um-atmos.exe 0000000003389BFE mpl_gatherv_ 98 mpl_gatherv.F90
um-atmos.exe 0000000000663EAB gather_field_mpl_ 222 gather_field_mpl.F90
um-atmos.exe 0000000000663606 gather_field_mod_ 127 gather_field.F90
um-atmos.exe 0000000000658848 stash_gather_fiel 399 stash_gather_field.F90
um-atmos.exe 000000000076F73F general_gather_fi 413 general_gather_field.F90
um-atmos.exe 0000000000BDF0BB um_writdump_mod_m 522 um_writdump.F90
um-atmos.exe 0000000000BDD65C dumpctl_mod_mp_du 207 dumpctl.F90
um-atmos.exe 00000000004F065D u_model_4a_mod_mp 452 u_model_4A.F90
um-atmos.exe 000000000040CA38 um_shell_mod_mp_u 748 um_shell.F90
um-atmos.exe 00000000004093F8 MAIN__ 60 um_main.F90
um-atmos.exe 00000000004093A2 Unknown Unknown Unknown
libc-2.28.so 000014D48D0B17E5 __libc_start_main Unknown Unknown
um-atmos.exe 00000000004092AE Unknown Unknown Unknown
I assume the job fails because I have not correctly configured an I/O server. @srennie from BoM kindly pointed me to this link (at the UKMO Trac website) containing UM hints for running on gadi: (link requires a MOSRS account) https://code.metoffice.gov.uk/trac/nwpscience/wiki/bomnwpscience/AccessNWP_SuiteOptimisations
It also contains links to configuring the I/O server which I’ll implement in the coming weeks.
In addition to improving IO, another option is to replicate the AUS2200 suite. Dale Roberts wrote a lot of documentation about the optimisations he achieved with AUS2200 when running on the gadi sapphire nodes, see here:
These optimisations are visible in the AUS2200 suite at :
https://code.metoffice.gov.uk/trac/roses-u/browser/c/s/1/4/2/trunk/site/nci-gadi/suite-adds.rc
210 {% macro atmos_resources(nx, ny, omp, mem) %}
211 {% set ios = 48 %}
212 {% if ( 'normal' == UM_ATM_NCI_QUEUE ) or ( 'express' == UM_ATM_NCI_QUEUE ) %}
213 {% set cores_per_node = 48 %}
214 {% elif ( 'normalbw' == UM_ATM_NCI_QUEUE ) or ( 'expressbw' == UM_ATM_NCI_QUEUE ) %}
215 {% set cores_per_node = 28 %}
216 {% elif ( 'normalsl' == UM_ATM_NCI_QUEUE ) %}
217 {% set cores_per_node = 32 %}
218 {% elif ( 'normalsr' == UM_ATM_NCI_QUEUE ) or ( 'expresssr' == UM_ATM_NCI_QUEUE ) %}
219 {% set cores_per_node = 104 %}
220 {% endif %}
221 {% set threads_per_node = 96 %}
222 {% set nnodes = (( nx * ny + ios ) * omp / threads_per_node)|round(0,'ceil')|int %}
223 [[[ directives ]]]
224 -q = {{UM_ATM_NCI_QUEUE}}
225 -l ncpus = {{ nnodes * cores_per_node }}
226 -l mem = {{mem * (nx * ny + ios) * omp}}mb
227 -l jobfs = {{mem * (nx * ny + ios) * omp}}mb
228 [[[ environment ]]]
229 UM_ATM_NPROCY = {{ny}}
230 UM_ATM_NPROCX = {{nx}}
231 OMP_NUM_THREADS = {{omp}}
232 FLUME_IOS_NPROC = {{ios}}
233 #ATMOS_LAUNCHER = mpirun -n {{nx * ny + ios}} --map-by node:PE={{omp}} --rank-by core
234 {# spr needs special binding options thanks to many NUMA nodes #}
235 {% if ( ( 'normalsr' == UM_ATM_NCI_QUEUE ) or ( 'expresssr' == UM_ATM_NCI_QUEUE ) ) and ( threads_per_node < cores_per_node ) %}
236 ATMOS_LAUNCHER = mpirun -n {{nx * ny + ios}} --map-by ppr:$(( {{threads_per_node}} / $PBS_NCI_NUMA_PER_NODE / {{omp}} )):numa:PE={{omp}} --rank-by core
237 {% else %}
238 ATMOS_LAUNCHER = mpirun -n {{nx * ny + ios}} --map-by ppr:{{(threads_per_node/omp/2)|round|int}}:socket:PE={{omp}} --rank-by core
239 {% endif %}
240 ROMIO_HINTS = /home/563/dr4292/hints.txt
241 OMPI_MCA_io = romio321
242 [[[job]]]
243 execution time limit = PT{{WALL_CLOCK_LIMIT}}S
244 {% endmacro %}
The parameters in that jinja
macro that I’m unfamiliar with are
ROMIO_HINTS = /home/563/dr4292/hints.txt
OMPI_MCA_io = romio321
which are MPI-related environment variables. See https://wordpress.cels.anl.gov/romio/2008/09/26/system-hints-hints-via-config-file/
and
I might ask Dale for the copy of his ROMIO_HINTS
file, as that current location doesn’t have read access.
Is anyone else familiar with ROMIO_HINTS
and OMPI_MCA_io
?