Issues with configuring the UM I/O server

Hello.

Myself and @qinggangg have been unable to scale the rAM3 suite beyond about 50x50 processors.

The UM will integrate in time successfully. E.g. on 72x70 processors it will compute a 6 hour forecast on the 1 km domain (21112 x 2000 points at 0.009 degree resolution) in about 13.3 minutes. The task then hangs, and will exceed the available wall time. The default wall time is 30 mins, but I’ve tried up to 2 hours with no luck. A task split across more that 50x50 processors will sometimes work, but 90% of the time it will fail due to wall time exceedance.

There is no error message (beyond wall time exceedance). The UM is running

stash_gather_field.F90

when PBS terminates it. The final output from the umnsa.fort6.pe0 (the Rank 0 MPI process) is

	********************************************************************************
	Atm_Step: Info: timestep      360 took     23.210 seconds
	********************************************************************************
	GET_FILENAME: Generated filename:/home/548/pag548/cylc-run/u-dq126/share/cycle/20220226T0000Z/Flagship_ERA5to1km/1km/RAL3P2/um//../ics/umnsaa_da006
	GET_FILENAME:             (From): /home/548/pag548/cylc-run/u-dq126/share/cycle/20220226T0000Z/Flagship_ERA5to1km/1km/RAL3P2/um//../ics/umnsaa_d%z%N
	FILE_MANAGER: Assigned : /home/548/pag548/cylc-run/u-dq126/share/cycle/20220226T0000Z/Flagship_ERA5to1km/1km/RAL3P2/um//../ics/umnsaa_da006
	FILE_MANAGER:          : Unit :  12 (portio)
	DUMPCTL: Opening new file /home/548/pag548/cylc-run/u-dq126/share/cycle/20220226T0000Z/Flagship_ERA5to1km/1km/RAL3P2/um//../ics/umnsaa_da006 on unit  12
	OPEN: File /home/548/pag548/cylc-run/u-dq126/share/cycle/20220226T0000Z/Flagship_ERA5to1km/1km/RAL3P2/um//../ics/umnsaa_da006 to be Opened on Unit 12 Exists
	OPEN:  Claimed 4194304 Bytes for Buffering
	OPEN: Buffer Address is 0x18a65a10
	IO: Open: /home/548/pag548/cylc-run/u-dq126/share/cycle/20220226T0000Z/Flagship_ERA5to1km/1km/RAL3P2/um//../ics/umnsaa_da006 on unit  12
	
	WRITING UNIFIED MODEL DUMP ON UNIT 12
	#####################################

This 83Gb file assigned to unit 12 exists and I can read it. But the task still doesn’t exit correctly.

So given that I/O appears to be the bottleneck, I activated a small I/O server using 12 dedicated processors. The first UM task in my suite is a 12 km outer nest (580 x 780 at 0.11 degrees with 30x26 processors). The UM will complete 20 timesteps and then the I/O server will fail with this error

	-------- IOS ERROR REPORT ---------------
	
	????????????????????????????????????????????????????????????????????????????????
	???!!!???!!!???!!!???!!!???!!!       ERROR        ???!!!???!!!???!!!???!!!???!!!
	?  Error code: 61
	?  Error from routine: IOS_INIT_MD
	?  Error message: A target object of    10 is not allowed
	?  Error from processor: 0
	?  Error number: 23
	????????????????????????????????????????????????????????????????????????????????

This error is thrown in this section of /src/io_services/client/ios_client_queue.F90


    ELSE IF (targetObject == file_op_pseudo_unit) THEN
        !$OMP CRITICAL(internal_write)
        WRITE(IOS_clq_message,'(A,I5,A)') 'A target object of ',                     &
          targetObject,' is not allowed'
        !$OMP END CRITICAL(internal_write)
        errorFlag=61
        CALL IOS_ereport( RoutineName, errorFlag, IOS_clq_message )

file_op_pseudo_unit is the Fortran unit number computed in

SUBROUTINE assign_file_unit(filename, f_unit, handler, id, force)

in src/io_services/model_api/file_manager.F90

Here are the pertinent sections of umnsa.fort6.pe0 that reference Unit 10.

FILE_MANAGER: Assigned : pseudo-file for UNIX operations
FILE_MANAGER:          : id   : io_reserved_unit
FILE_MANAGER:          : Unit :  10 (portio)
Running Atmospheric code as pe 0
MPPIO_File_Utils: Initialised file utils using unit  10
FILE_MANAGER: Assigned : Reserved unit for re-initialised stream (/home/548/pag548/cylc-run/u-dq126/share/cycle/20220226T0000Z/Flagship_ERA5to1km/12km/GAL9/um//umnsaa_pb%N.nc)
FILE_MANAGER:          : id   : usr1
FILE_MANAGER:          : Unit :  10 (netcdf)
NCFILE_INIT: Opening new file /home/548/pag548/cylc-run/u-dq126/share/cycle/20220226T0000Z/Flagship_ERA5to1km/12km/GAL9/um//umnsaa_pb000.nc on unit  10
Creating netCDF4 classic model file /home/548/pag548/cylc-run/u-dq126/share/cycle/20220226T0000Z/Flagship_ERA5to1km/12km/GAL9/um//umnsaa_pb000.nc on unit   10

It appears to me the I/O server is trying to write to a Unit Number that has been reserved by the NetCDF output functions.

I’ve tried to see if these unit numbers can be overridden with a namelist, but no luck.

So the question is, who here has run the UM with an I/O server? And have you ever had to deal with conflicting unit numbers?

The AUS2000 suite used an I/O server, and the BoM operational suites

  • ACCESS-S
  • BARRA-R2
  • NAS

all use it too.

I’ve tried the UM task with ios_unit_alloc_policy = 1,2,3 in theIOSCNTL namelist , i.e

  1. Static allocation based on unit number
  2. Static allocation based on usage order
  3. Round robin dynamic allocation

And the error remains the same for all three options.

Cheers

Do you get the same error if you use the standard fieldsfile outputs rather than netcdf?

No. In this suite if I disable netCDF outputs with

app/um/rose-app.conf:
l_netcdf=.false.

The UM task immediately fails with

????????????????????????????????????????????????????????????????????????????????
???!!!???!!!???!!!???!!!???!!!       ERROR        ???!!!???!!!???!!!???!!!???!!!
?  Error code: 14
?  Error from routine: rdbasis
?  Error message: Unable to find file id (usr1) in portio or netcdf file lists
?  Error from processor: 0
?  Error number: 13
????????????????????????????????????????????????????????????????????????????????

It fails in

um-atmos.exe       0000000002B9DBB3  io_server_listene         301  io_server_listener.F90
um-atmos.exe       0000000002B70521  ios_init_mp_ios_r        1025  ios_init.F90

I’m tracking down the source of that error message.

I’ll turn I/O server off and then try and run with l_netcdf=.false.

I may need to insert additional logic so it outputs .pp fields. I’m checking the suite graph and tasks now.

@Paul.Gregory, let me know if you want to fold back the netcdf outputs to fields file outputs and we can provide a different suite on a branch …

Hi Chermelle

A branch using standard fields file outputs would be awesome. Thanks.

1 Like

Thanks to @cbengel for creating a branch of this suite without netCDF outputs, I got the IO server running.

The relevant parts of the umnsa.fort6.pe0 are

FILE_MANAGER: Assigned : pseudo-file for UNIX operations
FILE_MANAGER:          : id   : io_reserved_unit
FILE_MANAGER:          : Unit :  10 (portio)
....
792 Processors initialised.
I am PE 0 on gadi-cpu-clx-1621.gadi.nci.org.au
[0] I am running with  2 thread(s).
[0] OpenMP Specification: 201611
FILE_MANAGER: Assigned : ATMOSCNTL
FILE_MANAGER:          : id   : atmoscntl
FILE_MANAGER:          : Unit :  11 (fortran)
FILE_MANAGER: Assigned : SHARED
FILE_MANAGER:          : id   : shared
FILE_MANAGER:          : Unit :  12 (fortran)
...
MPPIO_File_Utils: Initialised file utils using unit  10
...
FILE_MANAGER: Assigned : Reserved unit for re-initialised stream (/home/548/pag548/cylc-run/u-dq126/share/cycle/20220226T0000Z/Flags
hip_ERA5to1km/12km/GAL9/um//umnsaa_pb%N)
FILE_MANAGER:          : id   : pp1
FILE_MANAGER:          : Unit :  11 (portio)
FILE_MANAGER: Assigned : Reserved unit for re-initialised stream (/home/548/pag548/cylc-run/u-dq126/share/cycle/20220226T0000Z/Flags
hip_ERA5to1km/12km/GAL9/um//umnsaa_pa%N)
FILE_MANAGER:          : id   : pp0
FILE_MANAGER:          : Unit :  12 (portio)
FILE_MANAGER: Assigned : Reserved unit for re-initialised stream (/home/548/pag548/cylc-run/u-dq126/share/cycle/20220226T0000Z/Flags
hip_ERA5to1km/12km/GAL9/um//umnsaa_cb%N)
FILE_MANAGER:          : id   : ppmbc
FILE_MANAGER:          : Unit :  13 (portio)
FILE_MANAGER: Assigned : Reserved unit for re-initialised stream (/home/548/pag548/cylc-run/u-dq126/share/cycle/20220226T0000Z/Flags
hip_ERA5to1km/12km/GAL9/um//umnsaa_pvera%N)
FILE_MANAGER:          : id   : verpp1
FILE_MANAGER:          : Unit :  14 (portio)
FILE_MANAGER: Assigned : Reserved unit for re-initialised stream (/home/548/pag548/cylc-run/u-dq126/share/cycle/20220226T0000Z/Flags
hip_ERA5to1km/12km/GAL9/um//umnsaa_pverb%N)
FILE_MANAGER:          : id   : verpp2
FILE_MANAGER:          : Unit :  15 (portio)
FILE_MANAGER: Assigned : Reserved unit for re-initialised stream (/home/548/pag548/cylc-run/u-dq126/share/cycle/20220226T0000Z/Flags
hip_ERA5to1km/12km/GAL9/um//umnsaa_pverc%N)
FILE_MANAGER:          : id   : verpp3
FILE_MANAGER:          : Unit :  16 (portio)
FILE_MANAGER: Assigned : Reserved unit for re-initialised stream (/home/548/pag548/cylc-run/u-dq126/share/cycle/20220226T0000Z/Flags
hip_ERA5to1km/12km/GAL9/um//umnsaa_pverd%N)
FILE_MANAGER:          : id   : verpp4
FILE_MANAGER:          : Unit :  17 (portio)

So when using field fields, Unit numbers 11 to 17 are used for outputs.

I’ll now proceed to test a larger I/O server on the larger domains to see what the speed-up is.

RE : Unit number allocations for netCDF outputs.

The file_manager module (https://code.metoffice.gov.uk/trac/um/browser/main/trunk/src/io_services/model_api/file_manager.F90) suggests the units number for file I/O is read from the Fortran derived types

um_file_list % file_unit_min
um_file_list % file_unit_max

There are three types used for fortran, portio and netcdf filetypes:

147	! The instances of the type used as the master list objects by this module,
148	! there are two distinct file lists depending on the type of file
149	!-------------------------------------------------------------------------------
150	TYPE(um_file_list_type), SAVE, TARGET :: um_file_list_fortran
151	TYPE(um_file_list_type), SAVE, TARGET :: um_file_list_portio
152	TYPE(um_file_list_type), SAVE, TARGET :: um_file_list_netcdf

And they are apparently defined here:

160	INTEGER, PARAMETER :: start_unit_fortran = 10
161	INTEGER, PARAMETER :: end_unit_fortran = 300
162	
163	! Portio
164	INTEGER, PARAMETER :: start_unit_portio = 10
165	INTEGER, PARAMETER :: end_unit_portio = 300
166	
167	! NetCDF
168	INTEGER, PARAMETER :: start_unit_netcdf = 10
169	INTEGER, PARAMETER :: end_unit_netcdf = 300

This suggests you can’t, by default, use the I/O server with netCDF outputs as both will want to use Unit number 10.

I can’t see from the source code anyway to override these values via a namelist input.

One fix is to set

168	INTEGER, PARAMETER :: start_unit_netcdf = 11

or to set

! A flag which forces the portio list to use unique unit numbers
155	LOGICAL, PUBLIC :: portio_unique_units = .FALSE.

to .TRUE. (why would this be set to .FALSE. by default? :thinking: ), which suggests you have to recompile the code to use the I/O with netCDF outputs?

I’m very surprised this is the case, but happy to proceed with this if required.

What’s the best way to ask the UKMO about this?
A Trac ticket?
A post to https://cms-helpdesk.ncas.ac.uk ?

Hi @Paul.Gregory,

The default netcdf output is not suited to purpose.

I think it will make better sense to output to fields file then do a bespoke conversion to netcdf carefully designed by the research software engineers and scientists (with possible many flavours). I think this is what @mlipson is going to drive.

Given that, I would suggest that you postpone talking to the Met Office about the netcdf pipeline because the support for netcdf output is limited (and not fit for purpose).

1 Like