Hi all,
I’m trying to run the u-cy339 tutorial test case for ACCESS-CM2 for the first time. I ran into this error in the coupled step that I’m unfamiliar with (job.err output):
????????????????????????????????????????????????????????????????????????????????
???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!!
? Error code: 19
? Error from routine: CHECK_IOSTAT
? Error message:
? Error reading namelist domain
? IoMsg: invalid reference to variable in NAMELIST input, unit 14, file /scratch/ng72/jt0319/cylc-run/u-cy339/work/09500101/coupled/ATM_RUNDIR/STASHC, line 2565, position 10
? Please check input list against code.
? Error from processor: 0
? Error number: 10
????????????????????????????????????????????????????????????????????????????????
[0] exceptions: An non-exception application exit occured.
[0] exceptions: whilst in a serial region
[0] exceptions: Task had pid=2950149 on host gadi-cpu-clx-1947.gadi.nci.org.au
[0] exceptions: Program is "/home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/toyatm"
Warning in umPrintMgr: umPrintExceptionHandler : Handler Invoked
gc_abort (Processor 0): Job aborted from ereport.
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 9.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
Is there an invalid input in the UM coupling configuration?
Here is the start of the job.err output if that helps:
Using the cylc session localhost
Loading cylc7/24.03
Loading requirement: mosrs-setup/2.0.1
- Package -----------------------------.- Versions --------.- Last mod. -------
Currently Loaded Modulefiles:
mosrs-setup/2.0.1 default 2024/05/19 23:31:13
cylc7/24.03 default 2024/05/06 03:30:21
python2-as-python 2019/11/04 03:02:35
openmpi/4.0.2 2022/02/14 19:20:11
fcm/2019.09.0 2020/12/14 04:10:00
pythonlib/f90nml/1.0.2 2020/12/14 04:15:56
[WARN] file:/home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ATM_RUNDIR/IDEALISE: skip missing optional source: namelist:idealise
[WARN] file:/home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ATM_RUNDIR/STASHC: skip missing optional source: namelist:exclude_package(:)
[WARN] file:/home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ATM_RUNDIR/RECONA: skip missing optional source: namelist:trans(:)
[WARN] file:/home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ATM_RUNDIR/IOSCNTL: skip missing optional source: namelist:lustre_control
[WARN] file:/home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ATM_RUNDIR/IOSCNTL: skip missing optional source: namelist:lustre_control_custom_files
+ echo 'ACCESS COUPLED MODEL DRIVER,' /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled
+ [[ -z oasis3_mct ]]
+ [[ -z /home/272/jt0319/cylc-run/u-cy339/share/mom/exec/access-cm2/ACCESS-CM/fms_ACCESS-CM.x ]]
+ ATMOS_LINK=toyatm
+ OCEAN_LINK=mom5xx
+ ICE_LINK=cicexx
+ ln -sf /home/272/jt0319/cylc-run/u-cy339/share/fcm_make_um/build-atmos/bin/um-atmos.exe toyatm
+ ln -sf /home/272/jt0319/cylc-run/u-cy339/share/mom/exec/access-cm2/ACCESS-CM/fms_ACCESS-CM.x mom5xx
+ ln -sf /home/272/jt0319/cylc-run/u-cy339/share/cice/bin/cice5.exe cicexx
+ ATMOS_EXEC=toyatm
+ OCEAN_EXEC=mom5xx
+ export UM_NPES=576
+ UM_NPES=576
+ NPROC_MAX=576
+ export OCN_NPES=80
+ OCN_NPES=80
+ TOT_NPES=672
+ HIST_FILE=cy339.xhist
+ fix_cice_namelist.py /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ICE_RUNDIR/cice_in.nml /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ICE_RUNDIR/input_ice.nml
+ [[ 0 != 0 ]]
+ fix_mom_namelist.py /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/OCN_RUNDIR/input.nml
+ [[ 0 != 0 ]]
+ chmod u+w /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/CPL_RUNDIR/namcouple
+ fix_namcouple.py /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/CPL_RUNDIR/namcouple
+ [[ 0 != 0 ]]
+ chmod u+w /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/OCN_RUNDIR/INPUT/diag_table
+ fix_diag_table.py /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/OCN_RUNDIR/INPUT/diag_table
+ [[ 0 != 0 ]]
+ mkdir -p /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/OCN_RUNDIR/RESTART /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/OCN_RUNDIR/HISTORY
+ mkdir -p /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ICE_RUNDIR/RESTART /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ICE_RUNDIR/HISTORY
+ [[ 09500101 != 09500101 ]]
+ [[ true == \t\r\u\e ]]
+ [[ true == \f\a\l\s\e ]]
+ [[ true == \t\r\u\e ]]
+ [[ 09500101 == 09500101 ]]
+ echo 'Setting CONTINUE=false for WARM_RESTART_RUN'
+ export CONTINUE=false
+ CONTINUE=false
+ create_rankfile.py
+ export 'ACCESSRUNCMD=--rankfile /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/rankfile -wd /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ATM_RUNDIR -n 576 /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/toyatm : -wd /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/OCN_RUNDIR -n 80 /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/mom5xx : -wd /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ICE_RUNDIR -n 16 /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/cicexx'
+ ACCESSRUNCMD='--rankfile /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/rankfile -wd /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ATM_RUNDIR -n 576 /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/toyatm : -wd /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/OCN_RUNDIR -n 80 /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/mom5xx : -wd /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ICE_RUNDIR -n 16 /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/cicexx'
+ echo RUNCOMMAND, --rankfile /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/rankfile -wd /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ATM_RUNDIR -n 576 /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/toyatm : -wd /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/OCN_RUNDIR -n 80 /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/mom5xx : -wd /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ICE_RUNDIR -n 16 /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/cicexx
+ cd /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ATM_RUNDIR
+ [[ -z 10.6 ]]
+ [[ ! 10.6 =~ ^([0-9])+\.([0-9])+$ ]]
+ export DR_HOOK=false
+ DR_HOOK=false
+ export DR_HOOK_OPT=noself
+ DR_HOOK_OPT=noself
+ export PRINT_STATUS=PrStatus_Normal
+ PRINT_STATUS=PrStatus_Normal
+ export UM_THREAD_LEVEL=MULTIPLE
+ UM_THREAD_LEVEL=MULTIPLE
+ export HISTORY=/home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ATM_RUNDIR/History_Data/cy339.xhist
+ HISTORY=/home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ATM_RUNDIR/History_Data/cy339.xhist
+ [[ false == \f\a\l\s\e ]]
+ rm -f /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ATM_RUNDIR/History_Data/cy339.xhist
+ export HISTORY_TEMP=thist
+ HISTORY_TEMP=thist
+ export UM_NPES=576
+ UM_NPES=576
+ export NPROC=576
+ NPROC=576
+ export HOUSEKEEP=hkfile
+ HOUSEKEEP=hkfile
+ export STASHC=STASHC
+ STASHC=STASHC
+ export ATMOSCNTL=ATMOSCNTL
+ ATMOSCNTL=ATMOSCNTL
+ export SHARED_NLIST=SHARED
+ SHARED_NLIST=SHARED
+ export ERROR_FLAG=errflag
+ ERROR_FLAG=errflag
+ export STASHMASTER=..
+ STASHMASTER=..
+ export IDEALISE=IDEALISE
+ IDEALISE=IDEALISE
+ export IOSCNTL=IOSCNTL
+ IOSCNTL=IOSCNTL
+ export STDOUT_FILE=pe_output/cy339.fort6.pe
+ STDOUT_FILE=pe_output/cy339.fort6.pe
++ dirname pe_output/cy339.fort6.pe
+ mkdir -p pe_output
+ rm -f pe_output/cy339.fort6.pe0 pe_output/cy339.fort6.pe000
+ SIGNALS=EXIT
+ for S in $SIGNALS
+ trap FINALLY EXIT
+ export OMPI_MCA_hwloc_base_mem_alloc_policy=local_only
+ OMPI_MCA_hwloc_base_mem_alloc_policy=local_only
+ export OMPI_MCA_rmaps_base_mapping_policy=
+ OMPI_MCA_rmaps_base_mapping_policy=
+ mpirun --rankfile /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/rankfile -wd /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ATM_RUNDIR -n 576 /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/toyatm : -wd /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/OCN_RUNDIR -n 80 /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/mom5xx : -wd /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/ICE_RUNDIR -n 16 /home/272/jt0319/cylc-run/u-cy339/work/09500101/coupled/cicexx
oasis_init_comp: Calling MPI_Init
Cheers,
Joel