Hi there,
I recently started a new experiment using ACCESS-CM2, but the run failed during the coupled step. I’ve checked the job.err log to identify the source of the problem, but I still haven’t been able to resolve it. The error appears to be related to the messages shown below.
Has anyone else encountered a similar issue? I’ve also copied the full job.err log to a public path (/scratch/public/hl1052/job.err) in case anyone would like to take a closer look.
Any help would be greatly appreciated.
Thanks,
Huazhen
oasis_init_comp: Calling MPI_Init
oasis_init_comp: Not Calling MPI_Init
[30] exceptions: feenableexcept() mask [0x00000000] enabled. (mask [0x00000000] requested)
[141] exceptions: feenableexcept() mask [0x00000000] enabled. (mask [0x00000000] requested)
[36] exceptions: feenableexcept() mask [0x00000000] enabled. (mask [0x00000000] requested)
...
????????????????????????????????????????????????????????????????????????????????
?????????????????????????????? WARNING ??????????????????????????????
? Warning code: -10
? Warning from routine: check_configid
? Warning message:
? itab is set to missing data, pp_head(LBEXP) will be set to zero
? Warning from processor: 0
? Warning number: 0
????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????
?????????????????????????????? WARNING ??????????????????????????????
? Warning code: -100
? Warning from routine: warn_temp_fixes
? Warning message:
? Model run excludes ticket #646 as l_fix_nh4no3_equilibrium=.FALSE.
? This will affect any model runs which include the formation of
? ammonium nitrate within the CLASSIC aerosol scheme.
? Warning from processor: 0
? Warning number: 1
????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????
?????????????????????????????? WARNING ??????????????????????????????
? Warning code: -100
? Warning from routine: warn_temp_fixes
? Warning message: Model run excludes ticket #1017 as
? l_fix_ctile_orog=.FALSE.
? This will affect runs with coastal tiling.
? Warning from processor: 0
? Warning number: 2
????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????
?????????????????????????????? WARNING ??????????????????????????????
? Warning code: -100
? Warning from routine: warn_temp_fixes
? Warning message: Model run excludes ticket #1167 as
? l_fix_conv_precip_evap=.FALSE.
? This will affect all runs with parameterised
? precipitating convection; evaporation rates for
? convective precipitation will be underestimated.
? Warning from processor: 0
? Warning number: 3
????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????
?????????????????????????????? WARNING ??????????????????????????????
? Warning code: -100
? Warning from routine: warn_temp_fixes
? Warning message:
? Model run excludes ticket #1421 as l_fix_ukca_impscav=.FALSE.
? This will affect any model runs which include UKCA.
? Warning from processor: 0
? Warning number: 4
????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????
?????????????????????????????? WARNING ??????????????????????????????
? Warning code: -100
? Warning from routine: warn_temp_fixes
? Warning message:
? Model run excludes ticket #1638 as l_fix_rp_shock_amp=.FALSE.
? This will affect any model using the Random Parameters 2b Scheme
? by doubling the size of the shock amplitude.
? Warning from processor: 0
? Warning number: 5
????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????
?????????????????????????????? WARNING ??????????????????????????????
? Warning code: -100
? Warning from routine: warn_temp_fixes
? Warning message:
? Model run excludes ticket #1729 as l_fix_ustar_dust=.FALSE.
? This will affect any model runs which include interactive dust.
? Warning from processor: 0
? Warning number: 6
????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????
?????????????????????????????? WARNING ??????????????????????????????
? Warning code: -100
? Warning from routine: warn_temp_fixes
? Warning message:
? Model run excludes ticket #2077 as l_fix_dyndiag=.FALSE.
? This will mess up shear-dominated PBLs slightly
? Warning from processor: 0
? Warning number: 7
????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????
?????????????????????????????? WARNING ??????????????????????????????
? Warning code: -100
? Warning from routine: warn_temp_fixes
? Warning message:
? Model run excludes a change from ticket #1958 as
? l_fix_ukca_deriv_init=.FALSE.
? This will affect any model runs which include UKCA Strat+trop scheme.
? Warning from processor: 0
? Warning number: 8
????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????
?????????????????????????????? WARNING ??????????????????????????????
? Warning code: -100
? Warning from routine: CHECK_RUN_DIFFUSION
? Warning message:
? cldbase_opt_sh set to sh_wstar_closure since
? Smagorinsky diffusion not chosen.
? Warning from processor: 0
? Warning number: 9
????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????
?????????????????????????????? WARNING ??????????????????????????????
? Warning code: -30
? Warning from routine: PRELIM
? Warning message:
? Field - Section:0, Item:252 request denied.
? Unavailable to this model version.
? Warning from processor: 0
? Warning number: 10
????????????????????????????????????????????????????????????????????????????????
WARNING from PE 3: set_date_c: Year zero is invalid. Resetting year to 1
...
????????????????????????????????????????????????????????????????????????????????
?????????????????????????????? WARNING ??????????????????????????????
? Warning code: -1
? Warning from routine: eg_SISL_setcon
? Warning message: Constant gravity enforced
? Warning from processor: 0
? Warning number: 11
????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????
?????????????????????????????? WARNING ??????????????????????????????
? Warning code: -1
? Warning from routine: OASIS_INITA2O
? Warning message: Coupling field not enabled - 3D CO2 STASH code:0,252
? Warning from processor: 0
? Warning number: 12
????????????????????????????????????????????????????????????????????????????????
[gadi-cpu-bdw-0258:653656:0:653656] Caught signal 8 (Floating point exception: floating-point invalid operation)
[gadi-cpu-bdw-0258:653648:0:653648] Caught signal 8 (Floating point exception: floating-point invalid operation)
[gadi-cpu-bdw-0258:653637:0:653637] Caught signal 8 (Floating point exception: floating-point invalid operation)
==== backtrace (tid: 653637) ====
0 0x0000000000012990 __funlockfile() :0
1 0x00000000008fd36a ocean_vert_kpp_mom4p1_mod_mp_wscale_() /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_kpp_mom4p1.F90:2114
2 0x00000000008f64fb ocean_vert_kpp_mom4p1_mod_mp_bldepth_() /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_kpp_mom4p1.F90:1719
3 0x00000000008e09ff ocean_vert_kpp_mom4p1_mod_mp_vert_mix_kpp_mom4p1_() /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_kpp_mom4p1.F90:1278
4 0x000000000078307d ocean_vert_mix_mod_mp_vert_mix_coeff_() /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_mix.F90:3028
5 0x000000000044e237 ocean_model_mod_mp_update_ocean_model_() /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_core/ocean_model.F90:1626
6 0x0000000000416592 MAIN__() /home/561/hl1052/cylc-run/u-dv376/share/mom/src/accesscm_coupler/ocean_solo.F90:471
7 0x000000000040f722 main() ???:0
8 0x000000000003a7e5 __libc_start_main() ???:0
9 0x000000000040f62e _start() ???:0
=================================
forrtl: error (75): floating point exception
Image PC Routine Line Source
fms_ACCESS-CM.x 0000000001D7BB94 Unknown Unknown Unknown
libpthread-2.28.s 000014BFD28A4990 Unknown Unknown Unknown
fms_ACCESS-CM.x 00000000008FD36A ocean_vert_kpp_mo 2114 ocean_vert_kpp_mom4p1.F90
fms_ACCESS-CM.x 00000000008F64FB ocean_vert_kpp_mo 1719 ocean_vert_kpp_mom4p1.F90
fms_ACCESS-CM.x 00000000008E09FF ocean_vert_kpp_mo 1278 ocean_vert_kpp_mom4p1.F90
fms_ACCESS-CM.x 000000000078307D ocean_vert_mix_mo 3028 ocean_vert_mix.F90
fms_ACCESS-CM.x 000000000044E237 ocean_model_mod_m 1626 ocean_model.F90
fms_ACCESS-CM.x 0000000000416592 MAIN__ 471 ocean_solo.F90
fms_ACCESS-CM.x 000000000040F722 Unknown Unknown Unknown
libc-2.28.so 000014BFD24F57E5 __libc_start_main Unknown Unknown
fms_ACCESS-CM.x 000000000040F62E Unknown Unknown Unknown
==== backtrace (tid: 653648) ====
==== backtrace (tid: 653656) ====
0 0x0000000000012990 __funlockfile() :0
1 0x00000000008fd36a ocean_vert_kpp_mom4p1_mod_mp_wscale_() /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_kpp_mom4p1.F90:2114
2 0x00000000008f64fb ocean_vert_kpp_mom4p1_mod_mp_bldepth_() /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_kpp_mom4p1.F90:1719
3 0x00000000008e09ff ocean_vert_kpp_mom4p1_mod_mp_vert_mix_kpp_mom4p1_() /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_kpp_mom4p1.F90:1278
4 0x000000000078307d ocean_vert_mix_mod_mp_vert_mix_coeff_() /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_mix.F90:3028
5 0x000000000044e237 ocean_model_mod_mp_update_ocean_model_() /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_core/ocean_model.F90:1626
6 0x0000000000416592 MAIN__() /home/561/hl1052/cylc-run/u-dv376/share/mom/src/accesscm_coupler/ocean_solo.F90:471
7 0x000000000040f722 main() ???:0
8 0x000000000003a7e5 __libc_start_main() ???:0
9 0x000000000040f62e _start() ???:0
=================================
0 0x0000000000012990 __funlockfile() :0
1 0x00000000008fd36a ocean_vert_kpp_mom4p1_mod_mp_wscale_() /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_kpp_mom4p1.F90:2114
2 0x00000000008f64fb ocean_vert_kpp_mom4p1_mod_mp_bldepth_() /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_kpp_mom4p1.F90:1719
3 0x00000000008e09ff ocean_vert_kpp_mom4p1_mod_mp_vert_mix_kpp_mom4p1_() /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_kpp_mom4p1.F90:1278
4 0x000000000078307d ocean_vert_mix_mod_mp_vert_mix_coeff_() /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_mix.F90:3028
5 0x000000000044e237 ocean_model_mod_mp_update_ocean_model_() /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_core/ocean_model.F90:1626
6 0x0000000000416592 MAIN__() /home/561/hl1052/cylc-run/u-dv376/share/mom/src/accesscm_coupler/ocean_solo.F90:471
7 0x000000000040f722 main() ???:0
8 0x000000000003a7e5 __libc_start_main() ???:0
9 0x000000000040f62e _start() ???:0
=================================
forrtl: error (75): floating point exception
Image PC Routine Line Source
fms_ACCESS-CM.x 0000000001D7BB94 Unknown Unknown Unknown
libpthread-2.28.s 000015058B64F990 Unknown Unknown Unknown
fms_ACCESS-CM.x 00000000008FD36A ocean_vert_kpp_mo 2114 ocean_vert_kpp_mom4p1.F90
fms_ACCESS-CM.x 00000000008F64FB ocean_vert_kpp_mo 1719 ocean_vert_kpp_mom4p1.F90
fms_ACCESS-CM.x 00000000008E09FF ocean_vert_kpp_mo 1278 ocean_vert_kpp_mom4p1.F90
fms_ACCESS-CM.x 000000000078307D ocean_vert_mix_mo 3028 ocean_vert_mix.F90
fms_ACCESS-CM.x 000000000044E237 ocean_model_mod_m 1626 ocean_model.F90
fms_ACCESS-CM.x 0000000000416592 MAIN__ 471 ocean_solo.F90
fms_ACCESS-CM.x 000000000040F722 Unknown Unknown Unknown
libc-2.28.so 000015058B2A07E5 __libc_start_main Unknown Unknown
fms_ACCESS-CM.x 000000000040F62E Unknown Unknown Unknown
forrtl: error (75): floating point exception
Image PC Routine Line Source
fms_ACCESS-CM.x 0000000001D7BB94 Unknown Unknown Unknown
libpthread-2.28.s 000014A429D07990 Unknown Unknown Unknown
fms_ACCESS-CM.x 00000000008FD36A ocean_vert_kpp_mo 2114 ocean_vert_kpp_mom4p1.F90
fms_ACCESS-CM.x 00000000008F64FB ocean_vert_kpp_mo 1719 ocean_vert_kpp_mom4p1.F90
fms_ACCESS-CM.x 00000000008E09FF ocean_vert_kpp_mo 1278 ocean_vert_kpp_mom4p1.F90
fms_ACCESS-CM.x 000000000078307D ocean_vert_mix_mo 3028 ocean_vert_mix.F90
fms_ACCESS-CM.x 000000000044E237 ocean_model_mod_m 1626 ocean_model.F90
fms_ACCESS-CM.x 0000000000416592 MAIN__ 471 ocean_solo.F90
fms_ACCESS-CM.x 000000000040F722 Unknown Unknown Unknown
libc-2.28.so 000014A4299587E5 __libc_start_main Unknown Unknown
fms_ACCESS-CM.x 000000000040F62E Unknown Unknown Unknown
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
libifcoremt.so.5 0000149C7ADBB555 for__signal_handl Unknown Unknown
libpthread-2.28.s 0000149C7805B990 Unknown Unknown Unknown
um-atmos.exe 0000000000C7F792 emiss_io_mod_mp_e 2402 emiss_io_mod.F90
um-atmos.exe 0000000001595718 easyaerosol_read_ 1505 easyaerosol_read_input_mod.F90
um-atmos.exe 000000000159454F easyaerosol_read_ 162 easyaerosol_read_input_mod.F90
um-atmos.exe 00000000009F99B3 atm_step_4a_ 1907 atm_step_4A.F90
um-atmos.exe 0000000000497A5E u_model_4a_ 375 u_model_4A.F90
um-atmos.exe 000000000040B861 um_shell_ 653 um_shell.F90
um-atmos.exe 000000000040865A MAIN__ 28 um_main.F90
um-atmos.exe 0000000000408602 Unknown Unknown Unknown
libc-2.28.so 0000149C77CAC7E5 __libc_start_main Unknown Unknown
um-atmos.exe 000000000040850E Unknown Unknown Unknown
...
--------------------------------------------------------------------------
mpirun noticed that process rank 620 with PID 653648 on node gadi-cpu-bdw-0258 exited on signal 6 (Aborted).
--------------------------------------------------------------------------
+ FINALLY
+ for S in $SIGNALS
+ trap '' EXIT
++ echo 575
++ sed s/./0/g
+ PE0_SUFFIX=000
+ UM_PE0_STDOUT_FILE=pe_output/dv376.fort6.pe000
+ [[ -s pe_output/dv376.fort6.pe000 ]]
+ echo '%PE0 OUTPUT%'
+ cat pe_output/dv376.fort6.pe000
+ [[ true == \f\a\l\s\e ]]
+ [[ -f pe_output/dv376.fort6.pe000 ]]
+ [[ pe_output/dv376.fort6.pe000 != pe_output/dv376.fort6.pe0 ]]
++ basename pe_output/dv376.fort6.pe
+ ln -sf dv376.fort6.pe000 pe_output/dv376.fort6.pe0
[FAIL] access-coupled # return-code=134
2025-12-11T05:11:41Z CRITICAL - failed/EXIT