ACCESS-CM2 run failed in coupled step

Hi there,

I recently started a new experiment using ACCESS-CM2, but the run failed during the coupled step. I’ve checked the job.err log to identify the source of the problem, but I still haven’t been able to resolve it. The error appears to be related to the messages shown below.

Has anyone else encountered a similar issue? I’ve also copied the full job.err log to a public path (/scratch/public/hl1052/job.err) in case anyone would like to take a closer look.

Any help would be greatly appreciated.

Thanks,

Huazhen

oasis_init_comp: Calling MPI_Init
oasis_init_comp: Not Calling MPI_Init
[30] exceptions: feenableexcept() mask [0x00000000] enabled. (mask [0x00000000] requested)
[141] exceptions: feenableexcept() mask [0x00000000] enabled. (mask [0x00000000] requested)
[36] exceptions: feenableexcept() mask [0x00000000] enabled. (mask [0x00000000] requested)
...
????????????????????????????????????????????????????????????????????????????????
??????????????????????????????      WARNING       ??????????????????????????????
?  Warning code: -10
?  Warning from routine: check_configid
?  Warning message:
?          itab is set to missing data, pp_head(LBEXP) will be set to zero
?  Warning from processor: 0
?  Warning number: 0
????????????????????????????????????????????????????????????????????????????????


????????????????????????????????????????????????????????????????????????????????
??????????????????????????????      WARNING       ??????????????????????????????
?  Warning code: -100
?  Warning from routine: warn_temp_fixes
?  Warning message:
?          Model run excludes ticket #646 as l_fix_nh4no3_equilibrium=.FALSE.
?          This will affect any model runs which include the formation of
?          ammonium nitrate within the CLASSIC aerosol scheme.
?  Warning from processor: 0
?  Warning number: 1
????????????????????????????????????????????????????????????????????????????????


????????????????????????????????????????????????????????????????????????????????
??????????????????????????????      WARNING       ??????????????????????????????
?  Warning code: -100
?  Warning from routine: warn_temp_fixes
?  Warning message: Model run excludes ticket #1017 as
?          l_fix_ctile_orog=.FALSE.
?          This will affect runs with coastal tiling.
?  Warning from processor: 0
?  Warning number: 2
????????????????????????????????????????????????????????????????????????????????


????????????????????????????????????????????????????????????????????????????????
??????????????????????????????      WARNING       ??????????????????????????????
?  Warning code: -100
?  Warning from routine: warn_temp_fixes
?  Warning message: Model run excludes ticket #1167 as
?          l_fix_conv_precip_evap=.FALSE.
?          This will affect all runs with parameterised
?          precipitating convection; evaporation rates for
?          convective precipitation will be underestimated.
?  Warning from processor: 0
?  Warning number: 3
????????????????????????????????????????????????????????????????????????????????


????????????????????????????????????????????????????????????????????????????????
??????????????????????????????      WARNING       ??????????????????????????????
?  Warning code: -100
?  Warning from routine: warn_temp_fixes
?  Warning message:
?          Model run excludes ticket #1421 as l_fix_ukca_impscav=.FALSE.
?          This will affect any model runs which include UKCA.
?  Warning from processor: 0
?  Warning number: 4
????????????????????????????????????????????????????????????????????????????????


????????????????????????????????????????????????????????????????????????????????
??????????????????????????????      WARNING       ??????????????????????????????
?  Warning code: -100
?  Warning from routine: warn_temp_fixes
?  Warning message:
?          Model run excludes ticket #1638 as l_fix_rp_shock_amp=.FALSE.
?          This will affect any model using the Random Parameters 2b Scheme
?          by doubling the size of the shock amplitude.
?  Warning from processor: 0
?  Warning number: 5
????????????????????????????????????????????????????????????????????????????????


????????????????????????????????????????????????????????????????????????????????
??????????????????????????????      WARNING       ??????????????????????????????
?  Warning code: -100
?  Warning from routine: warn_temp_fixes
?  Warning message:
?          Model run excludes ticket #1729 as l_fix_ustar_dust=.FALSE.
?          This will affect any model runs which include interactive dust.
?  Warning from processor: 0
?  Warning number: 6
????????????????????????????????????????????????????????????????????????????????


????????????????????????????????????????????????????????????????????????????????
??????????????????????????????      WARNING       ??????????????????????????????
?  Warning code: -100
?  Warning from routine: warn_temp_fixes
?  Warning message:
?          Model run excludes ticket #2077 as l_fix_dyndiag=.FALSE.
?          This will mess up shear-dominated PBLs slightly
?  Warning from processor: 0
?  Warning number: 7
????????????????????????????????????????????????????????????????????????????????


????????????????????????????????????????????????????????????????????????????????
??????????????????????????????      WARNING       ??????????????????????????????
?  Warning code: -100
?  Warning from routine: warn_temp_fixes
?  Warning message:
?          Model run excludes a change from ticket #1958 as
?          l_fix_ukca_deriv_init=.FALSE.
?          This will affect any model runs which include UKCA Strat+trop scheme.
?  Warning from processor: 0
?  Warning number: 8
????????????????????????????????????????????????????????????????????????????????


????????????????????????????????????????????????????????????????????????????????
??????????????????????????????      WARNING       ??????????????????????????????
?  Warning code: -100
?  Warning from routine: CHECK_RUN_DIFFUSION
?  Warning message:
?          cldbase_opt_sh set to sh_wstar_closure since
?          Smagorinsky diffusion not chosen.
?  Warning from processor: 0
?  Warning number: 9
????????????????????????????????????????????????????????????????????????????????


????????????????????????????????????????????????????????????????????????????????
??????????????????????????????      WARNING       ??????????????????????????????
?  Warning code: -30
?  Warning from routine: PRELIM
?  Warning message:
?          Field - Section:0, Item:252 request denied.
?          Unavailable to this model version.
?  Warning from processor: 0
?  Warning number: 10
????????????????????????????????????????????????????????????????????????????????


WARNING from PE     3: set_date_c: Year zero is invalid. Resetting year to 1
...
????????????????????????????????????????????????????????????????????????????????
??????????????????????????????      WARNING       ??????????????????????????????
?  Warning code: -1
?  Warning from routine: eg_SISL_setcon
?  Warning message:  Constant gravity enforced
?  Warning from processor: 0
?  Warning number: 11
????????????????????????????????????????????????????????????????????????????????


????????????????????????????????????????????????????????????????????????????????
??????????????????????????????      WARNING       ??????????????????????????????
?  Warning code: -1
?  Warning from routine: OASIS_INITA2O
?  Warning message: Coupling field not enabled - 3D CO2 STASH code:0,252
?  Warning from processor: 0
?  Warning number: 12
????????????????????????????????????????????????????????????????????????????????

[gadi-cpu-bdw-0258:653656:0:653656] Caught signal 8 (Floating point exception: floating-point invalid operation)
[gadi-cpu-bdw-0258:653648:0:653648] Caught signal 8 (Floating point exception: floating-point invalid operation)
[gadi-cpu-bdw-0258:653637:0:653637] Caught signal 8 (Floating point exception: floating-point invalid operation)
==== backtrace (tid: 653637) ====
 0 0x0000000000012990 __funlockfile()  :0
 1 0x00000000008fd36a ocean_vert_kpp_mom4p1_mod_mp_wscale_()  /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_kpp_mom4p1.F90:2114
 2 0x00000000008f64fb ocean_vert_kpp_mom4p1_mod_mp_bldepth_()  /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_kpp_mom4p1.F90:1719
 3 0x00000000008e09ff ocean_vert_kpp_mom4p1_mod_mp_vert_mix_kpp_mom4p1_()  /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_kpp_mom4p1.F90:1278
 4 0x000000000078307d ocean_vert_mix_mod_mp_vert_mix_coeff_()  /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_mix.F90:3028
 5 0x000000000044e237 ocean_model_mod_mp_update_ocean_model_()  /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_core/ocean_model.F90:1626
 6 0x0000000000416592 MAIN__()  /home/561/hl1052/cylc-run/u-dv376/share/mom/src/accesscm_coupler/ocean_solo.F90:471
 7 0x000000000040f722 main()  ???:0
 8 0x000000000003a7e5 __libc_start_main()  ???:0
 9 0x000000000040f62e _start()  ???:0
=================================
forrtl: error (75): floating point exception
Image              PC                Routine            Line        Source             
fms_ACCESS-CM.x    0000000001D7BB94  Unknown               Unknown  Unknown
libpthread-2.28.s  000014BFD28A4990  Unknown               Unknown  Unknown
fms_ACCESS-CM.x    00000000008FD36A  ocean_vert_kpp_mo        2114  ocean_vert_kpp_mom4p1.F90
fms_ACCESS-CM.x    00000000008F64FB  ocean_vert_kpp_mo        1719  ocean_vert_kpp_mom4p1.F90
fms_ACCESS-CM.x    00000000008E09FF  ocean_vert_kpp_mo        1278  ocean_vert_kpp_mom4p1.F90
fms_ACCESS-CM.x    000000000078307D  ocean_vert_mix_mo        3028  ocean_vert_mix.F90
fms_ACCESS-CM.x    000000000044E237  ocean_model_mod_m        1626  ocean_model.F90
fms_ACCESS-CM.x    0000000000416592  MAIN__                    471  ocean_solo.F90
fms_ACCESS-CM.x    000000000040F722  Unknown               Unknown  Unknown
libc-2.28.so       000014BFD24F57E5  __libc_start_main     Unknown  Unknown
fms_ACCESS-CM.x    000000000040F62E  Unknown               Unknown  Unknown
==== backtrace (tid: 653648) ====
==== backtrace (tid: 653656) ====
 0 0x0000000000012990 __funlockfile()  :0
 1 0x00000000008fd36a ocean_vert_kpp_mom4p1_mod_mp_wscale_()  /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_kpp_mom4p1.F90:2114
 2 0x00000000008f64fb ocean_vert_kpp_mom4p1_mod_mp_bldepth_()  /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_kpp_mom4p1.F90:1719
 3 0x00000000008e09ff ocean_vert_kpp_mom4p1_mod_mp_vert_mix_kpp_mom4p1_()  /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_kpp_mom4p1.F90:1278
 4 0x000000000078307d ocean_vert_mix_mod_mp_vert_mix_coeff_()  /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_mix.F90:3028
 5 0x000000000044e237 ocean_model_mod_mp_update_ocean_model_()  /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_core/ocean_model.F90:1626
 6 0x0000000000416592 MAIN__()  /home/561/hl1052/cylc-run/u-dv376/share/mom/src/accesscm_coupler/ocean_solo.F90:471
 7 0x000000000040f722 main()  ???:0
 8 0x000000000003a7e5 __libc_start_main()  ???:0
 9 0x000000000040f62e _start()  ???:0
=================================
 0 0x0000000000012990 __funlockfile()  :0
 1 0x00000000008fd36a ocean_vert_kpp_mom4p1_mod_mp_wscale_()  /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_kpp_mom4p1.F90:2114
 2 0x00000000008f64fb ocean_vert_kpp_mom4p1_mod_mp_bldepth_()  /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_kpp_mom4p1.F90:1719
 3 0x00000000008e09ff ocean_vert_kpp_mom4p1_mod_mp_vert_mix_kpp_mom4p1_()  /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_kpp_mom4p1.F90:1278
 4 0x000000000078307d ocean_vert_mix_mod_mp_vert_mix_coeff_()  /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_param/vertical/ocean_vert_mix.F90:3028
 5 0x000000000044e237 ocean_model_mod_mp_update_ocean_model_()  /home/561/hl1052/cylc-run/u-dv376/share/mom/src/mom5/ocean_core/ocean_model.F90:1626
 6 0x0000000000416592 MAIN__()  /home/561/hl1052/cylc-run/u-dv376/share/mom/src/accesscm_coupler/ocean_solo.F90:471
 7 0x000000000040f722 main()  ???:0
 8 0x000000000003a7e5 __libc_start_main()  ???:0
 9 0x000000000040f62e _start()  ???:0
=================================
forrtl: error (75): floating point exception
Image              PC                Routine            Line        Source             
fms_ACCESS-CM.x    0000000001D7BB94  Unknown               Unknown  Unknown
libpthread-2.28.s  000015058B64F990  Unknown               Unknown  Unknown
fms_ACCESS-CM.x    00000000008FD36A  ocean_vert_kpp_mo        2114  ocean_vert_kpp_mom4p1.F90
fms_ACCESS-CM.x    00000000008F64FB  ocean_vert_kpp_mo        1719  ocean_vert_kpp_mom4p1.F90
fms_ACCESS-CM.x    00000000008E09FF  ocean_vert_kpp_mo        1278  ocean_vert_kpp_mom4p1.F90
fms_ACCESS-CM.x    000000000078307D  ocean_vert_mix_mo        3028  ocean_vert_mix.F90
fms_ACCESS-CM.x    000000000044E237  ocean_model_mod_m        1626  ocean_model.F90
fms_ACCESS-CM.x    0000000000416592  MAIN__                    471  ocean_solo.F90
fms_ACCESS-CM.x    000000000040F722  Unknown               Unknown  Unknown
libc-2.28.so       000015058B2A07E5  __libc_start_main     Unknown  Unknown
fms_ACCESS-CM.x    000000000040F62E  Unknown               Unknown  Unknown
forrtl: error (75): floating point exception
Image              PC                Routine            Line        Source             
fms_ACCESS-CM.x    0000000001D7BB94  Unknown               Unknown  Unknown
libpthread-2.28.s  000014A429D07990  Unknown               Unknown  Unknown
fms_ACCESS-CM.x    00000000008FD36A  ocean_vert_kpp_mo        2114  ocean_vert_kpp_mom4p1.F90
fms_ACCESS-CM.x    00000000008F64FB  ocean_vert_kpp_mo        1719  ocean_vert_kpp_mom4p1.F90
fms_ACCESS-CM.x    00000000008E09FF  ocean_vert_kpp_mo        1278  ocean_vert_kpp_mom4p1.F90
fms_ACCESS-CM.x    000000000078307D  ocean_vert_mix_mo        3028  ocean_vert_mix.F90
fms_ACCESS-CM.x    000000000044E237  ocean_model_mod_m        1626  ocean_model.F90
fms_ACCESS-CM.x    0000000000416592  MAIN__                    471  ocean_solo.F90
fms_ACCESS-CM.x    000000000040F722  Unknown               Unknown  Unknown
libc-2.28.so       000014A4299587E5  __libc_start_main     Unknown  Unknown
fms_ACCESS-CM.x    000000000040F62E  Unknown               Unknown  Unknown
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
forrtl: error (78): process killed (SIGTERM)
Image              PC                Routine            Line        Source             
libifcoremt.so.5   0000149C7ADBB555  for__signal_handl     Unknown  Unknown
libpthread-2.28.s  0000149C7805B990  Unknown               Unknown  Unknown
um-atmos.exe       0000000000C7F792  emiss_io_mod_mp_e        2402  emiss_io_mod.F90
um-atmos.exe       0000000001595718  easyaerosol_read_        1505  easyaerosol_read_input_mod.F90
um-atmos.exe       000000000159454F  easyaerosol_read_         162  easyaerosol_read_input_mod.F90
um-atmos.exe       00000000009F99B3  atm_step_4a_             1907  atm_step_4A.F90
um-atmos.exe       0000000000497A5E  u_model_4a_               375  u_model_4A.F90
um-atmos.exe       000000000040B861  um_shell_                 653  um_shell.F90
um-atmos.exe       000000000040865A  MAIN__                     28  um_main.F90
um-atmos.exe       0000000000408602  Unknown               Unknown  Unknown
libc-2.28.so       0000149C77CAC7E5  __libc_start_main     Unknown  Unknown
um-atmos.exe       000000000040850E  Unknown               Unknown  Unknown
...
--------------------------------------------------------------------------
mpirun noticed that process rank 620 with PID 653648 on node gadi-cpu-bdw-0258 exited on signal 6 (Aborted).
--------------------------------------------------------------------------
+ FINALLY
+ for S in $SIGNALS
+ trap '' EXIT
++ echo 575
++ sed s/./0/g
+ PE0_SUFFIX=000
+ UM_PE0_STDOUT_FILE=pe_output/dv376.fort6.pe000
+ [[ -s pe_output/dv376.fort6.pe000 ]]
+ echo '%PE0 OUTPUT%'
+ cat pe_output/dv376.fort6.pe000
+ [[ true == \f\a\l\s\e ]]
+ [[ -f pe_output/dv376.fort6.pe000 ]]
+ [[ pe_output/dv376.fort6.pe000 != pe_output/dv376.fort6.pe0 ]]
++ basename pe_output/dv376.fort6.pe
+ ln -sf dv376.fort6.pe000 pe_output/dv376.fort6.pe0
[FAIL] access-coupled # return-code=134
2025-12-11T05:11:41Z CRITICAL - failed/EXIT

No further assistance is needed. I figured out the issue—the problem was caused by some input files I had modified.

1 Like