anton
(Anton Steketee, ACCESS-NRI Oceans Team)
14 May 2025 22:44
21
Hi Paul
Yes it sounds like are on the right track.
access3
needs to be in BOTH the packages and develop section of spack.yaml
the configurations get set in the packages:access3:require:
section not the develop
section.
So, don’t do this:
Instead use:
access3:
require:
- '@git.2025.03.0'
- configurations=MOM6
# - fflags="-O0 -g -check bounds -check pointers -fpe0 -check noarg_temp_created"
And revert this:
anton
(Anton Steketee, ACCESS-NRI Oceans Team)
16 May 2025 05:14
22
Hi Paul
Apologies - we need to remove -check uninit
from the access-mom6 and access3-share packages as well. See example here: rOM3-MOM6 build test by anton-seaice · Pull Request #86 · ACCESS-NRI/ACCESS-OM3 · GitHub
the access3
and access3-share
spack packages actually both reference the same git repository (GitHub - ACCESS-NRI/access3-share: Shared code for access 3 models using NUOPC + CMEPS ), so the git ref can be the same for both packages. (To use the commit without -check uninit
). I added a commit to remove -check uninit
from MOM6 as well, see the example above.
Anton
1 Like
Hi Anton.
I had a good chat with @harshula this morning and I’m watching all the work you’re doing via the pull requests and issue comments Github CI build logs on ACCESS-OM3.
Do you still want me working on building locally? I can see you and Harshula are trying a few options (including checking the old ifort debugger against oneapi).
anton
(Anton Steketee, ACCESS-NRI Oceans Team)
16 May 2025 05:47
24
You should be able to use the executable in :
module use /g/data/vk83/prerelease/modules
module load access-om3/pr86-31
for now. Assuming the debugger works with oneapi ?
1 Like
That’s a very good question.
Gadi has padb
and TotalView
installed. I guess I’ll find out next week.
Cheers
1 Like
anton
(Anton Steketee, ACCESS-NRI Oceans Team)
16 May 2025 07:09
26
At some point Linaro DDT was mooted:
Linaro Forge HPC Tools... - NCI Help - Opus - NCI Confluence …
I don’t know of anyone who has got it setup though.
anton
(Anton Steketee, ACCESS-NRI Oceans Team)
22 May 2025 23:19
27
module use /g/data/vk83/prerelease/modules
module load access-om3/pr86-31
Hi Paul - I’ve closed this PR , so that the previous 30 failed attempts at a build get cleaned up but it also deletes the one build you could use. If you want to use it, just make a new draft PR from the same branch .
1 Like
Hi @anton
I’ve finished with the ACCESS rAM3 flagship for a while, so now I have time to get back to this.
I’ve create a PR from GitHub - ACCESS-NRI/ACCESS-OM3 at om3-mom6-debug-example .
The CI failed though. Om3 mom6 debug executable by Rush74 · Pull Request #128 · ACCESS-NRI/ACCESS-OM3 · GitHub
I had to manually resolve config/versions.json
and spack.yaml
and its probably the latter that is causing problems.
Hoping you can help.
Ok thanks for pointing out this version compiled earlier : MOM6 symmetric with updated MOM6 version by claireyung · Pull Request #120 · ACCESS-NRI/ACCESS-OM3 · GitHub
$ module use /g/data/vk83/prerelease/modules
$ module load access-om3/pr120-19
$ echo $PATH
/g/data/vk83/prerelease/apps/spack/0.22/release/linux-rocky8-x86_64_v4/
oneapi-2025.2.0/access3-2025.03.1-lnyu77ayu4jfq6bwusmzwiewkf6vwxkv/bin
$ file /g/data/vk83/prerelease/apps/spack/0.22/release/linux-rocky8-x86_64_v4/oneapi-2025.2.0/access3-2025.03.1-lnyu77ayu4jfq6bwusmzwiewkf6vwxkv/bin/access-om3-MOM6-CICE6
/g/data/vk83/prerelease/apps/spack/0.22/release/linux-rocky8-x86_64_v4/oneapi-2025.2.0/access3-2025.03.1-lnyu77ayu4jfq6bwusmzwiewkf6vwxkv/bin/access-om3-MOM6-CICE6:ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=abe9b2fdf2ff3a2b1d399b0b750f3e4d131b5cdf, with debug_info, not stripped, too many notes (256)
Now onto MPI debugging.
anton
(Anton Steketee, ACCESS-NRI Oceans Team)
30 July 2025 01:48
30
It will be interesting to here if the debug exe runs for you, in general we’ve just found new issues with it !
Just checking that I’m running this version. Here are the relevant parts of my config.yaml
model: access-om3
exe: access-om3-MOM6-CICE6
modules:
use:
- /g/data/vk83/prerelease/modules
load:
- access-om3/pr120-19
From the job.stdout
mod access3/2025.03.1-lnyu77a
mod access-om3/pr120-19
It seems to work, as mpirun
specifies the directory
mpirun -wdir /scratch/gb02/pag548/access-om3/work/access-rom3-PG-dev-MC_100km_jra_iaf -np 100 /scratch/gb02/pag548/access-om3/work/access-rom3-PG-dev-MC_100km_jra_iaf/access-om3-MO
M6-CICE6
And that directory points to the desired executable:
$ ls -l /scratch/gb02/pag548/access-om3/work/access-rom3-PG-dev-MC_100km_jra_iaf/
total 22550228
lrwxrwxrwx 1 pag548 gb02 163 Jul 30 12:16 access-om3-MOM6-CICE6 -> /g/data/vk83/prerelease/apps/spack/0.22/release/linux-rocky8-x86_64_v4/oneapi-2025.2.0/access3-2025.03.1-lnyu77ayu4jfq6bwusmzwiewkf6vwxkv/bin/access-om3-MOM6-CICE6
And it fails immediately. From the stderr.
$ more access-om3.err
(t_initf) Read in prof_inparm namelist from: drv_in
(t_initf) Using profile_disable= F
(t_initf) profile_timer= 4
(t_initf) profile_depth_limit= 4
(t_initf) profile_detail_limit= 2
(t_initf) profile_barrier= F
(t_initf) profile_outpe_num= 1
(t_initf) profile_outpe_stride= 0
(t_initf) profile_single_file= F
(t_initf) profile_global_stats= T
(t_initf) profile_ovhd_measurement= F
(t_initf) profile_add_detail= F
(t_initf) profile_papi_enable= F
WARNING from PE 0: MOM_file_parser : DT over-ridden. Line: 'DT = 50' in file MOM_override.
WARNING from PE 0: MOM_file_parser : DT_THERM over-ridden. Line: 'DT_THERM = 300' in file MOM_override.
forrtl: error (65): floating invalid
Image PC Routine Line Source
libpthread-2.28.s 0000149D19D07990 Unknown Unknown Unknown
access-om3-MOM6-C 00000000010E58EF limit_topography 422 MOM_shared_initialization.F90
access-om3-MOM6-C 00000000010999E6 mom_initialize_to 264 MOM_fixed_initialization.F90
access-om3-MOM6-C 000000000109680D mom_initialize_fi 87 MOM_fixed_initialization.F90
access-om3-MOM6-C 000000000066BB0B initialize_mom 2848 MOM.F90
access-om3-MOM6-C 00000000005EC89D ocean_model_init 283 mom_ocean_model_nuopc.F90
access-om3-MOM6-C 000000000052D43B initializeadverti 753 mom_cap.F90
libesmf.so 0000149D1D738798 _ZN5ESMCI6FTable1 Unknown Unknown
Do these match your issues?
anton
(Anton Steketee, ACCESS-NRI Oceans Team)
30 July 2025 02:42
32
I think so - @claireyung might confirm ?
My error is a bit different… haven’t dug much into it though.
forrtl: error (73): floating divide by zero
Image PC Routine Line Source
libpthread-2.28.s 000014B52D919990 Unknown Unknown Unknown
libimf.so 000014B52F378247 __libm_log10_z0 Unknown Unknown
access-om3-MOM6-C 0000000004DD721F time_scalar_mult 778 time_manager.F90
access-om3-MOM6-C 0000000004DD73EE time_divide 806 time_manager.F90
access-om3-MOM6-C 0000000000690DB0 initialize_mom 3497 MOM.F90
access-om3-MOM6-C 00000000005EC89D ocean_model_init 283 mom_ocean_model_nuopc.F90
access-om3-MOM6-C 000000000052D43B initializeadverti 753 mom_cap.F90
1 Like
Well I guess I can try and run it with a debugger and see what happens.
angus-g
(Angus Gibson)
30 July 2025 23:09
35
That error is happening in a pretty simple loop:
do j=G%jsd,G%jed ; do i=G%isd,G%ied
D(i,j) = min( max( D(i,j), 0.5*min_depth ), max_depth )
enddo ; enddo
Given you’re getting a floating invalid
, that suggests to me that your topography file possibly has NaN values, particularly the _FillValue
. FPE traps will usually trigger, even if they’re in non-computational points.
1 Like
anton
(Anton Steketee, ACCESS-NRI Oceans Team)
30 July 2025 23:29
36
Thanks Angus! CICE6 has/had a failed also due to a missing _FillValue - should we be setting this explicitly in the bathymetry / grid / mask files ?
angus-g
(Angus Gibson)
31 July 2025 00:05
37
I think for bathymetry and any fields that are read as an input (initial condition, forcing, etc.) it’s usually less confusing to have a defined non-NaN _FillValue
, yeah.
1 Like
Aidan
(Aidan Heerdegen, ACCESS-NRI Release Team Lead)
31 July 2025 05:47
38
The other option is to interpolate to all points so there is some wriggle room when masks might have small changes.
anton
(Anton Steketee, ACCESS-NRI Oceans Team)
31 July 2025 06:53
39
Oh I was just thinking of the _FillValue global attribute, anyway … bit of a tangeant, ill make a new issue
Hi Team. I just checked the build flags for this executable at : ACCESS-OM3/spack.yaml at b751c08461cc95eb330b8b504aae0bd1f24a3c7e · ACCESS-NRI/ACCESS-OM3 · GitHub
- 'fflags="-march=sapphirerapids -mtune=sapphirerapids -unroll -O0 -g -check bounds -check pointers -fpe0 -check noarg_temp_created"'
So this must be run on the normalsr queue, correct?
I tried to submit to the normalsr queue via payu1.1.7, but that fails at it enforces a CPU multiple of 48, which applies to the normal queue.
Is there a version of payu that supports the normalsr queue?