Installing ACCESS-OM2 on NeSI (New Zealand supercomputer)

Hi @dkhutch , I was able to compile access-om2 with GCC on the NeSI HPC. Here are the instructions: How to build ACCESS-OM2 on NeSI HPC

I could not get access-om2 to build with the intel-compiler/2023.2.1 compiler.

4 Likes

Thanks so much Harshula! I will check this out on Monday.

Oh so it seems spack 1.1 was needed @harshula ? Or perhaps you didn’t test the older version?

Hi @cbull , Good question! ACCESS-NRI is in the process of migrating to Spack v1.1. Once we do, supporting Spack v0.22 becomes difficult because the SPR (package.py) and MDR (spack.yaml) API has breaking changes.

However, if I have time, I’ll see if I can build access-om2 with intel-compiler/2023.2.1 using Spack v0.22. If that works, I can also provide that as a data point to Spack upstream.

Hi Harshula,
Sorry for the radio silence! There’s been a lot of end of term assessments to deal with.
I am now running through your steps to install the ACCESS-OM2 components on NeSI. Thank you so much for putting these custom steps in place.
I will look to test it out in the next few days. I’ll need to get a makeshift version of payu on there in order to run it. Once I have a test case I will let you know the outcome.
Thanks again,
David

Ok so I’m at the point where I’ve got a prototype payu to at least set up a slurm submission on NeSI. But, I’m now running to MPI problems, which I find myself unable to deal with. Errors appear in a form like this:

--------------------------------------------------------------------------
PMI2_Init failed to intialize.  Return code: 14
--------------------------------------------------------------------------
--------------------------------------------------------------------------
PMI2_Init failed to intialize.  Return code: 14
--------------------------------------------------------------------------
--------------------------------------------------------------------------
The application appears to have been direct launched using "srun",
but OMPI was not built with SLURM's PMI support and therefore cannot
execute. There are several options for building PMI support under
SLURM, depending upon the SLURM version you are using:

  version 16.05 or later: you can use SLURM's PMIx support. This
  requires that you configure and build SLURM --with-pmix.

  Versions earlier than 16.05: you must use either SLURM's PMI-1 or
  PMI-2 support. SLURM builds PMI-1 by default, or you can manually
  install PMI-2. You must then build Open MPI using --with-pmi pointing
  to the SLURM PMI library location.

Please configure as appropriate and try again.
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[c005.hpc.nesi.org.nz:980464] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

I should mention that separately to this (before I got in touch about spack and ACCESS-OM2), I installed my own version of MOM5 (using the fms_CM2M configuration). In that case I used NeSI-configured compilers, i.e. these ones:

module load impi/2021.5.1-intel-compilers-2022.0.2
module load netCDF-Fortran/4.6.0-iimpi-2022a

Then was able to do my own compilation of MOM5 that way. I could successfully run that executable (i.e. no MPI problems), using a previous test case.

I am tempted to take this same approach with ACCESS-OM2. What I would need to do then is figure out how to compile cice5 and yatm using these same compilers (i.e. not using spack).

I am hoping I can find a way to do that… (at this point of the year, I suspect getting help from ACCESS-NRI would have to wait until at least mid-January). Hence posting this as an update of a tentative plan, rather than anything else.

Update: I hadn’t loaded my spack modules properly, so the error I quoted above is not the problem anymore. I’m on to the next crash issue haha…
Still trying with spack then!
will see how we go

I’m not recommending this, but fyi there’s an old and deprecated (but spack-free) version of access-om2 here GitHub - COSIMA/access-om2: Deprecated ACCESS-OM2 global ocean - sea ice coupled model code and configurations. with installation instructions here Getting started Ā· COSIMA/access-om2 Wiki Ā· GitHub

1 Like

Thanks Andrew. I’ll try not to go down that path just now, but good to know thanks!

Ok… now at the point of having this error:

ice: error reading namelist

So… this would suggest that the workflow is not seeing one of the essential input files for CICE5. Problem is I can’t see what is missing because the work directory looks fine.

I’m currently using a version of payu branched from payu/1.1
I know this is deprecated by a couple of years, just that I couldn’t make sense of the singularity stuff so I went back to payu/1.1 because I can understand how to find the actual payu and payu-run command line scripts.

Did anything really massive change between payu/1.1 and the current versions in regards to where the cice input namelists live?

(sorry to be pain)

I don’t think payu changed the path to cice_in.nml.

Rather than indicatingcice_in.nml was not found, this may mean cice_in.nml contains invalid entries.

Is this a cold start or are you starting from an existing restart?

If the former, is cice_in.nml compatible with the cice version you’re running? cice will die with ice: error reading namelist if unknown names are present - e.g. see lcdf64 no longer recognised; existing cice_in.nml now cause crash on init Ā· Issue #31 Ā· COSIMA/cice5 Ā· GitHub

If the latter, you also need the previous cice_in.nml - see step 8 in Tutorials Ā· COSIMA/access-om2 Wiki Ā· GitHub and also Restarting access-om2 requires outputNNN/ice/cice_in.nml Ā· Issue #193 Ā· payu-org/payu Ā· GitHub

Thanks for the tips Andrew! It is a cold start. It should be compatible with the current cice, however I will investigate further.

Ok, so I think I got past the cice namelist issue in my latest test case, but I’m now running into a problem with the MOM5 executable producing invalid memory reference.

[c004:1186587:0:1186587] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))

WARNING from PE     2: ocean_model_init called with an associated ocean_state_type structure. Model is already initialized.

[c004:1186585:0:1186585] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
==== backtrace (tid:1186587) ====
 0 0x000000000003e6f0 __GI___sigaction()  :0
 1 0x0000000001448d97 __mpp_domains_mod_MOD_mpp_get_compute_domain2d()  ???:0
 2 0x000000000124a1a9 __data_override_mod_MOD_data_override_init()  ???:0
 3 0x000000000040fd4a MAIN__()  /tmp/david.hutchin6926/spack-stage/spack-stage-mom5-git.2025.08.000_access-om2-4ny34dmjy4xafgra4wpzi7pz75cx6tdb/spack-src/src/access/accessom_coupler/ocean_solo.F90:394
 4 0x0000000000406ffd main()  /tmp/david.hutchin6926/spack-stage/spack-stage-mom5-git.2025.08.000_access-om2-4ny34dmjy4xafgra4wpzi7pz75cx6tdb/spack-src/src/access/accessom_coupler/ocean_solo.F90:81
 5 0x0000000000029590 __libc_start_call_main()  ???:0
 6 0x0000000000029640 __libc_start_main_alias_2()  :0
 7 0x0000000000407035 _start()  ???:0
=================================

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Note: I ran into pretty much the same error when I tried to run a GCC-compiled version of MOM5 recently. I think it’s to do with how MOM5 specifies BOZ literals in the MPP domains module.
For example, this part of the code:

And several other BOZ examples in that module. I couldn’t easily compile those with gfortran, and when did get it to compile, I would get this same kind of invalid memory reference at run time with MOM5.

My solution in that case was to switch to intel fortran compilers which seem to handle the BOZ specification ok. Regrettably, that pathway would mean I can’t use the spack configuration that @harshula has set up on NeSI for me, and will instead seek to compile cice5 and yatm using the intel compilers that are available on NeSI (but don’t seem to play nice with spack currently).

1 Like

Hi @dkhutch , Sounds like access-om2 has problems when built by GCC.

I was able to build it using oneapi@2025.2.0 compiler on NeSI. I’ve updated the instructions at How to build ACCESS-OM2 on NeSI HPC . The biggest difference is that Spack will install the Intel oneapi compiler instead of using system compilers. Note, I tried to do this with the old Intel Classic compilers, but libiconv failed to build.

Serendipitously, @dougiesquire has been doing some testing of access-om2 built with oneapi@2025.2.0 on Gadi: Update to oneapi compiler (and openmpi/5.0.8) Ā· Issue #212 Ā· ACCESS-NRI/access-om2-configs Ā· GitHub

2 Likes

Amazing thank you again @harshula! I will try that out and let you know how it goes.

Ok so after some fiddling about, I have hit some new issues. I think I have managed to get past a problem with -xCORE-AVX2 vectorisation by manually editing the cmake of MOM5 to remove this problematic flag. (Which causes problems on the AMD infrastructure.)

But now when I try to run it, I get the following errors:

--------------------------------------------------------------------------
PMI2_Init failed to intialize.  Return code: 14
--------------------------------------------------------------------------
--------------------------------------------------------------------------
The application appears to have been direct launched using "srun",
but OMPI was not built with SLURM's PMI support and therefore cannot
execute. There are several options for building PMI support under
SLURM, depending upon the SLURM version you are using:

  version 16.05 or later: you can use SLURM's PMIx support. This
  requires that you configure and build SLURM --with-pmix.

  Versions earlier than 16.05: you must use either SLURM's PMI-1 or
  PMI-2 support. SLURM builds PMI-1 by default, or you can manually
  install PMI-2. You must then build Open MPI using --with-pmi pointing
  to the SLURM PMI library location.

Please configure as appropriate and try again.
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[c028.hpc.nesi.org.nz:923286] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
[proxy:0@c027.hpc.nesi.org.nz] HYD_pmcd_pmip_control_cmd_cb (proxy/pmip_cb.c:487): assert (!closed) failed
[proxy:0@c027.hpc.nesi.org.nz] HYDT_dmxu_poll_wait_for_event (lib/tools/demux/demux_poll.c:76): callback returned error status
[proxy:0@c027.hpc.nesi.org.nz] main (proxy/pmip.c:122): demux engine error waiting for event
srun: error: c027: task 0: Exited with exit code 7
srun: Terminating StepId=4538385.0
[mpiexec@c027.hpc.nesi.org.nz] HYDT_bscu_wait_for_completion (lib/tools/bootstrap/utils/bscu_wait.c:109): one of the processes terminated badly; aborting
[mpiexec@c027.hpc.nesi.org.nz] HYDT_bsci_wait_for_completion (lib/tools/bootstrap/src/bsci_wait.c:21): launcher returned error waiting for completion
[mpiexec@c027.hpc.nesi.org.nz] HYD_pmci_wait_for_completion (mpiexec/pmiserv_pmci.c:196): launcher returned error waiting for completion
[mpiexec@c027.hpc.nesi.org.nz] main (mpiexec/mpiexec.c:260): process manager error waiting for completion
~

I have made an issue for this, please feel free to add any missing info, or correct anything.

1 Like

Update on my spack failings. The errors I encountered above seem to be due to incompatibility of the compiler (which was installed by spack) with the NeSI SLURM system. I tried a few different ways to modify the spack build to use a NeSI system compiler, specifically this one:

impi/2021.5.1-intel-compilers-2022.0.2

Which I’ve had success with in a standalone MOM5 build. (Not ACCESS.) Unfortunately I couldn’t get spack to install ACCESS-OM2 this way. It did concretize ok, but then died in trying to make c-blosc which was one of the dependencies of netcdf-c.

I had also hoped to get spack to recognise the NeSI-built netcdf-fortran package which is available as a module:

netCDF-Fortran/4.6.0-iimpi-2022a

But I couldn’t get spack to find this module (doesn’t show up in spack external find).
Since I’ve been stuck in the weeds of this for some days now, I’m thinking the best path might in fact be to go back to the deprecated COSIMA build that Andrew Kiss linked to, because there I have more confidence that I can run and debug the necessary scripts and tweak the dependencies to things I can get working.

Trying to get my head around how cmake and spack work simultaneously is proving to be more difficult than I can reasonably manage at this point.

1 Like

I get that, and it is not encouraging that it is so difficult, when the promise of something like spack is that it should make it easier. It is a shame, as it seems like you’re quite close to a good outcome.

If you wanted to give it one more go …

The blosc filter is only required for NCZarr support, so you could disable the variant by adding ~blosc to the netcdf-c package requirements.

C’mon David. You can do this. I believe in you.