Extending ACCESS-OM2-025 RYF run present on ik11

Hi,

I’m trying to extend the ACCESS-OM2 0.25 deg RYF run that’s been run for 600+ years, with outputs and restarts present in /g/data/ik11/outputs/access-om2-025/025deg_jra55_ryf9091_gadi

I am using the configuration files in /g/data/ik11/configs/access-om2-025/025deg_jra55_ryf9091_gadi. I had to make one change in the config.yaml file: instead of /g/data4/ik11/inputs/access-om2/input_rc/, I am using /g/data4/ik11/inputs/access-om2/input_rc-DELETE/. Everything else is exactly the same.

This updated config version is kept in /home/156/db6174/access-om2/025deg_jra55_ryf_control-test for anyone to have a look. This config gives the following error:

Currently Loaded Modulefiles:

1) pbs 2) openmpi/4.1.4(default)

ERROR: Unable to locate a modulefile for 'openmpi-mofed4.7-pbs19.2/4.0.1'

ERROR: Unable to locate a modulefile for 'openmpi-mofed4.7-pbs19.2/4.0.1'

ERROR: Unable to locate a modulefile for 'openmpi-mofed4.7-pbs19.2/4.0.1'

payu: Model exited with error code 1; aborting.

My guess is that since this config is 4-5 years old, module versions have been updated, which this config is unable to load. Has anyone encountered this error before or have suggestions?

Thanks!

@Dhruv_Bhagtani I’m not sure the configuration files in /g/data/ik11/configs/access-om2-025/025deg_jra55_ryf9091_gadi are kept up to date.

On the ACCESS-OM2 control runs hive forum post it says the configuration used is this one. That doesn’t seem to refer to input_rc, rather referring to input_20200530. Can you have a go with those configs?

That said, it might not fix your modules issue…

Thanks Ryan! This config seems to be working well, no more module version issues now!

1 Like

Hi @Dhruv_Bhagtani if you want to extend the run you should clone the final configuration of the previous run so that everything is the same. You could use the ryf9091_gadi branch from here GitHub - rmholmes/025deg_jra55_ryf at ryf9091_gadi or @rmholmes’s control directory on Gadi if that’s still around. To be really sure, it’s a good idea to re-do the final run and check that the restarts in the manifests in your new run are the same as before. See
Tutorials · COSIMA/access-om2 Wiki · GitHub

2 Likes

Good idea!

@rmholmes I’m having a similar issue with this module error - in my case, I’m trying for a warm start to do some perturbation experiments. The intention is to start at restart250, and I’ve updated the configs to point at the input_20200530 as per your suggestion. The manifests show the correct filepaths as far as I can tell, but the job is still returning the same module error that @Dhruv_Bhagtani was originally finding. What else can I try?

Hi Serena,

I’ve never encountered that module issue before. I wonder if there are other differences between your configuration files and the ones used for the original run? I would suggest doing a comparison. By working forwards from the original configuration files I’ve linked above (which seem to have worked for @Dhruv_Bhagtani ) you might be able to narrow down the issue? Let me know if that doesn’t help and I can look into it in more detail.

1 Like

Hi Serena, I would also suggest starting from the original configuration files and making sure that all exe and input files are the same.

1 Like

3 posts were merged into an existing topic: Extending ACCESS-OM2-025 RYF runs on ik11

The original error “Unable to locate a modulefile” is a bit of a red-herring, and doesn’t stop the config running.

As @aidan explains here, payu tries to find the module file for openmpi that the model was compiled against. However this versions doesn’t exist anymore. Payu does find a newer openmpi (4.1.4) and loads that and the model can run but was failing for a different reason.