This is splitting a discussion off from this thread
When running in mom-only mode, @Paul.Gregory and a few other’s are getting this error:
replys to this are:
I’d need to look into the
cice
issue, but @anton may have some idea.
This is splitting a discussion off from this thread
When running in mom-only mode, @Paul.Gregory and a few other’s are getting this error:
replys to this are:
I’d need to look into the
cice
issue, but @anton may have some idea.
This was also a reply to this issue:
To work around the non-fatal issues described in this post, I’d try for now just commenting out the entire
userscripts
section in theconfig.yaml
.
Ok I got this working.
I was able to restart the run by purging my access-om3/archive
directory access-rom3-MR
.
This may be related to @mmr0 's restart issues.
Doing this in config.yaml
#userscripts:
# setup: /usr/bin/bash /g/data/vk83/apps/om3-scripts/payu_config/setup.sh
# archive: /usr/bin/bash /g/data/vk83/apps/om3-scripts/payu_config/archive.sh
Works. The job completes with stdout
======================================================================================
Resource Usage on 2025-03-11 14:20:42:
Job Id: 136779491.gadi-pbs
Project: gb02
Exit Status: 0
Service Units: 8.17
NCPUs Requested: 16 NCPUs Used: 16
CPU Time Used: 02:32:58
Memory Requested: 100.0GB Memory Used: 74.37GB
Walltime requested: 01:00:00 Walltime Used: 00:09:48
JobFS requested: 10.0GB JobFS used: 8.07MB
======================================================================================
with stderr
Loading access-om3/pr30-5
Loading requirement: access-om3-nuopc/0.3.1-fvr7qaw
Currently Loaded Modulefiles:
1) access-om3-nuopc/0.3.1-fvr7qaw 3) nco/5.0.5 5) pbs
2) access-om3/pr30-5 4) openmpi/4.1.7(default)
It produces the following outputs in
access-rom3-MR/output000
$ ls -lht *.nc
42M Mar 11 14:20 access-om3.mom6.h.native_2013_01.nc
612K Mar 11 14:20 access-om3.mom6.h.sfc_2013_01.nc
130K Mar 11 14:20 access-om3.mom6.h.static.nc
20M Mar 11 14:20 access-om3.mom6.h.z_2013_01.nc
Note - if you don’t do what @dougiesquire suggests and didn’t ENTIRELY remove the userscript
section in config.yaml
, i.e. this
userscripts:
# setup: /usr/bin/bash /g/data/vk83/apps/om3-scripts/payu_config/setup.sh
# archive: /usr/bin/bash /g/data/vk83/apps/om3-scripts/payu_config/archive.sh
It will generate a payu
error:
Traceback (most recent call last):
File "/g/data/vk83/prerelease/apps/base_conda/envs/payu-dev-20250220T210827Z-39e4b9b/bin/payu-run", line 8, in <module>
sys.exit(runscript())
File "/g/data/vk83/prerelease/apps/base_conda/envs/payu-dev-20250220T210827Z-39e4b9b/lib/python3.10/site-packages/payu/subcommands/ru
n_cmd.py", line 123, in runscript
expt = Experiment(lab, reproduce=run_args.reproduce, force=run_args.force)
File "/g/data/vk83/prerelease/apps/base_conda/envs/payu-dev-20250220T210827Z-39e4b9b/lib/python3.10/site-packages/payu/experiment.py"
, line 124, in __init__
init_script = self.userscripts.get('init')
AttributeError: 'NoneType' object has no attribute 'get'
Evidently if you place userscripts
in the config.yaml
, payu
expects a value.
I’ll now continue working with this configuration, changing the layouts, number of CPUs and resolution to try and figure out why I couldn’t get the 10x10 layout with 100 CPUs at 0.1 resolution working.
10x10 layout with 100 CPUs at 0.1 resolution working
Just noting that in general its a good idea to use entire nodes, e.g. multiples of 48 cores if you’re using the normal
queue
Thanks @dougiesquire - you have answered a question I was about to ask! I need to update this in the instructions.
Is this issue really fatal ?
It should just be a warning which could be ignored .
The CICE warning is not fatal and can be ignored. It does report itself as being fatal though, so is causing some confusion.