ACCESS-CM2 persistent session problems

Thanks all for the help!

Here is what I did to get the suite working and likely what you need to do if there are old suites from accessdev to move over

In suite.rc you need to modify a few lines in different places

  1. in the [scheduling] block change fcm_make_drivers and fcm_make2_drivers (if applicable) to make_drivers
  2. In the [runtime] block, change the `{% if BUILD_DRIVERS %}’ sub block to the following
{% if BUILD_DRIVERS %}
    [[make_drivers]]
        inherit = BUILD, NCI
        script = """
cd $CYLC_SUITE_SHARE_DIR
if [ -d access-cm2-drivers ] ; then
  rm -rf access-cm2-drivers
fi
git clone https://github.com/ACCESS-NRI/access-cm2-drivers.git
"""
        [[[directives]]]
            -q = copyq
{% endif %}
  1. So that the new driver scripts get read, go to this line in suite.rc
[runtime]
...
      [[NCI]] 

add in this line t the top :

        env-script = "eval $(rose task-env --path=share/access-cm2-drivers/src)"

Still under [runtime], , [[NCI]] change the remote host to the following:

[[[remote]]]               
              host = localhost
  1. under scheduling the ‘if BUILD_UM’ line. => fcm_make2_um so the lines becomes
{% if BUILD_UM %}
fcm_make_um
{% endif%}
  1. delete references to fcm-make.cfg from the fcm_make_um processes in the [runtime] block. The block should now read:
{% if BUILD_UM %}
    [[fcm_make_um]]
        inherit = BUILD, NCI_BUILD, UMBUILD, NCI
        [[[ environment ]]]
            ROSE_TASK_OPTIONS = --archive
        [[[job]]]
            execution time limit = PT40M
        [[[directives]]]
            -l ncpus=6
            -l mem = 24gb
            -l jobfs = 2gb
            -q = {{NCIEXQ}}
{% endif %}
  1. there is also a change to make in the app/fcm_make_um/rose-app.conf to ensure that the UM builds correctly
mirror=preprocess-atmos build-atmos preprocess-recon build-recon
  1. Include lines on SHARE_NODES at the beginning of suite.rc so it changes from
{% set ICE_NPROCS = ((NXBLK_ICE*NYBLK_ICE)/CICE_MAXBK)|round(0,'ceil')|int %}
{% set NNODE_OCNICE = ((MOM_CPUS+ICE_NPROCS)/NSLOTS)|round(0,'ceil')|int %}
{% set NUM_NODES = NNODE_ATM + NNODE_OCNICE %}
# Should allow for undercommitted nodes here

To this

... (lines above unchanged
{% set ICE_NPROCS = ((NXBLK_ICE*NYBLK_ICE)/CICE_MAXBK)|round(0,'ceil')|int %}
{% if SHARE_NODES %}
# Allow ocean and ice models to share node
{% set NNODE_OCNICE = ((MOM_CPUS+ICE_NPROCS)/NSLOTS)|round(0,'ceil')|int %}
{% set NUM_NODES = NNODE_ATM + NNODE_OCNICE %}
{% else %}
{% set NUM_NODES = NNODE_ATM + NNODE_OCN + NNODE_ICE %}
{% endif %}
# Should allow for undercommitted nodes here
...lines below unchanged

and finally, add a line in the [coupled] block. Find #For CICE and add in the last line here on share nodes

           #For CICE
            ICE_NPROCS={{ICE_NPROCS}}
            NSLOTS = {{NSLOTS}}
            SHARE_NODES = {{SHARE_NODES}}

in rose-suite.conf add
SHARE_NODES=true
Lastly, change COMPUTE_HOST in rose-suite.conf to “localhost” since the job is being submitted from gadi now

Hopefully these changes can help anyone trying to port a suite that was on accessdev and using fcm_make_drivers

If you have any questions let me know!
Cheers,
Sebastian

1 Like