UM build qsub error - help

Hi,

I’m hoping someone might be able to help me with this confusing error :slight_smile:

I am trying to run ACCESS-AM2, and at the UMBUILD/fcm_make2_um step I have gotten the following error message:
2026-01-28T00:04:23Z [STDERR] qsub: Error: You are not a member of project project=gx60. You must be a member of a project to submit a job under that project. [(('event-mail', 'submission retry'), 1) ret_code] 0

However,

  • I am a member of gx60
  • After becoming a member of gx60 I created a new persistent session which the job is running under
  • I can create ARE sessions (using the normalbw and expressbw queues) using compute from gx60

For some more info on the job submission:

#!/bin/bash -l
#
# ++++ THIS IS A CYLC TASK JOB SCRIPT ++++
# Suite: u-dt947
# Task: fcm_make2_um.20190101T0000Z
# Job log directory: 20190101T0000Z/fcm_make2_um/02
# Job submit method: pbs

# DIRECTIVES:
#PBS -N fcm_make2_um.20
#PBS -o /home/563/ac9768/cylc-run/u-dt947/log/job/20190101T0000Z/fcm_make2_um/02/job.out
#PBS -e /home/563/ac9768/cylc-run/u-dt947/log/job/20190101T0000Z/fcm_make2_um/02/job.err
#PBS -q express
#PBS -l ncpus=8
#PBS -l walltime=1:00:00
#PBS -l mem=12gb
#PBS -l jobfs=12gb
#PBS -P project=gx60
#PBS -l storage=scratch/q90+scratch/gx60+scratch/access+gdata/access+gdata/hh5+gdata/hr22+gdata/ki32+gdata/q90+gdata/gx60
#PBS -l software=intel-compiler
export CYLC_DIR='/g/data/hr22/apps/cylc7/cylc_7.9.9'
export CYLC_VERSION='7.9.9'
CYLC_FAIL_SIGNALS='EXIT ERR TERM XCPU'

Hi @alanah.chapman,

Have you tried submitting the job right after being accepted in gx60?
I’m asking because sometimes there might be a delay between being able to access resources on a project and being able to access computing SU on the same project.
In that case, I would suggest to wait an hour or so and try again.

Cheers
Davide

Although the fact that you can still create ARE sessions is strange.

Have you checked what happens if you try running a dummy PBS job on broadwell and non-broadwell queues?

Something like:

  • qsub -P gx60 -q express <<< 'echo hello' (non-broadwell)
  • qsub -P gx60 -q expressbw <<< 'echo hello' (broadwell)

I was accepted into the project last week, so I would have thought it would be ok by now to access compute resources. I created the new persistent session yesterday and ran groups beforehand (which listed gx60).

I submitted both those PBS jobs like you suggested and they both succeeded with an empty .e file and output:

hello

======================================================================================
                  Resource Usage on 2026-01-28 13:33:54:
   Job Id:             159503912.gadi-pbs
   Project:            gx60
   Exit Status:        0
   Service Units:      0.00
   NCPUs Requested:    1                   CPU Time Used: 00:00:00        
   Memory Requested:   500.0MB               Memory Used: 7.11MB          
   Walltime Requested: 10:00:00            Walltime Used: 00:00:01        
   JobFS Requested:    100.0MB                JobFS Used: 0B              
======================================================================================

So it seems fine to me since I seem to be able to use compute resources :confused:

2 Likes

I think your next step is to email help@nci.org.au. If you provide them with PBS IDs for the failed jobs they can inspect relevant logs which might provide some more clues.

1 Like

I’ve got it working!
The problem was that because I am using compute from one project and storage of another, in the nci_gadi.rc site file I think I made a syntax error by writing gx60 ..?:

[[[ directives ]]]
            -q          = express
            -l ncpus    = 1
            -l walltime = 1:00:00
            -l mem      = 1gb
            -l jobfs    = 1gb
            -P          = gx60
            -l storage  = scratch/q90+scratch/{{PROJECT}}+scratch/access+gdata/access+gdata/hh5+gdata/hr22+gdata/ki32+gdata/q90+gdata/{{PROJECT}}

and by changing gx60 to {{PROJECT}} which is defined in my rose-suite.conf file:

[[[ directives ]]]
            -q          = express
            -l ncpus    = 1
            -l walltime = 1:00:00
            -l mem      = 1gb
            -l jobfs    = 1gb
            -P          = {{PROJECT}}
            -l storage  = scratch/q90+scratch/{{PROJECT}}+scratch/access+gdata/access+gdata/hh5+gdata/hr22+gdata/ki32+gdata/q90+gdata/{{PROJECT}}

…now it works :grinning_cat:

Thank you both for your suggestions :)))

1 Like

I don’t love the complexity of many suites, I’d have thought putting the project code in directly would be functionally the same.

Does your solution also allow you to write to a different project?

In rose-suite.conf the compute project was defined as PROJECT="gx60" so I wondered whether it needs to be passed as a string (and whether it might also have worked in the directives section if I passed -P = “gx60”)? But the storage lines don’t have quotations so it doesn’t make much sense to me either. …and having a look at the AM3 configs, they have what I had written in the nci_suite.rc, i.e. -P = gx60 without “” and {{PROJECT}} ??

Does your solution also allow you to write to a different project?

  • I think so (I managed to get it to write to gx60 which was not my intention); but I just had a look at the AM3 suite config and looks like the setup is different so I’m not sure how applicable it is to the newer model versions
  • By setting PROJECT=“compute_project” in rose-suite.conf, and changing the storage lines in nci_gadi.rc I think I can control the compute and storage projects:
[[[ directives ]]]
            -q          = express
            -l ncpus    = 1
            -l walltime = 1:00:00
            -l mem      = 1gb
            -l jobfs    = 1gb
            -P          = {{PROJECT}}
            -l storage  = scratch/q90+scratch/{{PROJECT}}+scratch/access+gdata/access+gdata/hh5+gdata/hr22+gdata/ki32+gdata/q90+gdata/{{PROJECT}}
1 Like