UM build qsub error - help

alanah.chapman · 28 January 2026 00:18

Hi,

I’m hoping someone might be able to help me with this confusing error

I am trying to run ACCESS-AM2, and at the UMBUILD/fcm_make2_um step I have gotten the following error message:
2026-01-28T00:04:23Z [STDERR] qsub: Error: You are not a member of project project=gx60. You must be a member of a project to submit a job under that project. [(('event-mail', 'submission retry'), 1) ret_code] 0

However,

I am a member of gx60
After becoming a member of gx60 I created a new persistent session which the job is running under
I can create ARE sessions (using the normalbw and expressbw queues) using compute from gx60

For some more info on the job submission:

#!/bin/bash -l
#
# ++++ THIS IS A CYLC TASK JOB SCRIPT ++++
# Suite: u-dt947
# Task: fcm_make2_um.20190101T0000Z
# Job log directory: 20190101T0000Z/fcm_make2_um/02
# Job submit method: pbs

# DIRECTIVES:
#PBS -N fcm_make2_um.20
#PBS -o /home/563/ac9768/cylc-run/u-dt947/log/job/20190101T0000Z/fcm_make2_um/02/job.out
#PBS -e /home/563/ac9768/cylc-run/u-dt947/log/job/20190101T0000Z/fcm_make2_um/02/job.err
#PBS -q express
#PBS -l ncpus=8
#PBS -l walltime=1:00:00
#PBS -l mem=12gb
#PBS -l jobfs=12gb
#PBS -P project=gx60
#PBS -l storage=scratch/q90+scratch/gx60+scratch/access+gdata/access+gdata/hh5+gdata/hr22+gdata/ki32+gdata/q90+gdata/gx60
#PBS -l software=intel-compiler
export CYLC_DIR='/g/data/hr22/apps/cylc7/cylc_7.9.9'
export CYLC_VERSION='7.9.9'
CYLC_FAIL_SIGNALS='EXIT ERR TERM XCPU'

atteggiani · 28 January 2026 02:12

Hi @alanah.chapman,

Have you tried submitting the job right after being accepted in gx60?
I’m asking because sometimes there might be a delay between being able to access resources on a project and being able to access computing SU on the same project.
In that case, I would suggest to wait an hour or so and try again.

Cheers
Davide

atteggiani · 28 January 2026 02:14

Although the fact that you can still create ARE sessions is strange.

Have you checked what happens if you try running a dummy PBS job on broadwell and non-broadwell queues?

Something like:

qsub -P gx60 -q express <<< 'echo hello' (non-broadwell)
qsub -P gx60 -q expressbw <<< 'echo hello' (broadwell)

alanah.chapman · 28 January 2026 02:39

I was accepted into the project last week, so I would have thought it would be ok by now to access compute resources. I created the new persistent session yesterday and ran groups beforehand (which listed gx60).

I submitted both those PBS jobs like you suggested and they both succeeded with an empty .e file and output:

hello

======================================================================================
                  Resource Usage on 2026-01-28 13:33:54:
   Job Id:             159503912.gadi-pbs
   Project:            gx60
   Exit Status:        0
   Service Units:      0.00
   NCPUs Requested:    1                   CPU Time Used: 00:00:00        
   Memory Requested:   500.0MB               Memory Used: 7.11MB          
   Walltime Requested: 10:00:00            Walltime Used: 00:00:01        
   JobFS Requested:    100.0MB                JobFS Used: 0B              
======================================================================================

So it seems fine to me since I seem to be able to use compute resources

Aidan · 28 January 2026 03:03

I think your next step is to email help@nci.org.au. If you provide them with PBS IDs for the failed jobs they can inspect relevant logs which might provide some more clues.

alanah.chapman · 28 January 2026 03:47

I’ve got it working!
The problem was that because I am using compute from one project and storage of another, in the nci_gadi.rc site file I think I made a syntax error by writing gx60 ..?:

[[[ directives ]]]
            -q          = express
            -l ncpus    = 1
            -l walltime = 1:00:00
            -l mem      = 1gb
            -l jobfs    = 1gb
            -P          = gx60
            -l storage  = scratch/q90+scratch/{{PROJECT}}+scratch/access+gdata/access+gdata/hh5+gdata/hr22+gdata/ki32+gdata/q90+gdata/{{PROJECT}}

and by changing gx60 to {{PROJECT}} which is defined in my rose-suite.conf file:

[[[ directives ]]]
            -q          = express
            -l ncpus    = 1
            -l walltime = 1:00:00
            -l mem      = 1gb
            -l jobfs    = 1gb
            -P          = {{PROJECT}}
            -l storage  = scratch/q90+scratch/{{PROJECT}}+scratch/access+gdata/access+gdata/hh5+gdata/hr22+gdata/ki32+gdata/q90+gdata/{{PROJECT}}

…now it works

Thank you both for your suggestions :)))

Aidan · 28 January 2026 23:09

I don’t love the complexity of many suites, I’d have thought putting the project code in directly would be functionally the same.

Does your solution also allow you to write to a different project?

alanah.chapman · 29 January 2026 00:02

In rose-suite.conf the compute project was defined as PROJECT="gx60" so I wondered whether it needs to be passed as a string (and whether it might also have worked in the directives section if I passed -P = “gx60”)? But the storage lines don’t have quotations so it doesn’t make much sense to me either. …and having a look at the AM3 configs, they have what I had written in the nci_suite.rc, i.e. -P = gx60 without “” and {{PROJECT}} ??

Does your solution also allow you to write to a different project?

I think so (I managed to get it to write to gx60 which was not my intention); but I just had a look at the AM3 suite config and looks like the setup is different so I’m not sure how applicable it is to the newer model versions
By setting PROJECT=“compute_project” in rose-suite.conf, and changing the storage lines in nci_gadi.rc I think I can control the compute and storage projects:

[[[ directives ]]]
            -q          = express
            -l ncpus    = 1
            -l walltime = 1:00:00
            -l mem      = 1gb
            -l jobfs    = 1gb
            -P          = {{PROJECT}}
            -l storage  = scratch/q90+scratch/{{PROJECT}}+scratch/access+gdata/access+gdata/hh5+gdata/hr22+gdata/ki32+gdata/q90+gdata/{{PROJECT}}

Topic		Replies	Views
Submit failed on rose stem test for Jules V7.1 Land Surface land , jules	2	337	15 December 2022
Job submission failure, ACCESS-CM2 Earth System help , rosecylc	6	83	24 September 2025
Transferring UMUI experiments to gadi Technical	6	196	21 February 2024
Payu and change of project on gadi General help , payu	10	77	2 March 2026
OAS and RAS suites stalling for ~1 day on Gadi and then failing Regional Nesting Suite help , access-ram3	49	182	11 May 2026

UM build qsub error - help

Related topics