Change project ID (CM2)

Hi there,

I assume it just needs to change the project ID under rose-suite.conf then the model will run under that proejct. Is that right? It’s CM2 setting.

rose-suite.conf:PROJECT=‘lg87’

However, it seems I am still using p66:

[ars599@gadi-login-01:u-ds688 ] $ nqstat|grep ars599
149768271 R p66 ars599 normalbw sys-dashboard- 04:27:14 08:00:00 2 0
149790221 Q p66 ars599 normal atmos.19000101 08:00:00 576

Cheers,

Arnold

Hi @arnoldsu,

It looks like the nqstat command might only show jobs on your default project, and so by default won’t show jobs you’ve submitted to lg87. To see if there is anything running on lg87, try:

nqstat -P lg87

Cheers,
Spencer

1 Like

Many thanks Spencer!!

I guess I need to change this part. I will give it a try. Many thanks!

session_name=cm2
USER=ars599
PROJECT=lg87

export CYLC_SESSION={USER}.${PROJECT}.ps.gadi.nci.org.au
“”"

run: persistent-sessions start cm2

seems need to do this first. just change the project in the job is not working.

(Run ACCESS-CM2 - ACCESS-Hive Docs)

Hi there,

I tried Arnold’s method to change the project, but it didn’t work as expected.

Here’s what I did:

I created a persistent-session under lg87:

$ persistent-sessions start -p lg87 amip-exp

Then, I revised the PROJECT in rose-suite.conf :

PROJECT='lg87'

However, the model still runs under the default project:

$ nqstat -P lg87

Job ID S Proj User Queue Job Name Used Request CPUs CPU%

----------------------------------------------------------------------------------

$ nqstat -P k10

Job ID S Proj User Queue Job Name Used Request CPUs CPU%

----------------------------------------------------------------------------------
150050296 R k10 hw1226 normal atmos.19000101 00:00:28 08:00:00 576

Do you know there is another way to switch the project successfully? Thanks a lot.

Regards,

Hao

Hi all,

how to change the running project for a suite depends on the suite you are running, as each of them might have a different structure.

@arnoldsu

I tested running the CM2 suite in the Hive Docs documentation (u-cy339) specifying a custom project in the rose-suite.conf PROJECT field.
Everything seems to work as expected, and the jobs that run through PBS use the specified project instead of my default one (see nqstat output below):

$ nqstat -P k10 -u dm5220
Job ID    S Proj User   Queue      Job Name         Used       Request   CPUs CPU%
----------------------------------------------------------------------------------
150061525 Q  k10 dm5220 express    fcm_make_um.09              00:40:00      6
150061527 Q  k10 dm5220 normal     make_cice.0950              00:05:00      1
150061529 Q  k10 dm5220 copyq      make_drivers.0              00:05:00      1
150061531 Q  k10 dm5220 copyq      make_mom.09500              00:05:00      1

You are not supposed to be changing anything else than the PROJECT field. The persistent-session project doesn’t matter (for PBS jobs), it only matters for tasks within the suite that run in the background (but usually these are very quick and use only a bunch of SU).

@Hao

It seems to me you are trying to run a different suite than the suggested CM2 one (u-cy339). From your nqstat output you shared, a Job Name is atmos. 19000101, but within the u-cy339 suite there is no task which has that name.
What suite are you trying to run using a different project?

Cheers
Davide

2 Likes

Many thanks, Davide!! I was expecting just to change the PROJECT under the rose-suite.conf. But somehow it’s not working with @hao very strange. @hao Manually changed the job ID. maybe should start a new one and test @hao ?? Thanks Davide!!

It is worth noting that the PROJECT that is used to submit a job to PBS can be specified in the Cylc PBS [[[directives]]] and that for some suites this may be hardcoded - {% set PROJECT = ‘dx2’ %} for example - Although this is not a good practice, it is worthwhile checking that it has not been done in your implementation/suite

2 Likes

Hi @atteggiani, @griff and @arnoldsu ,

Thanks so much.

I’m actually running the suite u-cz934. I just tested a new one but it still didn’t work, so the issue might be related to the structure of this suite. I’ll check the Cylc PBS [[[directives]]]. Thanks again.

Regards,

Hao

1 Like

@Hao, the suite u-cz934 doesn’t seem to have any project directives set (it has a PROJECT field in the rose-suite.conf but it doesn’t get used within the suite.rc). Therefore, it currently uses always your default project.

As Griffith suggested, in this case you might have to change the [[[directives]]] field within the [[atmos]] task in the suite.rc file.

I would suggest, instead of hardcoding your project, to add the line:

[[[ directives ]]]
    -l ncpus = {{NUM_CPUS}}
    -l mem = {{PHYSICAL_MEMORY}}GB
    -q = {{QNORMAL}}
+   -P = {{PROJECT}}

so you can still change project by changing the PROJECT field within the rose-suite.conf.

Cheers
Davide

1 Like

Thanks Davide @atteggiani , that makes sense. I’ll test it as you suggested.

Much appreciated!

Regards,

Hao

Hi Davide @atteggiani ,

Thanks again for your suggestions last week. I added -P = {{PROJECT}} line in the [[[directive]]] of u-cz934’s suite.rc, but the job stays in “submitted“ state and never actually runs.

I also tested u-cy339, setting PROJECT = ‘lg87‘ in rose-suite.conf, but it was also stuck in the “submitted“ state. Do you know what might cause this?

Thanks and regards,

Hao

Hi @Hao,

It is strange that the jobs stay as submitted and never runs.
In many cases, these could be transient Gadi connecton problems.

Have you tried re-running the suite and checking if the jobs are actually run?

Cheers
Davide

If the job is submitted, then there are many reasons why it may get stuck in the queue. First step - Check that the job has been submitted and is in the correct project with qstat. Second step - Check the comment in the job with qstat -f {JOB_ID} as this will tell you things like “Not enough resources”, “Not enough CPUs”, stuff that can help us determine the cause of the delay. Sometimes there is just a delay, other times, the comment will indicate that the job will never run. But if the job is in the queue and is the right project, then this ‘Solved’ ticket is complete.

1 Like

Not sure if this helps, but I have had a similar problem before - Have you checked that the storage flags in the suite.rc file contain all the relevant paths/directories needed for your suite to run? This has been a problem for me in the past when changing the compute project that I run my suite from.

Hi @atteggiani , @griff , @hrsdawson

Thanks a lot for all your suggestions! I checked again and realized the issue was truly caused by the storage path setting in my suite.rc. I didn’t included scratch path of the previous project in the storage path. After adding it, the jobs are now running fine.

Really appreciate all your help!

Cheers,
Hao

3 Likes

Since the solutions were found, I closed this topic.

If you still have issues with anything related to this, feel free to still reply to it.