Can we add uqstat to vk83 payu conda?

I am having trouble doing both:

module use /g/data/vk83/modules
module load payu

and

module use /g/data/xp65/public/modules
module load nci-scripts

at the same time, so I can payu run and uqstat in the same terminal.

@Aidan suggests perhaps the best solution to this is to add nci-scripts to the payu conda environment.

@adele-morrison can you paste in the text of the error you got when you tried to do this?

I loaded both and had no issue running either nqstat or payu

$ module use /g/data/vk83/modules
$ module load payu
Loading payu/1.1.7
  Loading requirement: singularity
$ module use /g/data/xp65/public/modules
$ module load nci-scripts
$ payu -h
usage: payu [-h] [--version]
            {archive,branch,build,checkout,clone,collate,ghsetup,init,list,profile,push,run,setup,sweep,sync} ...

positional arguments:
  {archive,branch,build,checkout,clone,collate,ghsetup,init,list,profile,push,run,setup,sweep,sync}

options:
  -h, --help            show this help message and exit
  --version             show program's version number and exit

$ nqstat
Job ID    S Proj User   Queue      Job Name         Used       Request   CPUs CPU%
----------------------------------------------------------------------------------
148497993 R xx99 xx9999 normalbw   sys-dashboard-   05:10:03   08:00:00      1   1
148499836 R xx99 xx9999 normalbw   jupyter-lab      04:50:29   08:00:00      8   0

Doh! I typed nqstat instead of uqstat.

So now I do get an error. Is it the same as yours @adele-morrison?

$ module use /g/data/vk83/modules
$ module load payu
Loading payu/1.1.7
  Loading requirement: singularity
$ module use /g/data/xp65/public/modules
$ module load nci-scripts
$ payu -h
usage: payu [-h] [--version]
            {archive,branch,build,checkout,clone,collate,ghsetup,init,list,profile,push,run,setup,sweep,sync} ...

positional arguments:
  {archive,branch,build,checkout,clone,collate,ghsetup,init,list,profile,push,run,setup,sweep,sync}

options:
  -h, --help            show this help message and exit
  --version             show program's version number and exit

$ uqstat
/g/data/xp65/public/apps/med_conda_scripts/analysis3.d/bin/python: line 121: /g/data/vk83/apps/base_conda/envs/analysis3-25.07/bin/python: No such file or directory
$ 

Yep, same error. payu seems to work fine, but uqstat does not work.

The short answer is that they’re both using conda python environments that are not playing well together.

This is a solved problem for the version of payu in payu/dev, and we’ll shortly be releasing a new version of payu that will also not have this problem

$ module use /g/data/xp65/public/modules
$ module load nci-scripts
$ uqstat -h
usage: uqstat [-h] [--historical] [--format {table,csv,json}] [--project PROJECT] [--comment]

Print more detailed information from qstat

Returns the following columns for each job:
    project:    NCI project the job was submitted under
    job_name:   Name of the job
    queue:      Queue the job was submitted to
    state:      Current state - 'Q' in queue, 'R' running, 'H' held, 'E' finished
    ncpus:      Number of cpus requested
    walltime:   Walltime the job has run for so far
    su:         SU cost of the job so far
    mem_pct:    Percent of the memory request used
    cpu_pct:    Percent of time CPUs have been active
    qtime:      Time the job spent in the queue before starting

If 'mem_pct' is below 80% make sure you're not requesting too much memory (4GB
per CPU or less is fine)

If 'cpu_pct' is below 80% and you're requesting more than one CPU make sure
your job is making proper use of parallelisation

options:
  -h, --help            show this help message and exit
  --historical, -x      Show historical info
  --format {table,csv,json}, -f {table,csv,json}
                        Output format
  --project PROJECT, -P PROJECT
                        Show all jobs in a project
  --comment, -c         Show PBS queue comment
$ module use /g/data/vk83/prerelease/modules/
$ module load payu
Loading payu/dev-20250828T223708Z-0fd643d
  Loading requirement: singularity
$ uqstat -h
usage: uqstat [-h] [--historical] [--format {table,csv,json}] [--project PROJECT] [--comment]

Print more detailed information from qstat

Returns the following columns for each job:
    project:    NCI project the job was submitted under
    job_name:   Name of the job
    queue:      Queue the job was submitted to
    state:      Current state - 'Q' in queue, 'R' running, 'H' held, 'E' finished
    ncpus:      Number of cpus requested
    walltime:   Walltime the job has run for so far
    su:         SU cost of the job so far
    mem_pct:    Percent of the memory request used
    cpu_pct:    Percent of time CPUs have been active
    qtime:      Time the job spent in the queue before starting

If 'mem_pct' is below 80% make sure you're not requesting too much memory (4GB
per CPU or less is fine)

If 'cpu_pct' is below 80% and you're requesting more than one CPU make sure
your job is making proper use of parallelisation

options:
  -h, --help            show this help message and exit
  --historical, -x      Show historical info
  --format {table,csv,json}, -f {table,csv,json}
                        Output format
  --project PROJECT, -P PROJECT
                        Show all jobs in a project
  --comment, -c         Show PBS queue comment
$  payu -h
usage: payu [-h] [--version]
            {archive,branch,build,checkout,clone,collate,ghsetup,init,list,profile,push,run,setup,sweep,sync} ...

positional arguments:
  {archive,branch,build,checkout,clone,collate,ghsetup,init,list,profile,push,run,setup,sweep,sync}

options:
  -h, --help            show this help message and exit
  --version             show program's version number and exit

$

I will update this topic when there is a new released version of payu that works as well, but you should be fine to use payu/dev in the mean time if you’re feeling adventurous.

Just a bit more context on this.

The issue is not (or at least not only) related to the payu ennvironment.

I get a similar error without any module loaded (I enabled debug logs with CMS_CONDA_DEBUG_SCRIPTS=1):

$ module purge
$ module use /g/data/xp65/public/modules
$ module load nci-scripts
$ CMS_CONDA_DEBUG_SCRIPTS=1 uqstat -h

wrapper_bin =  /g/data/xp65/public/apps/med_conda_scripts/analysis3-25.08.d/bin
conf_file =  /g/data/xp65/public/apps/med_conda_scripts/analysis3-25.08.d/bin/launcher_conf.sh
PROG_ARGS = /g/data/xp65/public/apps/nci_scripts/uqstat
cmd_to_run =  /g/data/xp65/public/apps/med_conda_scripts/analysis3.d/bin/python /g/data/xp65/public/apps/nci_scripts/uqstat
CONTAINER_OVERLAY_PATH after override check =
SINGULARITYENV_PREPEND_PATH=  /g/data/xp65/public/apps/nci_scripts/:/home/565/dm5220/.local/bin:/g/data3/tm70/dm5220/projects/bin:/scratch/tm70/dm5220/micromamba/condabin:/opt/pbs/default/bin:/opt/nci/bin:/opt/bin:/opt/Modules/v4.3.0/bin:/opt/pbs/default/bin
overlay_args=  --overlay=
binding args=  /etc,/half-root,/local,/ram,/run,/system,/usr,/var/lib/sss,/var/run/munge,/sys/fs/cgroup,/iointensive
Singularity invocation:  /opt/singularity/bin/singularity -s exec --bind /etc,/half-root,/local,/ram,/run,/system,/usr,/var/lib/sss,/var/run/munge,/sys/fs/cgroup,/iointensive --overlay= /g/data/xp65/public/apps/med_conda/etc/base.sif /g/data/xp65/public/apps/med_conda_scripts/analysis3.d/bin/python /g/data/xp65/public/apps/nci_scripts/uqstat
wrapper_bin =  /g/data/xp65/public/apps/med_conda_scripts/analysis3-25.08.d/bin
conf_file =  /g/data/xp65/public/apps/med_conda_scripts/analysis3-25.08.d/bin/launcher_conf.sh
PROG_ARGS = /g/data/xp65/public/apps/nci_scripts/uqstat
cmd_to_run =  /g/data/xp65/public/apps/med_conda_scripts/analysis3.d/bin/python /g/data/xp65/public/apps/nci_scripts/uqstat
CONTAINER_OVERLAY_PATH after override check =
Short circuit detected, running:  exec -a /g/data/xp65/public/apps/med_conda_scripts/analysis3.d/bin/python /home/565/dm5220/miniconda3/bin/conda/envs/analysis3-25.08/bin/python /g/data/xp65/public/apps/nci_scripts/uqstat
/g/data/xp65/public/apps/med_conda_scripts/analysis3.d/bin/python: line 121: /home/565/dm5220/miniconda3/bin/conda/envs/analysis3-25.08/bin/python: Not a directory
/g/data/xp65/public/apps/med_conda_scripts/analysis3.d/bin/python: line 121: exec: /home/565/dm5220/miniconda3/bin/conda/envs/analysis3-25.08/bin/python: cannot execute: Not a directory

The error is related to this issue.

The /g/data/xp65/public/apps/med_conda_scripts/analysis3.d/bin/python script at line 121 has:

121 exec -a "${0}" "${cmd_to_run[@]}"

which, from lines 109 and 119:

109 export CONDA_BASE="${CONDA_BASE_ENV_PATH}/envs/${myenv}"
119 cmd_to_run[0]="${CONDA_BASE}/bin/${cmd_to_run[0]##*/}"

can be expanded as:

121(expanded) exec -a "${0}" "${CONDA_BASE_ENV_PATH}/envs/${myenv}/bin/${cmd_to_run[0]##*/}" ${cmd_to_run[@]:1}

From the output errors, it can be seen that CONDA_BASE_ENV_PATH gets evaluated to my custom conda directory (/home/565/dm5220/miniconda3/bin/conda) in my case, and to the payu conda directory (/g/data/vk83/apps/base_conda) when the payu module is loaded.

This should not happen and, instead, CONDA_BASE_ENV_PATH should always be evaluated to the conda/analysis3 conda env directory /g/data/xp65/public/apps/med_conda/ (or whatever the directory with the current conda/analysis environment within the singularity container is)
@rbeucher

I am starting to wonder if it would be simpler to put it in the container…

@atteggiani has pointed out this is a broader problem. He is tracking that, so he will update the topic when there are relevant changes available.

This problem should be fixed now.
Could you please confirm @adele-morrison?

Yes I can confirm there are no more errors, we can now use payu and uqstat simultaneously! Thanks @atteggiani !