Can we add uqstat to vk83 payu conda?

I am having trouble doing both:

module use /g/data/vk83/modules
module load payu

and

module use /g/data/xp65/public/modules
module load nci-scripts

at the same time, so I can payu run and uqstat in the same terminal.

@Aidan suggests perhaps the best solution to this is to add nci-scripts to the payu conda environment.

2 Likes

@adele-morrison can you paste in the text of the error you got when you tried to do this?

I loaded both and had no issue running either nqstat or payu

$ module use /g/data/vk83/modules
$ module load payu
Loading payu/1.1.7
  Loading requirement: singularity
$ module use /g/data/xp65/public/modules
$ module load nci-scripts
$ payu -h
usage: payu [-h] [--version]
            {archive,branch,build,checkout,clone,collate,ghsetup,init,list,profile,push,run,setup,sweep,sync} ...

positional arguments:
  {archive,branch,build,checkout,clone,collate,ghsetup,init,list,profile,push,run,setup,sweep,sync}

options:
  -h, --help            show this help message and exit
  --version             show program's version number and exit

$ nqstat
Job ID    S Proj User   Queue      Job Name         Used       Request   CPUs CPU%
----------------------------------------------------------------------------------
148497993 R xx99 xx9999 normalbw   sys-dashboard-   05:10:03   08:00:00      1   1
148499836 R xx99 xx9999 normalbw   jupyter-lab      04:50:29   08:00:00      8   0

Doh! I typed nqstat instead of uqstat.

So now I do get an error. Is it the same as yours @adele-morrison?

$ module use /g/data/vk83/modules
$ module load payu
Loading payu/1.1.7
  Loading requirement: singularity
$ module use /g/data/xp65/public/modules
$ module load nci-scripts
$ payu -h
usage: payu [-h] [--version]
            {archive,branch,build,checkout,clone,collate,ghsetup,init,list,profile,push,run,setup,sweep,sync} ...

positional arguments:
  {archive,branch,build,checkout,clone,collate,ghsetup,init,list,profile,push,run,setup,sweep,sync}

options:
  -h, --help            show this help message and exit
  --version             show program's version number and exit

$ uqstat
/g/data/xp65/public/apps/med_conda_scripts/analysis3.d/bin/python: line 121: /g/data/vk83/apps/base_conda/envs/analysis3-25.07/bin/python: No such file or directory
$ 

Yep, same error. payu seems to work fine, but uqstat does not work.

1 Like

The short answer is that they’re both using conda python environments that are not playing well together.

This is a solved problem for the version of payu in payu/dev, and we’ll shortly be releasing a new version of payu that will also not have this problem

$ module use /g/data/xp65/public/modules
$ module load nci-scripts
$ uqstat -h
usage: uqstat [-h] [--historical] [--format {table,csv,json}] [--project PROJECT] [--comment]

Print more detailed information from qstat

Returns the following columns for each job:
    project:    NCI project the job was submitted under
    job_name:   Name of the job
    queue:      Queue the job was submitted to
    state:      Current state - 'Q' in queue, 'R' running, 'H' held, 'E' finished
    ncpus:      Number of cpus requested
    walltime:   Walltime the job has run for so far
    su:         SU cost of the job so far
    mem_pct:    Percent of the memory request used
    cpu_pct:    Percent of time CPUs have been active
    qtime:      Time the job spent in the queue before starting

If 'mem_pct' is below 80% make sure you're not requesting too much memory (4GB
per CPU or less is fine)

If 'cpu_pct' is below 80% and you're requesting more than one CPU make sure
your job is making proper use of parallelisation

options:
  -h, --help            show this help message and exit
  --historical, -x      Show historical info
  --format {table,csv,json}, -f {table,csv,json}
                        Output format
  --project PROJECT, -P PROJECT
                        Show all jobs in a project
  --comment, -c         Show PBS queue comment
$ module use /g/data/vk83/prerelease/modules/
$ module load payu
Loading payu/dev-20250828T223708Z-0fd643d
  Loading requirement: singularity
$ uqstat -h
usage: uqstat [-h] [--historical] [--format {table,csv,json}] [--project PROJECT] [--comment]

Print more detailed information from qstat

Returns the following columns for each job:
    project:    NCI project the job was submitted under
    job_name:   Name of the job
    queue:      Queue the job was submitted to
    state:      Current state - 'Q' in queue, 'R' running, 'H' held, 'E' finished
    ncpus:      Number of cpus requested
    walltime:   Walltime the job has run for so far
    su:         SU cost of the job so far
    mem_pct:    Percent of the memory request used
    cpu_pct:    Percent of time CPUs have been active
    qtime:      Time the job spent in the queue before starting

If 'mem_pct' is below 80% make sure you're not requesting too much memory (4GB
per CPU or less is fine)

If 'cpu_pct' is below 80% and you're requesting more than one CPU make sure
your job is making proper use of parallelisation

options:
  -h, --help            show this help message and exit
  --historical, -x      Show historical info
  --format {table,csv,json}, -f {table,csv,json}
                        Output format
  --project PROJECT, -P PROJECT
                        Show all jobs in a project
  --comment, -c         Show PBS queue comment
$  payu -h
usage: payu [-h] [--version]
            {archive,branch,build,checkout,clone,collate,ghsetup,init,list,profile,push,run,setup,sweep,sync} ...

positional arguments:
  {archive,branch,build,checkout,clone,collate,ghsetup,init,list,profile,push,run,setup,sweep,sync}

options:
  -h, --help            show this help message and exit
  --version             show program's version number and exit

$

I will update this topic when there is a new released version of payu that works as well, but you should be fine to use payu/dev in the mean time if you’re feeling adventurous.

Just a bit more context on this.

The issue is not (or at least not only) related to the payu ennvironment.

I get a similar error without any module loaded (I enabled debug logs with CMS_CONDA_DEBUG_SCRIPTS=1):

$ module purge
$ module use /g/data/xp65/public/modules
$ module load nci-scripts
$ CMS_CONDA_DEBUG_SCRIPTS=1 uqstat -h

wrapper_bin =  /g/data/xp65/public/apps/med_conda_scripts/analysis3-25.08.d/bin
conf_file =  /g/data/xp65/public/apps/med_conda_scripts/analysis3-25.08.d/bin/launcher_conf.sh
PROG_ARGS = /g/data/xp65/public/apps/nci_scripts/uqstat
cmd_to_run =  /g/data/xp65/public/apps/med_conda_scripts/analysis3.d/bin/python /g/data/xp65/public/apps/nci_scripts/uqstat
CONTAINER_OVERLAY_PATH after override check =
SINGULARITYENV_PREPEND_PATH=  /g/data/xp65/public/apps/nci_scripts/:/home/565/dm5220/.local/bin:/g/data3/tm70/dm5220/projects/bin:/scratch/tm70/dm5220/micromamba/condabin:/opt/pbs/default/bin:/opt/nci/bin:/opt/bin:/opt/Modules/v4.3.0/bin:/opt/pbs/default/bin
overlay_args=  --overlay=
binding args=  /etc,/half-root,/local,/ram,/run,/system,/usr,/var/lib/sss,/var/run/munge,/sys/fs/cgroup,/iointensive
Singularity invocation:  /opt/singularity/bin/singularity -s exec --bind /etc,/half-root,/local,/ram,/run,/system,/usr,/var/lib/sss,/var/run/munge,/sys/fs/cgroup,/iointensive --overlay= /g/data/xp65/public/apps/med_conda/etc/base.sif /g/data/xp65/public/apps/med_conda_scripts/analysis3.d/bin/python /g/data/xp65/public/apps/nci_scripts/uqstat
wrapper_bin =  /g/data/xp65/public/apps/med_conda_scripts/analysis3-25.08.d/bin
conf_file =  /g/data/xp65/public/apps/med_conda_scripts/analysis3-25.08.d/bin/launcher_conf.sh
PROG_ARGS = /g/data/xp65/public/apps/nci_scripts/uqstat
cmd_to_run =  /g/data/xp65/public/apps/med_conda_scripts/analysis3.d/bin/python /g/data/xp65/public/apps/nci_scripts/uqstat
CONTAINER_OVERLAY_PATH after override check =
Short circuit detected, running:  exec -a /g/data/xp65/public/apps/med_conda_scripts/analysis3.d/bin/python /home/565/dm5220/miniconda3/bin/conda/envs/analysis3-25.08/bin/python /g/data/xp65/public/apps/nci_scripts/uqstat
/g/data/xp65/public/apps/med_conda_scripts/analysis3.d/bin/python: line 121: /home/565/dm5220/miniconda3/bin/conda/envs/analysis3-25.08/bin/python: Not a directory
/g/data/xp65/public/apps/med_conda_scripts/analysis3.d/bin/python: line 121: exec: /home/565/dm5220/miniconda3/bin/conda/envs/analysis3-25.08/bin/python: cannot execute: Not a directory

The error is related to this issue.

The /g/data/xp65/public/apps/med_conda_scripts/analysis3.d/bin/python script at line 121 has:

121 exec -a "${0}" "${cmd_to_run[@]}"

which, from lines 109 and 119:

109 export CONDA_BASE="${CONDA_BASE_ENV_PATH}/envs/${myenv}"
119 cmd_to_run[0]="${CONDA_BASE}/bin/${cmd_to_run[0]##*/}"

can be expanded as:

121(expanded) exec -a "${0}" "${CONDA_BASE_ENV_PATH}/envs/${myenv}/bin/${cmd_to_run[0]##*/}" ${cmd_to_run[@]:1}

From the output errors, it can be seen that CONDA_BASE_ENV_PATH gets evaluated to my custom conda directory (/home/565/dm5220/miniconda3/bin/conda) in my case, and to the payu conda directory (/g/data/vk83/apps/base_conda) when the payu module is loaded.

This should not happen and, instead, CONDA_BASE_ENV_PATH should always be evaluated to the conda/analysis3 conda env directory /g/data/xp65/public/apps/med_conda/ (or whatever the directory with the current conda/analysis environment within the singularity container is)
@rbeucher

I am starting to wonder if it would be simpler to put it in the container…

@atteggiani has pointed out this is a broader problem. He is tracking that, so he will update the topic when there are relevant changes available.

1 Like

This problem should be fixed now.
Could you please confirm @adele-morrison?

Yes I can confirm there are no more errors, we can now use payu and uqstat simultaneously! Thanks @atteggiani !

1 Like