@dougrichardson am I correctly understanding that you’re spinning up the PBSCluster from within an ARE instance?
If so, can you try a couple of things just to see what happens:
- Try to launch the script you provided in your initial post from a login node
- Change the script to the following from within the ARE session:
from dask.distributed import Client,LocalCluster
from dask_jobqueue import PBSCluster
import sys
walltime = "00:05:00"
cores = 1
memory = str(4 * cores) + "GB"
cluster = PBSCluster(
walltime=str(walltime),
cores=cores,
memory=str(memory),
processes=cores,
job_extra_directives=[
"-q normal",
"-P dt6",
"-l ncpus="+str(cores),
"-l mem="+str(memory),
"-l storage=gdata/xp65+gdata/w42"
],
local_directory="$TMPDIR",
job_directives_skip=["select"],
#python="/g/data/xp65/public/apps/med_conda_scripts/analysis3-25.08.d/bin/python",
#job_script_prologue=['module load conda/analysis3-25.08'],)
cluster.scale(jobs=1)
client = Client(cluster)
print(client)
- Try a combo of both - ie. running the script above from a login node with the conda/analysis3 loaded. The xp65 environment variables should configure it for you (I can’t spot anything, but maybe there’s a typo in there?
A while back, we set the environment variables to handle this by default - I wonder if the added configuration is causing conflict. See also forum post and git issue.
I have a mild suspicion that for containerisation reasons, the additional config is confusing things - but I’m not entirely sure why yet. I thiiiink that if at least one of my suggestions above doesn’t work, they should give us different error & point us to the source of the issue.
My suspicion is that one of those two directives is messing up the python path somehow and it’s sending dask looking for a python executable in the wrong place (N.B I’ve split this error up into lines):
ERROR: Unable to locate a modulefile for ‘conda/analysis3-25.08’/g/data/xp65/public/apps/med_conda_scripts/analysis3-25.08.d/bin/python:
line 121: /g/data/w42/dr6273/apps/conda/bin/conda/envs/analysis3-25.08/bin/python: Not a directory/g/data/xp65/public/apps/med_conda_scripts/analysis3-25.08.d/bin/python:
line 121: exec: /g/data/w42/dr6273/apps/conda/bin/conda/envs/analysis3-25.08/bin/python: cannot execute: Not a directory
If you read the subsequent lines, it looks like paths have gotten mangled together.