Python problem with RNS, task install_glm_startdata

I have been working with a copy of Regional Nesting Suite by395, (my de207). I had an initial attempt working cleanly earlier this week, but have had problems since the NCI issues yesterday (timing might be a coincidence).

The task install_glm_startdata now fails, with the following from job.err:

Using the cylc session localhost

Loading cylc7/23.09
Loading requirement: mosrs-setup/1.0.1
Loading conda/analysis3-23.07
Module ERROR: CONDA_SHLVL=0
CONDA_EXE=/g/data/hh5/public/apps/miniconda3/bin/conda
_CE_M=
PWD=/home/599/mxw599
CONDA_PYTHON_EXE=/g/data/hh5/public/apps/miniconda3/bin/python
_CE_CONDA=
SHLVL=1
PATH=/g/data/hh5/public/apps/miniconda3/condabin:/usr/bin:/bin
_=/bin/env
Fatal Python error: init_import_site: Failed to import the site module
Python runtime state: initialized
Traceback (most recent call last):
File “/g/data/hh5/public/apps/miniconda3/lib/python3.9/site.py”, line 73,
in
import os
File “/g/data/hh5/public/apps/miniconda3/lib/python3.9/os.py”, line 61, in

import posixpath as path
File “/g/data/hh5/public/apps/miniconda3/lib/python3.9/posixpath.py”, line
28, in
import genericpath
ModuleNotFoundError: No module named ‘genericpath’
In ‘/g/data/hh5/public/modules/conda/analysis3-23.07’
Please contact root@localhost
RuntimeError: unable to get file status from ‘/g/data/hr22/apps/cylc7/lib/python2.7/site.py’
/local/spool/pbs/mom_priv/jobs/111766273.gadi-pbs.SC: line 93: ROSE_DATAC: unbound variable
/g/data/hr22/apps/cylc7/cylc_7.9.7/…/23.09/bin/cylc: line 8: /g/data/hr22/apps/cylc7/23.09/…/cylc_7.9.7/bin/cylc: Input/output error
/g/data/hr22/apps/cylc7/cylc_7.9.7/…/23.09/bin/cylc: line 8: /g/data/hr22/apps/cylc7/23.09/…/cylc_7.9.7/bin/cylc: Input/output error

Additionally, the failure is not being reported back to cylc, with the task status remaining as ‘submitted’ despite the failure.

Any help appreciated!

Hi Matt,

I usually run an “install_ec_startdata” RNS suite.

Last week I had problems with suites that used to run suddenly not running and got around it by rebuilding the executable. Do you want to try that?

The other question is what commands do you run to set up your environment?

I use:

image

Hope that helps somewhat.

We’ve been getting reports of strange python failures today specifically on copyq, I wonder if that’s related. Could be something weird with the disks

1 Like

Hi Chermelle, Scott,

Thanks for the suggestions!

I tried using express rather than copyq, and the task ran cleanly and also reported back to cylc gui the correct status of the task (which wasn’t happening before). So I think we have our culprit: problems with copyq.

Cheers,
Matt