BUT then after the next job ancil_sst_seaice finishes, all files in the output directory have been deleted. The folder is empty. Others have been to able to run the same suite and branch just fine so this is very odd @mlipson@cbengel
@reyhan.respati and I had the same experience trying to run the suite last week. I didn’t have much time to dig up the code but it seams the creation of .anc files is failing silently.
Yes, thanks @Jatreutlein I am looking into it already.
For some reason the ancillary conversion is not working for some people.
I made the jobs run in parallel in the scripts due to Gadi job submissions slowing down the suite completion but the errors are not being propagated.
Can one of you (who’s suite it not producing the ancillary files) please do me a favour and checkout u-dk517/trunk@334063 (I can privately message you the full command) to try out the suite before the parallel wrapping was added? Then can you please send me the output of the ancil job?
In that older version of the suite each day is submitted individually so we will be able to look at the ancillary job error.
(Please note that your output directory would need exist as the creation of the directory was added later).
Thanks @cbengel, I’m trying to checkout that branch with an issue:
rosie checkout u-dk517/trunk@334063
[FAIL] svn checkout -q https://code.metoffice.gov.uk/svn/roses-u/d/k/5/1/7/trunk@334063 /home/272/jt0319/roses/u-dk517 # return-code=1, stderr=
[FAIL] svn: E170013: Unable to connect to a repository at URL 'https://code.metoffice.gov.uk/svn/roses-u/d/k/5/1/7/trunk'
[FAIL] svn: E215004: No more credentials or we tried too many times.
[FAIL] Authentication failed
clairecarouge
(Claire Carouge, ACCESS-NRI Land Modelling Team Lead)
8
We discussed the issues with running u-dk517 with Chermelle. The errors of the type “it works for me and not for them” usually mean a problem with the individual’s setup. For Python-based issues, usually it’s confusion in Python between environments and locally installed stuff under $HOME/.local.
When possible, deleting $HOME/.local and retrying the failing job is the surest way to test if that’s the cause. But it’s not always possible.
Otherwise, you can set PYTHONNOUSERSITE=1 as an environment variable before launching the Python script. This would have to be set in the suite in the same job that is running pp2anc.py. I haven’t read the suite so not sure where exactly, but it would likely be with the module loads.
Thanks for your suggestion. I tried adding export PYTHONNOUSERSITE=1after line 90 in suite.rc of the released u-dk517 suite, and now I have the .anc files.