Using the newly released version of payu and finding the new output directory naming rather frustrating. Previously, the output folder would have the same name as the control folder, which was convenient. Now, instead, a (random?) experiment ID gets appended to the end - which is not useful (quite the opposite!) for my use case. As such, I’m wondering if there is a simple way (e.g., a flag I can set: legacy = true maybe!) to return to the previous default behaviour. Reading the docs I can see that setting
experiment: “control_folder”
in config.yaml will have the desired effect, but is there a way to do this without having to manually add the control folder name in config.yaml for every run…?
Hi @CallumJShakespeare, I second your opinion and prefer using the old way which was to have the output folder have the same name as the control folder. So far I have dealt with this by using a legacy version of payu, e.g.
module load payu/1.1.6
perhaps not a long-term solution but it does allow you to make your archive folder the “old” way.
Aidan
(Aidan Heerdegen, ACCESS-NRI Release Team Lead)
3
payu creates a symlink to the archive folder where the outputs are stored, so they shouldn’t be difficult to find.
Adding part of the experiment_uuid means work and archive directories avoid namespace clashes. e.g. when you called an experiment the same name but in different directories, and payu found the experiment directory from the same experiment name and assumed it was the same experiment.
Not all change is bad.
There is a work-around to achieve the behaviour you want, but I strongly advise against it.
Aidan
(Aidan Heerdegen, ACCESS-NRI Release Team Lead)
4
I’m curious why this strong preference?
Appending the UUID solves a long standing problem that can trip up experienced and new users without them realising it with a small change to the automated naming of an archive directory.
I don’t understand the “long standing problem” you’re referring to here? Edit: ahh okay, I see it comes about if one has multiple directories with experiments in them. I’ve never done that, so never had this problem.
My objection is that previously I knew in advance exactly how the output directory would be named, so it was simple to find programatically. Now it’s an extra step to find out what the appended exp-id is before I know what the directory name is to use in scripts… hence its just an extra annoyance that seems entirely unnecessary as it worked perfectly well before…
Don’t worry, I am using the work around… and creating a bunch of symlinks to deal with ones I’ve already run and are oddly named
Aidan
(Aidan Heerdegen, ACCESS-NRI Release Team Lead)
7
It might be that it affected users who cloned shared model configurations, e.g. COSIMA experiment configurations, and used them as-is. If they did this in different directories and used the same PROJECT code to run them, payu would find the archive from the initial experiment and happily begin writing to it rather than beginning a new experiment.
You can get the previous behaviour by setting
metadata:
enable: False
in your config.yaml.
When you do this payu will no longer create a metadata.yaml file or generate a unique experiment ID.
We’re in the process of adding the ability to add these unique IDs to the global attributes of the model output netCDF files to improve provenance and tracking. This will not be available if you disable metadata generation and would make it harder to ingest model data into an ACCESS-NRI intake catalogue.
Aidan
(Aidan Heerdegen, ACCESS-NRI Release Team Lead)
Split this topic
8
Thanks! I’ll do that. I don’t use intake and generally don’t have a need to share my output from bespoke models so provenance and tracking is not a priority…
1 Like
Aidan
(Aidan Heerdegen, ACCESS-NRI Release Team Lead)
12
Good that we have solution that works for you.
Fair enough, however provenance can be very useful when a data file gets moved out of its payu context, e.g. copied to another directory.