Payu: a workflow manager for some ACCESS models

Aidan · 17 August 2023 06:07

What is this?

A topic that will be used to announce updates to payu at NCI.

What is payu?

payu is the tool used to run a number of ACCESS models on NCI hardware

How is this topic used?

This topic will be the place where updates to payu at NCI will be announced, with details of what those updates entail.

What should I do to be notified of any updates?

You can watch this topic, and be notified by email of every update.

Where do I ask questions about releases?

Replies to this topic are disabled.

If you have specific questions about this release follow the guidelines for requesting help from ACCESS-NRI.

If you have questions about payu create a topic in a category that best matches the model you are using, or in the Technical category and tag it with payu. If you require assistance follow the guidelines for requesting help from ACCESS-NRI.

Aidan · 17 August 2023 06:35

17/08/2023

payu has been updated to v1.0.29 in the conda/analysis3-unstable conda environment maintained by the CLEX CMS team.

Updates:

Added support for module use to add module paths in config.yaml Issue #347
Removed redundant code intended to support MVAPICH Issue #350
Supports models built with spack Issue #341
Fixed bug introduced with spack support update (version 1.0.28) and cleaned up internal code logic Issue #354

Notes

Enhanced module support

payu already supported loading user-specified modules when a model is run.

payu now also supports module use paths in the config.yaml file. This is documented but in essence if you have an existing config.yaml that has something like this:

modules:
  - netcdf-c-4.9.0
  - parallel-netcdf-1.12.3
  - xerces-c-3.2.3

that only works if you you first do

module use /path/to/module/directory

then the module use is no longer required, it can be incorporated in the config.yaml by changing to:

modules:
   use:
      - /path/to/module/directory
   load:
      - netcdf-c-4.9.0
      - parallel-netcdf-1.12.3
      - xerces-c-3.2.3

This has a couple of benefits:

It makes the configuration more portable. Someone else can clone your configuration and run it without additional steps required
The paths specified in the use section are parsed for storage points and /g/data/ and /scratch directories are automatically added to -l storage options when payu submits the job to the PBS queue

Note: the previous syntax is still supported.

Support for spack built executables

For the most part spack uses a mechanism to embed within an executable information about the location of library dependencies that were used to compile the executable.

In these cases it doesn’t rely on some of the module inspection payu does to make sure the correct modules are loaded. For libraries outside the /apps hierarchy payu no longer attempts to do this sort of inspection.

Credits

This work was all done by @jo-basevi. Thanks!

Aidan · 16 February 2024 06:53

18/02/2024

NOTE: ACCESS-NRI is now supporting payu in a dedicated conda environment in the vk83 project. vk83 is the project ACCESS-NRI will be using to release all climate models in the future. See below for more information.

This release adds support for experiment UUIDs and marks a major change in the way payu names the work and archive laboratory directories to support experiment UUIDs and git branching. The naming scheme incorporates a portion of the experiment UUID and the git branch name which prevents namespace clashes, and allows experiments with the same name to co-exist. More importantly it means git branches can be utilised seamlessly as independent experiments.

As a result the minor version has been incremented (from 1.0 to 1.1): the changes are backward compatible, but from now on by default new experiments will use the new naming scheme.

Updates:

Automatic generation of unique Experiment IDs (UUIDv4)
Automatic creation and population of metadata.yaml file (compatible with ACCESS-NRI Intake Catalogue)
Uniquely named work and archive directories, allows the same experiment repository to be used for multiple unique experiments in separate branches
New payu checkout command: wraps git checkout to facilitate changing between experiments stored in separate branches
New payu branch command: lists available branches, and their experiments, that exist in the repository
Newpayu clone command: wrapper for git clone that updates metadata and makes sure the experiment directory is correctly configured
New sync support: uses rsync to copy outputs and restarts to a specified location (can be a local long-term storage disk, or a completely remote machine). Incredibly useful functionality when model outputs are saved to short-term storage, e.g. /scratch at NCI
Date based restart pruning: payu now supports pandas style date/frequency syntax to specify what restarts should be retained

Notes

Experiment UUIDs

payu now automatically generates unique experiment UUIDs. These are UUIDv4 format, which is typically represented as a 128bit hexadecimal number, e.g.

550e8400-e29b-41d4-a716-446655440000

They are guaranteed to be unique within any reasonable computation effort, and so can be confidently used to identify and track experiments. UUIDs are not human friendly, and are designed to be used by software, but the first 8 digits of the experiment UUID is used to uniquely name experiment laboratory archive and work directories.

Branches

git branches are now explicitly supported by payu, and form a crucial part of the updated workflow. This means a single control directory (which is a git repository) can contain multiple independent experiments, and it is possible to switch between experiments, though only one experiment can be active and running at any one time.

See the payu tutorial for more information on branches.

Experiment naming

An experiment name is used to identify the experiment inside the work and archive sub-directories inside the laboratory.

The experiment name historically would default to the name of the control directory. This is still supported for experiments with pre-existing archived outputs. To support git branches and ensure uniqueness in shared archives, the new default behaviour is to add the branch name and a short version of the experiment UUID to the name of the control directory when creating experiment names.

See the payu tutorial for more detail.

Syncing

payu now supports syncing of an experiment archive to another filesystem, either local or remote. There are a number of configuration options to customise what is sync’ed and when. See the payu tutorial for more detail.

Restart pruning

payu now supports specifying which restarts to retain using date-based frequencies. This allows restarts pruning based on time units. For example setting

restart_freq: 5YS

will only save the first restart of every fifth year, with the rest deleted.

See the payu tutorial for more detail.

How to access payu

All ACCESS-NRI models and critical supporting software such as payu is located in /g/data/vk83 on NCI. It is necessary to be a member of the vk83 project to use ACCESS-NRI supported versions of payu. See NCI Documentation for more information about how to join a project

To access payu version 1.1:

module use /g/data/vk83/modules
module load payu/1.1

payu is installed on gadi using an automated deployment process. This ensures a consistent software environment and the design allows for multiple versions of payu to be maintained, so if there are changes which are incompatible with your experiment you can use an older compatible payu version.

Credits

All payu development was done by @jo-basevi. Deployment by @TommyGatti with assistance from @jo-basevi and @harshula. Thanks!

Aidan · 12 April 2024 04:49

12/04/2024

payu has been updated to v1.1.3

Updates

Bug fixes/improvements for the new branching capabilities (#423, #425, #435)
Added capability to deal with multiple restarts for ACCESS-OM3 experiments (#432)
Prevent UM restart files being archived to output #415
Packaging, CI and documentation updates. payu now correctly reports it’s own version (#403, #407, #408, #410, #411)
Metadata handling updates (#434, #427)

Notes

How to access payu

All ACCESS-NRI models and critical supporting software such as payu is located in /g/data/vk83 on NCI. It is necessary to be a member of the vk83 project to use ACCESS-NRI supported versions of payu. See NCI Documentation for more information about how to join a project

To access payu version 1.1.3:

module use /g/data/vk83/modules
module load payu/1.1.3

payu is installed on gadi using an automated deployment process. This ensures a consistent software environment and the design allows for multiple versions of payu to be maintained, so if there are changes which are incompatible with your experiment you can use an older compatible payu version.

This new version has been deployed to vk83 and is now the default version. It contains a number of bug fixes and improvements and is the recommended version to use. The previously deployed version (1.1) is deprecated and will be removed when the next version of payu is deployed, or within 3 months, whichever is sooner.

Credits

payu development was primarily by @jo-basevi. Other contributions from @dougiesquire, @MartinDix and @Aidan. Deployment by @TommyGatti, @dougiesquire, @jo-basevi and @harshula. Thanks!

jo-basevi · 27 August 2024 23:12

23/08/2024

Release: 1.1.5

https://github.com/payu-org/payu/releases/tag/1.1.5

Getting Started:

This version of payu can now be accessed on gadi:

module use /g/data/vk83/modules
module load payu/1.1.5

Updates

Get module executables from paths added by loaded environment modules (#439, #482)
Allow generic tracers with cesm_cmeps driver (#433)
Remove unnecessary UM config files (#455)
Replace UM um_env.py configuration file with yaml file (#459)
User scripts and postscript updates to error handling, environment variables and shell features (#452, #467, #445)
Adding date based pruning for ACCESS-ESM1.5 (#465)
Changes to manifest logic (#475)
Enable seperate ice_history.nml & cice_in.nml settings (#483)
Replace CICE start date calculations (#484)
Add command line flag to disable metadata generation and related commits (447)

For a full list of pull requests that includes additional bug-fixes, see the Payu Github 1.1.5 release.

Notes

Python version 3.10

Released payu environments modules from payu/1.1.5 onwards are based on python version 3.10.

UM um_env.py files

UM um_env.py configuration files are no longer supported and should be replaced with um_env.yaml files. Existing um_env.py files can be converted to yaml files, using this script.

Loading model executables using model modules

Payu can now find model executables by searching paths added to $PATH by model environment modules. This simplifies the config.yaml, as only the name of the executable is required and changing model versions is simpler and less error prone. For example,

# Modules for loading model executables
modules:
  use:
      - /g/data/vk83/modules
  load:
      - access-esm1p5/2024.05.0

...
-  name: ocean
   model: mom
   exe: fms_ACCESS-CM.x

Previously, the executable had to be specified as a full path. For executables built by spack, these included a hash in the path, and were long and complicated. For example:

    exe: /g/data/vk83/apps/spack/0.22/restricted/ukmo/release/linux-rocky8-x86_64_v4/intel-19.0.3.199/mom5-git.access-esm1.5_2024.06.20_access-esm1.5-wxxrc3ivrjz76yx565ddkuuiwoqpalko/bin/fms_ACCESS-CM.x

Loaded modules in config.yaml must be unique to ensure the correct model executables is used. This means modules must be specified with a version, and modules of the same name and version can not be found in multiple module directories.

Updates to user processing scripts

Payu now exports some current run information to environment variables so they can be accessed in post-processing scripts:

PAYU_CURRENT_RUN - The current run number, e.g. 0 for first run, 1 for the second run
PAYU_ARCHIVE_DIR - Full path to the archive directory - this contains all the outputs and restarts subdirectories
PAYU_CURRENT_OUTPUT_DIR, PAYU_CURRENT_RESTART_DIR - Full path to the current output and restart directories, e.g. for first run, it would be /path/to/archive/output000 and path/to/archive/restart000

Userscript and postscript commands calls can now also include shell-specific values such as file re-directions, pipes and environment variables (which are expanded). So it’s now possible to run commands such as:

runscript:
    setup: echo "some_data" > input.txt
    archive: some_script.sh $PAYU_CURRENT_OUTPUT_DIR

If users scrips exit with an error payu run execution halts. If this is not desirable, error handling will need to be added to post-processing scripts. Previously only warnings were issued if a user scripts exited with an error.

Changes to manifest logic

Payu manifests store information of files in the work directory of an experiment. They are used to track changes to files over an experiment for experiment provenance, and ensure an experiment run can be reproduced. There are three manifests types: executable files, input and restart files (exe.yaml, input.yaml and restart.yaml respectively).

The logic for updating manifests and enforcing reproducibility has been greatly simplified:

Stored manifests from previous runs are used as the source of truth for full (md5) hashes, for all manifest types. Previously this was the case for only input manifests.
Fast change sensitive hashes, by default binhash, are calculated at each payu setup. If a fast hash matches the value in the stored manifest the full hash from the stored manifest is used. This avoids re-calculating slow md5 hashes where possible.
All changes to config.yaml are now correctly detected. For example a different executable path or new input file paths.
scaninputs option has been removed. This allowed existing file paths to change but not scan for new inputs. This means only paths configured in config.yaml or found through searching input and restart directories, are added to the work directory.

Enforcing reproducibility

Setting reproduce to true in config.yaml or via a command-line option will check and make sure files have not changed since the previous run. It is possible to set reproduce for each manifest type separately.

When reproduce for a manifest type is set to true, payu will refuse to run if:

Full hash changes: calculated md5 hashes differ to full hash in stored manifest
New files: if files are found in the work directory that were not in the stored manifest
Missing files: files in the stored manifest are not present in the work directory

If a full path to a file has changed or a fast hash has been changed, but there’s a match with the stored full hash (so it is effectively the same file), the manifest will be updated.

Changes to config.yaml are now correctly picked up. For example a different executable path or new input file paths. Previously specifying reproducibility would only add paths in the manifests to the work directory and raise errors if those files were modified.

For more information on manifests, see the payu documentation for configuring your experiment and manifests content and tracking.

Disabling metadata + UUID generation and commits

To update manifests without auto-updating metadata:

payu setup --metadata-off

The --metadata-off/-m command line flag was added to make it more convenient to update released configurations which do not include a UUID.

Previously the only way to disable generating a new UUID and updated metadata.yaml file and related git commits was via config.yaml:

metadata:
    enable: false

This option is only available with the payu setup and payu sweep commands. Disabling metadata for payu run still needs to be done via config.yaml.

Support

Replies to this topic are disabled.

If you have specific questions about this release follow the guidelines for requesting help from ACCESS-NRI.

If you have questions about payu create a topic in a category that best matches the model you are using, or in the Technical category and tag it with payu. If you require assistance follow the guidelines for requesting help from ACCESS-NRI.

Credits

Development was by @jo-basevi, @spencerwong, @dougiesquire, @anton, @Aidan and @TommyGatti.

jo-basevi · 4 February 2025 22:27

03/02/2025

Release: 1.1.6

https://github.com/payu-org/payu/releases/tag/1.1.6

Getting Started

This version of payu can now be accessed on gadi:

module use /g/data/vk83/modules
module load payu/1.1.6

Updates

Fix default shortpath project bug (#504)
Add minimum payu version config.yaml option (#511)
Support symbolic links in input files (#518)
Add command line option for payu clone to start from commit hash/tag (#515)
Support relative restart paths with payu clone command (#516)
ACCESS-OM3: Add checks of input/output parallel processor layout to cmeps driver used for ACCESS-OM3 (#496)
CABLE: Add staged driver for CABLE configurations (#461)
CICE5: Don’t require cice_in.nml namelist in restarts - this fixes errors when starting from ACCESS-OM2 restarts (#507)
CICE5: Fix restart pointers (#535)
CICE4: Add log file compression (#532 and #542)
CICE4: Check restart file dates (#539)

For a full list of pull requests that includes additional bug fixes, see the Payu Github 1.1.6 release.

Notes

Containerised conda environment

The conda environment for payu/1.1.6 now runs inside a singularity container. The deployment is based on the work done by CLEX CMS on Containerised Conda environments. This dramatically reduces the inode usage of each conda environment as the environment is compressed into a SquashFS file that is added as an overlay to a singularity container at runtime.

If there are problems with running commands once the payu module is loaded, please create an issue on the ACCESS-NRI/model-release-condaenv repository.

Model log compression option for CICE4

CICE4 model log files can take up a significant amount of storage. By default payu now compresses some log files into a tarball during archival.

No changes are necessary to enable this option, but it can be configured in config.yaml under the archive section, e.g.

archive:
  compress_logs: false # Default is true

Currently log file compression is only implemented for the CICE4 model.

Added Beta CABLE driver

Added a driver for running multi-stage CABLE configurations. Supports arbitrary spin-up configurations through the staged_cable driver. See the cable-offline-configs for instructions on using the driver. Driver internals are still being iterated, but user experience should remain consistent.

Payu clone/checkout updates

With the payu clone command, to create a new git branch starting from a specific git tag or commit, use -s/--start-point flag:

payu clone -b <NEW_BRANCH> -s <COMMIT_HASH|TAG> <REPOSITORY_URL> <EXPERIMENT_NAME>

When checking out a new branch starting from a restart directory, a relative path can now be used, e.g.:

payu checkout -b expt -r archive/restart002

Previously the full path to the restart directory was required.

Payu minimum version

As some configurations might require features in later versions of payu to run, there’s an config.yaml option to set a minimum version of payu:

payu_minimum_version: 1.1.6

This will run a check at the start of payu setup to compare this version with the currently running payu version. Note that this check will only run with payu versions >=1.1.6 so it will be more useful in the future once there are more versions of payu available that support this feature.

Support

Replies to this topic are disabled.

If you have specific questions about this release follow the guidelines for requesting help from ACCESS-NRI.

If you have questions about payu create a topic in a category that best matches the model you are using, or in the Technical category and tag it with payu. If you require assistance follow the guidelines for requesting help from ACCESS-NRI.

Credits

Development was by @jo-basevi, @anton, @spencerwong and @lachlanswhyborn.

jo-basevi · 23 July 2025 05:04

23/07/2025

Release: 1.1.7

https://github.com/payu-org/payu/releases/tag/1.1.7

Getting Started

This version of payu can now be accessed on gadi:

module use /g/data/vk83/modules
module load payu/1.1.7

Updates

UM: Add date-based restart pruning. This supports date-based restart pruning for ESM1.5 and ESM1.6 AMIP configurations as previously it was only supported if there was a MOM subcomponent (#553)
ACCESS-OM3: Remove overriding user settings for restart output frequency in model driver (#556)
ACCESS-OM3: Add date-based restart pruning (#562)
ACCESS-OM3: Commit MOM output documentation into the docs folder of the configuration (#567)
ACCESS-ESM1.6: Add new model driver to run the ACCESS-ESM1.6 configurations that are under development (#563)
ACCESS-ESM1.6: Add cable namelists as optional configuration files (#570)
ACCESS-ESM1.6: Incorporate CICE5 into driver (#585)
ACCESS-ESM1.6: Don’t write start date to CICE5 namelist (#596)
ACCESS-ESM1.6: Refactor parsing restart date-times and test methods to be consistent between ESM1.6 components (#599)
Deprecate MATM model driver (#573)
Add more error messaging when cloning from a tag/commit #575
Add exe_prefix configuration option to documentation (#582)
Fix bug that would run archive user scripts after the collation job was submitted (#584)
Allow repeat config option to be used with previous restart defined in config.yaml (#580)
Always map-by node when npernode is set in config.yaml (#594)
Allow starting from a restart in config.yaml with a non-zero counter which is used for naming output directories (#604).
Prepend a launcher script to the Payu generated PBS commands, if it exists (#607)
List loaded environment modules in error logs after all modules are loaded (#608)

For a full list of pull requests that includes additional bug fixes, see the Payu Github 1.1.7 release.

Notes

Automatically committing MOM6 Parameter output files for ACCESS-OM3

MOM output documentation, e.g. MOM_parameter_doc.* files, are committed back into the docs/ folder of the configuration. This is run when model outputs are archived by payu and the commit message is payu archive: documentation of MOM6 run-time configuration. If runlog is set to False, this will also disable these commits.

Customising the MPI run command

The exe_prefix option adds a string immediately before the model executable in the MPI run command. This can be useful for configuring profilers or valgrind. For example, given the following in config.yaml:

name: atmosphere
model: um
exe: um_hg3.exe
exe_prefix: aps -r <aps_output_dir>

The generated run command would then look like mpirun <mpi_flags> aps -r <aps_output_dir> <path/to/um_hg3.exe>.

Support

Replies to this topic are disabled.

If you have specific questions about this release follow the guidelines for requesting help from ACCESS-NRI.

If you have questions about payu create a topic in a category that best matches the model you are using, or in the Technical category and tag it with payu. If you require assistance follow the guidelines for requesting help from ACCESS-NRI.

Credits

Development was by @spencerwong, @anton, @jo-basevi and @manodeep
Reviewers included @dougiesquire, @cbull, @Aidan, @TommyGatti and @tmcadam
Documentation update by @Benoit

Topic		Replies	Views
Running Model Experiments with payu and git Training Day payu , training , workshop-2024	2	322	2 September 2024
PAYU issues on Setonix Technical payu	17	427	26 September 2023
ACCESS-OM2 payu tutorial COSIMA access-om2 , tutorial , payu	0	741	24 January 2024
Porting CSIRO/UMUI ACCESS-ESM1.5 ksh run script to payu Earth System payu	6	243	18 January 2024
13th Feb 2025 - Experiment manager and Payu 2025 training program	1	128	12 February 2025

Payu: a workflow manager for some ACCESS models

What is this?

What is payu?

How is this topic used?

What should I do to be notified of any updates?

Where do I ask questions about releases?

17/08/2023

Updates:

Notes

Enhanced module support

Support for spack built executables

Credits

18/02/2024

Updates:

Notes

Experiment UUIDs

Branches

Experiment naming

Syncing

Restart pruning

How to access payu

Credits

12/04/2024

Updates

Notes

How to access payu

Credits

23/08/2024

Getting Started:

Updates

Notes

Python version 3.10

UM um_env.py files

Loading model executables using model modules

Updates to user processing scripts

Changes to manifest logic

Enforcing reproducibility

Disabling metadata + UUID generation and commits

Support

Credits

03/02/2025

Getting Started

Updates

Notes

Containerised conda environment

Model log compression option for CICE4

Added Beta CABLE driver

Payu clone/checkout updates

Payu minimum version

Support

Credits

23/07/2025

Getting Started

Updates

Notes

Automatically committing MOM6 Parameter output files for ACCESS-OM3

Customising the MPI run command

Support

Credits

Related topics