Payu: a workflow manager for some ACCESS models

What is this?

A topic that will be used to announce updates to payu at NCI.

What is payu?

payu is the tool used to run a number of ACCESS models on NCI hardware

How is this topic used?

This topic will be the place where updates to payu at NCI will be announced, with details of what those updates entail.

What should I do to be notified of any updates?

You can watch this topic, and be notified by email of every update.

Where do I ask questions about releases?

Replies to this topic are disabled.

If you have specific questions about this release follow the guidelines for requesting help from ACCESS-NRI.

If you have questions about payu create a topic in a category that best matches the model you are using, or in the Technical category and tag it with payu. If you require assistance follow the guidelines for requesting help from ACCESS-NRI.

17/08/2023

payu has been updated to v1.0.29 in the conda/analysis3-unstable conda environment maintained by the CLEX CMS team.

Updates:

  • Added support for module use to add module paths in config.yaml Issue #347
  • Removed redundant code intended to support MVAPICH Issue #350
  • Supports models built with spack Issue #341
  • Fixed bug introduced with spack support update (version 1.0.28) and cleaned up internal code logic Issue #354

Notes

Enhanced module support

payu already supported loading user-specified modules when a model is run.

payu now also supports module use paths in the config.yaml file. This is documented but in essence if you have an existing config.yaml that has something like this:

modules:
  - netcdf-c-4.9.0
  - parallel-netcdf-1.12.3
  - xerces-c-3.2.3

that only works if you you first do

module use /path/to/module/directory

then the module use is no longer required, it can be incorporated in the config.yaml by changing to:

modules:
   use:
      - /path/to/module/directory
   load:
      - netcdf-c-4.9.0
      - parallel-netcdf-1.12.3
      - xerces-c-3.2.3

This has a couple of benefits:

  1. It makes the configuration more portable. Someone else can clone your configuration and run it without additional steps required
  2. The paths specified in the use section are parsed for storage points and /g/data/ and /scratch directories are automatically added to -l storage options when payu submits the job to the PBS queue

Note: the previous syntax is still supported.

Support for spack built executables

For the most part spack uses a mechanism to embed within an executable information about the location of library dependencies that were used to compile the executable.

In these cases it doesn’t rely on some of the module inspection payu does to make sure the correct modules are loaded. For libraries outside the /apps hierarchy payu no longer attempts to do this sort of inspection.

Credits

This work was all done by @jo-basevi. Thanks!

2 Likes

18/02/2024

NOTE: ACCESS-NRI is now supporting payu in a dedicated conda environment in the vk83 project. vk83 is the project ACCESS-NRI will be using to release all climate models in the future. See below for more information.

This release adds support for experiment UUIDs and marks a major change in the way payu names the work and archive laboratory directories to support experiment UUIDs and git branching. The naming scheme incorporates a portion of the experiment UUID and the git branch name which prevents namespace clashes, and allows experiments with the same name to co-exist. More importantly it means git branches can be utilised seamlessly as independent experiments.

As a result the minor version has been incremented (from 1.0 to 1.1): the changes are backward compatible, but from now on by default new experiments will use the new naming scheme.

Updates:

  • Automatic generation of unique Experiment IDs (UUIDv4)
  • Automatic creation and population of metadata.yaml file (compatible with ACCESS-NRI Intake Catalogue)
  • Uniquely named work and archive directories, allows the same experiment repository to be used for multiple unique experiments in separate branches
  • New payu checkout command: wraps git checkout to facilitate changing between experiments stored in separate branches
  • New payu branch command: lists available branches, and their experiments, that exist in the repository
  • Newpayu clone command: wrapper for git clone that updates metadata and makes sure the experiment directory is correctly configured
  • New sync support: uses rsync to copy outputs and restarts to a specified location (can be a local long-term storage disk, or a completely remote machine). Incredibly useful functionality when model outputs are saved to short-term storage, e.g. /scratch at NCI
  • Date based restart pruning: payu now supports pandas style date/frequency syntax to specify what restarts should be retained

Notes

Experiment UUIDs

payu now automatically generates unique experiment UUIDs. These are UUIDv4 format, which is typically represented as a 128bit hexadecimal number, e.g.

550e8400-e29b-41d4-a716-446655440000

They are guaranteed to be unique within any reasonable computation effort, and so can be confidently used to identify and track experiments. UUIDs are not human friendly, and are designed to be used by software, but the first 8 digits of the experiment UUID is used to uniquely name experiment laboratory archive and work directories.

Branches

git branches are now explicitly supported by payu, and form a crucial part of the updated workflow. This means a single control directory (which is a git repository) can contain multiple independent experiments, and it is possible to switch between experiments, though only one experiment can be active and running at any one time.

See the payu tutorial for more information on branches.

Experiment naming

An experiment name is used to identify the experiment inside the work and archive sub-directories inside the laboratory.

The experiment name historically would default to the name of the control directory. This is still supported for experiments with pre-existing archived outputs. To support git branches and ensure uniqueness in shared archives, the new default behaviour is to add the branch name and a short version of the experiment UUID to the name of the control directory when creating experiment names.

See the payu tutorial for more detail.

Syncing

payu now supports syncing of an experiment archive to another filesystem, either local or remote. There are a number of configuration options to customise what is sync’ed and when. See the payu tutorial for more detail.

Restart pruning

payu now supports specifying which restarts to retain using date-based frequencies. This allows restarts pruning based on time units. For example setting

restart_freq: 5YS

will only save the first restart of every fifth year, with the rest deleted.

See the payu tutorial for more detail.

How to access payu

All ACCESS-NRI models and critical supporting software such as payu is located in /g/data/vk83 on NCI. It is necessary to be a member of the vk83 project to use ACCESS-NRI supported versions of payu. See NCI Documentation for more information about how to join a project

To access payu version 1.1:

module use /g/data/vk83/modules
module load payu/1.1

payu is installed on gadi using an automated deployment process. This ensures a consistent software environment and the design allows for multiple versions of payu to be maintained, so if there are changes which are incompatible with your experiment you can use an older compatible payu version.

Credits

All payu development was done by @jo-basevi. Deployment by @TommyGatti with assistance from @jo-basevi and @harshula. Thanks!

12/04/2024

payu has been updated to v1.1.3

Updates

  • Bug fixes/improvements for the new branching capabilities (#423, #425, #435)
  • Added capability to deal with multiple restarts for ACCESS-OM3 experiments (#432)
  • Prevent UM restart files being archived to output #415
  • Packaging, CI and documentation updates. payu now correctly reports it’s own version (#403, #407, #408, #410, #411)
  • Metadata handling updates (#434, #427)

Notes

How to access payu

All ACCESS-NRI models and critical supporting software such as payu is located in /g/data/vk83 on NCI. It is necessary to be a member of the vk83 project to use ACCESS-NRI supported versions of payu. See NCI Documentation for more information about how to join a project

To access payu version 1.1.3:

module use /g/data/vk83/modules
module load payu/1.1.3

payu is installed on gadi using an automated deployment process. This ensures a consistent software environment and the design allows for multiple versions of payu to be maintained, so if there are changes which are incompatible with your experiment you can use an older compatible payu version.

This new version has been deployed to vk83 and is now the default version. It contains a number of bug fixes and improvements and is the recommended version to use. The previously deployed version (1.1) is deprecated and will be removed when the next version of payu is deployed, or within 3 months, whichever is sooner.

Credits

payu development was primarily by @jo-basevi. Other contributions from @dougiesquire, @MartinDix and @Aidan. Deployment by @TommyGatti, @dougiesquire, @jo-basevi and @harshula. Thanks!

2 Likes