Ocean Team Workshop (12-15th November, 2024)

Tuesday 12th November: 10.45am - 11.30am. CI Demonstration​
People present: @Aidan , @TommyGatti , @helen , @aekiss , @cbull , @anton @ezhilsabareesh8 , @minghangli
Topics covered (briefly): ​Demo for source code CI and configuration CI, model deployment.
Next actions: ​
@anton:- Reference the model deployment template in the readme. Or add to the top of spack.yaml
Does it make sense to compare manifests as reproducibility.

To be followed up: ​
Summary note written by: @minghangli

Anton gave a quick demo of the source code build CI and the configuration CI using Spack.

CI does the spack build and doesnt need users to have their own spack installation.

1. Source code build CI:

spack.yaml contains the environment which defines the versions of individual source codes.

Under spec: specs lists packages that would be installed if the PR is merged. The digits represent a pre-release version @git.(year.month.version) rather than a git tag.

Under packages: update the git hash for the main dependencies, e.g., access-om3-nuopc from cosima access-om3, either a tag or a git commit full hash.

Down to the bottom, in projections, one needs to update the version of the overall deployment too. In this example, one needs to update access-om3 and access-om3-nuopc. Those numbers would be the release numbers. Some more documentations may be referred here. One example (i.e., a PR) is given below.

# configuration settings.
spack:
  specs:
    - access-om3@git.2024.09.0
    + access-om3@git.2024.10.0
  packages:
    # Main Dependencies
    access-om3-nuopc:
      require:
        - '@git.0.3.1'
        + '@git.1f36419a1f4ba2d0fd5136e4eb43d3b4a52a162c'

    # Other Dependencies
    esmf:
@@ -64,5 +64,5 @@ spack:
              'SPACK_{name}_ROOT': '{prefix}'
        projections:
          all: '{name}/{version}'
       -   access-om3: '{name}/2024.09.0'
       -   access-om3-nuopc: '{name}/0.3.1'
       +   access-om3: '{name}/2024.10.0'
       +   access-om3-nuopc: '{name}/1f36419a1f4ba2d0fd5136e4eb43d3b4a52a162c'

During the PR open and updates events, it creates pre-releases. However, if the PR is closed, whether it is closed or merged, it will clean up all pre-releases. And if it were merged successfully, it will go and attempt to do a proper release version. On the other hand, the closed or merged PR can still be recreated. For example, one can still use the spack.yaml to recreate the PR, the environment at a particular point. When the deployment is running, it will take ~1 hour to build all dependencies on GADI. But it will reuse anything that is not modified. For the timebeing, there is a bug related to parallel IO…

Chris asked if this deployment is expensive or not, Tommy replied: as much as you want.

The build is in parallel for each package using login node with 4 tasks, but the entire process is not. Now the parallel builds are suppressed due to memory errors, specific for the ESMF build.

2. Configuration build CI:

A channel config in MOM_input of 1 deg configuration was updated in the demo. One may refer to this PR.

To trigger the Bitwise Reproducibility, one needs to write down !test repro in a new comment. This will trigger a 3-hour run of the model and compare answers to the ones that are saved in the repo.

The reproducibility check compares to a saved checksum in the configuration repository. If a PR changes the checksum (and is expected to), use !test repro commit to update the saved checksum (and the bot will commit the new checksum).

Post-note:

We can save the MOM_docs output using payu, rather than CI: