Technical requirements for MOM6 node testing

Aidan · 2 August 2023 05:57

@angus-g has put out a call for MOM6 model configurations to include in a testing suite when there is an Australian MOM6 “node”

https://forum.access-hive.org.au/t/a-call-for-community-model-configurations-that-use-mom6/1049/5

This is a related discussion about the technical aspects of any such testing.

aekiss · 2 August 2023 05:23

Here’s a couple of relevant papers discussing reproducibility and testing. The first one defines 4 categories of reproducibility, and statistical tests to automate categorising the non-bit-for-bit cases

Changes, additions and updates to CICE fall into four categories: (I) BFB [bit-for-bit] with no further assessment required; (II) non-BFB but unlikely to be climate changing; (III) non-BFB and climate changing; and (IV) a new model configuration option requiring separate scientific assessment. This section describes the automated methods used to flag the first three categories.

http://dx.doi.org/10.1098/rsta.2017.0344

Aidan · 2 August 2023 05:24

Are payu configurations desirable? Preferred? Required?

How do other sites run their testing? Do they version control their inputs? If so, how?

Is this even the right topic to discuss this? Happy to move to another topic if not.

angus-g · 2 August 2023 05:42

I suppose most of the preexisting configurations around would probably be using payu, which is why I suggested the config.yaml (also gives an idea of resource requirements). But this probably ends up being a question for the technical implementation of the actual running of the tests. There’d probably be a little bit of modification required to give a testing-suitable run anyway. I think a lower barrier to entry by not requiring payu is fine?

I know that GFDL runs their tests through a pipeline on an internal Gitlab instance. I wouldn’t be surprised if there are a range of solutions from manual running, to Makefiles handed down from a supreme being (some of the developers use this for their own tests), to modern pipelines. I can try to dig around for a bit more info there.

I think the control inputs are often version controlled. GFDL has MOM6-examples, ESMG has an equivalent with their configurations, etc. There are probably private configurations, but they’d be within version control on the inside of the firewall. As for other (binary) inputs, I’m not sure! That is probably an issue we’ll have to think about too, particularly for full-chain reproducibility and provenance.

Sure But we might want to spin out a discussion on the technical implementation (running tests, validating tests, how to organise configurations, etc.). Ideally it can all be authoritative, so we don’t get a desync between what we’re testing and what’s actually being run. But also the testing is only as valuable as the tests capturing codepaths in the model that people are actually interested in.

Aidan · 2 August 2023 06:00

I was worried I had derailed your topic @angus-g, so I’ve moved the discussion to this topic. Hope you don’t mind being scooped up and moved to @aekiss, but your post seemed to fit here quite well. I can move it back if you want me to.

Aidan · 2 August 2023 06:13

Sorry, I didn’t notice the config.yaml reference.

Yes I don’t think there is a problem with a lower barrier of entry, but if payu is the preferred way to go (and I think it is), then non-payu configs will have to be converted to run with payu in any case.

I’m wondering aloud about a few things:

@MartinDix was enquiring a while ago about ways to version inputs for the rose+cylc experiments. It got me thinking about IPFS
Are there modifications we might usefully make to payu to facilitate running test cases programatically like this? The ACCESS-OM2 testing writes to config.yaml files. We could make it more seamless than that I reckon.

True. The intel compiler can also generate codecov data, which might be worthwhile thinking about to quantify coverage.

angus-g · 2 August 2023 06:17

GFDL actually uses codecov on their fork, e.g.: Fix a bug in the OMP directive for plume_flux by Hallberg-NOAA · Pull Request #427 · NOAA-GFDL/MOM6 · GitHub. Although that applies to the smaller regression tests that are run through GitHub Actions. You can also see an example of the Gitlab pipeline link in that PR.

Topic		Replies	Views
A call for community model configurations that use MOM6 COSIMA	3	550	4 August 2023
MOM6 code: where it lives and what's the best way to know what version I'm using COSIMA	8	309	21 November 2022
Porting CSIRO/UMUI ACCESS-ESM1.5 ksh run script to payu Earth System payu	6	243	18 January 2024
Community Talks 1: Aidan Heerdegen (ACCESS-NRI) RRR: Reliability, Replicability, Reproducibility for Climate Models ACCESS Workshop Day 1 workshop-2024	4	55	9 September 2024
ACCESS-OM2 Restart Reproducibility: Bitwise Reproducibility Testing TWG access-om2 , reproducibility	0	215	23 March 2024

Technical requirements for MOM6 node testing

Related topics