COSIMA TWG Meeting Minutes 2024

Summary of today’s TWG - I didn’t catch everything so please edit to add/correct as needed.

Date: 2024-04-03

Attendees:

Offline BGC

DS, AK: MOM6 offline tracers to be explored - potentially very useful capability for BGC dev, parameter tuning, spinup (esp. CMIP7) and science - see Offline tracer transport for BGC · Issue #123 · COSIMA/access-om3 · GitHub

ACCESS-OM3 component update

MO: updating model components, following CESM

  • updating spack env
  • CESM has own fork of FMS but not easy to mix so updated to latest stable FMS release from GFDL - seems to work - now need to compile FMS with special config options to activate the old API
    • DS: MOM abstracts the version (FMS1 vs 2) - is this FMS3?
    • MO: not sure
  • Should we build CESM with these updated components, now that we’ve diverged with our new configs?
    • AK: only do that if we need to run CESM configs for debugging one of our OM3 configs
    • AS: field dict has changed and would need updating

MOM6 version choice and MOM6 node

DS: are we still happy with tracking CESM? What about when we have our own MOM6 node?

  • AG: GFDL say we need to nominate somebody to sign off on PRs. Then up to us to set up test infrastructure to approve PRs.
  • AM: Would be good to do - should it be by AG or somebody at NRI?
  • AK: wait until we have adopted NRI’s test framework from OM2
  • AH: this is underway
  • AHogg: will our system meet requirements?
  • AG: no formal requirement - we just need to be happy with it
  • AH: using pytest - can be controlled via workflow dispatch, very flexible
  • MO: Need 1 example to run, and a way to run it, then can expand to other examples
  • AHogg: would be good to have AG as one of the approvers, but also to have NRI. Get Tommy’s testing/deployment running, then tell GFDL we’re ready.
  • AS: once test infrastructure is established we can incrementally add tests to suit what we care about
  • MO: CESM is currently using nearly the latest MOM6 so currently no big motivation to use GFDL - but might not always be the case
  • DS: will there ever be things in the NCAR CESM fork we need that are not in the main
  • AH: are MOM6 nodes obliged to run MOM6-main?
  • AG: not necessarily - GFDL use a much newer dev branch but there are periodic PRs to main for everyone to approve

Profiling & benchmarking

MO:

  • one region of MOM6 code (surface forcing) was not scaling with more cores - narrowed down to reproducible sum
  • but config set CICE6 max_blocks to a very large number - allocates a lot of memory - huge CICE6 mem footprint - then affected MOM6 apparently because CICE mem too big for cache to hold both MOM6 and CICE6 data
  • resolved by a more reasonable max_blocks: parallel scaling improved, but still not great
  • profiling paused for now as MOM6 now has a new feature to mask land tiles automatically at runtime to match number of core - want to use this for profiling. MOM6 land proc mask not relevant to CICE which uses a very different approach. Newer CICE6 can also automatically determine max_blocks.
  • AH: NetCDF chunk size in output is auto-determined - mppnccombine-fast assumes the same proc land mask for all files
  • MO: but auto-masking sets io cores (io_layout) to 1 - not what we want for production but useful for profiling, and we can read in previously auto-generated proc land mask in production configs
  • AH: have we looked into parallel io for MOM6?
  • MO: not sure, and might not be performant to gather to one core and then redistribute for parallel io

Documentation

  • MO, AK: 2 main options (1 is AK’s preference) - see Documentation · COSIMA/access-om3 · Discussion #120 · GitHub
  • AH: can it defined in a datastructure? say, doc.yaml in each branch - also makes it easier to systematically extract data
  • DS: or no specificity at all - just have one doc with a common section followed by a section for each config which is free-form text extracted from each repo branch
  • AH: maintainability a problem if free-form, and unclear to doc writer what is needed
  • AH: use submodules?
  • MO: not simple to use sphinx or mkdocs with submodules
  • AHogg: try something and see how it works - how it looks to the user and how much work to update continuously
  • AHogg: and will this scale to the other ACCESS models? eg ACCESS-CM3 and ACCESS-ESM3
  • MD: no discussion of documentation for CM/ESM yet - would likely follow OM3’s lead
  • AH: may be fewer configs in climate models since they don’t have choice of forcing?
  • AHogg: but in future there will be multiple resolutions
  • MD: ESM will have a lot more configs than CM

Licensing

AH: what licence to release OM2 under? Software licence · Issue #264 · COSIMA/access-om2 · GitHub

  • AH: MOM5 is GPL3
  • MO: so no choice - code must be distributed under terms of GPL3. So need to check that all component licences are compatible - and they are.
  • AH: what about configs?
  • MO: doesn’t matter - these are input, not code - might have IP but not something we need to deal with. Are we distributing code? Licences are about distribution, not the use it is put to.
  • AH: were there custom licences that weren’t being adhered to? CICE? OASIS?
  • AS: does new CICE licence not matter since it didn’t have one back when we forked the code? Licence was added about 2017ish.
  • MO: there’s an issue about this - Not complying with licensing · Issue #67 · COSIMA/cice5 · GitHub
  • AH: so just need a GPL-compatible licence for OM2?
  • MO: might not strictly need a license for OM2 but it’s easy to add one
  • AH: yes clearer just to have one
  • DS: there is a little code in the configs, eg shell scripts
  • AH: so use Apache? CMIP - data has different licence (eg cc-by) from the code

Next meeting

17 April, usual time