COSIMA TWG Meeting Minutes 2025

Minutes for 2025, please set new posts to ‘Wiki’ for easier editing.

TWG meeting - 15 Jan

Attendees: @helen, @anton, @cbull, @aekiss, @Minghangli, @KieranRicardo, @Martindix, @Angus-g, @PaulLeopardi

Landmask error:
Kieran has been running the CM3 prototype with the new 0.25 grids

Some points in Africa look like they should be land and are currently ocean

@aekiss will follow up, and mask out those cells.

Interesting that it only came up in CM3 and OM3 didn’t give an issue.

Kieren will try smoothing within the mediator for runoff.

Mom symmetric:

For mom6 regional , the recommendation is to use MOM “symmetric memory” but for access-om3 global modelling we haven’t been using this (NCAR is doing the same). In non-symmetric, all arrays are the same size but for symmetric the velocity arrays are one larger because they include all four edges for all cells (rather than just the typical north and east edges). The mom-ocean tests check that symmetric and non symmetric results are the same, and rotating domains give the same results. The halos for each “rank” are large enough that the missing cells (default 3 cells) for velocity won’t impact the results of any calculation.

Angus suggested that we could check the mom-ocean tests are testing the same parameters we are using for access-om3. In an overall sense, symmetric and non-symmetric should be bitwise identical.

@anton: will do a bfb test using mom_symmetric vs current for om3, with plan to turn on mom_symmetric.
@helen has raised an issue for the change

OM3-0.25 Project Board - check in on this. No updates/blockers from anyone

Run-off

Anton has some fixes to the OM3 configuration to conserve run-off. See Runoff not conserved · Issue #231 · COSIMA/access-om3 · GitHub

Will use first order conservative remapping for moving from JRA grid to the ocean grid. Any new grids need to go to atleast 80 degrees south to capture all runoff.

To account for differing landmasks, Anton suggested just mapping any water in land cells to the nearest ocean cell. We noted the limitations of this, which include runoff isn’t moved to the mouth of any bay that is landmasked, it is just moved into the nearest ocean. There is also no spreading of run-off, we need to test if EPBL + vertical river mix is stable and sensible with this configuration. We will go ahead with the simplistic just mapping to nearest ocean cell, and see how the results look.

Pedro at UNSW has been inserting frozen runoff at depth in OM2 iceberg spreading projects

Speaker for the Community Symposium on Sea Ice Modelling

ACCESS has been invited to present, Andrew could say something for the last seven years? or maybe someone from CSIRO has more historical perspective.

@anton to ask how long is the talk and follow up

GitHub Desktop and GitKraken - @cbull

GitHub Desktop

  • Provides a clicky interface to cherry-pick commits, reorder commits, squash, create branches, commits etc

  • Also shows local changes and other git stuff

GitKraken does similar + shows history graphs

MOM6 testing on the access-nri branch

The automated testing errors are hard to follow for folks who are not the authors. In future, we should check that tests pass as they are committed.

These tests are failing on 2025.01 branch can be resolved when @dougiesquire / @ezhilsabareesh8 is back. Further details of outstanding issues here.

ACCESS-OM3 0.4.0 Release

New components versions - updated MOM6/CIC6 and CMEPS versions.

Andrew & Chris to review (details)

MOM6

Chris is to start attending MOM6 developer moeetings, they are on fortnightly. And Dougie will resume attending as well.

Access-nri model transformation team will start working on a MOM6 GPU implementation. Led by @edoyango and @micael

@claireyung has put her work to track the MOM6 panAn/regional ice shelf model development (see issues for current status)

COSIMA training program

Starting soon, program is here:

https://anu365-my.sharepoint.com/:x:/g/personal/u1164007_anu_edu_au/ESNybi3AdkFHrQJGyE3phtIBVhLEcqCw8craAhP4FBL8OA?e=WWsP69

Please let Chris know if there’s a particular session/topic you’d like to present on.

COSIMA talk upcoming (Thursday 13th Feb)

  • @minghangli present a brief introduction into the expt manager tool & save a more detailed tutorial at training program (23 May)

  • Need some content for the rest of the session (payu ?)

Andrew is Chair and Minghang in minutes for next TWG (29th Jan)

TWG meeting - 29 Jan

Attendees: @micael, @CharlesTurner, @helen, @anton, @cbull, @aekiss, @Minghangli, @Martindix, @dougiesquire

Agenda can be found in this link COSIMA TWG Announce - #55 by aekiss.

  1. ACCESS-NRI COSIMA training program and upcoming cosima ‘main’ training session - @CharlesTurner

For the Cosima meeting:

  • @minghangli will present a 5 minute overview of the parameter manager tool and reference his training session in May.
  • A 20-30 minute talk on new Payu features will be delivered by @anton, @Aidan, or @jo-basevi.

Training program outpline:

  • create users’ own datastore
  • some basics using intake
  • Q&A
  • searchable coordinates for the ocean files (current issues where grid-related searches are not indexed by Intake).

@CharlesTurner has almost finished the cli for making intake datastores. He’s currently writing the tests. He’s also working on the coordniate search within intake ESM, allowing catalog to support searchable coordinates.

For the training program, it is good to have breakout rooms to allow people to retain knowledge better and allow informal discussion. There may also people jump in to ask other questions related to intake, it is better to leave some time for that as well.

@CharlesTurner is also working on the model autodetection through CLI when buidling a catalog. @dougiesquire raised concerns about validation risks, as model names are not strictly prescribed.

  1. Choice of compiler for ACCESS-OM3, moving to ifx - @micael
  • IFX provides better optimisation for Sapphire Rapids compared to classic Intel compilers.
  • For spack builds, when the release team has made available configuration to use IFX on Gadi, it is easy to make a switch by simply changing the name of the compiler used in the spack environment.
  • @MartinDix has tested UM with IFX a while ago and the performance is pretty much the same as with IFORT, and is considering moving CM3 to IFX.
  • All model components need to be compiled with IFX due to Fortran module compatibility constraints.
  1. OM3 0.25deg project board
    3.1 0.25deg ocean mask error
  • @MartinDix found differences on the tripolar edge between @aekiss and @ezhilsabareesh8 versions. @aekiss suggested waiting for @ezhilsabareesh8to return for further discussion.

    3.2 diag_table

  • Discussions on diag_table daily / monthly output frequency and potential performance concerns when switching to daily output.

  • Ongoing discussions on managing diag_table and diag_table_source.yaml, where the yaml configuration file along with GitHub - COSIMA/make_diag_table: Python script to generate MOM diag_table is used to generate the diag_table. An issue is created Diagnostic table files management · Issue #259 · COSIMA/access-om3 · GitHub, where there will be further discussions there.

    3.3 KPP and epbl with fixed runoff

  • @minghangli analysed salinity comparisons at the Amazon river mouth over time and across z* coordinates. Results show similar trends with KPP. The rivermixdepth parameter (40m or 20m) has minimal impact. Seasonal cycles are stronger, but salinity remains low (~0.02 PSU).

  • epbl with/without fixed runoff has velocity truncation errors, but KPP does not. Before the runoff fix, truncation errors appeared around Antarctica. Since runoff is a constant in time, errors may be circulation-related. @minghangli will provide a spatial plot of the new truncation errors with runoff fix. One thing to note, before runoff fix, truncations happened in a row of 10 years and seemed persistent.

  • @MartinDix mentioend that Kieran has been testing the horizontal spreading without epbl.

Next TWG is assigned: Chair: Dougie. Minutes: Andrew K. Andrew will give summary of this TWG at next COSIMA meeting.

1 Like

TWG meeting 12 Feb

Present: @aekiss, @anton, @MartinDix, @dougiesquire, @cbull, @minghangli, @manodeep

ACCESS-OM3-025 project board

ePBL

ML: Using latest Riechl et al (2024) ePBL parameters but we get truncation errors. To maintain runtime performance comparable to OM2, we’ve set a tracer timestep of 3 hours, which is longer than GFDL’s 2-hour timestep. He’ll test the alpha release (0.4.0) since it includes fixes with more numerically stable schemes that might help address this issue.
CB: ePBL preferred over KPP in COSIMA meeting
CB: what are other groups using?
ML: GFDL are using ePBL with same params
ML: currently tuning params but still not working

other items (project board)

Many items will be resolved with config updates

Repro issue

also on zulip

ML: OM3 build & config passed CI repro test but aren’t actually reproducing - CI test now fixed
but still have no repro between 0.3.0 and 0.4.0
also no repro in MOM6 standalone using MOM6 driver (not NUOPC, no sea ice)
default MOM6 parameters changed but have been fixed.
DS: only one of the long list of parameters is different, and this change won’t affect our model
ML: still no repro even with parameter issue fixed
DS: how did this get through the MOM6 repro testing?
DS: unclear whether we expect MOM6 repro given the version change - would have to check through all the
repro tests don’t even run on 0.4.0 config due to truncation
AK: would be good to have a way to tell if we expect repro between any 2 commits
MS: is run deterministic?
ML: yes
DS: do we want to dig into this to find out why it all changed
CB: would want to know whether we expect reproducibility. If repro not expected, do longer run to see if results are plausible. But gap between the 2 MOM6 version, so not a good use of our time to dig into all those commits.
DS: but how to know if repro is exected with digging into commits
CB: did Marshall mention this a month ago?
DS: might have been discussed at a MOM6 dev meeting we didn’t attend - ask Angus?
AS: ask Marshall
also go back to 0.3.0, add one PR and check whether it’s something in our process that is breaking things
CB: ok will ask Marshall
DS: also check ML’s standalone runs to see if they reach the same conclusions.
This problem will crop up with other components
AS: we could have CI run 1deg for 20yr on every update

MS: for every CI, make it fail to check it works

DS: there are many bugfixes in MOM6 that are turned off by default for repro but which we should turn on, breaking repro

CB: could any of the patches on patches involved in 0.3.0 → 0.4.0 be a problem?
DS: possibly, but unlikely, and not in

MOM6 dev meeting on Tuesday

CB: notes on zulip

bug discovered (doesn’t affect us)

update to MARBL, will change cap, may affect WOMBAT - asked us to check

Marshall gave presentation on GPU work - see link to his notes. Impressively fast progress, eg pressure solve running on GPU. Targeting momentum solver first.
Our software team also contributing
Ed has long todo list
Some things in specified GPU coding style are unsupported by hardware; hard to get vendor support; NVIDIA won’t look at code due to license use (LGPL) - looking to move to more commercially-friendly Apache, which does not oblige disclosue of code changes (see table here). Asking ~90 contributors to approve license change.

DS: Generic tracer: code moving out of mom into ???, may affect us.

DS: Next set of changes will alter defaults - need to keep an eye on this for repro.
AK: good reason to storeMOM_parameter_doc.* in repo
DS: release team suggest doing via payu
MS: do via pre-commit hook to run model for 1 timestep?
AS: then add repro CI test to fail on diff between these files
MS: set up cron to regularly check?
DS: but want to know immediately that defaults have changed
MS: belt and braces - cron job to pick it up in case repro test was forgotten in commit
AS: on release there are more stringent tests than commit, so that would pick it up too
AK: would be nice to also be able to do this with CICE
DS: are we talking about just committing the docs (easy, we can do in payu) or CI repro test against branch to merge in (harder, involved release team)
CB: would be happy to have a go at this but AS is probably better positioned, AS will write something and CB can help review/have a chat.

COSIMA twg update tomorrow

CB: Dougie to give TWG update to Thurs COSIMA meeting
ML: increase tracer timestep may be part of problem with ePBL; reducing from 3 to 2hr (matching GFDL) fixes truncation errors, but performance is worse. Truncation occurs at particular places around Antarctica.
DS: truncations in 0.30 and 0.4.0?
ML: haven’t tested in 0.4
DS: one of the new MOM6 changes (off by default) helps improve model stability
ML: should I discuss this at CSIMA meeting tomorrw, or spend some time working on it first?
CB, AK: might be better to discuss ePBL in offline meeting with the few people who can give good input
ML: Wilton found ePBL performed similarly to KPP but without vertical resolution dependence

Timestep at 1 deg

DS: mom dynamic timesetep is quote short (30 min) and also differs from coupling, unlike 0.25
DS: probably inherited from CESM

Release team meeting update (Tommy, Aidan, Lachlan, Spencer, Jo, Dougie and Chris)

DS: CI to automatically create diffs (like the ones in README) between PR branches and all config branches. DS is coordinating. Some interest from Cable/land teams.

Bluelink invite

Invitation to present at Blue Link in late March on OM3 development and/or high-resolution development.

To be discussed offline

NCRIS/board update

CB: due Friday. @minghangli please write an update on the OM2 new control experiments.

ACCESS-NRI COSIMA 2025 training program

CB: starting next week on Fri 21st Feb (discussion link, draft program). What’s the status AS?
AS: may need to use xp65 env. Will have some slides. Possibly a notebook

Next time…

@dougiesquire on for the agenda for next Wednesday’s OSIT and for the next TWG: Chair: Anton. Minutes: Dougie.

Date: February 26, 2025
Present: @AndyHoggANU, @aekiss, @MartinDix, @sofarrell, @ezhilsabareesh8, @minghangli, @anton, @helen
Apologies: @dougiesquire
Chair: @anton
Minutes: @cbull

Key points

non nuopc MOM6-standalone

Some existing executables are about a year old and broken (SIS and FMS but being used as standalone – other components are turned off at run time). Helen is currently suggesting people try executables where it’s build and origin is unknown (possibly one Angus, one Micael but Andy H/Andrew K unsure).

Question: what is the best way for the community to proceed? Anton: can we ask the community to just use NUOPC? Helen: yes, long term, but current notebook workflow does not. Next actions: ask Angus to make a new one that works/we have provenance, point towards NUOPC instructions, ask NCAR if instructions can be more generalised (consider abstracting access-nri/ncar parts).

025 minimum depths (Kieran)

Was discussed at the CM meeting this afternoon, related issue

Bathymetry at the moment is based on the top level 4 layers, gives minimum depth of 5m but runtime option is just as a configuration change (which would be applied everywhere) and create a disconnect between input bathymetry and what the model does.

Kieran asked Andrew about the hand edits discussed here. Andrew: thought that many were unlikely to be needed here. Andy H: integrated salinity restoring over the Persian Gulf/Red Sea in the 0.25 might tell you that the sill depth isn’t deep enough (concept: can be inferred from ocean model run and then avoid one having to run the CM). Andrew: there are differences in the two setups that could make it more complicated (e.g. rainfall and run-off fluxes being in different locations).

Minghang re-showed the “salinity restoring flux” plots to consider Andy’s suggestion. Minghang: note OM2 had a restoring cap (not present in om3). Andy/Andrew/Siobhan: next action, run the CM model further to see how the restoring we are seeing in 025 OM plays out. (i.e. current restoring plots are inconclusive and it’s likely faster to just run CM)

OM3-025 project board & efficiency blockers (Chris)
Re: efficiency blockers for OM3-025 release.
Anton: too many side projects (e.g. ice shelves, training program, wombat etc)
Andy: would like to see the sensitivity tests run out and shared with the community asap (sensed this was imminent a few months ago?)
Chris: thinks there has been an over emphasis at this stage on bitwise repro’.

Side follow ups:
Minghang: stuck with epbl but is running parallel sensitivity tests. Andrew are these runs in a shared place with scripts that the community could access? Minghang: not currently but will look into it.
Andrew: bitwise repro is important as you say in some contexts and there’s a time for that. However, restart repro’ is still broken? Chris: still being looked into.
Ezhil: will other parameter tests require changes in the bathymetry? Andy: it’s iterative (helps to get community input early!).

CICE+WW3 regular international meetings ? (Anton + )
Ezhil: met with Luke, Alessandro, Alberto, Noah, Siobhan at Uni Melb last week and presented the status of the current model. There is community interest in using/developing the model! Chris suggested we could have a semi-regular meeting with the core group and a wider (less regular) meeting from the wider community. Anton: do you imagine targeting the development community/science focused outcomes. Ezhil’s interest, focus first on the communities that can benefit our WW3 development.

MOM consortium update (Chris)
See here for long summary. Chris gave brief update on:

  • change MOM6 to use the Apache license? (Such a decision must be unanimous.)
  • Changes to the NUOPC cap to permit the use of coupled generic tracers and coordination with other generic tracer interface changes
  • Sharing the ACCESS Cmake build system for MOM6

OM3 evaluation paper in terms of OM3 025 release? (Chris)
Chris: what’s the plan? How do people cite OM3 025 once it’s released? Do we wait till high-res om3 before writing eval paper?
Andy: historical perspective re: om2 was that the grant funded the high-res model so all configs were developed at the same time. In this instance, we should consider writing up the configs as they become available. E.g. om3 025 could have it’s own evaluation paper.
Andrew: last time tech doc’ was used as a starting point.

“How to do a perturbation experiment in ACCESS om2” and Aidan’s guinea pig (Chris) link
Chris: would think this session will have more community value if co-presented by the community. Andrew: what is being perturbed? Perturbation to the forcings are common (run-off, atmosphere) but involved. Chris: imagining a brief overview to some of the more common kinds of perturbations (perhaps om2 architecture session would come beforehand?). Andrew and Chris can follow up offline. Anton: consider asking community folks e.g. @hrsdawson?

International Travel in 2025
Chris: exec want to know if anyone has any planned? Please let Chris know.
No responses.

Shifting some dates on the training program re: Anzac day. (Chris)
No complaints.

@Andrew. Bluelink meeting, do Helen and Chris start booking travel?
Andrew: yes can book. Would perhaps be good to coordinate. Chris: Yes, let’s do that on Zulip.

Next week is an OSIT, Chris is rostered. (Dougie will take minutes in two TWG’s time.)

Date: March 12, 2025

Attendees:

@anton
@dougiesquire
@minghangli
@ezhilsabareesh8
@MartinDix
@aekiss
@AndyHoggANU
@helen
@cbull

Meeting Schedule:
Most attendees will be at Bluelink meeting next week.

Apologies - we forgot to send an invite this week.

OM3-025 Project Board Priorities:
Discussion on the priorities for the OM3-025 release.
Addition of a Docs issue to the project board.
We are close to doing an alpha release and there was some discussion about what (if any) changes are needed before this point. Anton is advocating github should reflect what we have been testing with - initial conditions / grids / bathymetry / run-off fix. We will also allow extra truncations with this.
We talked about the changing MOM parameters - eg dt therm , KPP etc but decided only change dt_therm (MOM tracer timestep) in dev-025deg_jra55do_ryf config and leave KPP as is. Previous similar configs have been run for 30 yeas with current KPP and dt_therm.

  • Users - Andy would advocate a group of testers for the release. Chris suggested maybe there will be an OM3 group at how to run a model training sessions next week. Minghang has a list of sensitivity testing that maybe users could engage with.
  • Need to stress to uses what alpha release means re support and not-for-science use
  • @Cbull will make docs for access-hive for OM3

OneAPI Compiler:
Anton is close to having a build ready with the OneAPI compiler.
Discussion on the need for performance checks and the potential impact on stability. But ultimately we were happy to proceed to minimum sanity testing only as we don’t have a current baseline set of answers / climate to compare to and at this point in development changing answers is ok.

Mom Symmetric Issue:
Turning on Mom Symmetric (in 2025.01.0) breaks restart reproducibility in the main configs and breaks determinism in BGC configs. We have reverted this change for now, but it is needed for regional work. We will provided this as a pre-release in the shorter term, with a plan to investigate and fix the issue, with Dougie Squire leading the investigation.

Upstream Model Component Updates:
We reviewed changes to upstream model components since 2025.01.0 release, and these appear fairly small. We have recently built the model without trouble with latest CMEPS/CDEPS/share. We can just update this now, but probably wouldn’t do a release.

Blue Link Meeting Preparation:
Andrew Kiss and Helen will present at the Blue Link meeting.
Discussion on the content of the presentation, including grid and topography updates, dynamic tracer time step, and potential collaboration with Blue Link.

Action Items:

@dougiesquire: Investigate the Mom Symmetric issue.
@cbull: access-hive style docs for ACCESS-OM3 (025 ryf)

1 Like

Meeting Summary & Minutes

Date: May 26, 2025
Attendees: @cbull, @dougiesquire, @anton, @aekiss, @AndyHoggAnu, @pearseb, @helen, @MartinDix, @spencerwong

ACCESS OM3 Config Docs
Chris demonstrated the config documentation system. The docs are versioned, written in Markdown for easy transferability, and will support citability with an automatically updated DOI upon each config release. Chris also proposed generating a full PDF version. The implementation involves a repository with a structured skeleton where the actual documentation is stored in Markdown files.

Current ACCESS OM3 config docs are located here: access-om3-configs/docs at davide/docs-setup · ACCESS-NRI/access-om3-configs · GitHub

Related issues:

New Website Structuring Approach
Chris proposed a new website structure where docs live alongside the code, with the main branch holding general info and config details in separate branches, to minimise redundancy and supports community edits.

Repo Search & Configs Discussion
Andrew explained the use of repo search links in the docs. Chris and Andy discussed dynamic linking, MathJax for equations, and a standardized config subdomain.

Chris emphasized proper work attribution and linking commits to work.

Versioning & Updates
Chris and Andy highlighted the need for flexible versioning to handle updates across releases while ensuring accuracy for older configurations.

Restart Reproducibility
Dougie identified that the Mom symmetric restart reproducibility issue comes from changes ACCESS-NRI have made to the MOM NUOPC cap. Hopefully an easy fix.

Andy inquired about patches, and Dougie confirmed they are in a PR candidate branch, queued for upstreaming.

Incoming MOM6 PR from NCAR

  • There’s an open PR to MOM6 main that impacts OM3
  • Dougie has suggested changes to that PR. One accepted already, one hopefully accepted soon.

OM3 configs consolidation

One API Release:
Anton plans to skip the old COSIMA build system and transition to the new one, as test results were identical.

Action Items

  • Chris: Create a release checklist template for the ocean team on the forum.
  • Dougie: Fix the restart reproducibility issue in MOM symmetric.

Note: I might have missed something, please feel free to edit.