Feedback on ACCESS-NRI Work Plan FY2025-2026

Hi, Many people said they didn’t have time to digest the ACCESS-NRI work plan and give feedback at the COSIMA meeting this morning. Below are the relevant parts to COSIMA.

We still have a couple of weeks to give feedback on whether this aligns with COSIMA’s priorities.

I suggest we put together a list of suggested changes, and vote on the order of them below.

1 Like

I have two comments:

  1. Why is the 25km model a higher priority than the 8km? I understand that maybe it makes more sense to start low res and move to higher, but maybe the 8km one is of more interest for the community? Speaking without knowing here.
  2. The COSIMA recipes are not ready to be ported. There’s heaps of intake conversion/bug fixing to do. And even if they reach the stage they can be ported, they are quite organic in the sense that bugs and fixes are found constantly, as well as new recipes coming up, etc. How would the porting work? Everytime there is a change would it be immediately incorporated to the ESMtools, or would the later be lagging behind?
3 Likes

Hi, some additional points from me:

  • instructions/support for common perturbation experiments performed in our community (freshwater fluxes, atmospheric forcing) + playing around with the bathymetry/grid/different vertical coordinate (given that MOM6 has different options)

  • communicating/informing the community to help with the transition from OM2 to OM3: What is different? >> Model configuration (MOM5 vs MOM6), workflows re running/analysing the model

  • regional modelling: are there any plans to develop nesting capabilities?

1 Like

I’m with Julia and Adele on that one: it is only logical to add the OM3-8 km alpha release as one of the priorities, since is what people were more interested according to the polls.

1 Like

The jupyter instructions for setting up a regional model domain on NUOPC would be very helpful!

2 Likes

I second this and reiterate (for prosperity) my comment from this morning: it would be fantastic to have ACCESS-NRI support for the COSIMA Recipes written into the workplan. In particular:

  1. Help COSIMA convert all existing Recipes from cookbook to intake. New students have been really struggling with this.
  2. Help COSIMA make all recipes work with both MOM5 and MOM6 output.

This is our plan for the September COSIMA hackathon, but help before, or in Melbourne, or after would also be amazing.

Thanks @adele-morrison, I’ve highlighted this internally again. You said you could send a specific list of recipes that students/community folks are having a hard time with?

Thanks @cbull! Main ones I’m aware of that have been problematic for students recently due to the lack of intake conversion are:

  • Meridional_Overturning_Circulation.ipynb
  • Surface_Water_Mass_Transformation.ipynb
1 Like
  1. The COSIMA recipes are not ready to be ported. There’s heaps of intake conversion/bug fixing to do. And even if they reach the stage they can be ported, they are quite organic in the sense that bugs and fixes are found constantly, as well as new recipes coming up, etc. How would the porting work? Everytime there is a change would it be immediately incorporated to the ESMtools, or would the later be lagging behind?

We had a meeting with @CharlesTurner and @marc.white yesterday and discussed a few ongoing challenges. The conversion to intake is mostly complete, but getting the pull requests reviewed has been tougher than expected. The intake conversion itself is largely a technical step — and that part is done for most recipes.

However, during the conversion process, people uncovered bugs or realised that some workflows could be improved. That’s slowed things down significantly, and it’s become harder to distinguish between what’s a technical task and what’s a scientific one. Ideally, the porting process should preserve the original outputs when moving from the COSIMA cookbook to intake.

Another issue we’re facing is the incomplete transition away from hh5. Some recipes are still relying on data located in temporary or legacy folders on hh5, which is far from ideal and adds further friction to the process.

Because of the overlap between conversion and content fixes, it’s been unclear what to prioritise. We’ve made a few pragmatic decisions and have started merging converted recipes, even if some scientific issues are still unresolved. We’re happy to help and make sure the recipes remain functional for both MOM5 and MOM6.

There’s also been some confusion around what it means to bring COSIMA-recipes into ESMValTool. As you know, ESMValTool is primarily aimed at CMIP evaluation. Its strengths lie in semi-standardised outputs, a wide range of observational products, and a powerful pre-processor that streamlines CMIP model evaluation. The broader ESMValTool community — largely from the ESM world — has been calling for more ocean diagnostics, and the plan at ACCESS-NRI is to help fill that gap using the insights from COSIMA-recipes.

We’ve been careful in our work plan to use language that acknowledges the foundational work of the COSIMA community. Our goal isn’t just to port recipes, but to build on that legacy within a more standardised framework. The hope is that this transition leads to more robust and reusable ocean diagnostics.

That said, we’re not yet in a position to directly use raw model outputs (e.g. OM2, OM3) in ESMValTool. We’re actively working on this, but it depends on some upstream progress — particularly around standardising outputs and improving the flexibility of CF/CMOR conversions.

1 Like

I see here plans to port MOM6 code on GPUs? How would this work? I know that just porting the code to work on GPU architecture doesn’t do the job. Or at least doesn’t get you near as much as the speedup that GPUs can give.

I’m intrigued and asking out of curiosity.

Perhaps related, I read:

Optimise GPU code to have similar or better performance than CPU version

So this sorta admits that just porting won’t give you much. You seem to be implying here that “similar performance to CPU” is a good goal? GPUs offer 50x to 100x speedup. How do you measure performance and why put effort to get something similar to CPU?

Hi @navidcy ,

Thanks for the feedback and sorry if the description of the project is not very clear. As you can imagine, the technicalities of porting codes to GPU or how to compare CPU-GPU performance is not something that can be easily explained in a couple of sentences.

All our CPU-GPU comparisons will be done on a per dollar basis, which, in my opinion, is the most useful way to do it. If your code runs 100x faster on one GPU than on one CPU core, but you can buy and operate 1000 CPU cores with the money that the GPU costs, then it’s probably not a good idea to buy a GPU cluster to run your code. Of course, one can’t really buy a single CPU core and it’s also often challenging to run a realistic benchmark on a single core or even on a single GPU, so in practice one usually compares performance on one HPC GPU node with the performance on an equivalent, money wise, number of CPU nodes.

So that’s what we mean by making MOM6 run as fast on GPUs than it currently runs on CPUs and that would be the main goal for 2025-26. We already have most of the dynamical core ported and the performance before doing any sort of refactoring is actually not as bad as I expected, so I’m quite optimistic that we’ll be able to do better than that. In any case, we will for sure keep working on improving the performance and won’t be satisfied with matching the CPU performance.

Hope this clarifies our goals. I can also provide more details about the strategy we are following in collaboration with the MOM6 developers in case you’re interested.

1 Like