Summary of my notes from the COSIMA TWG meeting today. Please add whatever I’ve missed/misrepresented (this is a wiki post so can be directly edited, or add a comment below to expand or add additional points to topics that were raised)
Date : 2023-03-08
Attendees : Micael Oliveira @micael, Andrew Kiss @aekiss, Aidan Heerdegen @aidanheerdegen, Siobhan O’Farrell @sofarrell, Andy Hogg @AndyHoggANU, Adele Morrison @adele157, Kieran Ricardo @kieranricardo, Paul Leopardi @paulleopardi, Matt Chamberlain @matthew.chamberlain, Martin Dix @MartinDix, Dougie Squire @dougiesquire
@micael described scaling tests of regional pan-antarctic simulation
- Scaling tests show how many cores it is possible to use while not wasting too many resources. Runs tried to stay close to real production runs, so that the time spent in Init and finalise is realistic
- Ocean component scales better than the ice component and the atmospheric forcing. Not an issue as most time is spent in the ocean part.
- Might be worth changing the IO layout to improve IO performance.
- Rui Yang performed a more detailed analysis of what is happening in the code during one run (10th deg. with 962 cores). Main findings:
- High MPI overhead: a lot of waiting due to load imbalance. This seems to come from the open-boundaries
- Little vectorisation
- Is boundary forcing read in every timestep? Files are daily, not sure if they are being read in once for a day
- Maybe optimise layout
- Could try testing global 0.1 scaling and see what difference boundaries make
- Could ask for advice on a MOM6 forum
- Could we make boundary cells smaller, give them less work to do, or change affinity and assign more CPUs
- Compressed and chunked in time? Angus made sure chunk size is 1 in time
- A lot depends on CMIP7 post meeting
- Interest in ACCESS-CM3 and ACCESS-ESM3 based on ACCESS-OM3. Tight timelines for IPCC reporting. Spin-up is time consuming. Hard to do in parallel and hit deadlines.
- Possible is two-pronged approach, above and a fallback approach adding ESM components to CM2
- No progress on decision yet. @AndyHoggANU working on notes to get out to people to get a decision. Chicken-egg problem
- COSIMA can benefit from ACCESS-CM3, get OM3 up and running to show possible. Would be logical to use it. Timelines still quite uncertain
- Hitting IPCC timeline means impact, but also international scrutiny of model, which is beneficial
- COSIMA shift priorities to global, perhaps 1 deg global demonstrator which could be picked up by coupled model => implies shared codebase.
- Also issue of porting of WOMBAT to MOM6.
- CM work can start with NCAR 1 degree configuration @dougiesquire got working. Can use this as a test-bed to work out coupler, land-masks, fractional land cells etc. UM coupling still under development. Have working AMIP config working with a data model. Not proper two-way coupling yet. No technical challenges to it working.
- UM work can be done in parallel with OM3 and bolted on later
- Yes, COSIMA wants to go down this path to facilitate CMIP plan
- Initial conditions ok. Forcing will use CDEPS datm and driver models to provide JRA or ERA5 forcing to the model. Not planning to port yatm. Should mean more forcing products can be used, and flexibility in blending forcing products.
- Low res configs @dougiesquire has set up are using CESM input files and parameters. Do we want to use these, and topography? Or harmonise with previous OM2 configs?
- Start with CESM inputs and gradually change?
- Benefits: OM3 part of CM3/ESM3 would be a big tick on COSIMA grant proposal. Important achievement. If can know it possible can get ready in time, will have leveraging effect on community. If can’t manage, then impetus will fade. Needs thought about plan, who is available etc. If delay too much will not be an option. Risks in delay large. Risks in starting are small.
- Planning 1/4, 1/10 and 1/25 global. Note sure if 1 degree necessary? 1 degree is useful from a technical pov, testing wombat etc.
- @paulleopardi if 1 degree useful for optimisation, might be useful. Use fewer cores and vectors might end up similar length.
- For climate/ESM 1 degree still the main workhorse. 0.25 used, but as high as currently used
- Look at points of failure, technical challenges, performance bottlenecks. Identify pitfalls and how quickly they can be dealt with.
– Some questions about maturity of C-grid CICE6? Coupling is via A-grid, even though model on C-grid. Might be dealt with at Bluelink meeting.
- Should ACCESS-NRI formally be part of CICE6 consortium? Some resourcing requirements (FTEs per year)
- Focus on CMIP7 potentially puts wav watch on the back burner. Work for CM3 doesn’t include a wave component. Already have coupled version with wavewatch, but will be focussing on coupled model parameterisations.
- Wavewatch is important for atmospheric boundary layer. But don’t put in immediately. Could have there and turn on? Plan was always MOM6/CICE6 first, so sooner that is ready sooner we can add in Wavewatch.
- Close to distributing a technially working ACCESS-OM3 with wavewatch that interested parties could test and feedback
- Vision is to be able to switch out components for testing, so data ocean for testing wavewatch
- What are next steps of MOM6/CICE6 configuration? @dougiesquire has a “thrown together” config, low level of confidence in outputs. Happy to share, but very likely to have issues. Made them work with
- Initial WW3 coupled MOM6-SIS2 config (GitHub - shuoli-code/MOM6_WW3_SIS2_coupled)
- Configs using development version of WW3 for coupling to CICE6. Parameter settings seemed to be missing/confusing.
- Put @dougiesquire’s config on GitHub? Will work out of the box with new build system executable
- Build system is 90% done.
- Merge payu driver.
ParallelIO in CICE6
- Same as that in CICE5? Does that need to be worked on?
- There is Parallel in CICE6. Not sure how performant it is.
- Parallel scaling at high resolution a later concern, focussing on climate initially.