Payu: Model exited with error code 1; aborting. Stuck at a leap year

Hi all,

This is a follow up on a problem mentioned in one of the technical topics (Payu potentially miscalculating a leap year)

I am having trouble trying to run one particular year in ACCESS-ESM1.5. I have completed 299 years of a simulation, and it just won’t run for the 300th year. I tried starting again from a different restart (298th), it runs and writes the output of 299th year, and then runs another year (the 300th), but crashes before writing the output of 300th year.

Did anyone encounter anything similar or have any suggestions?

Thank you.

I will be posting some of the suggestions from @holger below who has been working diligently on this.

1 Like

From @holger:

Looking into the configuration, the model crashes in the year 400.
The crash comes from OASIS, with the error message
oasis_advance_run at 31536000 31536000 ERROR: t_surf
oasis_advance_run ERROR model time beyond namcouple maxtime 31536000
31536000
oasis_advance_run abort by model : 2 proc : 0
This suggests that at least one of the submodels wants to keep going longer than the oasis coupler.
But the configuration hasn’t been changed, it’s still a single year per run.
The model year when it crashes is the year 400, which (according to the gregorian rules) would be a leap year. If oasis doesn’t realise this, oasis might want to quit a single day early.

Looks like OASIS is waiting for the data for december 31st from ice/ocean – but those two submodels might have terminated because they also didn’t realise that it was a leap year.

I notice that work/cice/input_ice.nml seems to think that it’s the year 300, not 400.

So I modified the hack script to change the date for the ice model, but this will only work for the year 400, so the script has to be disabled after this year.
Even if this works (I still don’t know what’s up with the ocean), the issue is likely to re-appear in the year 500 when the ice submodel believes to be in the year 400 (a leap year) and atmosphere submodel is in the year 500 (a non-leap-year)

I’m still working on it. It seems as if MOM knows the year is 400, at least from the file time_stamp.out.

Made any progress with this @holger? Is there anything we can do to assist?

Hello,

I tried one suggestion from @tiloz , which was successful and my model has now crossed the 400th year and is at 404th year right now.

Below is the suggestion that worked: (thanks a lot @tiloz )

I set up a new run, a copy of the existing one, but taking the last restart from the current run (399th model year) and restarted the simulation year counter (so, starting again counting from year 101). So now, I have new output and restart folders as output/restart001,002…, but the model year continues as 400,401 etc.

I will keep checking my outputs.

1 Like

In case it is useful in the future, you can tell payu what number to start the run from.

In your case adding a restart option to your config.yaml file to point to your restart directory and then using

payu run -i 400

would start the run counter at 400 to match up to your previous run.

Thanks. I will keep this in mind.

1 Like

I had another thought: ACCESS-OM2 had issues with MOM5 using a gregorian calendar, but actually implementing proleptic_gregorian

This causes issues with offsets compared with software that was implementing gregorian correctly (which is actually a mix of gregorian and julian).

The fix was to use a base time after 1582, e.g. 1900-01-01 works.

1 Like