ACCESS-CM2 ice: vertical thermo error

Hi,

My ACCESS-CM2 0.25 simulation stopped with an error for wfimelt. Does anyone know how to fix this issue so I can keep running the simulation?

WARNING from PE   452: diag_manager_mod::send_data_3d: A value for ocean_model in field wfimelt (Min:   4.02948E-005, Max:   1.25775E-001) is outside the range [ -1.00000E-001,  1.00000E-001], and not equal to the missing value.
ice: Vertical thermo error

I copied the error file to /scratch/public/wgh581/job.err

Thanks,
Wilma

Hi Wilma

I can’t open this, can you change the permissions please ? It’s probably useful to include to ice log too.

What is the setup of the experiment? Have you made changes to the forcings, model, paramerisations etc ?

The warning is from the ocean model, and the error is from the ice model, but they are likely related as the warning is about wfimelt i.e. the water flux due to ice melting.

Hi Anton,

Thanks for getting back. I changed the permissions for the log/job/03240701/coupled/NN/job.err file. I also added the log/job/03250101/failcheck/NN/job.err (renamed as failcheck_job.err).

Where can I find the ice log? I can only see the coupled and failcheck folders under my cylc-run directory.

The experiment: I am running a copy of @MartinDix cz789, it’s the ACCESS-CM2 with 0.25deg ocean. I’ve run the simulation for > 300 yr already. The error is not because I made any changes. To me, it looks like wfimelt has reached some unrealistic large value the model can’t deal with. Not sure if I somehow need to tweak the latest restart file?

Many thanks,
Wilma

Hi Wilma

The debug log is in the ICE_RUNDIR in the cycl work directory. i.e. /scratch/x77/wgh581/cylc-run/u-cz861/work/03250101/coupled/ICE_RUNDIR/ice_diag.d

It doesn’t show anything strange though, which I don’t understand as there should be some diagnostics written (per cice5/source/ice_step_mod.F90 at e8e94d59503110d491f5dc7c35776980c6617de5 · ACCESS-NRI/cice5 · GitHub)

Hi Wilma

I’ve seen this error before. My recollection is that ice: Vertical thermo error is due to instabilities in the ice model and can be fixed by smaller time steps. (This then creates gridpoint noise which causes wfimelt to go out of range, hence the error being seen in the ocean.)

You can test by rerunning the experiment with a higher number of ice time steps per ocean time steps (I think this is ntdt, but you should check). I can’t explain why there is no diagnostics written out of the ice here…

1 Like

Thanks Andy

ndtd would increase the number of dynamics timesteps, but not change the number of thermodynamic cycles. Shortening the dynamic timestep would improve model stability and reduce noisy fields etc, but it seems unusual (although possible due to feedbacks) this would show up as an error in the thermodynamics ?

Yes, that sounds right, ndtd – the point is that it creates an instability which creates anomalous values of temperature – and so the first errors are seen in thermo. At least, that commonly happens and that’s my theory here. It’s testable, so @wghuneke will be able to tell us if it works!

Note that it’s probably just an anomalous storm or something, so you may be able to wind that parameter back after another few years of running.

Thanks @anton and @AndyHoggANU. I’m currently trying to rerun with an increased coupling time step between ocean and ice. Still running, but hasn’t reached the point when it failed previously.

ndtd is set to 1 at the moment. What would be a reasonable value to change it to?

ndtd = 2

Hey @wghuneke if increasing ndtd=2 does not solve your issue, which it may not because the error that is hinted at above (“ice: Vertical thermo error”) is a thermodynamic error, and ndtd is the number of mechanical-dynamic iterations. Hence you might consider reviewing this article (see FAQ: CICE Thermodynamic convergence errors. | DiscussCESM Forums )–when Dave Bailey refers to nitermax he is referring to “ndte” as it sets the number of thermodynamic iterations. That article and increasing ndte, has helped me past “ice: Vertical thermo error” in CICE6. Also if you can find “cice.runlog…” which has more detailed information about when CICE falls over then this might further benefit you. Hope that is helpful.

Thanks Dan - this is interesting. I hadn’t realised the thermodynamics had a seperate iterative solver.

ndte is different again. Its the number of evp subcycles per dynamic cycle. (e.g. setting ndtd to 2 also doubles the number of evp subcycles per normal/thermodynamic timestep.)

I think he’s referring to nitermax in the bl99 thermodynamic solver. Here

I just noticed OM2 has nitermax for bl99 set to 500, and CM2 has it set to 100, so that could be the issue.

I am moderately sure thats the file called ice_diag.d in this case

1 Like

Nice! Yes, I see this now that he is referring to the number of iterations in the temperature solver.

Changing ndtd=2 worked. I could change it back for the next resubmission. Seems like Andy was correct and there was a storm that the model had to make it through…

Thats good!

If it doesn’t slow down the overall model, then leaving it 2 is good - other centres are encouraging us to configure OM3 with more dynamics cycles that we did for OM2/CM2 and setting ndtd=2 is one way to do this.

For consistency with the rest of the model run, you might prefer to change it back to 1.

Interesting. Setting ndtd=2 did slow down the run quite dramatically though (walltime went up from around 2:50 hr to 4:10 hr).

Probably go back to 1 then and see how it goes !

Yes, I only ran one 6-month block with 2 to avoid the model crashing.

1 Like