I am attempting to change the land-sea mask in ACCESS-ESM1.5. I have started from the standard pre-industrial configuration and am attempting to close Drake Passage.
What I have done so far:
I edited the ocean depth input in grid_spec.nc, so that the Drake Passage is closed. I then ran the model from a cold start, i.e. zero velocity, without changing anything in the atmosphere, and it completed 1 year successfully.
I interpolated my new land-sea mask onto the N96 atmosphere grid, and created new variables of the binary land sea mask (LSM) and fractional land-sea mask (landfrac).
I edited the pre-industrial restart files to have āreasonableā values in Drake Passage consistent with my new LSM variable. I did this systematically to all of the atmosphere restart variables. To do this, I made use of scripts in this folder: /g/data/access/projects/access/apps/pythonlib/umfile_utils/
(These enable editing of UM binary filesā¦)
I edited all the input ancillary files I could find in the pre-industrial restart folder so that they are also consistent with my new LSM.
I edited the āmasks.ncā and ākmt.ncā variables in the coupler input directory, in line with the new LSM.
I edited the namelist for the atmosphere to have an increased number of land grid cells, consistent with my LSM.
I tried running a new simulation with this new configuration, and it appeared to crash after time step 3.
I am seeking assistance on how to debug the error outputs. Specifically:
Is it possible to run UM7.3 in ādebug modeā?
Is there a standard way to check consistency of restart variables with the land-sea mask?
Does anyone with experience of changing the land-sea mask know how to spot typical errors / ways of crashing the model?
Try outputting a field like surface temperature every timestep (you may need to set up the file streams so that it uses a new file each step) and look for outlier points along the coasts
Make sure any new land points have reasonable values for orography, vegetation etc.
Dear David,
I did a number of land-sea changes experiments in the ACCESS-N48 coupled model over the past view days. When I introduce land, the simulation usually crashes, either fast (~few time steps?) or sometimes after a few month. I notices that these crashes have very cold Tsurf (< -100^oC) at locations that are partial land points, but most likely have no ocean in the ocean mask.
So, I manually changed the UM land-sea masks/fraction to make these point 100% land. These run than all go for 5yrs and no crash. So, I think there is something wrong with the mask interpolations.
I also did runs in which I removed land. These usually do not crash.
Maybe we can discuss on Thursday in our weekly ACCESS-N48 coupled model development meeting.
I can certainly try making all of the new land points have 100% land fraction to see if that also helps. I will also try Scottās suggestion to output surface temperature every timestep.
Another question: in the coupler input directory, there are these remapping files, labelled:
rmp_cice_to_um1t_CONSERV_FRACNNEI.nc
rmp_cice_to_um1u_CONSERV_FRACNNEI.nc
rmp_cice_to_um1v_CONSERV_FRACNNEI.nc
rmp_um1t_to_cice_CONSERV_DESTAREA.nc
rmp_um1t_to_cice_CONSERV_FRACNNEI.nc
rmp_um1u_to_cice_CONSERV_FRACNNEI.nc
rmp_um1v_to_cice_CONSERV_FRACNNEI.nc
Theyāll be regenerated if they donāt exist when you start the run. You can also manually generate them with ESMF_regrid_weight_gen but variable names need to be changed to match what oasis expects
As Scott said, the rmp* files get automatically generated by the oasis coupler if they donāt exist.
Aidan
(Aidan Heerdegen, ACCESS-NRI Release Team Lead)
10
Note that COSIMA manually generate the regridding files with ESMF_regrid_weight_gen because doing so on the fly with OASIS is not feasible for the very high resolution ocean model, it even struggles a bit at 0.25 degree from memory.
For low resolution (1 degree and above) regridding on the fly works well, and only has to be done once. @Scott do the generated files have to be moved to an input directory, or are they generated in the correct location?
Thanks Aidan. Continuing my theme of really basic questions:
ACCESS-ESM1.5 has ice restart files called iced.01010101 and ice.restart_file. These appear to be in a binary format. Do you or COSIMA people know how to read/write these? @aekiss maybe?
Aidan
(Aidan Heerdegen, ACCESS-NRI Release Team Lead)
13
Some updates:
I checked what the model was generating for the remapping files. Some of the mask variables that were auto-generated by the coupler looked wrong. So I went through and recreated these rmp*nc files using ESMF_regrid_weight_gen as suggested.
I double checked that I could successfully run the pre-industrial control using new rmp* files that I created.
For the Drake Passage closed configuration, my new rmp* files made no difference. The model still crashes at the same place, for unknown reasons.
With help from @atteggiani I also adjusted the STASHC file to try to get outputs from every time step. There is a ātime domainā specified in STASHC called āTALLTSā which stands for output at every time step. I tried to get one of the diagnostic files (which contains surface temperature) to use that time domain instead of monthly output. While this succeeding in generating a large output file, I couldnāt get anything useful out of that file. xconv fails to show anything, and conv2nc.tcl gets a segmentation fault.
My only lead at this point is that when I inspect the processor log atm.fort6.pe0, it shows that atmosphere Tracer 01 and Tracer 02 are going to NaN values after the first couple of time steps. I also get the following warning:
WARNING q_POS : UNABLE TO RESET VALUES CONSERVATIVELY
(This q_POS thing doesnāt occur in the pre-industrial run). So Iād like to try figure out what this means.
Otherwise, I would like to lower the timestep. Anybody know how to do that? I know itās currently 1800 seconds from the log file, but I have no idea where this value is specified.
For those interested: I managed to halve the timestep to 900 seconds. This is implemented through the namelists file:
STEPS_PER_PERIODim=48,0,0,0,
and
A_ENERGYSTEPS=48,
I changed the 48 to 96 in both cases.
This also occurred in the CONTCNTL file, so I changed them there too.
(This did not solve my problem, but I got the timestep down to 900 seconds.)
The problem was: I was interpolating across the Drake Passage by averaging across multiple grid boxes of the land-based atmosphere grid cells. I did it systematically to all restart variables, so did not have time to check every single one. I did a comparison of all max/min values and some funny values popped up.
So I re-did it by replicating a single grid box from South America across the Drake Passage, i.e. replication of an existing cell, no averaging. This worked. Iām not even sure which variable was breaking it. But, now the model runs for a month, with no crash and no apparent crazy values.
I also realised that the sea ice restart variable (within the UM restart file) had to be treated differently, because sea ice exists for fractional land cells (0 < landfrac < 1), but not for complete land cells (landfrac==1). Therefore, it had to be treated differently.
The ACCESS-N48 case that Dietmar is working on crashes in a different way. We donāt know why yet.
Itās also running using the rose-cylc workflow, whereas my case, ACCESS-ESM1.5, uses the payu workflow. We will look into it some more.