Tool for modifying ACCESS-ESM restart files

lachlanswhyborn · 30 January 2025 03:05

Following discussion during the CABLE4 meeting and Rachel’s work around matching ACCESS-ESM vegetation maps to the LUH3 dataset, there is a need for a tool for making modifications to ACCESS-ESM restart files. I want to scope out exactly what people are looking for in this tool.

Purpose

Modify ACCESS-ESM restart files (for the land surface) to accommodate modified vegetation maps. This is non-trivial as the existing restart file doesn’t necessarily contain sensible values in unused vegetation tiles.

Should we also accommodate other changes? Say improved soil property maps are produced, should the tool also be able to make these changes? Are other ancillary updates, likely to require derived changes to other parameters?

Requirements

Take a user specified vegetation map and reference ACCESS-ESM restart file, and create a new restart file, in which the state properties within all tiles (or just the required vegetation tiles?) have physically sensible values.

It should be possible for the user to specify the remapping method for each property, BUT there should be good defaults that are fallen back to if no method is specified. For example, properties relating to the short time scale physics, like soil moisture or temperature, area averaging would be applied by default, while for properties relating to the carbon cycle, some nearest neighbour process would be applied.

What does nearest neighbour actually mean in this scenario? Say there is a grid cell which was originally 100% Evergreen Broadleaf, but the new vegetation map specifies 50% Evergreen Broadleaf and 50% C3 grasses. How do we fill the phenology variables for the C3 grasses?

Comments/Feedback

I plan to build a bit of a prototype to demonstrate over the next few weeks. If people involved in this effort (@RachelLaw, @tiloz, @inh599, @tammasloughran, @clairecarouge) have more clear ideas of what this tool must achieve, or comments about the details that I might be missing, please let me know.

Scott · 30 January 2025 04:09

There may be useful ideas in ANTS which is the tool used by the met office to pre-process external datasets for use by the model.
There is a contrib repository with a variety of scripts using the tool.

RachelLaw · 30 January 2025 05:00

It’s probably useful to distinguish between those fields in the restart file that are initial conditions and those that are brought in from ancillary (forcing) data. In the case of the ancillary information, it may be better that we write new ancillary files (this would presumably include the vegetation distribution) and then use the ‘reconfiguration’ step of the UM to bring these into the restart file.

For the new ESM1.6 vegetation distribution I’ve been working on, my ‘nearest neighbour’ for the C/N/P pools had a number of options. Except for veg type 10 (c4 crops) which was new, it only used tiles of the same pft as the one we needed to fill. It did a local fill if it could - I defined this as an average of any tile of that type within +/-2 grid-cells, ignoring area-weighting and whether you hit a boundary of the dataset (i.e. ignored 0=360 longitude). If no tiles existed within this region, I averaged tiles within all longitudes within +/-10 degrees (+/- 8 grid-cells). This covered every case I needed but I wrote in a global average if the local or regional cases didn’t work. If useful, my fortran code is /g/data/p66/rml599/luh2/luh3/restart-fields/modifyCpools.F90.

I guess the main challenge with a tool is whether there is a generic enough solution or whether there are always/often going to be special cases that need to be accommodated. I picked my averaging regions based on a check of how many tiles would be solved at various levels of local or regional averaging.

spencerwong · 31 January 2025 00:04

I think a tool like this would be really valuable for researchers in the paleo community, who often create vegetation distributions for different time periods.

It might be outside of the scope of the initial version for CMIP7, but I think a popular use case would be working with modified land sea masks. For example, if you have a vegetation map defined on a modified land sea mask, and a reference restart file on the original mask, a method to fill the cable state variables with sensible values on the new land points based on the new vegetation map and original restart data could be really useful.

Tagging @dkhutch who’s done this previously and might be able to clarify!

lachlanswhyborn · 31 January 2025 06:34

I think it’s possible to write such a tool that is generic enough to allow tuning of the search → average process, that works for both scenarios (remapping vegetation on either the same or different land grids). The cascading search → average process would work in both instances, which is:

Attempt to retrieve from the same grid tile
Attempt to retrieve from a specified radius around the grid tile
Attempt to retrieve within a specified range of latitudes around the grid tile
Retrieve globally

It could easily accommodate defined mappings to new vegetation types as well, as was done with the C4 grasses in @RachelLaw’s script. Would simplify the process of adding new vegetation types in the future.

clairecarouge · 3 February 2025 22:11

Just to note that @RachelLaw took an average for a range of longitudes not latitudes. This is because vegetation tends to vary more with latitude than longitude. Think, tundra up in the north, then evergreen needleleaf south of it, then deciduous broadleaf trees for example.

lachlanswhyborn · 3 February 2025 22:51

Yea this is what I mean- average over everything within a given range of latitudes, i.e. within ±5 degrees of the original point, and be agnostic to the longitude.

cbengel · 3 February 2025 23:35

Just flagging here that we do modification of restart files in the regional nesting suite. Not recalculation of fields but replacement of data.

It might be simpler to keep the ESM and RNS tools completely separate but the “modify restart files” capability is relevant to both suites at the same time (most likely for different purposes).

lachlanswhyborn · 3 February 2025 23:53

Is this done via ANTS for the regional nesting suite? How powerful/easy to use is the existing framework you have? Something non-trivial which is relevant to this use case is creating a restart that contains a new vegetation type; could it handle that?

cbengel · 4 February 2025 00:54

Modifying restart files was done via python/mule.

It was to modify fields existing in the start dump not add new ones - sorry.

It is quite simple.

In our use-case ANTS would be used to prepare a field with the corrected field written to an ancillary file. The corrected field would then be added to the start-dump/restart file in a two-step process. i.e. create proper ancillary then modify the start dump using ancillary (read in via python/mule). We don’t currently need to do that though because the data exists in a suitable form to use it in our replacement scripts directly.

lachlanswhyborn · 4 February 2025 01:07

I don’t think we’d be modifying the restart file in place- more using it as a reference to create a new restart file. Thinking about it, I might be wrong in saying that it’s a “new” vegetation type (and therefore new field); I think it was that a particular vegetation type was unused in the ESM1.5 runs, so therefore had no immediately obvious reference data to initialise from.

paulleopardi · 4 February 2025 01:27

From what I have seen so far, I would be in favour of using Python scripts in ANTS Contrib to generatw new ancillary files and using these to create the restart file, as @RachelLaw and @Scott have described. I think that this may be cleaner and better documented in the long run, as well as preventing a proliferation of possibly redundant tools. The exact situation here may be different and may require an extra tool, but I think that it is at least worth investigating.

cbengel · 4 February 2025 01:41

There are too many details to go into here but there are parallels for what you are doing in the ESM in the RNS. In the RNS the heavy lifting is done via the UM. Please reach out if you want to know what they do.

lachlanswhyborn · 4 February 2025 02:40

We’ll discuss this further at our CABLE4 meeting on Thursday to work out which direction we want to go with this. Might get back to you on this.

paulleopardi · 4 February 2025 04:00

The UM already has a tool for modifying restart (dump) files. It is the reconfiguration program. See the current reconfiguration documentation for UM 13.7. Even at UM 7.3 there is a corresponding qxreconf executable.

lachlanswhyborn · 4 February 2025 04:27

It’s entirely unclear how that tool works, and we definitely don’t want to be adding more legacy Fortran code to our workflow, there’s enough of that as is I think. I would be surprised if that tool achieves what we want it to achieve.

lachlanswhyborn · 11 February 2025 23:44

I’m at the stage where I now have corrected fields contained in a NetCDF file, with variable names matching the names of the relevant fields in the original dump file (with the caveat that any “/” characters replaced, as they are not allowed in NetCDF names). I would like to place these fields back into the original dump file, and write the results to a new fields file. @cbengel Do the tools you use cover this use case?

cbengel · 12 February 2025 00:04

For a direct replacement run by you outside of a model suite, for example code look at: replace_landsurface/src/replace_landsurface at main · ACCESS-NRI/replace_landsurface · GitHub
This is a simple way especially if you are in the mode of testing.

For long-term folding in of the replacement to the suite it may be more appropriate to use the UM to read in the data from the netCDF.

And for a non-global target data that requires changes of grids/regions the solution is different again.

For now though, hopefully the first option can help you test out the inputs to see if they work.

cbengel · 12 February 2025 00:17

Please note, our use-case was replacing regional fields with data cut-out from global netcdf files.

Your use-case is a direct read and replace of fields, so you can simplify your version of the code.

Topic		Replies	Views
Modify a UM restart file to work with new vegetation distribution Earth System Model python , help , inscope , access-esm15	34	232	11 August 2025
Modifying a UM restart file for ACCESS-ESM1.5 Technical help , land , solved	39	747	23 January 2025
Updating land-use change in ACCESS restart files for new tiles Biogeochemistry Land python	10	390	6 February 2023
Changing land-sea mask in ACCESS-ESM1.5 Working Group	34	487	4 September 2023
Set up ACCESS-ESM1.5 for Miocene land-sea mask Paleoclimate experiments , wiki , user-story	0	265	27 October 2023

Tool for modifying ACCESS-ESM restart files

Purpose

Requirements

Comments/Feedback

Related topics