Preset outputs for paleoclimate simulations

Hi everyone, for the next version of the ESM1.5 configurations, we’re hoping to add preset output profiles for each model component tailored towards longer paleoclimate simulations. The rough plan is to start with the current Standard profiles, and remove variables that aren’t frequently analysed in paleo simulations, especially those saved at higher frequencies and with more dimensions.

Based on previous discussions, some proposals for the Reduced profiles include:

  • Remove all daily atmospheric output.
  • Remove monthly 3D ocean output, saving instead at yearly frequency.

It would be great to get feedback on whether these changes would be suitable for different projects, and to also hear if there are other variables that could be removed.

A table of the output variables from the Standard preset is available here, if there is anything you would be happy to see removed in the Reduced preset, let me know!

3 Likes

One specific question is whether it’s important to keep all of the land variables on the Cable tiles. As these have 17 levels, they tend to take up quite a bit of space in the output.

@LaurieM, @dkhutch, @HIMADRI_SAINI, @gpontes, @jbrown, I’m just tagging you as it would be great to get your suggestions. Do tag anyone else too that might have ideas. Thank you!

1 Like

Hi Spencer,
In general yes I would support removing daily atmospheric output and reducing ocean output primarily to 3D yearly and a small amount of 2D monthly data.
Unfortunately Gadi is down today and tomorrow but can dig up some examples of what I tend to save on Monday.
Regards,
David

1 Like

I am not super interested in saving the land variables in most cases, so yes culling that would be helpful to me.
I could imagine scenarios where we only switch that on for the last ~50 years of a simluation so that we’re not accumulating many hundreds of years of spinup with detailed CABLE data that we never look at.

1 Like

Thanks @dkhutch! That would be great to see examples of the variables you save once gadi is back online!

HI Spencer,
I agree with David that it would be best not to have daily atm. outputs in the standard version. We just need to have the possibility to turn it back on to save a few decades of outputs.
The standard ocean outputs in your link seem to be all we need.
One output that would be useful to have are yearly total carbon inventories in the ocean and land. That could allow for a reduced CABLE output list. the CABLE outputs are the ones we have least used so far and the ones we are struggling the most with.
Note that for paleo work we have two different type of simulations: 1) spin up, 2) generation of data. In phase 1, which can last 1000yrs!, we have only used CABLE outputs to make sure that our vegetation/ice forcing was correct. In phase 2, we have probably used LAI, and evapotranspiration…
If you had a list of outputs, I could probably tell you what we might use.
Thanks a lot for looking into this Spencer!

1 Like

Hello Spencer,

Thanks for your effort on this.

I agree with @LaurieM and @dkhutch that ideally we would want two presets: one for spin up and one for data generation (equilibrium climate).

I believe for the spin up most variables can be saved at yearly resolution, except for 3D temperature (both atmos and ocean), salinity, sea-ice extent and mixed layer depth (further suggestions are welcome). We might need these in a monthly resolution for seasonality checks during the spin up. Here, we could keep cable variables at the minimum necessary for vegetation and carbon checks.

For data generation, I would like to have back all oceanic and atmospheric variables that the current set up has been saving at a monthly resolution. Noting that, after running the ACCESS-Archiver script, we still keep many variables in the atmosphere (or cable?) as an “unknown field”, which is not really useful because we don’t what that is.

Lastly, I rarely use CABLE variables in my data analyses. @LaurieM should have a better idea on what to keep. We should keep in mind that some of our paleo simulations will be part of PMIP (Paleoclimate Model intercomparison Project) and others might be interested in these simulations.

Thanks,
Gabriel

I would think there should be specific profiles tailored for CMIP/PMIP submission.

Part of the problem with saving excessive model diagnostic fields has been a result of the output being tailored for CMIP submission.

Thank you everyone for your suggestions!

It sounds like a great idea to have a very paired down Spinup profile. One thing I’d like to understand better is whether the current Standard presets would be suitable for the shorter data generation stage after the spin up, or is it worth having an additional paleo specific profile here?

I think it would be useful to have an option to select either full PMIP/CMIP output or a smaller set of outputs where the run is not planned to be part of CMIP and the full set of outputs is not required. We are struggling to find space for output from longer non-PMIP palaeo simulations and would be nice to have a “lean output” option to save space/time. But if we are running a standard PMIP/CMIP experiment then we would want to “switch on” the full set of outputs after the spin-up stage.

1 Like

Hi @dkhutch, I just wanted to check whether you would be happy to share some examples of the variables you usually save during spin up.

Many thanks,
Spencer

Hi Spencer,
Apologies for dropping this. An example I’ve got going now is:
/home/157/dkh157/ACCESS/esm_b21_c3_p4

(let me know if you can’t see that directory and I’ll enable access…)

As you can see, my ocean outputs are pretty lean, and I’ve gotten rid of all daily atmosphere output. A typical output directory has the following sizes:
2.7G Total
1.4G ./atmosphere
232M ./ocean
1.1G ./ice
24K ./coupler
40K ./manifests

The atmosphere could shrink further if I deleted the original binary files and just kept the netcdfs. The ice is a bit wasteful as I look at only a couple of fields there (concentration and thickness usually).

2 Likes

Thanks @dkhutch, I can access that without any issues!

Hi everyone, apologies for the delay in updates.

I’ve attached a proposal for a spin-up preset based on a mix of @dkhutch’s example and the existing presets, with additional variables cut out (thanks @dkhutch for answering several questions about his outputs too). The spreadsheet shows the proposed “spin up” preset compared to the existing Standard preset.
spinup_preset.xls (97 KB)

Saving the variables in the spreadsheet produces ~2.4G of output per year (~900M is uncompressed CICE logs, which can hopefully be reduced through a future payu update).

It would be great to hear whether the proposal would be suitable for your spin up simulations, and in particular: are there any variables being removed that should instead be kept, and are there any variables being kept that you would never look at during spin up?

A few more specific questions:

  • Based on discussion, I’ve removed most of the sea ice outputs other than area, thickness and velocities. Are there any other sea ice variables that would be important to keep?

  • ~640M of the output is the unprocessed timeseries output from the UM (the aiihca.pg* files), which um2nc` isn’t able to handle. My understanding is that the timeseries are CO2 fluxes at a preselected set of land and ocean points. Is there any need to keep these during spin up? (@RachelLaw and @tiloz@MartinDix mentioned that you might have used these variables in the past and so any feedback would be welcome!)

Thanks again for your feedback and ideas!

Hi @spencerwong
.pg timeseries files definitely aren’t needed for spin-up or standard experiments. They can be useful for specific applications and in model development so it would be helpful to have an example and/or some instructions on how to set them up but in general they should be removed. Notes on an example here: CABLE site runs with ACCESS forcing - Land Surface / CABLE - ACCESS Hive Community Forum (access-hive.org.au)

I’ve been making good use of timeseries output for CABLE3 work preparing for ESM1.6 (though I have to admit that I’m not running with payu). I’ve been using convsh to convert to netcdf.

1 Like

Hi @RachelLaw,

Thank you for clarifying about the timeseries. That’s great to know that they generally wouldn’t be needed during spin up. I’ve edited the spreadsheet to remove them. Thanks for the example about setting up the timeseries – I think it would be great to eventually put together some form of documentation of working with the STASHC file covering these types of things!