WW3 post-processor for CF-compliance

Hi @Aidan ,
Following up from our chat on the side of the RSE community just now.

  1. yes, xp65 does contain the CF checker! /g/data/xp65/public/apps/med_conda_scripts/analysis3-25.08.d/bin/cchecker.py
  2. The WHACS (wave hindcast for ACS) data presented at eResearch (CSIRO Data Access Portal) is run on a “spherical multi-cell” (“SMC”) grid, which is a type of unstructured grid, for this dataset we went via Zarr, first we created zarrs where the points were re-ordered to be more spatially contiguous and the data chunked appropriately, and then those chunked and reordered zarrs were converted back to netCDF to follow the ACS DRS and meet metadata publishing standards (because more tools can read netCDF than Zarr and a lot of our users are non-experts and would frankly prefer if we provided the data as CSV :smiley:). There’s also structured regrids of the dataset, these we didn’t pass through the intermediate zarr step, partially because it wasn’t needed and partially because we ourselves work pretty much exclusively with the SMC data so don’t care so much about Zarring the regrids - but our benchmarking suggests that the well-structured netCDFs aren’t that much slower than the Zarrs when you run appropriate cataloguing over the top via Icechunk/kerchunk etc - this latter is a work in progress mostly done by Ben Leighton and Blake Seers as needed. (CC @Thomas-Moore)
  3. Regarding access to the WW3 post-processing code, it’s on Github but the repo itself is set to private (mostly because we think the commit history probably has some hardcoded paths in) - https://github.com/AusClimateService/WHACS - very happy to add people on an individual basis though, let me know git usernames to add. If you’re not working with SMC data, the only thing you need to clean (e.g. fix numerical overflows) and get standards compliant is this script, adjusted for your dataset obviously. https://github.com/AusClimateService/WHACS/blob/main/postproc/hindcast_reg_to_nc.py
4 Likes

Thanks @ClaireT!

I’m interested in standards compliance for global metadata and I’m working with @joshuatorrance to improve and standardise the global metadata of ACCESS model outputs.

Some related GitHub issues:

I need a tool to use in CI pipelines to check standards compliance.

The last release of cf-checker was four and half years ago. Maybe it’s feature complete, but it seems there isn’t much resourcing for support.

Maybe compliance checker is a better option, but it is quite verbose and I get the impression from their docs they think it is a tool that should be run by a person.

Yep, right, that is an issue. The CEDA checker does have some movement on the github but IOOS is much more up to date (though still a version or two behind where we might like).
We manually checked a selection of files rather than checking every one.
Kelsey (@kdruken) had a tool at NCI to run checkers over whole datasets and produce summaries of issues found, I don’t know if she still has that code or if it’s disappeared into the NCI archival ether, but I understand PaolaP has been working on similar functionality at NCI, she’s on leave ATM but @Hannes may know more?

1 Like

Yeah Paola’s tool is referenced in the wiki.

Sorry to have hijacked your topic!

1 Like