A centralised location for ACCESS forcing and ancil files

Hi folks,

at the moment, many of the forcing and ancillary files that ACCESS requires to run are scattered throughout various personal folders on gadi. This presents a problem for payu, for example, where configurations would need users to be members of particular projects to run, which we can’t assume(see I had to make some changes to get this to work on Gadi. by tammasloughran · Pull Request #2 · coecms/esm-ssp585 · GitHub). A centralised location for all ACCESS forcing and ancillary files is needed. /g/data/access is one suggestion. The README file there also points to /projects/access. It’s unclear to me what the difference is. Whatever is chosen, it should be well organised and documented so that it’s easy to find the files that everyone needs.

Cheers,
Tam

I’m also wondering how this relates to @clairecarouge 's management plan for CABLE extraneous files as it transitions to git.

That README is ludicrously out of date:

$ cat /g/data/access/README
================================================================================
Michael Naughton
13/12/2013

/g/data/access should be mostly obsolete now. It was created on vayu, with a view to being used on raijin, but was superseded by /projects/access. To avoid confusion, directories copied across from vayu /g/data/access have been moved to gdata_from_vayu to avoid confusion.

Exception is the newly created AccessModelExperimentLibrary (AMEL), which contains input and output files for several standard ACCESS model experiments for the NeCTAR Climate and Weather Science Laboratory.

Please contact me if this causes problems for anyone.

I agree that all such data should be in /g/data/access. @MartinDix will have some ideas I am sure.

@tammasloughran It is related in some ways. I would like to manage the input files for CABLE outside the source code repository as I don’t think it works well within the repository. That would mean having a data collection centrally managed. There could also be experiment repositories (similar to payu’s way) managed via GitHub. I need to gather more information on what people are storing in the source code repository to be sure of the details though.

I don’t think the data collection would live under /g/data/access as not all CABLE users are members of “access” project at NCI. And the format of the input files for CABLE offline is very different from the ancil files format for ACCESS. But the location isn’t set yet.

So a similar idea but probably different collections, different task.

And if I remember the history of it all, now /g/data/access and /projects/access point to the same location. Long story there that can be summarised by saying NCI changed the way they organise their filesystems and discontinued the /projects/ area.

/projects/access was a special case from the vayu days. It was still separate to /g/data/access on raijin (actually on scratch). Moved to /g/data/access on gadi but with the old /projects path kept so as not to break any existing jobs. Hence the ugly
/projects/access -> /g/data/access/projects/access

Some files there may subject to the UM licence (it’s not all that clear for data files). It’s less critical than before because code mirrors have moved to /g/data/ki32.

It should be possible to use file access control lists to make say /g/data/access/CABLE usable by a wider group than members of the ACCESS project.

Thanks Martin for clarifying. Can someone with write permissions to /g/data/access/README (access.admin) please update the README to say that /projects/access is for preservation purposes and that /g/data/access/ is the future for ACCESS ancil and forcing files?

I’ve done a basic update of the README, it would probably also be useful to have this information on access-hive