The Model Evaluation and Diagnostics team is working on improving how we find and load data on NCI Gadi. We want to start discussions on how we could enhance how the ACCESS model writes data to disk, which will help remove some constraints during the analysis phase. We believe this is crucial for better evaluating model outputs.
see topics:
Current Structure and Issues in the p73 Project
We’re encountering an issue with the output of the ACCESS-ESM model, specifically with the atmospheric data. The data is stored as follows:
./atm/netCDF/:
HI-CN-05.pa-185001_mon.nc
HI-CN-05.pa-185002_mon.nc
HI-CN-05.pa-185003_mon.nc
HI-CN-05.pa-185004_mon.nc
HI-CN-05.pa-185005_mon.nc
HI-CN-05.pa-185006_mon.nc
HI-CN-05.pa-185007_mon.nc
HI-CN-05.pa-185008_mon.nc
HI-CN-05.pa-185009_mon.nc
HI-CN-05.pa-185010_mon.nc
To analyze data across the entire historical period, we need to create an XArray DataArray or IRIS cube by piecing together all these files. Each file contains a lot of metadata and all the variables, which makes this process slow and resource-intensive.
How to Improve
-
One Variable per File:
- Store each variable in its own file.
- Example:
ACCESS-ESM1-5_atm_historical_r1i1p1f1_185001_pr.nc ACCESS-ESM1-5_atm_historical_r1i1p1f1_185001_tas.nc
-
Use Clear and Consistent File Names:
- Include key details like the variable name, time period, and model configuration in the file names.
- Example:
ACCESS-ESM1-5_atm_historical_r1i1p1f1_1850-1859_pr.nc
-
Time Segments:
- Break down data into monthly or yearly files
- Example:
ACCESS-ESM1-5_atm_historical_r1i1p1f1_185001_pr.nc
-
Easy Metadata Extraction:
- Ensure the naming convention supports easy metadata extraction, aiding quick indexing and retrieval by tools like Intake.
-
Borrow from CMIP Standards:
- Although CMIP standards can be extensive, using elements such as experiment IDs, realization numbers, and variable short names can improve file organization and usability.
Model Evaluation Tools and CF Compliance
Most model evaluation tools are built for CMIP analysis and expect data to be CF-compliant and, in some cases, CMORised. CMORisation aligns data with CF (Climate and Forecast) conventions and CMIP standards, simplifying analysis and comparison. However, CMORisation is resource-intensive and should be avoided if possible.