COSIMA Working Group Reference Datasets FY23-24


ACCESS-NRI would like to know the needs of the Working Groups for Reference Datasets. We would like to know what datasets of significance for the working group you would like to have hosted in a published data collection at NCI within the financial year 23-24. We need to create a list of likely datasets that will be ready within this timeframe so we can start the data management process for these datasets with you and NCI.

What is a Reference Dataset

A Reference Dataset is data that is of significance to the community and will be used by a range of users. Possible examples:

  • Datasets for input in climate models (forcings, ancillary datasets etc.)
  • Datasets for evaluation of model outputs
  • Reference model outputs that are used widely

What we need from you

Proposing datasets

If you have a dataset you think

  1. can be considered as a Reference Dataset
  2. Will be ready for publication before June 2024

Please reply to this topic with:

  • Dataset name
  • Contact to liaise with publishing data (name and organisation)
  • Dataset details to show its importance to the community
  • Required storage in GB

Voting on datasets

This topic has post voting enabled. You can make short comments on replies but you can also vote on replies. This is an experiment to crowd-source information about which datasets you value.

Up-vote datasets that are important to your work, or you think are important for ocean and sea-ice modelling at NCI. You can down-vote datasets you think should not be included. Feel free to comment as well to provide context on why you voted a certain way.

Keep up to date

If you are a member of the COSIMA Working Group you should have received this topic as an email, as we have changed the defaults so Working Group members always get notified of new topics in the Working Group category. This is to ensure you don’t miss important information.

Consider watching this topic if you want to stay updated on what is happening with datasets for the working group.

Datasets list

Below is a table that I will update summarising the most voted datasets proposed by the working group:

1 Like

NSIDC Sea Ice Concentration

Climate Data Record, Daily 25km.

Contact is Walt Meier, NSIDC.

This is the most relevant long(ish) term dataset of sea ice for model evaluation. Data storage ~5GB for both poles.

1 Like

Version 4 of the CRU TS monthly high-resolution gridded multivariate climate dataset
Will be available via the ACCESS-NRI replica dataset collection for model evaluation soon.

1 Like

Ocean Reanalysis System 5 (ORAS5) Global coverage, 0.25 degres. Monthly resolution.

ECMWF: Ocean Reanalysis System 5 | ECMWF
Available from the CDS Copernicus Climate Data Store |

There have been discussions on adding support for Ocean reanalysis in ESMValTool. This will likely start with ORAS5.

OSI-SAF / NMO Sea-Ice Drift

OSI-405C is the best of the global long term sea ice drift / motion products to compare results against. Its a daily product but the drift is averaged over 48-hours. 62.5km resolution.

Size is <10 GB

(Suggestion from @adfraser)

Remote Sensing Ocean Productivity Products.

Various remote sensing ocean productivity products exist, derived from different satellites and different empirical models. The products can be used for BGC model validation, forcing ecological models, and general BGC/satellite oceanography.

This data is all downloadable on the OSU ocean productivity page as individual .tar files for each (8-day) times step. I suspect hosting gridded yearly or climatologic files would be useful.

Core products include NPP from VgPM and NPP from CbPM (which are two fundamentally different algorithms for estimating NPP), Chlorophyll (from different satellites if desired), phytoplankton biomass, phytoplankton specific division rates, and various associated physical properties need to estimate the rate terms.

Diagnostic Products that I could contribute include grazing rates estimate from the carbon budget which will be used to help validate WOMBAT v2 and could be useful to the ecology community.

More information on the data sets available is here:

More information on the emprical models is here:


JRA55-do v1.5.0.1 near-real-time model forcing data

  • v1.5.0.1 : temporal coverage is 01Jan2020 - latest
  • updated daily, but with 4 days delay
  • data for the past 30 days are preliminary, and replaced day-by-day by final, higher-quality data
  • dataset will cease being updated sometime in 2023 - see here

Download/update script: GitHub - COSIMA/JRA55-do-1-5-0-1: Scripts to download and update JRA55-do v1.5.0.1 downloads any new data from here

Download location: /g/data/ik11/inputs/JRA-55/JRA55-do-1-5-0-1/, currently 188GB.

Needed for near-real-time ACCESS-OM2 runs, e.g. 01deg_jra55v140_iaf_cycle4_jra55v150_extension


I think ACCESS-OM2-01 IAF simulations definitely fall under this category. Much of this data is already publicly published on cj50 (@aekiss can perhaps clarify how much is where?), so I’m not sure how that would work moving it. But it would be great to have ACCESS-NRI look after these (and future ACCESS-OM3) model outputs that are very widely used within the COSIMA community.

1 Like