COSIMA Working Group Reference Datasets FY23-24

Introduction

ACCESS-NRI would like to know the needs of the Working Groups for Reference Datasets. We would like to know what datasets of significance for the working group you would like to have hosted in a published data collection at NCI within the financial year 23-24. We need to create a list of likely datasets that will be ready within this timeframe so we can start the data management process for these datasets with you and NCI.

What is a Reference Dataset

A Reference Dataset is data that is of significance to the community and will be used by a range of users. Possible examples:

  • Datasets for input in climate models (forcings, ancillary datasets etc.)
  • Datasets for evaluation of model outputs
  • Reference model outputs that are used widely

What we need from you

Proposing datasets

If you have a dataset you think

  1. can be considered as a Reference Dataset
  2. Will be ready for publication before June 2024

Please reply to this topic with:

  • Dataset name
  • Contact to liaise with publishing data (name and organisation)
  • Dataset details to show its importance to the community
  • Required storage in GB

Voting on datasets

This topic has post voting enabled. You can make short comments on replies but you can also vote on replies. This is an experiment to crowd-source information about which datasets you value.

Up-vote datasets that are important to your work, or you think are important for ocean and sea-ice modelling at NCI. You can down-vote datasets you think should not be included. Feel free to comment as well to provide context on why you voted a certain way.

Keep up to date

If you are a member of the COSIMA Working Group you should have received this topic as an email, as we have changed the defaults so Working Group members always get notified of new topics in the COSIMA > Working Group category. This is to ensure you don’t miss important information.

Consider watching this topic if you want to stay updated on what is happening with datasets for the working group.

Datasets list

Below is a table that I will update summarising the most voted datasets proposed by the working group:

15 Answers

15

World Ocean Atlas, temperature and salinity. This is used for ACCESS-OM initial conditions and salinity restoring.

I'll also just add that the individual profiles (i.e., not gridded data) from WOD would be very useful to have for ocean observations analyses.

–

FYI I've downloaded WOA23 to /g/data/ik11/observations/woa23

–

NSIDC Sea Ice Concentration

Climate Data Record, Daily 25km.

Contact is Walt Meier, NSIDC.

This is the most relevant long(ish) term dataset of sea ice for model evaluation. Data storage ~5GB for both poles.

This is available as a Tier 3 (Licensing restrictions) dataset for ESMValTool and ILAMB. I need to figure out how to make it available to the community. I am planning to create a CF-compliant version that can be ingested by ESMValTool and ILAMB.

–

This request is now being handled as part of the request for evaluation datasets.

–

Version 4 of the CRU TS monthly high-resolution gridded multivariate climate dataset

https://www.nature.com/articles/s41597-020-0453-3
Will be available via the ACCESS-NRI replica dataset collection for model evaluation soon.

JRA55-do v1.5.0.1 near-real-time model forcing data

  • v1.5.0.1 : temporal coverage is 01Jan2020 - latest
  • updated daily, but with 4 days delay
  • data for the past 30 days are preliminary, and replaced day-by-day by final, higher-quality data
  • dataset will cease being updated sometime in 2023 - see here

Download/update script: GitHub - COSIMA/JRA55-do-1-5-0-1: Scripts to download and update JRA55-do v1.5.0.1 downloads any new data from here

Download location: /g/data/ik11/inputs/JRA-55/JRA55-do-1-5-0-1/, currently 188GB.

Needed for near-real-time ACCESS-OM2 runs, e.g. 01deg_jra55v140_iaf_cycle4_jra55v150_extension

Does anyone use any ocean reanalysis products that they would like ACCESS-NRI to curate? @taimoorsohail @Thomas-Moore ? Or are these already managed by NCI?

BRAN2020 ( now updated through 2023 ) is already on NCI ( https://my.nci.org.au/mancini/project/gb6 ) at 16TB on disk and 50TB+ unpacked float32. Offers 30 years of daily, global at 1/10th degree. Apologies if I misunderstood what "reference datasets" meant. (I should read above!) One suggestion would be to simply have this data in the ACCESS-NRI catalog. I do have working code to build an intake sub-catalogue for BRAN2020.

–

Thanks for the ping! I use ERA5 (already managed and available on NCI: https://opus.nci.org.au/display/ERA5/Data+Access), COREv2 (which only has data until 2006 and is no longer being updated, so may not be as relevant) and OAFlux: https://climatedataguide.ucar.edu/climate-data/oaflux-objectively-analyzed-air-sea-fluxes-global-oceans. Not sure if these constitute "reference datasets" but are good reanalysis products to compare in-situ ocean properties and model surface fluxes with.

–

I’m not sure if “reference datasets” includes gridded observations or in-situ profiles, as they can be used to compare with models, but I’ll put them down here just in case:

  • Dataset name: EN4 - Met Office Hadley Centre observations datasets
  • Contact to liaise with publishing data (name and organisation): Rachel Killick, Met Office UK
  • Dataset details to show its importance to the community: Gridded 1x1-degree monthly QC’d T and S dataset (corrected, QC’d profiles are also available) from ~1900s to present. EN4 corrects where possible errors in T and S observations (or removes them) that are flagged in the WOD profile database, resulting in a high(er?) quality gridded product for comparison with CMIP/ocean models.

Ocean Reanalysis System 5 (ORAS5) Global coverage, 0.25 degres. Monthly resolution.

ECMWF: Ocean Reanalysis System 5 | ECMWF
Available from the CDS https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-oras5?tab=overview

There have been discussions on adding support for Ocean reanalysis in ESMValTool. This will likely start with ORAS5.

OSI-SAF / NMO Sea-Ice Drift

OSI-405C is the best of the global long term sea ice drift / motion products to compare results against. Its a daily product but the drift is averaged over 48-hours. 62.5km resolution.

Size is <10 GB

(Suggestion from @adfraser)

https://osi-saf.eumetsat.int/products/osi-405-c

Remote Sensing Ocean Productivity Products.

Various remote sensing ocean productivity products exist, derived from different satellites and different empirical models. The products can be used for BGC model validation, forcing ecological models, and general BGC/satellite oceanography.

This data is all downloadable on the OSU ocean productivity page as individual .tar files for each (8-day) times step. I suspect hosting gridded yearly or climatologic files would be useful.

Core products include NPP from VgPM and NPP from CbPM (which are two fundamentally different algorithms for estimating NPP), Chlorophyll (from different satellites if desired), phytoplankton biomass, phytoplankton specific division rates, and various associated physical properties need to estimate the rate terms.

Diagnostic Products that I could contribute include grazing rates estimate from the carbon budget which will be used to help validate WOMBAT v2 and could be useful to the ecology community.

More information on the data sets available is here:
http://sites.science.oregonstate.edu/ocean.productivity/site.php

More information on the emprical models is here:
http://sites.science.oregonstate.edu/ocean.productivity/vgpm.model.php

I think ACCESS-OM2-01 IAF simulations definitely fall under this category. Much of this data is already publicly published on cj50 (@aekiss can perhaps clarify how much is where?), so I’m not sure how that would work moving it. But it would be great to have ACCESS-NRI look after these (and future ACCESS-OM3) model outputs that are very widely used within the COSIMA community.

Details on the breakdown between ik11 and cj50 are here. Nearly everything of interest is in cj50. ik11 just has some 3d daily data south of 60S and passive tracers in cycle 3, and the near-real-time data 01deg_jra55v140_iaf_cycle4_jra55v150_extension. Most of the cj50 data is published via thredds.

–

Ok, looks like we are fine for the ACCESS-OM2-01 IAF simulations, so scratch that last suggestion.

However, it was clear from the COSIMA meeting today that many people have published with and are still using the ACCESS-OM2-01 RYF simulation (/g/data/ik11/outputs/access-om2-01/01deg_jra55v13_ryf9091/) and we can not delete any of it. This is currently on ik11 and we don’t have space to publicly publish this on cj50. Could we please turn this into a reference dataset? It is 125 TB.

Another observational dataset:

  • Dataset name: IAPv4 - Temperature at: CAS Oceanographic Data Center;
    Salinity at: CASEarth Data Sharing and Service Portal
  • Contact to liaise with publishing data (name and organisation): Lijing Cheng, Institute of Atmospheric Physics, Chinese Academy of Sciences
  • Dataset details to show its importance to the community: Gridded 1-degree monthly global T and S dataset 1940 to 2022. The IAPv4 dataset is important as it uses a different interpolation technique to optimally interpolated datasets like WOA or EN4. This yields different T and S estimates, particularly in more poorly observed polar regions.

Hi! I was wondering if there have been any updates on this? There’s a couple of these datasets I’d like to use and am not finding in gadi, so wanted to check the status? i.e. NSIDC and WOA

have a look in /g/data/ik11/observations for NSIDC and WOA

–

Hi @JuliaN, apologies for the delay on this. I’ve recently joined the ACCESS-NRI and will be working through these requests with the WGs. It will take a while for us to get through them all but I’ll get back to you (and the COSIMA WG) with more information as soon as possible. Thanks. :slight_smile:

Southern Ocean Monthly Climatology (Yamazaki et al. 2024)