vn19
Investigating analysis-ready data (ARD) strategies to increase impact of ocean and climate model archives at NCI
The purpose of this NCI project is to provide resources to develop and test Analysis-Ready Data (ARD) workflows for climate and ocean modelling on High-Performance Computing (HPC) systems at NCI. This project, supported by the CSIRO, aims to bring together members of the COSIMA community, the Australian Climate Service*, and other interested parties to explore and develop ARD workflows.
*NB: recent ARD discussions have already spawned ACS efforts on Coupled Coastal Hazard Prediciton System (CCHaPS) SCHISM-WWMIII model used in ACS WP3
Project Goals:
The ARD project focuses on:
- Developing and testing ARD workflows using dedicated storage and compute resources at NCI
- Crowdsourcing use-cases and solutions from various organisations / projects including COSIMA and the Australian Climate Service (ACS)
- Fostering community learning and collaboration
- Exploring the value of using these approaches more systematically for a range of projects and organisations.
Stretch Goals
- Develop a small, simple package focused on NCI HPC python workflows
- Enable new scientific publications ( example: Chapman et al 2024 (submitted), Extreme Ocean Conditions in a Western Boundary Current through ‘Oceanic Blocking’ )
- Publishing data science workflows
Resources
The project has been allocated the following resources by NCI:
- 100 kSU of compute
- 5TB gdatga storage
- 50TB scratch storage
These resources, while modest, provide a foundation for trialing real-life workflows and developing solutions.
Code of Conduct
To ensure a productive and collaborative environment, those involved will be guided by the following principles:
- Be welcoming and kind to each other
- Open source only (MIT license)
- A safe space to discuss the data science that enables science
- Be generous with knowledge and resources
- Follow research integrity guidelines ( Australian Code for the Responsible Conduct of Research 2018 )
- Authorship criteria
- acquisition of research data where the acquisition has required significant intellectual judgement, planning, design, or input
- analysis or interpretation of research data
- Communicate early and often about planned publications
- As a project evolves, it is important to continue to discuss authorship, especially if new people become involved in the research and make a significant intellectual or scholarly contribution
- Authorship criteria
- Resources are limited:
vn19
is a sandbox - backup anything important elsewhere and don’t completely rely onvn19
for your actual deadlines and deliverablesvn19
is ultimately supported by the CSIRO share at NCI and this may influence future priorities for use of resources- Resourcing decisions (compute and storage) will be brought openly to the community but the Project lead reserves the right to be a benevolent dictator to maintain institutional support for the project.
Get Involved
We encourage community participation. Here’s how you can get involved:
- Say hello and present your use-case or problem here on the ACCESS-Hive Forum
- [If you are ready] write an issue in any open GitHub repository and make it known here on the ACCESS-Hive Forum
- an open COSIMA repo
- best-practice example workflow for loading ensemble ACCESS-ESM1.5 data · Issue #8 · shared-climate-data-problems/CMIP-data-problems · GitHub ( via @jemmajeffree issue Using intake-esm to load an ensemble: real-world problem that might lead to a tutorial/example · Issue #444 · COSIMA/cosima-recipes · GitHub)
- any other open repo
- Join the
vn19
NCI project ( Log in | MyNCI )