How ACCESS-NRI could support ML/AI

MartinDix · 10 April 2024 23:21

The FPWG welcomes your feedback on this document. Please provide comments/input by using the reply function, or editing directly in the google-doc version (in track-changes)

How ACCESS NRI could support ML/AI

Machine learning (ML) infrastructure is the software and hardware foundation required for the development and deployment of ML models.

Although ML models and infrastructure implementations differ widely depending on the application, there are core components of ML infrastructure that can be effectively supported at every stage of any ML workflow, i.e.:

Model selection and model construction
Data ingestion and preparation
Visualisation and monitoring
Model testing
Model deployment
Evaluation and analysis
Model/experiment versioning and traceability
ML pipeline automation (pipelines can encapsulate part or all of the model development, training and inference process, from data preparation, to model training and inference, to data post-processing, monitoring, and evaluation)

Below we outline how the ACCESS-NRI could potentially support ML infrastructure for the benefit of the Australian research community. The NRI is well-positioned to take a leading role in building a connected community in Australia to progress advancements in ML for weather and climate applications, and stimulate collaboration across organisations. This is an opportune time given the rapidly advancing nature of the field. It is clear that ML will be a key part of the future of our weather and climate systems.

Technical Infrastructure

Support, extend and develop tools for use at every stage of the ML workflow to enable the community to
- Use training pipelines and lower the effort for model training
- Ingest, prepare and manipulate data required for training
- Visualise and monitor at every stage of the ML workflow
- Perform integration tests and error-checking at various stages of the workflow
- Access evaluation tools to, for example, lower the effort to produce first-glance scorecards of performance
- Ensure appropriate version control and traceability of model pipelines, weights and data pre- and post-preocessing
Maintain reference implementations of pre-trained ML models i.e., neural-earth system models (pure ML weather/climate models) and other ML models (e.g., for downscaling, climate driver index prediction, image segmentation, and object identification and tracking). Potentially also maintain user-trainable reference ML model implementations.
Support infrastructure frameworks required for implementing ML emulators into dynamical model parameterisations.
Provision of support and advice for scientists and project teams around planning the HPC hardware needed for ML – which can have a huge impact on performance and cost (e.g., use of GPUs)

Documentation

Documentation and user-guides of reference model implementations
Documentation and user-guides for supported tools and infrastructure
“Getting started” primer information and various guides for using ML
Collections of ML use-case examples
Collections of applicable papers
Documentation for computational and data scientists getting started with scientific ML; ditto documentation for physical scientists getting started with ML

Data

Support and promote key datasets used for ML training and application
Where appropriate, make input and output data transparent, open and accessible
Advice on appropriate use of data storage (both input and output)

Research support and leadership

Provide advice and guidance, e.g., on the application and use of various ML architectures for different applications, on common pitfalls or misuse of ML tools and methods, and on methods for the interpretation and improvement/refinement of ML models
Maintain a benchmarking web page for popular and high-profile ML models in Australia on scores and use cases of particular interest to the Australian region and community
Support the development of community ML weather and/or climate models

Community support and development

Support and instigate community events (e.g., hackathons, workshops)
Facilitate regular working group meetings to share information and results related to ML architectures, experimental design, evaluation, challenges and opportunities
Support training for new science and data science graduates/ECRs to enable them to blend ML and geosciences in their work (build the future workforce)
Promote career pathways for physical scientists adopting ML; promotion of career pathways for computational and data scientists considering a science pathway
Provide a platform (Hive) for communication and collaboration
Provide an HPC hardware environment (NCI project) to perform experiments of interest to the ACCESS-NRI working group community

Terry · 1 May 2024 06:19

While I applaud the ambitious nature of the document, I am concerned that the scope is so broad and all encompassing as to be of of little use in developing a framework.

For example, the generic use of the term ML is pretty unhelpful. ML encompasses everthing from Neural Networks, Random Forests, Gaussian state space models, entropic AI, Bayesian inference, Neural ODEs, … the list goes on.

Applications range from subgrid modelling, parameter estimation, prediction etc., however the application will determine the appropriate methodology and training data requirements.

The document seems to suggest that the ACCESS-NRI will train people in data science which, I would argue is not its role and doesn’t have the capacity or expertise to do so. While there are certainly fairly simple, straightforward to learn ML methods (random forests for example), many require considerable background understanding if one is to do more than simply run a pre-canned tool.

I would suggest a more carefully planned and staged approach to
1: identify and coordinate a community of practice - the data science community is large and diverse (UDASH, CSIRO Environment & Data61, Universities are all active in this space) with many of us already actively engaged in problems in the earth sciences. The data assimilation community is a natural ally in this space and actively engaging with the ML community.
2: Making readily searchable available training data would be a huge step for the wider community.
3: the bullet points around infrastructure support and HPC for ML practitioners would be highly useful, especially regarding GPU implementation.

The wider aspirational goals would naturally follow from having an engaged community with code base etc built over time.

lachlanswhyborn · 10 September 2024 03:46

This working group is being closed. The topics will remain visible while we decide what to archive for posterity and how.

Topic		Replies	Views
Joint Atmosphere X F&P meeting minutes on single column models 8th May 2024 Forecasting and Prediction	2	75	10 September 2024
Sep 2023 Workshop: ML for modelling and prediction Forecasting and Prediction workshop , machine-learning , model , forecasting	10	647	10 September 2024
Poster: Software Transformation at ACCESS-NRI: Empowering Climate Science with Advanced Computing Workshop Posters workshop , software , hpc , poster , workshop-2024	0	34	2 September 2024
About the Machine Learning for Climate and Weather Working Group Working Group	0	364	10 September 2024
Applying for Machine Learning for Climate and Weather WG Resources Working Group experiment , machine-learning , working-group	1	48	6 December 2024

How ACCESS-NRI could support ML/AI

How ACCESS NRI could support ML/AI

Related topics