Plenary Speaker: Michael Groom1, Terence J. O’Kane1: Interpretable forecasts of ENSO phase at multi-year lead times using entropic learning

Interpretable forecasts of ENSO phase at multi-year lead times using entropic learning
Michael Groom, Terence J. O’Kane

Machine learning, in particular deep learning, has shown great potential in outperforming conventional GCMs at predicting ENSO, providing useful forecast skill beyond the Boreal spring predictability barrier and enabling the possibility of issuing ENSO forecasts at multi-year lead times. However, despite these advancements in forecast skill, much less progress has been made on understanding and interpreting why these models are able to make such accurate predictions. In this work, we show that the recently proposed entropy-optimal Sparse Probabilistic Approximation (eSPA) machine learning algorithm is able to accurately forecast the phase of ENSO (i.e. La Nina, Neutral or El Nino) at lead times that are competitive with state-of-the-art deep learning methods (e.g. up to 24 months), while also being substantially more parsimonious in its formulation. This latter point makes it much easier to obtain important insights into the dynamics of ENSO that are being captured when making successful forecasts at these lead times than would otherwise be possible with a “black-box” deep learning method, of which some preliminary findings will be presented. This talk will primarily be of interest to the Machine Learning for Climate and Weather working group.Keywords: ENSO, machine learning, interpretability.

Please use this thread for discussion about this plenary talk.

Thanks for a great talk! One pretty basic ‘curiosity’ question - do you account in any way for broader tropical SST warming wrt the cutoffs you’re using to classify El Nino/La Nina/neutral years? Sorry if I missed something and you’re already classifying events using RSST or something(!)

Thanks for the interesting talk @mrgroom.

For your El Niño example there seemed to be a signal appearing in Hudson Bay and I assume the Great Lakes.

I assume it isn’t significant, just being affected by whatever atmospheric process is also influencing El Niño, but still seemed a bit odd.

Thanks! I forgot to mention this but we use a sliding 30-year climatology (updated every year) as well as perform a linear detrending on the anomalies for each field. It would be straightforward to switch to using a relative Nino3.4 index, the main reason we went with the standard Nino3.4 is that we wanted to be able to directly compare with operational forecasts

1 Like

Yes I also believe that is spurious. Most likely it is a by-product of using a linear, global basis for the encoding, i.e. EOF modes containing important signals in the relevant regions also happen to have large signals elsewhere that covary and these don’t always “cancel out” when doing the sum of squared contributions from each mode

1 Like