Machine learning and data analysis in oceanography: 54th International Liège Colloquium on Ocean Dynamics
Data-driven approaches to understand and predict the ocean dynamics have gained significant traction in recent years thanks to increased amount, coverage and quality of ocean observations, improved numerical ocean modeling, the availability of massively parallel computing devices (like graphical processing units) and the advancement of optimization schemes and machine learning (ML) models able to constrain high dimensional and non-linear systems.
Many potential applications of such statistical or MLmodels (sometimes in combination with dynamical modeling derived from first principles) have been developed. These applications include identifying patterns and features as well as regions with common dynamics or deriving related quantities from easily observed ones (such a estimation of subsurface dynamics from partial observations, or chlorophyll concentration from radiances), in filling missing data in satellite observations or adaptive sampling methods guided by machine learning. The spatial and temporal scales from different observation systems are more and more diverse which requires new techniques and approaches to combine them in an optimal way for estimating the state of the ocean. Statistical or ML models can also be used to identify bad or questionable observations and can help in quality control and quality assessment of observational data products.
Data-driven techniques have also a high potential to complement classical numerical modeling as it allows to propose new ways to represent subgrid parameterization, surrogate models, optimal ensemble predictions, and new and efficient approaches for stochastic modeling and data assimilation.
While many recent advancements are within machine learning applications, progress in more traditional data-driven analysis techniques such as optimal interpolation, variational analysis, Kalman Filtering,… are also in the scope of the colloquium.