ECMWF Releases Software Strategy

The European Centre for Medium-Range Weather Forecasts (ECMWF) has released their software strategy out to 2027. From the highlights reel:

Following a decade of adopting an open source policy for part of our software stack, we are now further engaging with the community. We are adopting a full open development policy and moving our software development to an open GitHub based workflow. Here, external contributors will be able to see any challenges we face, updates we make, and how we test.

This makes our internal development process transparent, helping to establish true partnerships with external contributors, thus strengthening our work with our Member States and the community in general.

This is great news, and something we’d hope other large modelling centres would emulate.

From the technical report:

ECMWF has had an open-source policy for all non-IFS software for many years and has also been making much of its open-source software available on GitHub. However, in most cases our developers primarily interact with the code on our internal BitBucket servers, with the GitHub copy available for external users to use and contribute to. For the most part, Continuous Integration (CI) and Continuous Deployment (CD), issue handling and documentation are handled using our internal Atlassian systems.

Whilst we are aware that there is potentially the extra burden of managing the interactions with external contributors, experience in recent years has shown that the cost-benefit is very favourable, with the community quite often improving the code and, moreover, increasing the interaction with Member and Co- operating States (e.g., with the Met Office in odc and eckit, and Météo-France within Atlas library).

We aim to build on the progress already made in the further adoption of GitHub and other open platforms, including for CI/CD and software documentation. This will bring ECMWF’s software practices into line with much of the rest of the scientific software community and encourage greater collaboration.

This is also the approach ACCESS-NRI is taking with it’s development processes, to be as open as possible to you, the community, and encourage engagement in software development. To reduce technical debt (all the “quick” fixes over the years to “just make it work”) and embrace CI/CD (Continuous Integration and Continuous Deployment) for testing and software and documentation deployment.

Some other interesting quotes:

We will restructure the software into smaller components that can seamlessly integrate with each other and with packages that the community already provides. This will make it more accessible to the community, and compatible with commonly used software packages.

This is also a laudable goal. Hopefully this means our community can make use of the excellent work ECMWF does, and we can also strive for the same goal.

As our data grow in volume and throughput, it is imperative that we focus on data-centric workflows. This means preparing our software and services to use data at source, minimising the movement of data, which is energy intensive.

To increase efficiency, for example, we allow users to bring their workflows into our data centre by running their analyses within the European Weather Cloud.

This is an interesting development, somewhat bucking the international trend of outsourcing data provision to the big commercial players (Amazon, Google, Microsoft). Instead ECMWF is saying the data is too big to keep moving around, instead it is welcoming others inside their tent. The big gotcha is who gets access to this tent? If anyone has any knowledge about their plans in this area it would be fascinating to hear them.

We are updating our software with the user community and open standards in mind to enable higher interoperability of the systems that we manage.

An example will be the provision of our data via the new Open Geospatial Consortium APIs that are currently being developed.

Again, this is a great initiative. At ACCESS-NRI we’ve only got as far as acknowledging the technical gulf that too often separates the climate data and the geospatial community, and wanting to address this for ACCESS. Any comments @rbeucher?