Benchcab feedback including analysis plots ideas

CABLE-POP: eventually but it will take time. The main issue is what to do about the spin-up. It is too expensive and too long to run the spin-up every time during development for example.

Specific site: not currently. There are several reasons for this. First, there is no point in evaluating CABLE at irrigated sites for now. We know it will be bad. Also, sites with too much missing data or short timeseries are poor choices for running statistic-based diagnostics.
The last reason is due to how me.org organises the analysis. For the moment, the analysis is linked to a given set of observations and expects to have data for all of these sites. I have emitted the idea it would be great to develop an analysis script that has access to “all site data” (except the “problematic” sites identified above) and only picks the ones we provide model outputs for. This way we could get benchcab to run any subset of site simulations. The change in benchcab to allow this is really minimal.
One additional reason: benchcab main purpose is to enable us to have reproducible, standard evaluation so we can easily compare results between versions. This means additional flexibility to the tool is not a priority at this point. Especially because additional flexibility can make it harder to determine if we have comparable evaluations.