Mapping CCI Land Cover to CABLE PFTs

We have a first working configuration of the ACCESS-AM3 model with land ancillaries generated from the CCI 300m resolution land cover dataset. While there are still some small technical barriers to get over (see this thread for my updates on vegfunc ancillaries), we need to think about the primary science question: Mapping from the CCI land cover classifications to CABLE PFTs.

I plan to bring this up at one of the upcoming Land Working Group meetings. I’ve collected a series of plots in this zipfile (feel free to message me directly for a copy as well) that describes a quick first pass mapping from the CCI land cover classes to CABLE PFTs, one for each CCI class bar open water (see example below). I’d like to put this on your radar @alexnorton since you recently did something similar with the NVIS data (although I know you’re still be very busy with ESM1.6).

What metrics should we be using to judge how good (scientifically) the conversions are? Do we need to think about land use change as well? Given this will eventually feed into to ACCESS-ESM3, I think we will have to at some point.

@lachlanswhyborn @mlipson Quick response here - I suspect that we need to split the thinking up a bit. But certainly worth canvassing the broader community. I can see at least 2 use cases:

  1. weather/climate studies based around current/known land cover - in which case the mapping we need is directly from CCI to CABLE tiles.
  2. climate/ESM in a CMIP-like applications - in which case the mapping needs to also incorporate land use change**

Our first goal is really for use case 1. only. The two use cases likely need separating codewise since there is different science involved**.

**In effect as well as CCI->CABLE part 2 needs a CABLE pre-industrial->CABLE present in a way that picks up what CCI/CABLE thinks would have been the natural vegetation mix, and how LUH2/3 thinks how much land cover has changed. This also brings in questions around vegetation mix (tree:grass) and the detail of the how to distribute the LUH and CCI forest/woodland between our 4 CABLE woody PFTs - which also links into the technical limitations with POP/POPLUC.

Do you see much value in devoting time to this tuning without CASA active? I suspect that the mapping that gives the best results for the biogeophysics may not produce optimal biogeochemistry.

You see the future having multiple supported land cover maps for LUC/non-LUC applications? Would that also extend to weather/climate applications? I could see there being potentially 3 land cover maps:

  1. For short-term/weather focused applications
  2. For longer term climate applications where the Carbon cycle is important
  3. For CMIP style applications with LUC

Thanks @lachlanswhyborn for this, great work.

Keep in mind this work was driven by our desire to “fairly” compare JULES and CABLE. Also, to compare how this affects AM3-n512 compared with the current NCAR ancillaries. So hopefully we’ll be able to show CCI is a good input dataset for CABLE, but we haven’t shown that yet. In any case the workflow that has been developed will now much more easily be able to be tailored to any input dataset using cross-walking approaches, if required.

For broader discussion, as you say this is a “quick first pass” mapping of CCI → CABLE, and will need to be checked and refined. After a very quick look, I notice that"16 Tree Needleleaved Deciduous Open" appears on all land points on the map and “17 Tree Mixed” has 10% urban. (these errors are mine, as I did the initial cross-walking transcription).

Otherwise I suggest we:

  1. read through CCI documentation to understand each designation
  2. double check which Met Office science configuration we have based our initial pass on, and that it’s accurately transcribed
  3. where we want to differ from JULES cross-walking, have clearly documented reasons for this (e.g. our urban module incorporates vegetation, while the met office module does not, therefore we should adapt our urban mapping to account for this… I’m sure there will be other landcover classes like this)
  4. understand why the Met Office has different cross-walking tables, and the basis for these
  5. see if others (e.g. Martin Best at Met Office) have comments on our use of CCI for CABLE
  6. invite @siyuan to comment on whether we can incorporate the BOM landcover data for Australia into the workflow for our JULES/CABLE comparisons

Hi @lachlanswhyborn, nice work. Can you provide more details on the cross-walking table?

I agree with pretty much everything Mat and Ian have suggested. I think Mat’s steps provide a solid way forward.

As for evaluation of the resulting PFT classification map. It is best practice to evaluate the impact of the generated PFT map on simulated land surface variables e.g. evapotranspiration, gross primary productivity, and albedo. There are benchmark datasets for these in ILAMB (you may be less interested in GPP for weather/climate applications, but should include it if interested in climate/ESM applications). However, as a first pass sanity check, when I created the global PFT maps from the LUH3 (global) and NVIS (for Australia only) datasets for ACCESS-ESM1.6, I used a benchmark dataset called HYBMAP to evaluate the present-day forest/tree area and cropland area. This is a rough guide only, as the HYBMAP dataset is a combination of multiple land cover products (including ESA CCI), so its not a strictly independent dataset, making it a bit of a circular problem.

The crosswalking table is a map from the 37 CCI land cover types to the 17 cable tiles. (here is the archive). I can’t find a publication describing how they categorised them i.e. what properties to attribute to each class.

It’s been a while since I have used CABLE, but we have been translating NVIS to MODIS categories for used with Noah-MP in WRF. One of the key processes that has not been explicitly mentioned is vegetation height/roughness length effects. I’d also suggest to compare the resulting vegetation height after doing the translation to some benchmark vegetation height data-set, e..g the Simard et al data-set:https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2011JG001708 . hope this is helpful

We’re actually using that specific canopy height dataset (or at least some version of it, at /g/data/access/TIDS/UM/ancil/atmos/master/vegetation/canopy/simard/v1/Simard_Pinto_3D_GlobalVeg_JGR.nc) in the ancillary suite. I don’t think the option of spatially variant canopy heights has been exercised in CABLE in quite a while. We briefly tested it in ESM1.6, but we weren’t happy with the results, so we went back to PFT prescribed canopy heights there.

Personally, I would prefer to keep the spatially variant canopy heights and tune the things we don’t have good observations for, but it will be getting into somewhat new territory for CABLE.

Hi @lachlanswhyborn, sorry for the slow response here. Just to clarify my question. Can you tell me how the 37 CCI land cover types are translated into the 17 CABLE tile types? The CCI archive doesn’t provide that info. This is an important step that is fraught with uncertainty and arbitrary choices. For example, in the NVIS data, there are categories like “Mallee Woodlands and Shrublands” which is not simply one CABLE tile type. In reality, it is probably best represented by a mix of CABLE types (e.g. Shrub, C3Grass or C4Grass, a tree PFT of some kind, and perhaps Bare Ground too). So there needs to be some rules for how to allocate that NVIS category to multiple different CABLE tiles. I have done this for ESM1.6 but the BIOS team does it slightly different. Both will be imperfect. The same issue will be present for CCI

At the moment, it’s a crosswalking table of dimension (N CCI classes)x(M CABLE tiles) that describes what fraction of CCI class n should be assigned to CABLE tile m. So that fractional split you mentioned is available.

I’m also not quite following where the fractions come from (I’m looking at the figures in zip archive). For example how does the code translate “Cropland rainfed” to 5% shrub, 90% C3 crop and 5% barren?

I also think we can’t do this conversion without other information, in particular on C3/C4 grass. I don’t see information on this in the CCI classes but this will be important both globally and over Australia. In the past we’ve run with just C3 grass which is obviously wrong.

I also agree on Jatin’s suggestion to try and make use of tree height data, and potentially tree fraction data where we need to decide on the CABLE fractions. But it depends on what the ultimate aim is, to create the best possible land cover dataset or something reasonable to enable a JULES/CABLE comparison. But we will definitely need to deal with the C3/C4 issue. This dataset is an option over Australia.

I would also be a bit cautious evaluating the runs. Most existing CABLE land cover maps are a bit questionable. If we are comparing to those runs, I wouldn’t be confident that a “worse” result is due to the land cover map rather than some other biases. But hopefully all we see is improvements :wink:

@aukkola Those fractions come from a JSON crosswalking table (we have an old example, which the current one is based on, that came from Mat and Siyuan’s work at the 2024 workshop). I intend to talk about the process in some detail at an upcoming LWG meeting, though unsure which one yet as we have a few things scheduled already.

I forgot to mention that the UKMO’s version of the CCI ancillary suite uses a supplementary dataset for the C4 grasses, currently /g/data/access/TIDS/UM/ancil/atmos/master/vegetation/cover/cci/v3/c4_percent_1d.nc (from ISLSCP II according to the metadata), which is at 1 degree resolution, so nowhere near as resolved as we would like. So suggestions for improved C3/C4 datasets are certainly of interest to us.

I think ideally we would compare to ILAMB, as suggested by @alexnorton? But like you said, it depends on what the precise point of the exercise is. I intend to keep the ancillary suite configurable where possible, so it’s easy to generate ancillaries for different applications.