Daily total rainfall from AUS2200 output

Does this look more sensible: AUS2200_1day_rain.ipynb | notebooksharing.space

1 Like

Hi @dale.roberts , yeah this looks more like it, I’m going to check it is the same as calculated with CDO.

I’ve checked it and it is identical to the CDO calculated daily rainfall. Thanks a lot! So is the critical difference in the lines when importing all the files?

flist=sorted(glob('/g/data/v45/cc6171/experiments/ECL-2016_smallshift/netcdf/u-cs142/2016060*T*Z/aus2200/d0198/RA3/um/umnsa_spec_*.nc'))

and/or

ds = xr.open_mfdataset(flist,parallel=True,preprocess=lambda x: x['fld_s04i201'])

Hi @cchambers

Its both of those lines together, the glob function follows shell rules when expanding filenames, so that’ll return a list of all of the umnsa_spec files belonging to the experiment. There is no guarantee of any order to the list of files, but even then I’m not sure its necessary to pass the list through sorted. It does make it easy to spot if the pattern didn’t match all the files we want, though. It probably also helps xarray merge across the time dimension when not using dask, as it’ll end up opening the files in time order, which should simplify that operation. This is the easiest way to get xarray to construct a single data set out of all of the data we’re interested in. You can combine across time after loading the data, but I find the fewer calls into xarray the better.

The key argument in xr.open_mfdataset is preprocess=lambda x: x['fld_s04i201']. This makes it so that xarray is going to filter out everything except the field we’re interested in (and its dimensions) before it tries to create the full dataset. Doing this saves a huge amount of resources, my notebook uses under 8GB memory all up, so runs in a couple of minutes on a medium ARE session. parallel=True just tells it to use the dask cluster I opened with client=Client() to run the open and preprocess steps.

2 Likes

Hi @cchambers - I’ve just been alerted to this discussion by a Hive Forum Summary email. Hope this is response is not too late to no longer be useful :slight_smile: Sounds like you have got your technical issues sorted, but I thought I would join the discussion as someone who has a lot of experience of working with rainfall fields from models, as well as being a Bureau person.

When you run a model with explicit convection, the (cloud) microphysics scheme will generate all the precipitation, rather than a (large-scale) cloud scheme and a convection scheme.

You calculate the total precipitation amount by summing the amounts from all of the precipitating hydrometeor classes in the microphysics scheme. The exact number and type of these classes varies from scheme to scheme (hence I would need to know more about the science settings for AUS2200 to tell you exactly which fields you need) but will always include rain and snow, and sometimes also hail and / or graupel. To check this, if you can find out the science configuration (so we know which microphysics scheme is used) for AUS2200 I can check with regional modelling colleagues here. Alternatively, if you grab the output field headers from all your output files with an ncdump -h and pipe to a text file then put it on /scratch/public , I can take a look and check that you are using the correct variables. (It sounds like rain + snow is correct, but we should check that you don’t also have a hail/graupel class that you should be including as well).

Finally, the amount written in each file’s output timestep is the accumulation over that output timestep, so to get the total for a 24 hour period you would add up the amounts for all the output timesteps that fall in that 24 hours.

Hope that’s of some help - happy to help answer any questions you have.

4 Likes

Hi @bethanwhite , thanks a lot for the information! Yes it would be good to get any thoughts you have on the model microphysics setup and total precipitation plotting. At the moment I have just been working with the variable:

fld_s04i201(time_0, lat, lon) ;“stratiform_rainfall_amount” ;“LARGE SCALE RAIN AMOUNT KG/M2/TS” ;“kg m-2” ;

Which I think is the explicitly resolved 10-minute accumulated rain.

I was going to add the snow to that in a bit. I believe these units are equivalent to mm(?).

For the accumulated rainfall calculating I think we are sorted with a couple of methods now using CDO or python.

I have put the ncdump -h header text files in /scratch/public/cc6171

The rain and snow are in umnsa_spec_20160603T0600.txt

The txt file ncdumps for all the output files are:
umnsa_cldrad_20160603T0600.txt
umnsaa_pvera000.txt
umnsa_mdl_20160603T0600.txt
umnsa_spec_20160603T0600.txt
umnsa_slv_20160603T0600.txt
umnsaa_pa000.txt