Does this look more sensible: AUS2200_1day_rain.ipynb | notebooksharing.space
Hi @dale.roberts , yeah this looks more like it, I’m going to check it is the same as calculated with CDO.
I’ve checked it and it is identical to the CDO calculated daily rainfall. Thanks a lot! So is the critical difference in the lines when importing all the files?
flist=sorted(glob('/g/data/v45/cc6171/experiments/ECL-2016_smallshift/netcdf/u-cs142/2016060*T*Z/aus2200/d0198/RA3/um/umnsa_spec_*.nc'))
and/or
ds = xr.open_mfdataset(flist,parallel=True,preprocess=lambda x: x['fld_s04i201'])
Hi @cchambers
Its both of those lines together, the glob
function follows shell rules when expanding filenames, so that’ll return a list of all of the umnsa_spec
files belonging to the experiment. There is no guarantee of any order to the list of files, but even then I’m not sure its necessary to pass the list through sorted
. It does make it easy to spot if the pattern didn’t match all the files we want, though. It probably also helps xarray merge across the time dimension when not using dask, as it’ll end up opening the files in time order, which should simplify that operation. This is the easiest way to get xarray
to construct a single data set out of all of the data we’re interested in. You can combine across time after loading the data, but I find the fewer calls into xarray
the better.
The key argument in xr.open_mfdataset
is preprocess=lambda x: x['fld_s04i201']
. This makes it so that xarray
is going to filter out everything except the field we’re interested in (and its dimensions) before it tries to create the full dataset. Doing this saves a huge amount of resources, my notebook uses under 8GB memory all up, so runs in a couple of minutes on a medium ARE session. parallel=True
just tells it to use the dask cluster I opened with client=Client()
to run the open
and preprocess
steps.
Hi @cchambers - I’ve just been alerted to this discussion by a Hive Forum Summary email. Hope this is response is not too late to no longer be useful Sounds like you have got your technical issues sorted, but I thought I would join the discussion as someone who has a lot of experience of working with rainfall fields from models, as well as being a Bureau person.
When you run a model with explicit convection, the (cloud) microphysics scheme will generate all the precipitation, rather than a (large-scale) cloud scheme and a convection scheme.
You calculate the total precipitation amount by summing the amounts from all of the precipitating hydrometeor classes in the microphysics scheme. The exact number and type of these classes varies from scheme to scheme (hence I would need to know more about the science settings for AUS2200 to tell you exactly which fields you need) but will always include rain and snow, and sometimes also hail and / or graupel. To check this, if you can find out the science configuration (so we know which microphysics scheme is used) for AUS2200 I can check with regional modelling colleagues here. Alternatively, if you grab the output field headers from all your output files with an ncdump -h
and pipe to a text file then put it on /scratch/public
, I can take a look and check that you are using the correct variables. (It sounds like rain + snow
is correct, but we should check that you don’t also have a hail/graupel class that you should be including as well).
Finally, the amount written in each file’s output timestep is the accumulation over that output timestep, so to get the total for a 24 hour period you would add up the amounts for all the output timesteps that fall in that 24 hours.
Hope that’s of some help - happy to help answer any questions you have.
Hi @bethanwhite , thanks a lot for the information! Yes it would be good to get any thoughts you have on the model microphysics setup and total precipitation plotting. At the moment I have just been working with the variable:
fld_s04i201(time_0, lat, lon) ;“stratiform_rainfall_amount” ;“LARGE SCALE RAIN AMOUNT KG/M2/TS” ;“kg m-2” ;
Which I think is the explicitly resolved 10-minute accumulated rain.
I was going to add the snow to that in a bit. I believe these units are equivalent to mm(?).
For the accumulated rainfall calculating I think we are sorted with a couple of methods now using CDO or python.
I have put the ncdump -h header text files in /scratch/public/cc6171
The rain and snow are in umnsa_spec_20160603T0600.txt
The txt file ncdumps for all the output files are:
umnsa_cldrad_20160603T0600.txt
umnsaa_pvera000.txt
umnsa_mdl_20160603T0600.txt
umnsa_spec_20160603T0600.txt
umnsa_slv_20160603T0600.txt
umnsaa_pa000.txt
Hi @cchambers, I’ve just had a look through your netcdf headers. You have identified the correct variable(s) for calculating surface precip: fld_s04i201:long_name = "LARGE SCALE RAIN AMOUNT KG/M2/TS"
, and `fld_s04i202:long_name = “LARGE SCALE SNOW AMOUNT KG/M2/TS”.
You are also correct that KG/M2
is equivalent to mm
(with the accumulation period being the output timestep, i.e. 10-mins for your case here).
Hi @cchambers, thought I’d provide a bit of info regarding moist physics schemes and precipitation output fields from the UM.
Firstly, there are 3 moist physics schemes in the model, which in the code are called: the convection scheme, the large-scale cloud scheme, and the large-scale precipitation (microphysics) scheme.
When you run the regional model at km-scale resolutions or higher, we explicitly represent convection and use the RAL science configuration that does not use the convection scheme. This leaves us with 2 moist physics schemes in use: the cloud scheme that calculates the fractional cloud cover in grid boxes (noting that this scheme is used in the regional model as well as the global, and that it does not produce precipitation), and the microphysics scheme (which in the code is called the large-scale precipitation scheme).
When you want the total precipitation from regional model runs there are only 2 output fields you need: the “large-scale” rain amount (produced by the microphysics scheme) and the same for the snow. The snow field is the frozen precipitation with 4,202 including BOTH snow and graupel. Do not add graupel to this field otherwise you’ll be counting graupel amounts twice.
FYI, regional runs using RAL3.2, which AUS2200 uses, run with the following hydrometeor species: cloud water, rain, snow, ice and graupel.
Thanks for posting this, Chris.
We were also trying to figure out why our rainfall plots were weirdly low for the Feb-Mar 2022 SST experiment.
Hi @Kim_Reid. Following on from our meeting this afternoon, here is how to change the above notebook to include large scale snow and to sum from 9am-9am instead of midnight to midnight:
Change cell 5 to:
ds = xr.open_mfdataset(flist,parallel=True,preprocess=lambda x: x['fld_s04i201'] + x['fld_s04i202'])
and cell 6 to:
da_fin=ds.resample(time_0='1D',offset='9h').sum()
Note that when using offset
, the first sum will be from 0000UTC to 0900UTC on the first model day, and the last will be 0900UTC to 0000UTC on the final model day. The rest of the time stamps in the output will be the sum from 0900UTC to 0900UTC the next day.
Hi Charmaine, all,
Does anyone know what STASH code is used for snow mixing ratio? I could find the codes for the other four hydrometeors below but snow.
CLD LIQ MIXING RATIO (mcl) AFTER TS - m01s00i392
CLD ICE MIXING RATIO (mcf) AFTER TS - m01s00i393
RAIN MIXING RATIO (mr) AFTER TS - m01s00i394
GRAUPEL MIXING RATIO (mg) AFTER TS - m01s00i395
There is a variable called ‘ICE CRY MIXING RAT. (mcf2) AFTER TS - m01s00i396’. Is this the equivalent of ‘snow mixing ratio’? So CRY = CRYSTAL = SNOW?
Thanks very much!
Cheers, Yi
Hi Yi,
CASIM, the cloud microphysics scheme used in RAL3.2, includes 3 ice hydrometeor species: ice aggregates/snow, ice crystals and graupel.
Snow (or cloud ice aggregate) is “CLD ICE MIXING RATIO (mcf) AFTER TS - m01s00i393” and the ice crystals is the “ICE CRY MIXING RAT. (mcf2) AFTER TS - m01s00i396”.
Got it! Thanks so much.
Hi,
I missed this post completely, I’ve been working on a post-processing tool which we tested first on the AUS2200 simulations.
I bumped into the two time axes and other issues before. Also you might have knowledge which it’s important for us to get the mapping/calculations of variables correctly so if you could please tag me next time too it would be great!
Thanks
Thanks for all your help, Dale!