Daily total rainfall from AUS2200 output

cchambers · 9 May 2024 06:18

Hi AUS2200/ACCESS/Unified Model experts,

I’m trying to plot daily total accumulated rainfall from the AUS2200 model output. I think I need to sum the variable fld_s04i201 “LARGE SCALE RAIN AMOUNT KG/M2/TS” over a day but I’m having a bit of trouble doing this and I’m not sure if this is the correct variable to be summing. Unlike most other variables in the output, fld_s04i201 has a “time_0 = 6” dimension which I think causes issues for me when working with the variable.

I’m wondering if anyone has made total rainfall calculations and if you would be willing to share your method here or any tips at all would be much appreciated. I am quite new at working with Python.

Thanks.
Chris

Scott · 9 May 2024 06:27

Documentation for s4i201 (/g/data/access/projects/access/umdir/vn13.5/ctldata/STASHmaster/STASHmaster-meta.conf):

[stashmaster:code(4201)]
description=LARGE SCALE RAIN AMOUNT     KG/M2/TS
help=Large scale rain amount kg/m2/ts
    =This is the mass per metre squared of rain (liquid precipitation) which
    =falls on the surface during a single timestep as diagnosed by the large
    =scale precipitation scheme. It does not include a convective
    =contribution (see diagnostic 1 5 201).

You’ll need to look at what time processing of this field has already been done by the model - it’s probably already been accumulated over some interval.

Also consider field s5i214

[stashmaster:code(5214)]
description=TOTAL RAINFALL RATE: LS+CONV KG/M2/S
help=Total rainfall rate at the surface in kg/m2/s.
    =This is the sum of the large scale and convective rainfall at the surface.

cchambers · 10 May 2024 00:53

Thanks for the suggestions Scott!

I’ve taken a look for the second rain variable (s5i214) but I can’t seem to find it in any of our output files.

Perhaps this is because it has not been output, possibly because our simulations are resolving the precipitation explicitly
and do not use a cumulus scheme I think.

For finding the period of rainfall accumulation for variable s4i201,
do you know where I would look to determine what time processing of this field has been done by the model?
Is there some documentation that details this?

An example of our output files are at if anyone wants to take a look:
/g/data/v45/cc6171/experiments/ECL-2016_smallshift/netcdf/u-cs142/20160603T0000Z/aus2200/d0198/RA3/um

If anyone has ever plotted any kind of accumulated precipitation using ACCESS/UM/AUS2200 please any further tips would be much appreciated.

The three-hourly ACCESS accumulated precipitation is plotted by the BOM here http://www.bom.gov.au/australia/charts/viewer/index.shtml
so I’m wondering if there might be a contact at the BOM who has done the processing for producing this.

Scott · 10 May 2024 01:12

Guessing from your output path your suite ID is u-cs142. You can get the details in the ‘STASH requests’ pages in rose edit.

LS rain is being output using time profile 10mA0-E:

This is a time accumulation sampled every timestep accumulated over 10 minutes and output every 10 minutes:

In that case, just sum the values over a model day to get the accumulated daily value.

cchambers · 10 May 2024 03:43

That is good to confirm that it is the 10-minute accumulated rainfall and I do just have to sum it up. So my problem is really just a technical and probably very simple one, to do with the “time_0” dimension that this variable has. Most other variables simply use a “time” dimension but this one uses a “time_0” dimension as below

float fld_s04i201(time_0, lat, lon) ;
	fld_s04i201:_FillValue = 1.e+20f ;
	fld_s04i201:standard_name = "stratiform_rainfall_amount" ;
	fld_s04i201:long_name = "LARGE SCALE RAIN AMOUNT     KG/M2/TS" ;
	fld_s04i201:units = "kg m-2" ;
	fld_s04i201:um_stash_source = "m01s04i201" ;
	fld_s04i201:missing_value = 1.e+20f ;
	fld_s04i201:cell_methods = "time_0: mean" ;
	fld_s04i201:grid_mapping = "latitude_longitude" ;

I’m just wondering why does this variable have a “time_0” dimension rather than just a “time” dimension? Because of this dimension, if I merge the files together, for example below using cdo, it only outputs the first 6 time steps. This command works fine for other variables such as radar, sea level pressure etc that have the “time” dimension.

cdo -L -O -mergetime -apply,-selname,fld_s04i201 [ umnsa/*.nc ] thisone.nc

I’m sure the solution is very simple but I have brain trouble trying to think about these things.

Scott · 10 May 2024 03:51

Probably there’s some difference between time and time_0, perhaps one has a timestep 0 output and not the other, or one is indicating the middle of the time interval instead of the end (e.g. [1200Z, 1210Z, 1220Z] vs. [1205Z, 1215Z, 1225Z] for your 10-minute field). The metadata of the time dimensions may help clarify.

I won’t be able to help with the CDO command sorry, hopefully someone else can chip in.

Aidan · 10 May 2024 05:02

Some tips for getting assistance:

Provide more information. You’ve said you’re interested in the fld_s04i201 variable, but not said in which file to find this variable. The more work you ask someone to do the less likely you will get help.
You can’t assume everyone can access your data files. Not everyone is a member of the same projects, in this case v45. If you want assistance from a wider pool of people then you can copy data files to a directory in /scratch/public with permissions for anyone to access it.

That said, I am a member of v45.

It appears the variable you’re interested in is in the umnsa_spec_*.nc files.

The cdo command you tried is accessing all the files in the directory. Is there a reason for that?

I’m not a cdo expert, and I am sure it can be done with that or nco and ncrcat. What you want can also be accomplished with python and xarray

In [4]: import xarray as xr                                             
                                                                                                                                             
In [5]: ds = xr.open_mfdataset('umnsa_spec_*.nc')                                                                                            
                                                                        
In [6]: ds                                                              
Out[6]:                                                                 
<xarray.Dataset> Size: 1TB                                                                                                                    Dimensions:                   (time: 36, lat: 2120, lon: 2600, pseudo_level: 5,
                               lon_0: 2600, lat_0: 2121, time_0: 36, bnds: 2)
Coordinates:                                                            
  * time                      (time) datetime64[ns] 288B 2016-06-03T00:10:00 ...
  * lat                       (lat) float64 17kB -48.79 -48.77 ... -6.852 -6.832
  * lon                       (lon) float64 21kB 114.3 114.3 ... 165.7 165.7
  * pseudo_level              (pseudo_level) int32 20B 1 2 3 4 5
  * lon_0                     (lon_0) float64 21kB 114.3 114.3 ... 165.7 165.7  
  * lat_0                     (lat_0) float64 17kB -48.8 -48.78 ... -6.822
  * time_0                    (time_0) datetime64[ns] 288B 2016-06-03T00:05:0...
    theta_level_height        float64 8B 5.0   
    model_theta_level_number  int32 4B 1     
    sigma_theta               float64 8B 0.9994
    height                    float64 8B 10.0
    height_0                  float64 8B 1.5
Dimensions without coordinates: bnds
Data variables: (12/39)
    fld_s00i010               (time_0, time, lat, lon) float32 29GB dask.array<chunksize=(6, 1, 1060, 1300), meta=np.ndarray>
    latitude_longitude        (time_0, time) float64 10kB -2.147e+09 ... -2.1...
    lat_bnds                  (time_0, time, lat, bnds) float64 44MB dask.array<chunksize=(6, 36, 2120, 2), meta=np.ndarray>
    lon_bnds                  (time_0, time, lon, bnds) float64 54MB dask.array<chunksize=(6, 36, 2600, 2), meta=np.ndarray>
    theta_level_height_bnds   (time_0, time, bnds) float64 21kB dask.array<chunksize=(6, 36, 2), meta=np.ndarray>
    sigma_theta_bnds          (time_0, time, bnds) float64 21kB dask.array<chunksize=(6, 36, 2), meta=np.ndarray>
    ...                        ...
    fld_s21i101               (time_0, time, lat, lon) float64 57GB dask.array<chunksize=(6, 1, 1060, 1300), meta=np.ndarray>
    fld_s21i104               (time, time_0, lat, lon) float32 29GB dask.array<chunksize=(36, 2, 1060, 1300), meta=np.ndarray>
    fld_s30i403               (time_0, time, lat, lon) float32 29GB dask.array<chunksize=(6, 1, 1060, 1300), meta=np.ndarray>
    fld_s30i404               (time_0, time, lat, lon) float32 29GB dask.array<chunksize=(6, 1, 1060, 1300), meta=np.ndarray>
    fld_s30i405               (time_0, time, lat, lon) float32 29GB dask.array<chunksize=(6, 1, 1060, 1300), meta=np.ndarray>
    fld_s30i406               (time_0, time, lat, lon) float32 29GB dask.array<chunksize=(6, 1, 1060, 1300), meta=np.ndarray>
Attributes:
    history:      File /home/563/cc6171/cylc-run/u-cs142/share/cycle/20160603...
    Conventions:  CF-1.6
    source:       Data from Met Office Unified Model
    um_version:   12.2
In [7]: ds['fld_s04i201']                                                                                                                     
Out[7]:                                                                                                                                       
<xarray.DataArray 'fld_s04i201' (time: 36, time_0: 36, lat: 2120, lon: 2600)> Size: 29GB                                                      
dask.array<concatenate, shape=(36, 36, 2120, 2600), dtype=float32, chunksize=(36, 2, 1060, 1300), chunktype=numpy.ndarray>                    
Coordinates:                                                                                                                                  
  * time                      (time) datetime64[ns] 288B 2016-06-03T00:10:00 ...                                                              
  * lat                       (lat) float64 17kB -48.79 -48.77 ... -6.852 -6.832                                                              
  * lon                       (lon) float64 21kB 114.3 114.3 ... 165.7 165.7                                                                  
  * time_0                    (time_0) datetime64[ns] 288B 2016-06-03T00:05:0...                                                              
    theta_level_height        float64 8B 5.0                                                                                                  
    model_theta_level_number  int32 4B 1                                                                                                      
    sigma_theta               float64 8B 0.9994                                                                                               
    height                    float64 8B 10.0                                                                                                 
    height_0                  float64 8B 1.5                                                                                                  
Attributes:                                                                                                                                   
    standard_name:    stratiform_rainfall_amount                                                                                              
    long_name:        LARGE SCALE RAIN AMOUNT     KG/M2/TS                                                                                    
    units:            kg m-2                                                                                                                  
    um_stash_source:  m01s04i201                                                                                                              
    cell_methods:     time_0: mean                                                                                                            
    grid_mapping:     latitude_longitude                                                                                                      
                                                                                                                                              
In [8]: ds['time_0']                                                                                                                          
Out[8]:                                                                                                                                       
<xarray.DataArray 'time_0' (time_0: 36)> Size: 288B                                                                                           
array(['2016-06-03T00:05:00.000000000', '2016-06-03T00:15:00.000000000',
       '2016-06-03T00:24:59.999999744', '2016-06-03T00:35:00.000000256',
       '2016-06-03T00:45:00.000000000', '2016-06-03T00:54:59.999999744',
       '2016-06-03T01:05:00.000000256', '2016-06-03T01:15:00.000000000',
       '2016-06-03T01:25:00.000000000', '2016-06-03T01:35:00.000000000',
       '2016-06-03T01:45:00.000000000', '2016-06-03T01:54:59.999999744',
       '2016-06-03T02:05:00.000000256', '2016-06-03T02:15:00.000000000',
       '2016-06-03T02:24:59.999999744', '2016-06-03T02:35:00.000000256',
       '2016-06-03T02:45:00.000000000', '2016-06-03T02:55:00.000000000',
       '2016-06-03T03:05:00.000000000', '2016-06-03T03:15:00.000000000',
       '2016-06-03T03:24:59.999999744', '2016-06-03T03:35:00.000000256',
       '2016-06-03T03:45:00.000000000', '2016-06-03T03:54:59.999999744',
       '2016-06-03T04:05:00.000000256', '2016-06-03T04:15:00.000000000',
       '2016-06-03T04:25:00.000000000', '2016-06-03T04:35:00.000000000',
       '2016-06-03T04:45:00.000000000', '2016-06-03T04:54:59.999999744',
       '2016-06-03T05:05:00.000000256', '2016-06-03T05:15:00.000000000',
       '2016-06-03T05:24:59.999999744', '2016-06-03T05:35:00.000000256',
       '2016-06-03T05:45:00.000000000', '2016-06-03T05:55:00.000000000'],
      dtype='datetime64[ns]')
Coordinates:
  * time_0                    (time_0) datetime64[ns] 288B 2016-06-03T00:05:0...
    theta_level_height        float64 8B 5.0
    model_theta_level_number  int32 4B 1
    sigma_theta               float64 8B 0.9994
    height                    float64 8B 10.0
    height_0                  float64 8B 1.5
Attributes:
    axis:           T
    bounds:         time_0_bnds
    standard_name:  time

In [11]: ds['fld_s04i201'].resample(time_0='1D').sum()
Out[11]: 
<xarray.DataArray 'fld_s04i201' (time: 36, time_0: 1, lat: 2120, lon: 2600)> Size: 794MB
dask.array<transpose, shape=(36, 1, 2120, 2600), dtype=float32, chunksize=(36, 1, 1060, 1300), chunktype=numpy.ndarray>
Coordinates:
  * time                      (time) datetime64[ns] 288B 2016-06-03T00:10:00 ...
  * lat                       (lat) float64 17kB -48.79 -48.77 ... -6.852 -6.832
  * lon                       (lon) float64 21kB 114.3 114.3 ... 165.7 165.7
    theta_level_height        float64 8B 5.0
    model_theta_level_number  int32 4B 1
    sigma_theta               float64 8B 0.9994
    height                    float64 8B 10.0
    height_0                  float64 8B 1.5
  * time_0                    (time_0) datetime64[ns] 8B 2016-06-03
Attributes:
    standard_name:    stratiform_rainfall_amount
    long_name:        LARGE SCALE RAIN AMOUNT     KG/M2/TS
    units:            kg m-2
    um_stash_source:  m01s04i201
    cell_methods:     time_0: mean
    grid_mapping:     latitude_longitude

Note that the final object is a lazy dask array, and you need to add .compute() on the end, or evaluate it in some other way to force the computation. When I tried this on a login node it was killed, probably exceeded memory, so you’ll need to do this on an ARE session or similar.

cchambers · 10 May 2024 05:45

Thanks a lot for this python code Aidan! Could you tell me what this is calculating as I’m not too familiar with python, currently learning. Is this grabbing the 10 minute rainfall, summing it over the 6 times in each hourly file and then adding up to create a 24 hour total accumulated rain?

On the cdo command the reason for me accessing all the files in the directory is that I have temporarily copied the spec files to this other directory (for other reasons), so this command is only accessing those files.

As an update to the earlier cdo command to merge variables with time_0 dimensions into 1 file then “merge” rather than “mergetime” seems to work e.g.:

cdo -L -O -merge -apply,-selname,fld_s04i201 [ umnsa/umnsa_spec*.nc ] merge.nc

Using this merged file you can calculate the daily total rainfall using:

cdo daysum merge.nc daily.nc

Aidan · 10 May 2024 09:14

In this case I’m using xarray to read in all the data into a single virtual dataset (called ds for convenience), so the time-0 dimension has length 36, which was all the data that was available. If they’re 10 minutes apart then that is 6 hours, not 24.

So the python command below is accessing just the fld_s04i201 variable by accessing it directly as a key value for the ds object, then calling the .resample method to convert to daily (1D) time intervals and then calling the .sum() as the aggregating function for the resample operation

ds['fld_s04i201'].resample(time_0='1D').sum()

Here are some resources about the resample operation

Glad you got the cdo version working. The advantage of python is that it doesn’t have to create lots of intermediate data, which wastes space and can rapidly fill directories with lots of derivative data.

Also using python in jupyter notebooks provides a nicer data exploration interface, and provides some record of you work, which can be useful to refer to later on.

cchambers · 12 May 2024 02:40

Thanks so much! This is hugely helpful to for both this total rainfall calculation and for my future data processing with python. I’ll be taking a thorough look through this next week.

A follow up note:

For calculating the total precipitation it is our understanding that the sum of the LARGE SCALE RAINFALL and the LARGE SCALE SNOW variables is appropriate, particularly for our wintertime case.

cchambers · 14 May 2024 01:35

As far as I can tell the python method using resample does not correctly calculate the accumulated rainfall. In the script below I am bringing in 4 datasets for four 6-hour periods and summing them to create a 24-hour total. The reason I am doing it this way because it prevents a kernel crash that occurs (even on XXLarge) if I bring them all in as one virtual dataset. Perhaps this can be solved by using dask in some way but I’m not familiar with that yet.

import xarray as xr                                                                                                                                                                                         
#for a 24 hour segment
ds1 = xr.open_mfdataset('/g/data/v45/cc6171/experiments/ECL-2016_smallshift/netcdf/u-cs142/20160605T0000Z/aus2200/d0198/RA3/um/umnsa_spec_*.nc')
ds2 = xr.open_mfdataset('/g/data/v45/cc6171/experiments/ECL-2016_smallshift/netcdf/u-cs142/20160605T0600Z/aus2200/d0198/RA3/um/umnsa_spec_*.nc')
ds3 = xr.open_mfdataset('/g/data/v45/cc6171/experiments/ECL-2016_smallshift/netcdf/u-cs142/20160605T1200Z/aus2200/d0198/RA3/um/umnsa_spec_*.nc')
ds4 = xr.open_mfdataset('/g/data/v45/cc6171/experiments/ECL-2016_smallshift/netcdf/u-cs142/20160605T1800Z/aus2200/d0198/RA3/um/umnsa_spec_*.nc')
rain=ds1['fld_s04i201'].resample(time_0='1D').sum().values+ds2['fld_s04i201'].resample(time_0='1D').sum().values+ds3['fld_s04i201'].resample(time_0='1D').sum().values+ds4['fld_s04i201'].resample(time_0='1D').sum().values

The plot below is of this “rain” variable calculated using resample.
python_daily

and this plot is the same day accumulated rainfall calculated using CDO
cdo_daily

So it looks to me like the resample method may be summing a 1-hourly rainfall total from each 6 hourly period and so missing 5 hours of rainfall in each period. This is similar to what happened when I tried using nco. Probably this comes back to the time_0 dimension used by this variable. I still have not found out why the rainfall uses this dimension rather than the “time” dimension used by other variables in the same file so if anyone knows why this is please share here. For the moment I will carry on using the CDO method. As a warning to others, if you are doing a similar calculation for total rainfall, based on my experience it is possible to mistakenly produce a total rainfall plot using python and nco that could be misinterpreted as the total rainfall but is not. My full script is below, the final section produces a plot from the CDO calculated total rainfall in a file “daily.nc”. This file was calculated using the CDO commands:
cdo -L -O -merge -apply,-selname,fld_s04i201 [ umnsa/*.nc ] merge.nc
cdo daysum merge.nc daily.nc

Finally I will say that this type of work is at the limit of my abilities so I apologize in advance for any mistakes.

import xarray as xr     
import numpy as np
import pandas as pd
import cartopy.crs as ccrs
import cartopy.feature as cfeature
from cartopy.util import add_cyclic_point
import xesmf as xe
import datetime
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import cmocean.cm as cmo
import os, datetime
from sys import exit, argv
import netCDF4 as nc4
import geopandas as gpd
import regionmask
import matplotlib.pyplot as plt
import cartopy.io.shapereader as shpreader
from cdo import Cdo
cdo = Cdo()
%matplotlib inline
#for a 24 hour segment
ds1 = xr.open_mfdataset('/g/data/v45/cc6171/experiments/ECL-2016_smallshift/netcdf/u-cs142/20160605T0000Z/aus2200/d0198/RA3/um/umnsa_spec_*.nc')
ds2 = xr.open_mfdataset('/g/data/v45/cc6171/experiments/ECL-2016_smallshift/netcdf/u-cs142/20160605T0600Z/aus2200/d0198/RA3/um/umnsa_spec_*.nc')
ds3 = xr.open_mfdataset('/g/data/v45/cc6171/experiments/ECL-2016_smallshift/netcdf/u-cs142/20160605T1200Z/aus2200/d0198/RA3/um/umnsa_spec_*.nc')
ds4 = xr.open_mfdataset('/g/data/v45/cc6171/experiments/ECL-2016_smallshift/netcdf/u-cs142/20160605T1800Z/aus2200/d0198/RA3/um/umnsa_spec_*.nc')
#sum the four 6 hourly periods
rain=ds1['fld_s04i201'].resample(time_0='1D').sum().values+ds2['fld_s04i201'].resample(time_0='1D').sum().values+ds3['fld_s04i201'].resample(time_0='1D').sum().values+ds4['fld_s04i201'].resample(time_0='1D').sum().values


x = ds1.lat
y = ds1.lon
vmn = 1
vmx = 100
# interval for color bands
interval = 10     
tcks = np.arange(vmn,vmx+interval,interval)
lvl =  np.arange(vmn,vmx+interval,interval)
colormap = cm.hsv_r 
fig,ax=plt.subplots(1,1)
cp = ax.contourf(y,x,rain[0,0,:,:],cmap=colormap,levels=lvl,vmin=vmn,vmax=vmx,extend='max')
cbar_ax = fig.add_axes([0.25, 0.01, 0.5, 0.04])
cbar = fig.colorbar(cp, cax=cbar_ax, orientation='horizontal', label= 'rain', ticks=tcks, shrink=0.8)
# plot it
plt.show()

and finally for the CDO calculated rainfall plot

#trying the same plot but using the cdo calculated 24 hour rainfall dataset
ds = xr.open_dataset('/g/data/v45/cc6171/experiments/ECL-2016_smallshift/spec/daily.nc')
rain = ds.fld_s04i201
x = ds.lat
y = ds.lon
vmn = 1
vmx = 100
# interval for color bands
interval = 10     
tcks = np.arange(vmn,vmx+interval,interval)
lvl =  np.arange(vmn,vmx+interval,interval)
colormap = cm.hsv_r 
combines this with the terrain above
fig,ax=plt.subplots(1,1)
cp = ax.contourf(y,x,rain[2,:,:],cmap=colormap,levels=lvl,vmin=vmn,vmax=vmx,extend='max')
cbar_ax = fig.add_axes([0.25, 0.01, 0.5, 0.04])
cbar = fig.colorbar(cp, cax=cbar_ax, orientation='horizontal', label= 'rain', ticks=tcks, shrink=0.8)
# plot it
plt.show()

Scott · 14 May 2024 03:01

That is curious that the fields seem to have two time axes. Could you copy one of the netcdf files to /scratch/public so I can take a look?

cchambers · 14 May 2024 04:04

I’ve put one of the 1-hourly files that contains the 10-minute accumulated rainfall variable fld_s04i201 in
/scratch/public/cc6171
it’s file:
umnsa_spec_20160603T0600.nc

Scott · 14 May 2024 04:35

Looking at the metadata for the times, we have

dimensions:
        time = UNLIMITED ; // (6 currently)
        bnds = 2 ;
        time_0 = 6 ;
        #...
variables:
        double time(time) ;
                time:axis = "T" ;
                time:units = "days since 1970-01-01 00:00" ;
                time:standard_name = "time" ;
                time:calendar = "proleptic_gregorian" ;
        double time_0(time_0) ;
                time_0:axis = "T" ;
                time_0:bounds = "time_0_bnds" ;
                time_0:units = "days since 1970-01-01 00:00" ;
                time_0:standard_name = "time" ;
                time_0:calendar = "proleptic_gregorian" ;
        double time_0_bnds(time_0, bnds) ;

so time is an instantaneous time, while time_0 covers some time interval, the boundaries of which are in time_0_bnds.

With a single file open it looks like fields are just 3-dimensional, e.g.

        float fld_s04i201(time_0, lat, lon) ;
                fld_s04i201:_FillValue = 1.e+20f ;
                fld_s04i201:standard_name = "stratiform_rainfall_amount" ;
                fld_s04i201:long_name = "LARGE SCALE RAIN AMOUNT     KG/M2/TS" ;
                fld_s04i201:units = "kg m-2" ;
                fld_s04i201:um_stash_source = "m01s04i201" ;
                fld_s04i201:missing_value = 1.e+20f ;
                fld_s04i201:cell_methods = "time_0: mean" ;
                fld_s04i201:grid_mapping = "latitude_longitude" ;

you can see in cell_methods that it’s claiming that there’s been a mean along time (potentially an error in the post-processing script, this is supposed to be a sum).

Comparing against a field using the time dimension you can see that there’s no time processing

        float fld_s09i203(time, lat, lon) ;
                fld_s09i203:_FillValue = 1.e+20f ;
                fld_s09i203:standard_name = "low_type_cloud_area_fraction" ;
                fld_s09i203:long_name = "LOW CLOUD AMOUNT" ;
                fld_s09i203:units = "1" ;
                fld_s09i203:um_stash_source = "m01s09i203" ;
                fld_s09i203:missing_value = 1.e+20f ;
                fld_s09i203:grid_mapping = "latitude_longitude" ;

If I open just that file with Xarray then it’s getting the correct axes - when Aidan opened all the files he was getting (time_0, time, lat, lon) as the axes

In [2]: xarray.open_dataset('umnsa_spec_20160603T0600.nc').fld_s04i201
Out[2]:<xarray.DataArray 'fld_s04i201' (time_0: 6, lat: 2120, lon: 2600)>

I wonder if the time axis names are getting swapped in some of the files? so some files might have the averaged time marked as time_0 and others have it marked as time - that would confuse a lot of tools.

cbengel · 14 May 2024 22:49

Hi Scott, Chris and Aidan,

I have noticed strange header information in fields that have accumulation time-profiles. I got around it in simpler experiments by outputting the stratiform rainfall at each timestep and manually calculating the rainfall totals that I want.

This gives a bit more control. I know that the AUS2200 grids are very large, but given that it is a single layer this approach may work for you too.

Scott · 15 May 2024 00:21

Are you also converting the output using um2netcdf_iris? Any idea where that script is coming from? I’ve not been able to locate it.

cbengel · 15 May 2024 00:26

Scott, the strange header behaviour for skipped-accumulation-time-profiles actually begins in the fields files. Yes, I wonder what a netcdf conversion tool would do with those …

Scott · 15 May 2024 00:32

Interesting, both means and accumulations are marked in the um files with LBPROC=128

cbengel · 15 May 2024 01:13

The assigned timing has issues as well.

I requested every 10-minutes (accumulations) over 6 hours and in the fields file the timing indicated every 5-minutes for 3 hours.

It was not clear if the header information was off or the time profile …

Hence the swap to the per-timestep information.

dale.roberts · 15 May 2024 01:51

Hi @cchambers @Aidan @Scott I’m back from leave. The ‘fields with two time axes’ in Aidan’s example is an artifact of xarray.open_mfdataset(). When it does its concatenation, it seems to try its best to put all fields on consistent axes, even when that makes no sense, like in this case. These files are massive, so you really want to use preprocess (e.g. More efficent use of xarray.open_mfdataset() — CLEX CMS Blog) to pick out the fld_s04i201 variable before xarray gets the chance to merge anything. That’ll make the rest of the job (the resample and sum) a whole lot easier.

Edit: URL updated

Topic		Replies	Views
Meeting Minutes 2024: Atmosphere Working Group (Archived) Working Group meeting , notes , atmosphere	11	1653	4 December 2024
ACCESS-rAM3 Release 1.0 Feedback Regional Nesting Suite regional , atmosphere , feedback , access-ram , access-ram3	19	155	16 July 2025
Provenance of ACCESS-ESM1.5 UM ancillary data Earth System Model provenance , um-ancillary , atmosphere	6	59	8 November 2024
Run CABLE-POP at ACCESS Resolution CABLE cable4-planning	26	422	29 August 2024
Preset output profiles for ESM1.5 Earth System Model	3	109	13 December 2024

Daily total rainfall from AUS2200 output

Related topics