Hi,
I’m new to running ACCESS models and would like to report a couple of issues I’ve experienced with the recent beta release of ACCESS-rAM3. These were encountered after moving the domain eastward to cover parts of the Tasman Sea and New Zealand. After working around these issues I’ve managed to successfully run a case for three days over the below domain, where the inner domain covers parts of New Caledonia:
Issue 1: RAS (u-dg767) crashing over a purely ocean domain
Moving the inner nest southwards so that it lies purely over the ocean results in errors with the cap_vegfrac
, land
and soils_hydr
ancil tasks.
The relevant parts of the log files seem to be:
job.out
for ancil_cap_vegfrac:
Calculating bi-linear interpolation coeffs
Finding coastal points
Setting coastal values
WARNING - No source data is available in target domain
UNRESOLVED GRID POINTS IN SOIL DATASET
Number of points unresolved is 9
POINT 78674 LAT -29.0100 LONG 167.9304
POINT 78675 LAT -29.0100 LONG 167.9502
POINT 79124 LAT -29.0298 LONG 167.9304
POINT 79125 LAT -29.0298 LONG 167.9502
POINT 79126 LAT -29.0298 LONG 167.9700
POINT 79127 LAT -29.0298 LONG 167.9898
POINT 79574 LAT -29.0496 LONG 167.9304
POINT 79575 LAT -29.0496 LONG 167.9502
POINT 79576 LAT -29.0496 LONG 167.9700
Search radius 1
NO DATA FROM WHICH TO SET UNRESOLVED POINTS
***ERROR: No source data available in target domain
job.err
for ancil_land, with ancil_soils_hydr pretty much having the same issue:
Loading cylc7/23.09
Loading requirement: mosrs-setup/1.0.1
Traceback (most recent call last):
File "/home/565/cr7888/cylc-run/u-dg767/src/ants/bin/ancil_general_regrid.py", line 165, in <module>
_run_app()
File "/home/565/cr7888/cylc-run/u-dg767/src/ants/bin/ancil_general_regrid.py", line 152, in _run_app
main(
File "/home/565/cr7888/cylc-run/u-dg767/src/ants/bin/ancil_general_regrid.py", line 123, in main
ants.analysis.make_consistent_with_lsm(
File "/home/565/cr7888/cylc-run/u-dg767/src/ants/lib/ants/analysis/__init__.py", line 508, in make_consistent_with_lsm
filler = Filler(cube, target_mask=mask)
File "/home/565/cr7888/cylc-run/u-dg767/src/ants/lib/ants/analysis/_merge.py", line 835, in __init__
self._call_spiral_search(source)
File "/home/565/cr7888/cylc-run/u-dg767/src/ants/lib/ants/analysis/_merge.py", line 890, in _call_spiral_search
raise ValueError(msg)
ValueError: The provided source doesn't appear to have any valid data.
[FAIL] python_env ancil_general_regrid.py --ants-config ${ANTS_CONFIG} \
[FAIL] ${source} --target-lsm ${target_lsm} -o ${output} # return-code=1
2025-04-11T06:27:13Z CRITICAL - failed/EXIT
This seems to be somewhat similar to the issues discussed in AUS2200 vegetation fraction ancil creation issues except there the issues seem to be associated with land regions such as New Zealand rather than the lack of land. I guess maybe the suite is looking for land data that doesn’t exist over a pure ocean domain?
Issue 2: RNS (u-dg768) crashes depending on start date
When I change the start date of the simulation from 2018-01-03 to 2018-01-02, the model crashes during the first forecast cycle at d1000
resolution. No other changes were made to either suites, so I’m really not sure why one works fine and the other doesn’t.
In the job.out
log file I see the following error output:
????????????????????????????????????????????????????????????????????????????????
???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!!
? Error code: 1
? Error from routine: EG_BICGSTAB
? Error message: NaNs in error term in BiCGstab after 1 iterations
? This is a common point for the model to fail if it
? has ingested or developed NaNs or infinities
? elsewhere in the code.
? See the following URL for more information:
? https://code.metoffice.gov.uk/trac/um/wiki/KnownUMFailurePoints
? Error from processor: 216
? Error number: 22
????????????????????????????????????????????????????????????????????????????????
The link to the UM wiki writes the following about this error:
NaNs in error term in BiCGstab
Why?: This is usually a catch all failure point where a NaN has been generated in a physics scheme (or read in from a corrupt input file) and has subsequently been passed to the dynamics.
How to investigate?: Run the model with output diagnostics set to high ([env]PRINT_STATUS=PrStatus_Diag) as this switches on the summary information for physics increments. This will identify if a NaN has been generated by a physics scheme and allows you to narrow down where the problem is.
I’ve tried following this advice for output diagnostics by going to um -> env -> Runtime Controls -> Atmosphere only
in the rose GUI and changing PRINT_STATUS
to “Extra diagnostic messages” but I’m just getting the same message come up in the log files (job.err
is a complete mess with the same message coming up countless times).
Any help for either of these issues would be much appreciated. Thanks!
N.B. I will be overseas for most of May so nothing is super urgent – will be spending more time on this after getting back.