ACCESS-CM2 error codes (error 22)

Hi,

I was wondering if there is somewhere I can see what different error codes mean?

I have had a job stop over the weekend due to an error, but I don’t really know what caused it. This is from a pacemaker style simulation using Holgers setup to change the SST restoring file each year. Im not sure why the job would fail now, it has been running for 24 years fine before this.

Here is the error message below from job.out:

????????????????????????????????????????????????????????????????????????????????
???!!!???!!!???!!!???!!!???!!!       ERROR        ???!!!???!!!???!!!???!!!???!!!
?  Error code: 22
?  Error from routine: io:buffin
?  Error message: Error in buffin errorCode= 1.00 len=0/28672
?  Error from processor: 0
?  Error number: 14
????????????????????????????????????????????????????????????????????????????????

Let me know how I can fix this, or if should just restart the cycle and see if it works?

Here is some of the other lines prior to the job giving the error message if that helps:

MPPIO: file op completed
REPLANCA: UPDATE REQUIRED FOR FIELD                     1
Information used in checking ancillary data set: position of lookup table in dataset:                   681
Position of first lookup table referring to data type                      1
Interval between lookup tables referring to data type                     85  Number of steps                     8
STASH code in dataset                     60   STASH code requested                     60
'start' position of lookup tables for dataset                      1 in overall lookup array                      1
BUFFIN: C I/O Error - Return code = 1
****************** IO Error Report ***********************************
Unit Generating error=   15
---File States --------------------------
Unit  11 open on filename /home/561/sm2435/cylc-run/u-cy286/work/09760101/coupled/ATM_RUNDIR/History_Data/cy286a.p70976aug
--> File Type:   2 , Read Only: F , Write Only: F
--> Local: T AllLocal: F Remote: F Broadcast: T
--> Local: T AllLocal: F Remote: F Broadcast: T
Unit  12 open on filename /home/561/sm2435/cylc-run/u-cy286/work/09760101/coupled/ATM_RUNDIR/History_Data/cy286a.p80976aug
--> File Type:   2 , Read Only: F , Write Only: F
--> Local: T AllLocal: F Remote: F Broadcast: T
--> Local: T AllLocal: F Remote: F Broadcast: T
Unit  13 open on filename /home/561/sm2435/cylc-run/u-cy286/work/09760101/coupled/ATM_RUNDIR/History_Data/cy286a.pd0976aug
--> File Type:   2 , Read Only: F , Write Only: F
--> Local: T AllLocal: F Remote: F Broadcast: T
--> Local: T AllLocal: F Remote: F Broadcast: T
Unit  15 open on filename /g/data/access/TIDS/CMIP6_ANCIL/data/ancils/n96e/timeslice_1850/OzoneConc/v1/mmro3_monthly_CMIP6_1850_N96_edited-ancil_2anc
--> File Type:   1 , Read Only: T , Write Only: F
--> Local: T AllLocal: F Remote: F Broadcast: F
--> Local: T AllLocal: F Remote: F Broadcast: F
Unit  16 open on filename /g/data/access/projects/access/umdir/ancil/atmos/n96e/orca1/general_sea/GlobColour/v0/qrclim.sea
--> File Type:   1 , Read Only: T , Write Only: F
--> Local: T AllLocal: F Remote: F Broadcast: T
--> Local: T AllLocal: F Remote: F Broadcast: T
Unit  17 open on filename /projects/access/data/ancil/access_cm2_n96e/O1/cable_vegfunc_ncarlai.anc
--> File Type:   1 , Read Only: T , Write Only: F
--> Local: T AllLocal: F Remote: F Broadcast: T
--> Local: T AllLocal: F Remote: F Broadcast: T
Unit  18 open on filename /g/data/access/projects/access/data/ancil/access_cm2_n96e/O1/qrclim.sulpdms
--> File Type:   1 , Read Only: T , Write Only: F
--> Local: T AllLocal: F Remote: F Broadcast: T
--> Local: T AllLocal: F Remote: F Broadcast: T
---End File States ----------------------

It’s maybe not directly relevant, but I searched for that term and came up with this thread from the NCAS Modelling Support forum:

which might provide some clues with how to debug this.

1 Like

Possibly the current date of the model is beyond the data in the ancillary file. Historical experiment ancillaries run to 2015ish.

Stash 60 is ozone - matches up with unit 15 being the error file

STASH code in dataset                     60   STASH code requested                     60
# ...
Unit Generating error=   15
# ...
Unit  15 open on filename /g/data/access/TIDS/CMIP6_ANCIL/data/ancils/n96e/timeslice_1850/OzoneConc/v1/mmro3_monthly_CMIP6_1850_N96_edited-ancil_2anc

though this is a climatology (periodic 12 months of data) so it won’t be that dates have gone wrong.

I’d say retry and see if it reproduces at the same time.

Thanks @Aidan and @Scott
I have restarted and will see what happens.

Yeah error suggests it is not reading that file properly, but file does exist so not sure why it fails now. I will let you know if restarting fixes the problem or not

Thanks,
Sebastian

I note that this reply talks about “switching on extra diagnostics messages”

Is that something that is available in this suite? If so, is this a standard measure we should always suggested when debugging?

Restarting the job seems to have fixed it and the model is continuing to run with no issue now.

Unsure what caused this to happen, but it appears to have fixed itself in this case.

@Aidan I couldn’t find any options for extra diagnostics when I had a look.