Netcdf_conversion often fails since switching to xp65

Hi all,

I’ve noticed that the netcdf_conversion job fails regularly (every few model years) in CM2. I believe, I’ve started to notice this since moving to the xp65 environment. I’m running a copy of suite u-db130/meltwater.

I’m wondering if others experience something similar?

This is the error message I get:

Using the cylc session localhost

Loading cylc7/23.09
  Loading requirement: mosrs-setup/1.0.1
Loading pythonlib/um2netcdf4/xp65
  Loading requirement: singularity conda/analysis3-25.08
Traceback (most recent call last):
  File "/home/581/wgh581/cylc-run/u-ds210_ensemble3/share/fcm_make_drivers/build-drivers/bin/run_um2netcdf.py", line 7, in <module>
    import os, datetime, collections, um2netcdf4, shutil, re, f90nml
  File "/g/data/access/projects/access/apps/pythonlib/um2netcdf4/2.1/um2netcdf4.py", line 8, in <module>
    from iris.fileformats.pp import PPField
  File "/g/data/xp65/public/apps/med_conda/envs/analysis3-25.08/lib/python3.11/site-packages/iris/fileformats/__init__.py", line 17, in <module>
    from . import name, netcdf, nimrod, pp, um
  File "/g/data/xp65/public/apps/med_conda/envs/analysis3-25.08/lib/python3.11/site-packages/iris/fileformats/netcdf/__init__.py", line 22, in <module>
    from .._nc_load_rules.helpers import UnknownCellMethodWarning, parse_cell_methods
  File "/g/data/xp65/public/apps/med_conda/envs/analysis3-25.08/lib/python3.11/site-packages/iris/fileformats/_nc_load_rules/helpers.py", line 28, in <module>
    import pyproj
  File "/g/data/xp65/public/apps/med_conda/envs/analysis3-25.08/lib/python3.11/site-packages/pyproj/__init__.py", line 33, in <module>
    import pyproj.network
  File "/g/data/xp65/public/apps/med_conda/envs/analysis3-25.08/lib/python3.11/site-packages/pyproj/network.py", line 10, in <module>
    from pyproj._network import (  # noqa: F401 pylint: disable=unused-import
  File "pyproj/_network.pyx", line 1, in init pyproj._network
ImportError: /g/data/xp65/public/apps/med_conda/envs/analysis3-25.08/lib/python3.11/site-packages/pyproj/_datadir.cpython-311-x86_64-linux-gnu.so: cannot read file data: Input/output error
[FAIL] run_um2netcdf.py # return-code=1
2025-11-18T04:46:07Z CRITICAL - failed/EXIT

@wghuneke , we have noted the issue with ESM1.6 as well. After some testing, emails and conversations with various people at NCI and others, it all comes down to how the Lustre filesystem, Python and containers interact with each others. Unfortunately, we haven’t been able to come to a fix on this, and we still have to test various ideas to see what works.

This wasn’t a priority because of the low frequency it occurs in ESM1.6. It seems to be happening more often for your CM2 suite. We’ll take that into consideration and let you know if we can find a fix.

1 Like

A suggestion by @MartinDix to circumvent the issue is to add a line to the netcdf_conversion section in suite.rc:

[[netcdf_conversion]]
inherit = POSTPROC, NCI, SHARE
pre-script = “”"
module use ~access/modules
module unload python
module load pythonlib/um2netcdf4/xp65
“”"
[[[job]]]
execution time limit = PT30M
execution retry delays = 3*PT5M

1 Like