Job fails trying to access /scratch while the database is on /g/data

I am running some scripts as PBS jobs to calculate and save as netCDF annual mean of passive_adelie_xflux_adv from access-om2-01 01deg_jra55v140_iaf_cycle3_antarctic_tracers experiment. I created my own database when the outputs were moved from /scratch to g/data/ik11. But the job fails and it seems like it is still trying to access /scratch.

! I am trying to save data for 10 years 1985–1994. The script worked for 1985 and 1986 but failed for 1987 to 1994.

Since the outputs later had been saved to cosima_master db, I ran the job using master db but got the same error.

Here is the error:

2022-12-23 13:03:58,009 - distributed.worker - WARNING - Compute Failed
Key:       open_dataset-4f39abe6-ca81-4889-bead-0c3bddfbda46
Function:  execute_task
args:      ((<function apply at 0x14746f0ff3a0>, <function open_dataset at 0x147468bf8310>, ['/scratch/v45/akm157/access-om2/archive/01deg_jra55v140_iaf_cycle3_antarctic_tracers/output607/ocean/ross_xflux_adv.nc'], (<class 'dict'>, [['engine', None], ['chunks', (<class 'dict'>, [['time', 1], ['st_ocean', 75], ['yt_ocean_sub01', 27], ['xu_ocean_sub01', 24]])]])))
kwargs:    {}
Exception: "FileNotFoundError(2, 'No such file or directory')"

Traceback (most recent call last):
  File "/g/data/x77/ps7863/python_scripts/01deg_jra55v140_iaf_cycle3_antarctic_tracers/save_ross_fluxes.py", line 41, in <module>
    ross_xflux = cc.querying.getvar(expt,
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/cosima_cookbook/querying.py", line 368, in getvar
    ds = xr.open_mfdataset(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/api.py", line 1004, in open_mfdataset
    datasets, closers = dask.compute(datasets, closers)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/dask/base.py", line 600, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/client.py", line 3122, in get
    results = self.gather(packed, asynchronous=asynchronous, direct=direct)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/client.py", line 2291, in gather
    return self.sync(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/utils.py", line 339, in sync
    return sync(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/utils.py", line 406, in sync
    raise exc.with_traceback(tb)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/utils.py", line 379, in f
    result = yield future
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/tornado/gen.py", line 762, in run
    value = future.result()
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/client.py", line 2154, in _gather
    raise exception.with_traceback(traceback)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/dask/utils.py", line 71, in apply
    return func(*args, **kwargs)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/api.py", line 539, in open_dataset
    backend_ds = backend.open_dataset(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 572, in open_dataset
    store = NetCDF4DataStore.open(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 376, in open
    return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 323, in __init__
    self.format = self.ds.data_model
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 385, in ds
    return self._acquire()
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 379, in _acquire
    with self._manager.acquire_context(needs_lock) as root:
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 197, in acquire_context
    file, cached = self._acquire_with_cache_info(needs_lock)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 215, in _acquire_with_cache_info
    file = self._opener(*self._args, **kwargs)
  File "src/netCDF4/_netCDF4.pyx", line 2353, in netCDF4._netCDF4.Dataset.__init__
  File "src/netCDF4/_netCDF4.pyx", line 1963, in netCDF4._netCDF4._ensure_nc_success
FileNotFoundError: [Errno 2] No such file or directory: b'/scratch/v45/akm157/access-om2/archive/01deg_jra55v140_iaf_cycle3_antarctic_tracers/output607/ocean/ross_xflux_adv.nc'
2022-12-23 17:33:05,833 - distributed.worker - WARNING - Compute Failed
Key:       open_dataset-0ee6fa3b-50ca-43d3-ac90-44ea11cfc974
Function:  execute_task
args:      ((<function apply at 0x1455d563e3a0>, <function open_dataset at 0x1455cb130310>, ['/scratch/v45/akm157/access-om2/archive/01deg_jra55v140_iaf_cycle3_antarctic_tracers/output607/ocean/ross_xflux_adv.nc'], (<class 'dict'>, [['engine', None], ['chunks', (<class 'dict'>, [['time', 1], ['st_ocean', 75], ['yt_ocean_sub01', 27], ['xu_ocean_sub01', 24]])]])))
kwargs:    {}
Exception: "FileNotFoundError(2, 'No such file or directory')"

Traceback (most recent call last):
  File "/g/data/x77/ps7863/python_scripts/01deg_jra55v140_iaf_cycle3_antarctic_tracers/save_ross_fluxes.py", line 41, in <module>
    ross_xflux = cc.querying.getvar(expt,
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/cosima_cookbook/querying.py", line 368, in getvar
    ds = xr.open_mfdataset(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/api.py", line 1004, in open_mfdataset
    datasets, closers = dask.compute(datasets, closers)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/dask/base.py", line 600, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/client.py", line 3122, in get
    results = self.gather(packed, asynchronous=asynchronous, direct=direct)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/client.py", line 2291, in gather
    return self.sync(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/utils.py", line 339, in sync
    return sync(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/utils.py", line 406, in sync
    raise exc.with_traceback(tb)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/utils.py", line 379, in f
    result = yield future
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/tornado/gen.py", line 762, in run
    value = future.result()
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/client.py", line 2154, in _gather
    raise exception.with_traceback(traceback)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/dask/utils.py", line 71, in apply
    return func(*args, **kwargs)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/api.py", line 539, in open_dataset
    backend_ds = backend.open_dataset(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 572, in open_dataset
    store = NetCDF4DataStore.open(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 376, in open
    return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 323, in __init__
    self.format = self.ds.data_model
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 385, in ds
    return self._acquire()
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 379, in _acquire
    with self._manager.acquire_context(needs_lock) as root:
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 197, in acquire_context
    file, cached = self._acquire_with_cache_info(needs_lock)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 215, in _acquire_with_cache_info
    file = self._opener(*self._args, **kwargs)
  File "src/netCDF4/_netCDF4.pyx", line 2353, in netCDF4._netCDF4.Dataset.__init__
  File "src/netCDF4/_netCDF4.pyx", line 1963, in netCDF4._netCDF4._ensure_nc_success
FileNotFoundError: [Errno 2] No such file or directory: b'/scratch/v45/akm157/access-om2/archive/01deg_jra55v140_iaf_cycle3_antarctic_tracers/output607/ocean/ross_xflux_adv.nc'
  File "/g/data/x77/ps7863/python_scripts/01deg_jra55v140_iaf_cycle3_antarctic_tracers/save_ross_fluxes.py", line 55
    ross_xflux_mean = (ross_xflux.sel(time=time_slice)*month_length).sum('time')/365.load()
                                                                                     ^
SyntaxError: invalid syntax
2023-01-10 11:39:08,658 - distributed.worker - WARNING - Compute Failed
Key:       open_dataset-c3fa9279-6b8c-4617-a759-943c1b47daba
Function:  execute_task
args:      ((<function apply at 0x1480755143a0>, <function open_dataset at 0x14806b004310>, ['/scratch/v45/akm157/access-om2/archive/01deg_jra55v140_iaf_cycle3_antarctic_tracers/output607/ocean/ross_xflux_adv.nc'], (<class 'dict'>, [['engine', None], ['chunks', (<class 'dict'>, [['time', 1], ['st_ocean', 75], ['yt_ocean_sub01', 27], ['xu_ocean_sub01', 24]])]])))
kwargs:    {}
Exception: "FileNotFoundError(2, 'No such file or directory')"

Traceback (most recent call last):
  File "/g/data/x77/ps7863/python_scripts/01deg_jra55v140_iaf_cycle3_antarctic_tracers/save_ross_fluxes.py", line 42, in <module>
    ross_xflux = cc.querying.getvar(expt,
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/cosima_cookbook/querying.py", line 368, in getvar
    ds = xr.open_mfdataset(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/api.py", line 1004, in open_mfdataset
    datasets, closers = dask.compute(datasets, closers)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/dask/base.py", line 600, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/client.py", line 3122, in get
    results = self.gather(packed, asynchronous=asynchronous, direct=direct)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/client.py", line 2291, in gather
    return self.sync(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/utils.py", line 339, in sync
    return sync(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/utils.py", line 406, in sync
    raise exc.with_traceback(tb)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/utils.py", line 379, in f
    result = yield future
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/tornado/gen.py", line 762, in run
    value = future.result()
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/client.py", line 2154, in _gather
    raise exception.with_traceback(traceback)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/dask/utils.py", line 71, in apply
    return func(*args, **kwargs)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/api.py", line 539, in open_dataset
    backend_ds = backend.open_dataset(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 572, in open_dataset
    store = NetCDF4DataStore.open(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 376, in open
    return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 323, in __init__
    self.format = self.ds.data_model
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 385, in ds
    return self._acquire()
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 379, in _acquire
    with self._manager.acquire_context(needs_lock) as root:
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 197, in acquire_context
    file, cached = self._acquire_with_cache_info(needs_lock)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 215, in _acquire_with_cache_info
    file = self._opener(*self._args, **kwargs)
  File "src/netCDF4/_netCDF4.pyx", line 2353, in netCDF4._netCDF4.Dataset.__init__
  File "src/netCDF4/_netCDF4.pyx", line 1963, in netCDF4._netCDF4._ensure_nc_success
FileNotFoundError: [Errno 2] No such file or directory: b'/scratch/v45/akm157/access-om2/archive/01deg_jra55v140_iaf_cycle3_antarctic_tracers/output607/ocean/ross_xflux_adv.nc'
/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 168 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 168 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '


Can anyone point out what’s wrong?

1 Like

You’re correct, it is trying to open a file that doesn’t exist

$ ls -l /scratch/v45/akm157/access-om2/archive/01deg_jra55v140_iaf_cycle3_antarctic_tracers/output607/ocean/ross_xflux_adv.nc
ls: cannot access '/scratch/v45/akm157/access-om2/archive/01deg_jra55v140_iaf_cycle3_antarctic_tracers/output607/ocean/ross_xflux_adv.nc': No such file or directory

Did you recreate your database after the files were moved to /g/data?

Regarding the files that are now in /g/data, I’m not sure, but it’s possible that they are not in the master database because the top-level folder belongs to the wrong group:

$ ls -ld /g/data/ik11/outputs/access-om2-01/01deg_jra55v140_iaf_cycle3_antarctic_tracers/
drwxr-sr-x+ 108 akm157 oz91 12288 Dec  9 09:47 /g/data/ik11/outputs/access-om2-01/01deg_jra55v140_iaf_cycle3_antarctic_tracers/

As the owner of the files, this is something that only @adele157 can fix.

PS: In case it’s useful, there are some instructions and a couple of script to fix these kind of permission issues here: Home · COSIMA/master_index Wiki · GitHub. The scripts take care of finding and fixing all files with incorrect permissions/ownership :wink:

Permissions are fixed now. I was previously trying to free up space on /g/data/ik11 by changing group ownership to something not ik11.

But I’m still not clear why @polinash’s database and the master database are looking for a file on /scratch that doesn’t exist. Isn’t the database indexing supposed to prune non-existent files? And why are the databases even looking at files on /scratch when they’re only given /g/data/ paths to index?

Yes, I recreated database. Also, I’ve used cosima_master recently, but it gives the same error.

I tried to load a desirable variable from the master database and it worked. So it made me think that the files I need are in the master database. But when it comes to running a job, it fails.

@micael @Aidan also, it works fine for some years and fails for another.

Could you share the script or let us know where to find it? That will make it easier to understand what is happening.

import cosima_cookbook as cc
from cosima_cookbook import explore

import matplotlib.pyplot as plt
import matplotlib.patches as patches
from matplotlib import ticker, cm
import numpy as np
import netCDF4 as nc
import cartopy.crs as ccrs
import xarray as xr
import cmocean.cm as cmocean
import matplotlib.colors as col
from mpl_toolkits.axes_grid1 import make_axes_locatable
import glob,os
import gsw
import climtas.nci
import timeit
import sys
from time import process_time

import logging
logging.captureWarnings(True)
logging.getLogger('py.warnings').setLevel(logging.ERROR)

from dask.distributed import Client

if __name__ == '__main__':

        climtas.nci.GadiClient()

        db = '/g/data/x77/ps7863/database/iaf_cycle3_daily_tracers_new.db'
        session = cc.database.create_session(db)
        secondary_session = cc.database.create_session('/g/data/ik11/databases/cosima_master.db')
        expt ='01deg_jra55v140_iaf_cycle3_antarctic_tracers'

        first_year = str(int(sys.argv[1]))
        start_time = first_year + '-01-01'
        end_time = first_year + '-12-31'
        time_slice = slice(start_time, end_time)

        t1_start = process_time() 

        ross_xflux = cc.querying.getvar(expt,
                                    'passive_ross_xflux_adv',
                                    secondary_session,
                                    start_time=start_time,
                                    end_time=end_time)

        ross_yflux = cc.querying.getvar(expt,
                                    'passive_ross_yflux_adv',
                                    secondary_session,
                                    start_time=start_time,
                                    end_time=end_time)

        month_length = ross_yflux.time.dt.days_in_month
        ross_xflux_mean = ((ross_xflux.sel(time=time_slice)*month_length).sum('time')/365).load()
        ross_yflux_mean = ((ross_yflux.sel(time=time_slice)*month_length).sum('time')/365).load()
        
        t1_stop = process_time()
        print('Mean done in ', (t1_stop-t1_start)/60, ' minutes') 

        enc_x = {'ross_xflux_mean':
                {'shuffle': True,
                'chunksizes': [75, 101, 100],
                'zlib': True,
                'complevel': 5}}

        enc_y = {'ross_yflux_mean':
                {'shuffle': True,
                'chunksizes': [75, 101, 100],
                'zlib': True,
                'complevel': 5}}

        save_dir = '/g/data/x77/ps7863/data/ross_tracer_adv_fluxes/'


        ross_xflux_mean.to_dataset(name='ross_xflux_mean').to_netcdf(save_dir+'mean_ross_xflux_adv_' + first_year + '.nc', encoding=enc_x)
        ross_yflux_mean.to_dataset(name='ross_yflux_mean').to_netcdf(save_dir+'mean_ross_yflux_adv_' + first_year + '.nc', encoding=enc_y)

        t2_stop = process_time()

        print('Saving finished in ', (t2_stop-t1_stop)/60, ' minutes')

Not-a-positive update: I recreated the database AGAIN and still got the same error @micael @Aidan

Hmm. I can’t find any reference to files in scratch in the master database. The error message you posted above, did it happen when using the master database or when using your own database? If it was the later, do you still have the error message you got when using the master database?

@angus-g do you have any ideas?

The error message I originally posted happened when I used cosima_master. Here is the error occurred when I tried my re-created database:

2022-12-23 13:03:58,009 - distributed.worker - WARNING - Compute Failed
Key:       open_dataset-4f39abe6-ca81-4889-bead-0c3bddfbda46
Function:  execute_task
args:      ((<function apply at 0x14746f0ff3a0>, <function open_dataset at 0x147468bf8310>, ['/scratch/v45/akm157/access-om2/archive/01deg_jra55v140_iaf_cycle3_antarctic_tracers/output607/ocean/ross_xflux_adv.nc'], (<class 'dict'>, [['engine', None], ['chunks', (<class 'dict'>, [['time', 1], ['st_ocean', 75], ['yt_ocean_sub01', 27], ['xu_ocean_sub01', 24]])]])))
kwargs:    {}
Exception: "FileNotFoundError(2, 'No such file or directory')"

Traceback (most recent call last):
  File "/g/data/x77/ps7863/python_scripts/01deg_jra55v140_iaf_cycle3_antarctic_tracers/save_ross_fluxes.py", line 41, in <module>
    ross_xflux = cc.querying.getvar(expt,
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/cosima_cookbook/querying.py", line 368, in getvar
    ds = xr.open_mfdataset(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/api.py", line 1004, in open_mfdataset
    datasets, closers = dask.compute(datasets, closers)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/dask/base.py", line 600, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/client.py", line 3122, in get
    results = self.gather(packed, asynchronous=asynchronous, direct=direct)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/client.py", line 2291, in gather
    return self.sync(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/utils.py", line 339, in sync
    return sync(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/utils.py", line 406, in sync
    raise exc.with_traceback(tb)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/utils.py", line 379, in f
    result = yield future
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/tornado/gen.py", line 762, in run
    value = future.result()
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/client.py", line 2154, in _gather
    raise exception.with_traceback(traceback)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/dask/utils.py", line 71, in apply
    return func(*args, **kwargs)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/api.py", line 539, in open_dataset
    backend_ds = backend.open_dataset(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 572, in open_dataset
    store = NetCDF4DataStore.open(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 376, in open
    return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 323, in __init__
    self.format = self.ds.data_model
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 385, in ds
    return self._acquire()
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 379, in _acquire
    with self._manager.acquire_context(needs_lock) as root:
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 197, in acquire_context
    file, cached = self._acquire_with_cache_info(needs_lock)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 215, in _acquire_with_cache_info
    file = self._opener(*self._args, **kwargs)
  File "src/netCDF4/_netCDF4.pyx", line 2353, in netCDF4._netCDF4.Dataset.__init__
  File "src/netCDF4/_netCDF4.pyx", line 1963, in netCDF4._netCDF4._ensure_nc_success
FileNotFoundError: [Errno 2] No such file or directory: b'/scratch/v45/akm157/access-om2/archive/01deg_jra55v140_iaf_cycle3_antarctic_tracers/output607/ocean/ross_xflux_adv.nc'
2022-12-23 17:33:05,833 - distributed.worker - WARNING - Compute Failed
Key:       open_dataset-0ee6fa3b-50ca-43d3-ac90-44ea11cfc974
Function:  execute_task
args:      ((<function apply at 0x1455d563e3a0>, <function open_dataset at 0x1455cb130310>, ['/scratch/v45/akm157/access-om2/archive/01deg_jra55v140_iaf_cycle3_antarctic_tracers/output607/ocean/ross_xflux_adv.nc'], (<class 'dict'>, [['engine', None], ['chunks', (<class 'dict'>, [['time', 1], ['st_ocean', 75], ['yt_ocean_sub01', 27], ['xu_ocean_sub01', 24]])]])))
kwargs:    {}
Exception: "FileNotFoundError(2, 'No such file or directory')"

Traceback (most recent call last):
  File "/g/data/x77/ps7863/python_scripts/01deg_jra55v140_iaf_cycle3_antarctic_tracers/save_ross_fluxes.py", line 41, in <module>
    ross_xflux = cc.querying.getvar(expt,
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/cosima_cookbook/querying.py", line 368, in getvar
    ds = xr.open_mfdataset(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/api.py", line 1004, in open_mfdataset
    datasets, closers = dask.compute(datasets, closers)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/dask/base.py", line 600, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/client.py", line 3122, in get
    results = self.gather(packed, asynchronous=asynchronous, direct=direct)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/client.py", line 2291, in gather
    return self.sync(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/utils.py", line 339, in sync
    return sync(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/utils.py", line 406, in sync
    raise exc.with_traceback(tb)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/utils.py", line 379, in f
    result = yield future
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/tornado/gen.py", line 762, in run
    value = future.result()
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/client.py", line 2154, in _gather
    raise exception.with_traceback(traceback)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/dask/utils.py", line 71, in apply
    return func(*args, **kwargs)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/api.py", line 539, in open_dataset
    backend_ds = backend.open_dataset(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 572, in open_dataset
    store = NetCDF4DataStore.open(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 376, in open
    return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 323, in __init__
    self.format = self.ds.data_model
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 385, in ds
    return self._acquire()
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 379, in _acquire
    with self._manager.acquire_context(needs_lock) as root:
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 197, in acquire_context
    file, cached = self._acquire_with_cache_info(needs_lock)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 215, in _acquire_with_cache_info
    file = self._opener(*self._args, **kwargs)
  File "src/netCDF4/_netCDF4.pyx", line 2353, in netCDF4._netCDF4.Dataset.__init__
  File "src/netCDF4/_netCDF4.pyx", line 1963, in netCDF4._netCDF4._ensure_nc_success
FileNotFoundError: [Errno 2] No such file or directory: b'/scratch/v45/akm157/access-om2/archive/01deg_jra55v140_iaf_cycle3_antarctic_tracers/output607/ocean/ross_xflux_adv.nc'
  File "/g/data/x77/ps7863/python_scripts/01deg_jra55v140_iaf_cycle3_antarctic_tracers/save_ross_fluxes.py", line 55
    ross_xflux_mean = (ross_xflux.sel(time=time_slice)*month_length).sum('time')/365.load()
                                                                                     ^
SyntaxError: invalid syntax
2023-01-10 11:39:08,658 - distributed.worker - WARNING - Compute Failed
Key:       open_dataset-c3fa9279-6b8c-4617-a759-943c1b47daba
Function:  execute_task
args:      ((<function apply at 0x1480755143a0>, <function open_dataset at 0x14806b004310>, ['/scratch/v45/akm157/access-om2/archive/01deg_jra55v140_iaf_cycle3_antarctic_tracers/output607/ocean/ross_xflux_adv.nc'], (<class 'dict'>, [['engine', None], ['chunks', (<class 'dict'>, [['time', 1], ['st_ocean', 75], ['yt_ocean_sub01', 27], ['xu_ocean_sub01', 24]])]])))
kwargs:    {}
Exception: "FileNotFoundError(2, 'No such file or directory')"

Traceback (most recent call last):
  File "/g/data/x77/ps7863/python_scripts/01deg_jra55v140_iaf_cycle3_antarctic_tracers/save_ross_fluxes.py", line 42, in <module>
    ross_xflux = cc.querying.getvar(expt,
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/cosima_cookbook/querying.py", line 368, in getvar
    ds = xr.open_mfdataset(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/api.py", line 1004, in open_mfdataset
    datasets, closers = dask.compute(datasets, closers)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/dask/base.py", line 600, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/client.py", line 3122, in get
    results = self.gather(packed, asynchronous=asynchronous, direct=direct)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/client.py", line 2291, in gather
    return self.sync(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/utils.py", line 339, in sync
    return sync(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/utils.py", line 406, in sync
    raise exc.with_traceback(tb)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/utils.py", line 379, in f
    result = yield future
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/tornado/gen.py", line 762, in run
    value = future.result()
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/distributed/client.py", line 2154, in _gather
    raise exception.with_traceback(traceback)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/dask/utils.py", line 71, in apply
    return func(*args, **kwargs)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/api.py", line 539, in open_dataset
    backend_ds = backend.open_dataset(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 572, in open_dataset
    store = NetCDF4DataStore.open(
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 376, in open
    return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 323, in __init__
    self.format = self.ds.data_model
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 385, in ds
    return self._acquire()
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 379, in _acquire
    with self._manager.acquire_context(needs_lock) as root:
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 197, in acquire_context
    file, cached = self._acquire_with_cache_info(needs_lock)
  File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 215, in _acquire_with_cache_info
    file = self._opener(*self._args, **kwargs)
  File "src/netCDF4/_netCDF4.pyx", line 2353, in netCDF4._netCDF4.Dataset.__init__
  File "src/netCDF4/_netCDF4.pyx", line 1963, in netCDF4._netCDF4._ensure_nc_success
FileNotFoundError: [Errno 2] No such file or directory: b'/scratch/v45/akm157/access-om2/archive/01deg_jra55v140_iaf_cycle3_antarctic_tracers/output607/ocean/ross_xflux_adv.nc'
/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 168 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 168 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.07/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 168 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

UPD: loaded passive_ross_xflux_adv (just loaded, didn’t calculate mean) either from my or cosima_master database in Gadi Jupyter and it worked. Why it loads when using Jupiter Notebook and doesn’t when run .py as PBS job?

I managed to run your script from a notebook for one of the problematic years (1987), so this seems to confirm it works as expected from a Jupyter Notebook.

@polinash I had a closer look at your scripts. It seems that in the PBS script you are redirecting the output and error logs of the python script with >>. That means that the logs will be appended to the file.

If you look at outputs/output_save_ross_fluxes_1987.txt you will see that there are errors with time-stamps from different days, but none from yesterday. So maybe the script ran fine when you switched to using the master dabatase or the error is not related to loading the files?

2 Likes

This is indeed quite strange…

So I tried to reproduce the error with @polinash yesterday in a Jupyter notebook and everything seemed fine.

@polinash perhaps you can boil this down to a MWE, ideally using the master database.

In the meantime, if you know where the files are located you can open them via xarray.open_mfdataset() bypassing cosima-cookbook? This way you don’t need to wait until this issue is resolve to continue analysing whatever you need to analyse.

Thanks Micael, you were right: the message error was from previous attempts. Re-created database and master database worked fine. I kept thinking that the problem was in database trying to access /scratch looking into old error message and because my jobs exceeded wall time, so the data didn’t save. As soon as I cleared old error messages and increased wall time, I managed to save data.
Fixed!

2 Likes

Great that this is solved @polinash.

It would be great to mark @micael’s answer as the solution. See this best practice guide for how to do this:

2 Likes