Xr.open_mfdataset OS Errno 22 on load

I’m trying to read ORAS5 sea surface height data with xarray, and it seems to be struggling at the concatenation step. All the files can be read individually with xr.open_dataset, and the ones I’ve checked look fine, but when I try and read them all at once with xr.open_mfdataset and then load I get OSError: [Errno 22] Invalid argument

It seems to come down to opening 362 files is fine, but 363 is not, but I can’t see why. Has anyone seen anything similar before?

import glob
import xarray as xr
files = glob.glob('/Users/u6955431/Large_Datasets/ORAS5/oras5/ORCA025/sossheig/opa0/*/*.nc')

ds = xr.open_mfdataset(files[:363], compat='override',coords='minimal') # all good
ds.load() # NOT good (error message below)
xr.open_mfdataset(files[:362], compat='override',coords='minimal').load() # all good with one fewer file
xr.open_mfdataset(files[1:363], compat='override',coords='minimal').load() # all good with one fewer file
xr.open_mfdataset(files[1:364], compat='override',coords='minimal').load() # same number of files and we have problems

Technical details

  • I’m running this on my laptop so can’t share an example notebook easily.
  • I’m using dask, but get the same problem without it
  • As a workaround, I’m just loading both halves of the dataset individually and concatenating them together, but it would be nice to know what’s going on and be able to do it properly

Error message:

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In[5], line 5
      2 files = glob.glob('/Users/u6955431/Large_Datasets/ORAS5/oras5/ORCA025/sossheig/opa0/*/*.nc')
      4 ds = xr.open_mfdataset(files[:363], compat='override',coords='minimal') # all good
----> 5 ds.load() # not good (error message below)

File ~/mambaforge/lib/python3.10/site-packages/xarray/core/dataset.py:853, in Dataset.load(self, **kwargs)
    850 chunkmanager = get_chunked_array_type(*lazy_data.values())
    852 # evaluate all the chunked arrays simultaneously
--> 853 evaluated_data = chunkmanager.compute(*lazy_data.values(), **kwargs)
    855 for k, data in zip(lazy_data, evaluated_data):
    856     self.variables[k].data = data

File ~/mambaforge/lib/python3.10/site-packages/xarray/core/daskmanager.py:70, in DaskManager.compute(self, *data, **kwargs)
     67 def compute(self, *data: DaskArray, **kwargs) -> tuple[np.ndarray, ...]:
     68     from dask.array import compute
---> 70     return compute(*data, **kwargs)

File ~/mambaforge/lib/python3.10/site-packages/distributed/utils_comm.py:434, in retry_operation(coro, operation, *args, **kwargs)
    428 retry_delay_min = parse_timedelta(
    429     dask.config.get("distributed.comm.retry.delay.min"), default="s"
    430 )
    431 retry_delay_max = parse_timedelta(
    432     dask.config.get("distributed.comm.retry.delay.max"), default="s"
    433 )
--> 434 return await retry(
    435     partial(coro, *args, **kwargs),
    436     count=retry_count,
    437     delay_min=retry_delay_min,
    438     delay_max=retry_delay_max,
    439     operation=operation,
    440 )

File ~/mambaforge/lib/python3.10/site-packages/distributed/utils_comm.py:413, in retry(coro, count, delay_min, delay_max, jitter_fraction, retry_on_exceptions, operation)
    411             delay *= 1 + random.random() * jitter_fraction
    412         await asyncio.sleep(delay)
--> 413 return await coro()

File ~/mambaforge/lib/python3.10/site-packages/distributed/core.py:1377, in PooledRPCCall.__getattr__.<locals>.send_recv_from_rpc(**kwargs)
   1375 prev_name, comm.name = comm.name, "ConnectionPool." + key
   1376 try:
-> 1377     return await send_recv(comm=comm, op=key, **kwargs)
   1378 finally:
   1379     self.pool.reuse(self.addr, comm)

File ~/mambaforge/lib/python3.10/site-packages/distributed/core.py:1136, in send_recv(comm, reply, serializers, deserializers, **kwargs)
   1134 await comm.write(msg, serializers=serializers, on_error="raise")
   1135 if reply:
-> 1136     response = await comm.read(deserializers=deserializers)
   1137 else:
   1138     response = None

File ~/mambaforge/lib/python3.10/site-packages/distributed/comm/tcp.py:235, in TCP.read(self, deserializers)
    233         chunk = frames[i:j]
    234         chunk_nbytes = chunk.nbytes
--> 235         n = await stream.read_into(chunk)
    236         assert n == chunk_nbytes, (n, chunk_nbytes)
    237 except StreamClosedError as e:

File ~/mambaforge/lib/python3.10/site-packages/tornado/iostream.py:467, in BaseIOStream.read_into(self, buf, partial)
    464 self._read_partial = partial
    466 try:
--> 467     self._try_inline_read()
    468 except:
    469     future.add_done_callback(lambda f: f.exception())

File ~/mambaforge/lib/python3.10/site-packages/tornado/iostream.py:836, in BaseIOStream._try_inline_read(self)
    834     return
    835 self._check_closed()
--> 836 pos = self._read_to_buffer_loop()
    837 if pos is not None:
    838     self._read_from_buffer(pos)

File ~/mambaforge/lib/python3.10/site-packages/tornado/iostream.py:750, in BaseIOStream._read_to_buffer_loop(self)
    743 next_find_pos = 0
    744 while not self.closed():
    745     # Read from the socket until we get EWOULDBLOCK or equivalent.
    746     # SSL sockets do some internal buffering, and if the data is
    747     # sitting in the SSL object's buffer select() and friends
    748     # can't see it; the only way to find out if it's there is to
    749     # try to read it.
--> 750     if self._read_to_buffer() == 0:
    751         break
    753     # If we've read all the bytes we can use, break out of
    754     # this loop.
    755 
    756     # If we've reached target_bytes, we know we're done.

File ~/mambaforge/lib/python3.10/site-packages/tornado/iostream.py:861, in BaseIOStream._read_to_buffer(self)
    859     else:
    860         buf = bytearray(self.read_chunk_size)
--> 861     bytes_read = self.read_from_fd(buf)
    862 except (socket.error, IOError, OSError) as e:
    863     # ssl.SSLError is a subclass of socket.error
    864     if self._is_connreset(e):
    865         # Treat ECONNRESET as a connection close rather than
    866         # an error to minimize log spam  (the exception will
    867         # be available on self.error for apps that care).

File ~/mambaforge/lib/python3.10/site-packages/tornado/iostream.py:1116, in IOStream.read_from_fd(***failed resolving arguments***)
   1114 def read_from_fd(self, buf: Union[bytearray, memoryview]) -> Optional[int]:
   1115     try:
-> 1116         return self.socket.recv_into(buf, len(buf))
   1117     except BlockingIOError:
   1118         return None

OSError: [Errno 22] Invalid argument

I wonder if you’re hitting a file descriptor limit. You could see if increasing the limit changes anything. The default is probably 1024, so you could double it by running

ulimit -n 2048

in the shell from which you launch your notebook. If that doesn’t help, it could be hitting a kernel limit in the asynchronous file polling…

Thanks Angus.

Adjusting the ulimit didn’t do anything. How do I fix the asynchronous file polling? I’m not sure if I understand that, but it seems possible because each file is a single month (of monthly resolution data) so there’s lots of small files

On macOS, it might also be that you need to use

sysctl -w kern.maxfilesperproc=20000

or something similar? There appear to be a lot of different ways to tackle this problem, depending on OS version and whether you want it to persist.

Still no luck.
Though, I don’t think that upped the limit.

bash-3.2$ sysctl -w kern.maxfilesperproc=20000
kern.maxfilesperproc: 122880
sysctl: kern.maxfilesperproc=20000: Operation not permitted
bash-3.2$ sudo sysctl -w kern.maxfilesperproc=20000
Password:
kern.maxfilesperproc: 122880 -> 20000

After checking it wasn’t a successful change, I’m putting it back to where it was

Hi @jemmajeffree ,

Your question is likely to be out-of-scope for ACCESS-NRI support. However, it’s useful for us to see the issues that are faced by users.

When debugging this type of problem, it can help to isolate the problem a bit further. You could try to determine whether it’s a file limit or a character limit. e.g. Can you copy the files to a directory location where the full path has less characters than /Users/$USER/Large_Datasets/ORAS5/oras5/ORCA025/sossheig/opa0/*/*.nc and then use the shorter path? Does using a shorter path change the number of files that can be opened?

If the Error is from using sockets, it’s more likely to be due to an OS resource limit. Have you tried something like: macos - How to increase limits on sockets on osx for load testing? - Stack Overflow ?

Hi @jemmajeffree. This is a really interesting error, and I think its worth documenting the cause of it. Its coming from the tornado communications backend for dask/xarray while its attempting to distribute chunks. I believe the root cause is a low-level system error specific to the socket implementation on MacOS. According to this ancient documentation, there are 2 occasions where the recv syscall (the OS implementation of the function call in your error) can return EINVAL (see here). The first is where the MSG_OOB flag is set (not relevant here) and the second is where the length of the requested/returned data would overflow a C ssize_t.

It turns out that the modified BSD kernel Apple uses and its libc implementation are open source, so digging through all of that seems to indicate that on anything running MacOS, a ssize_t is a signed 32-bit integer, meaning its limited to 231-1. I’m not a kernel programmer, but I think it’s this if statement throwing the error. Given the sizes of the datasets you’re likely dealing with it may actually be attempting to create a chunk greater than 2GB. This probably explains why you don’t see it on Gadi, and why it works when you halve the dataset. To know for sure though, we’d need to know the size in bytes of buf at tornado/iostream.py:1116.

If anything, this would be an error in tornado, the correct way to do this would be to receive into smaller, fixed size buffers and append to the larger final buffer as messages arrive until there are no more. I guess the tornado devs made the (valid) assumption that there isn’t a practical limit on the receive buffer size (on linux, its 263-1 bytes, or 8 Exabytes). Anyway, I don’t check this forum often now that I’ve left CLEX, but this was a fascinating rabbit hole to go down while some tests were running and I think valuable to the community. Many people are going to be working with larger and larger datasets on MacOS in future, I’m sure this won’t be the last time we see assumptions on type sizes causing problems. In terms of practical workarounds, maybe setting chunking from the start or preprocessing out some of the data could help?

2 Likes