I am currently using the edge environment analysis3_edge-25.11, which works well with torch + GPU, but I’m running into issues with xarray,specifically where the failure seems to come from importing netCDF4. For example, ds = xr.open_dataset(nc_path) gives:
The same happens when I simply run import netCDF4, which gives the same import error.
Is it possible to fix the import netCDF4 issue in the edge environments? have no issues when I switch to analysis3; I am currently using analysis3-26.02
We were planning on deprecating the -edge environments soon anyway - is there a particular reason you’re needing to use the edge environment over a regular, non-edge one?
I am running this script to test whether PyTorch can see CUDA. In analysis3-26.02 (and earlier versions), PyTorch can’t see CUDA. The latest environment where PyTorch can see CUDA is analysis3_edge-25.11, but I run into an issue: import netCDF4 throws errors, which then causes import xarray to fail. I’m trying to find an analysis3 environment that has a PyTorch build with CUDA support and also has xarray.
Here’s the simple script:
torch on GPU?
import torch
# Let’s check if CUDA (i.e. a GPU) is supported
if torch.cuda.is_available():
print(“CUDA GPU is available.”) # analysis3_edge-25.11
else:
print(“CUDA GPU not available.”) # analysis3-26.02
I’ve been able to solve for an environment which works with both netcdf and for which torch.gpu.is_available == True.
Unfortunately I had to resort to using a Pixi solver (not a standard conda way of doing things), so extracting the solution for this so we can port it into analysis3 is not going to be super straightforward, but we are at least in a situation were we have a working environment that allows you to use PyTorch with CUDA support and xarray.
I’ll update once we’ve got it into analysis3 - hopefully shouldn’t be more than a couple of days work.
I’ve managed to get an environment built, but due to some solver issues I’ve had to remove tensorflow. It’s not that it can’t actually solve it, but that the solve job times out due to the other cuda library dependency stuff that tensorflow requires - which blows out the solve time and the job times out.
AFAIK, tensorflow is basically deprecated in favour of PyTorch nowadays - is this right & would the ML community generally be happy with us replacing tensorflow with torch?
If people are going to need to keep it hanging around, we definitely can get them both into the same environment, but it’ll take a bit longer as we’ll need to figure out how to prime the solver to speed things up sufficiently.
Many researchers I know, including myself, use PyTorch. How about keeping Tensorflow out for now, and if others think it’s necessary for their work and request it, you could look into getting both?