The nccompress package has been written to facilitate compressing netcdf files. Although nccompress can work on single files, it is particularly useful to compress all uncompressed files under whole directory trees. This can allow users to compress files regularly using the same script each time.
The nccompress package consists of three python programs, ncfind, nc2nc and nccompress. nc2nc can copy netCDF files with compression and an optimised chunking strategy that has reasonable performance for many datasets. His two main limitations: it is slower than some other programs, and it can only compress netCDF3 or netCDF4 classic format. There is more detail in the following sections.
The convenience utility ncvarinfo is also included, and though it has no direct relevance to compression, it is a convenient way to get a summary of the contents of a netCDF file.
For those following along at home; can I find this on gadi (and if so, where?) or do I need to install for myself?
Aidan
(Aidan Heerdegen, ACCESS-NRI Release Team Lead)
4
I think an install into a virtual environment is your best bet. The code is not currently maintained, so I can’t see a use case to put it into a shared conda environment until that status changes.
Is there a NetCDF compression tool equivalent to nccompress in the new xp65 space? I was routinely running it on my files but I can see it is no longer available. I have no idea what installing to a virtual environment means I know python has compression options but I also use other languages to work with NetCDFs. Thanks
Aidan
(Aidan Heerdegen, ACCESS-NRI Release Team Lead)
6
I am glad you found nccompress useful @aukkola. Assuming that is what you’re referring to?
It hasn’t had substantial changes in over 6 years, so I can’t guarantee it is still fit for purpose.
There may well now be better options available. e.g. nco or cdo (though the cdo documentation seems to be publicly inaccessible currently).
Thanks Aidan. That’s a shame nccompress is no longer maintained, it was such a no-brainer to run it on files to save storage space. I’ll see if I can find something similar using NCO/CDO
@Aidan and @aukkola I also regularly used nccompress until the hh5 environment was decommissioned. If there’s any chance it could be added to xp65, that would be great. I’ve got a bunch of files I need to compress (should’ve done it properly when I created them…), and nccompress would’ve worked really well.
Aidan
(Aidan Heerdegen, ACCESS-NRI Release Team Lead)
9
I think there is very little likelihood of it being added to the conda/analysis environments. However, you can load that environment and then install nccompress into a virtual environment.
Following the most excellent instructions from the CLEX CMS blog
this worked for me:
$ module use /g/data/xp65/public/modules/
$ module load conda/analysis3
Loading conda/analysis3-26.03
Loading requirement: singularity
$ python3 -m venv nccompress --system-site-packages
$ source nccompress/bin/activate
((nccompress) ) $ pip install git+https://github.com/coecms/nccompress.git@master
Collecting git+https://github.com/coecms/nccompress.git@master
Cloning https://github.com/coecms/nccompress.git (to revision master) to /scratch/tm70/aph502/tmp/pip-req-build-v7l7uii7
Running command git clone --filter=blob:none --quiet https://github.com/coecms/nccompress.git /scratch/tm70/aph502/tmp/pip-req-build-v7l7uii7
Resolved https://github.com/coecms/nccompress.git to commit ce3f3b67be30b6a6838ff1b3c936f9cb537b45ec
Preparing metadata (setup.py) ... done
Building wheels for collected packages: nccompress
Building wheel for nccompress (setup.py) ... done
Created wheel for nccompress: filename=nccompress-0.2.2.dev1-py3-none-any.whl size=27807 sha256=70f60700638be8e60682d189b385e296d8483af8a72976d0a2444fb7f323588c
Stored in directory: /scratch/tm70/aph502/tmp/pip-ephem-wheel-cache-mbmghn2e/wheels/8b/c8/7a/9ce85d0a8ece792f8348bd0aa343903e30f23359bf5692a890
Successfully built nccompress
Installing collected packages: nccompress
Successfully installed nccompress-0.2.2.dev1
[notice] A new release of pip is available: 25.0.1 -> 26.0.1
[notice] To update, run: pip3.12 install --upgrade pip
((nccompress) ) $ which nccompress
~/nccompress/bin/nccompress
((nccompress) ) $ nccompress -h
usage: nccompress [-h] [-d {1-9}] [-n] [-s CHUNKSIZE] [-b BUFFERSIZE] [-t TMPDIR] [-v] [-r] [-o] [-m MAXCOMPRESS] [-p] [-f] [-c] [-pa] [-np NUMPROC]
[-ff FROMFILE] [--nccopy] [--timing]
[inputs ...]
Run nc2nc (or nccopy) on a number of netCDF files
positional arguments:
inputs netCDF files or directories (-r must be specified to recursively descend directories). Can accept piped arguments.
options:
-h, --help show this help message and exit
-d {1-9}, --dlevel {1-9}
Set deflate level. Valid values 0-9 (default=5)
-n, --noshuffle Don't shuffle on deflation (default is to shuffle)
-s CHUNKSIZE, --chunksize CHUNKSIZE
Set chunksize - total size of one chunk in KiB (default=64), nc2nc only
-b BUFFERSIZE, --buffersize BUFFERSIZE
Set size of copy buffer in MiB (default=500), nc2nc only
-t TMPDIR, --tmpdir TMPDIR
Specify temporary directory to save compressed files
-v, --verbose Verbose output
-r, --recursive Recursively descend directories compressing all netCDF files (default False)
-o, --overwrite Overwrite original files with compressed versions (default is to not overwrite)
-m MAXCOMPRESS, --maxcompress MAXCOMPRESS
Set a maximum compression as a paranoid check on success of nccopy (default is 10, set to zero for no check)
-p, --paranoid Paranoid check : run nco ndiff on the resulting file ensure no data has been altered
-f, --force Force compression, even if input file is already compressed (default False)
-c, --clean Clean tmpdir by removing existing compressed files before starting (default False)
-pa, --parallel Compress files in parallel
-np NUMPROC, --numproc NUMPROC
Specify the number of processes to use in parallel operation
-ff FROMFILE, --fromfile FROMFILE
Read files to be compressed from a text file
--nccopy Use nccopy instead of nc2nc (default False)
--timing Collect timing statistics when compressing each file (default False)
((nccompress) ) $ deactivate
$
DO THIS AT YOUR OWN RISK!
There are absolutely no guarantees this still works, so definitely do a test before you point it at your precious research outputs.