In a PBS job script, I run a number of binary comparisons between netcdf files using the nccmp
command (see below). The file size of the netcdf files being compared range from 15M to 92M. The script executes the binary comparisons successfully, but the peak memory usage of the script reported by PBS exceeds the typical memory usage of executing nccmp
for any of the netcdf files alone.
For example, running nccmp
for two files of size 92M uses about 230M of memory. Running the script: executing 52 comparisons uses about 5G of memory, and executing 332 comparisons uses about 30G of memory. (Note, the memory usage is the peak memory usage reported by PBS, i.e. resources_used.mem
).
Does anyone have any idea why the peak memory usage scales with the number of comparisons we execute with nccmp
in the script?
Here is the script used to execute binary comparisons:
#!/bin/bash
#PBS -l wd
#PBS -l ncpus=1
#PBS -l mem=32GB
#PBS -l walltime=1:00:00
#PBS -P tm70
#PBS -j oe
#PBS -m e
#PBS -l storage=gdata/hh5+scratch/tm70
module purge
module use /g/data/hh5/public/modules
module load conda
output_dir=more_outputs # path to netcdf files
R0_files=($output_dir/*_R0_*)
R1_files=($output_dir/*_R1_*)
if [ ${#R0_files[@]} -ne ${#R1_files[@]} ]; then
echo "Error: number of R0 files unequal to number of R1 files."
exit 1
fi
for ((i=0; i<${#R0_files[@]}; i++)); do
echo "nccmp -df ${R0_files[i]} ${R1_files[i]}"
nccmp -df ${R0_files[i]} ${R1_files[i]}
done
To reproduce:
The script and netcdf files used are accessible on NCI: /scratch/public/sb8430/memory-issue
.
To run the script, simply submit the script to the PBS scheduler:
qsub compare_files_serial.pbs
There are three directories that contain netcdf outputs which contain varying numbers of netcdf files: outputs
, more_outputs
and many_outputs
. To run the largest number of comparisons (332 comparisons), set output_dir=many_outputs
in the script.