NCI quarantine: recover several files with a specific name pattern

The problem

Someone had some simulation files that ended up in quarantine on NCI and wanted to recover the files. All the files that were needed followed the pattern *.ice_daily.nc. The tool nci-file-expiry has an option for batch recovery but it expects to be given a text file that contains a list of all the files to recover in a specific format.

Solution

@dale.roberts gave a solution to feed nci-file-expiry back to itself so nci-file-expiry is used to create the needed file following a pattern given by a user. You can put the function in your ~/.bashrc file:

function recover_pattern () {
    nci-file-expiry batch-recover <( while read uuid a b c d path; do echo $uuid $path; done < <( nci-file-expiry list-quarantined | grep "${1}" ) )
}

Then to use it in the current example:

$ recover_pattern .ice_daily.nc
2 Likes

Hi all. I’ve written up a quick CMS blog post detailing some of the bash tricks this function uses. Quick tip - Feed nci-file-expiry output back into itself — CLEX CMS Blog

Edit: URL updated

3 Likes

Just a minor thing.
To avoid matching files that you don’t want with grep (due to . meaning “any character” in regex), I would either use grep -F (preventing grep from using regex) or maybe write a note in the CMS blog, so users can escape the right characters in the function argument (in the case above if someone only wants the *.ice_daily.nc pattern, it should be:

$ recover_pattern '\.ice_daily\.nc'

@atteggiani, yes of course. Strictly speaking even with escaping the .'s it’d over-match as the $ required to match EOL isn’t included either. I don’t think that’s a big deal in this case, accidentally recalling too many files isn’t really a problem. Not escaping the .'s was a deliberate decision in order for the command example to be more human-readable, therefore making the blog post accessible to a wider audience.

2 Likes