Command line utility: pbs-workbench; your own personal ARE

PS: I will be presenting this today at the Reserach Software Community Meeting: RSE Announce - #8 by paocorrales

After having a bunch of problems with ARE sessions I decided to stop using them. But because I still need to spin up a job in a node to run analysis code interactively, I created this small bash tool.

The command

job start

submits a job to the pbs queue that just sits there doing nothing. But then you can ssh into the node and work. It also has a cute monitor system that tells you if the job is running, how much time you’ve got left and quick copy-paste commands do interact with it.

It works well with vscode and its forks and also with jupyter notebooks.

The script supports creating “profiles”, so you can create your own “big”, “small”, “bigmem” or whatever other profile you want, then you spin up your job with job start <name of profile>.

I tried to make installation as straightforward a possible. You need to clone my repo and then run the install script.

Let me know if anyone finds this useful and what other features would you like implemented. Right now the tool only supports one running job that is “global”, but I was thinking that maybe it would be better to use a job per project.

4 Likes

Thank you for sharing this tool @eliocamp!

I think being able to quickly spin up a job on a computing node and being able to connect to it through VSCode is very useful for many people in the community.

I created a similar script years ago following this (fairly old) discussion as I was getting frustrated with ARE, and @abhaasgoyal might also have done something similar based of my script.

In general, I see such a feature being highly requested by people in the community, as multiple forum topics confirm: VSCode Extension to run ARE session, Opening the ARE jupyterlab session on VS code, Working with Jupyter notebooks on gadi/ARE via VS Code.

What are the thoughts about creating a VSCode extension (some old details in the link above) to spin up a computing job and connect to it?
We might need to coordinate with NCI too on this, but I would be happy to help.

Cheers
Davide

We talked about this a bit, but I think the idea of having a standard “API” that other tools can hook could be a good solution. Then, anyone can create VSCode extensions or local bash commands, or whatever (does jupyter notebook have extensions?) using those standard endpoints.

One of the reasons I like the script running on gadi is that I don’t need to worry about supporting multiple OS and IDEs.

I don’t know exactly what that would look like. Maybe it’s a standard module that you can load and all the interaction is done with ssh commands.

1 Like

I think the idea of a standard API is very good. As you mentioned, the API could be used by any IDE extension, or as a CLI from a terminal.

I would say the first approach here is to get in contact with NCI to understand whether there is scope for them to collaborate and see how we can help with that, because I think this is something that should be managed (or at least approved) by NCI in the first place.

A good start to compile a list of all these custom solutions and requests to show that the community really needs this.

  1. pbs-workbench (this tool)
  2. The previous center of excellence for extreme weather used to have this tool.
  3. Discussion here, including manual steps.
  4. Requests from researchers:
    1. Working with Jupyter notebooks on gadi/ARE via VS Code .
    2. https://forum.access-hive.org.au/t/opening-the-are-jupyterlab-session-on-vs-code/22

I believe @sam.green creates an interactive job and sshs into it. We could gather “testimonials” of other researchers’ custom solutions.

Secondly , describe exactly what is missing from the ARE workflow. Some ideas are:

  1. IDEs limited to RStudio and jupyter notebooks.
  2. Virtual desktop is very slow compared with native/local GUIs.
  3. Harcoded modules loaded that cannot be changed after spinup (that is true for rstudio and jupyter, not sure about the VDI).
    1. If in the process of data analysis you realise you need a new module, you need to close everything, add the modules and spin the whole thing up again.
    2. This makes it hard to run the same code interactively in ARE and via terminal or in jobs in the queue.
  4. Relies on external service (what happens if you ARE is down?)

Third, specify what “endpoints” would we need.

  1. Start a workbench with particular resources/profile for a particular ammount of time
  2. List running workbenches.
  3. Get information from a workbench:
    1. Resources
    2. time spent
    3. time left
    4. node address
    5. SU cost?
    6. … Something else?
  4. Stop a workbench
  5. Save “profiles” for particular configurations (similar to ARE ability to save settings)
  6. Modify existing profiles.
  7. …Something else?

For troubleshooting, it might be useful to get easy access to the logs.

1 Like

niice, Positron here we come

1 Like