Stats capture should run as a service user

The scripts for stats capture at NCI are contained in these two repos

and are run via the jenkins instance:

dump_stats

queue_stats

queue_stats captures PBS job information hourly.

dumpstats captures lquota, nci_account and nci-files-report daily. The projects and data captured is governed by the configuration file

NCI only allows members of a project to report statistics for that project. Currently the Jenkins jobs run on my (@Aidan) account. This means I have to be a member of every account that wants to track their analytics in this way.

We should change this and make it run under a service user account. It’s not long-term tenable for me to be a member of all these groups, and handing over to another personal account every time someone shifts role is not tenable either.

There are already some access related service accounts:

  • access
  • access_nfs
  • accesstester

A service user would have to join all the requisite accounts, but adding a heap of accounts to a service user has potential to be problematic if we don’t know exactly who has access to them and what they’re using them for.

Adding new accounts might be as simple as inviting the service user to join and then making a PR against the config file above. If we wanted this to be as simple as possible the config file should probably be generated from a more simple list of projects, as the correct resources to request in the config file is discoverable.

Part of the original motivation for accesstester was to have an account that wasn’t a member of common projects like p66 so that we could test that jobs ran successfully for a general user. Being a member of multiple projects would prevent this catching problems.

Agree with this 100%. The biggest issue with running a service as a user account is that, should the service be compromised, the bad actor will have the same level of privilege (e.g. sudo/root access) as the user account. Best practice for service accounts is to have full separation of concerns, every service gets its own service account, ideally a new account would be created for this purpose that has access to only the projects it needs.

Access to service users is usually controlled by ‘sudo’, a properly configured service user generally does not have a password set, nor does it have any SSH authorised keys. So you would have some group (e.g. tg8 on accessdev) that would also have access to the service user by virtue of also having admin access to the host.

You can also place restrictions on the SSH keys such that the service user can only run certain commands, or can’t launch interactive sessions and things like that.

I think your idea of adding new accounts is good, however, if we follow separation of concerns, the stats capture service user will only be a member of the projects it needs to get data from, so you could have the getter commands enumerate the service user’s supplementary groups and gather stats for all of them. That way adding a new project requires no config changes, just adding the service user to a group.

Thanks for the background @MartinDix, that is good to know. In the same vein, out of interest, what is the access service intended to be/actually used for?

Agreed. I had the same thought, but assumed we couldn’t be sure the groups the service user belonged to would only be for this purpose. If seems that this is actually the desired outcome, to limit service users to a single service, so yeah, absolutely, sounds great.

So, I guess that means we should ask for an access_stats user to be created?

Another option would be to ask NCI to make the statistics from nci_account etc. visible by a web api, e.g. through mancini. That way you don’t need a service user with filesystem access to a whole bunch of projects.

We enquired about this a couple months ago and NCI were interested but didn’t have the resources at the time, this would be worth following up

1 Like

Yeah, given recent departures I don’t think that will be happening any time soon. nci_account is already a WSGI web API that can only be authenticated to from inside Gadi. The pain points for NCI in making that an external service are authentication - specifically, they’d need some self-serve JWT generating service to allow non-interactive usage. Second, they’d need to figure out where to put it, The database that the accounting data lives on and the nci_account WSGI server are both internal to Gadi, and exposing truly internal things like that in a safe way is not something they’ve solved yet. I tried for a while to get them to think about it while I worked there so we could publish the web-facing apps stuff I developed, but it never progressed. I think that if this is something that you want to get done in the next 6-12 months, it’ll have to be with the tools that already exist.

1 Like

Way back on BOM/CSIRO shared system we had a shared access account for various admin things and probably requested the same when we moved to NCI. However I don’t think it’s used for anything now. The access and access.admin groups really cover the original uses.