Looks like accessdev is down again. We were trying to add your umami package to conda and I tried to connect to Jenkins to see if it succeeded.
Which reminds me that we haven’t discussed a long term plan for Jenkins once accessdev is gone.
httpd is not responding. It appears to be stuck trying to access some files on gdata that I’m not sure it should have access to. /etc/httpd/conf.d/25-accessdev.conf
was modified at 3:29am this morning, probably by puppet. There are a huge amount of errors in /var/log/httpd/error_log
starting at 3:29am about processes not being able to be killed.
I think this part of the config is the problem:
<Directory "/home/*/*/public_html" >
Order Deny,Allow
Deny from all
Allow from 134.178 124.47.137.7 124.47.159.2
AuthName "accessdev"
Options Indexes SymlinksIfOwnerMatch IncludesNOEXEC
Satisfy Any
AuthType Basic
AuthBasicProvider ldap
AuthLDAPURL "ldaps://ldap.nci.org.au/ou=People,dc=apac,dc=edu,dc=au?uid"
AuthLDAPGroupAttribute "memberUid"
AuthLDAPGroupAttributeIsDN off
Require ldap-group cn=access,ou=Group,dc=apac,dc=edu,dc=au
</Directory>
This is matching with /home/548/sjr548/public_html/
, which has symlinks into /g/data
. root
cannot read from /g/data
due to the NFS root-squash option. Can whoever did this please revert the config?
I take that back, I think the config above has been there for a while. Might be a gdata NFS problem.
I put in a change yesterday for the auth of rose-bush, I’ll investigate
httpd failed to restart properly, the old processes are still present and not killable with sudo kill -9
. they’re hanging on to port 443 so a new process can’t start
Yep, can see that. Tried remounting the /g/data
directories containing files opened by stuck processes, no luck there unfortunately.
I think our culprit is pid 3082, it has a different stack than the rest of the httpd processes. Its stuck in nfs_idmap_id
, and the rest are waiting for it to finish. Config change is a red herring, these processes seem to have been stuck since 9:44am.
Is there anything we can do here, or do we need to restart the server?
Nope, needs a reboot.
Ok, I’ll send out a notice and schedule the reboot
It’s being slow to come back online
Closed for archival.