from what I can tell, the middleware bug is something to do with the contents of /dev changing during the execution of a cleanup script that runs periodically, which would explain why it's a rare edge-case.
looking through the logs it might've been a HBA hiccup because it did complain about something on /dev/da1, but it's hard to line up the timing because I don't exactly know when the script started.
Comments
Displaying 0 of 1 comments
Graham Sutherland / Polynomial
I just found the actual answer to this. /etc/periodic/security/ has two periodic scripts that by default run daily: 100.chksetuid and 110.neggrpperm
by default (/etc/defaults/periodic.conf) these are enabled and configured to run daily. these scripts scan your system for files that have insecure setuid and negative group permissions, using `find`.
the problem is that this gets run *per jail* and if the jails mount large datasets it eats a ton of CPU time for several hours at a time.
the biggest problem is if one of these operations ever takes over 24h you'll end up with multiple scans overlapping and sharing CPU/IO load, slowing them down, spiralling into resource exhaustion.
these can be turned off system-wide by setting the security_status_chksetuid_enable and security_status_neggrpperm_enable rc.conf vars to NO in the Tunables tab, or you can manually add those overrides to /etc/periodic.conf on a per-jail basis if you just want to turn this off for specific jails.
by Graham Sutherland / Polynomial ;
@gsuberland There are numerous periodic jobs like that, IMO the defaults need updating in this modern world of jails and large ZFS datasets.. My favourite solution would be proper OS/FS indexing so you could just query some index for the answer trivially but that is a lot of work
by Daniel O'Connor ;
Likes: 0
Replies: 2
Boosts: 0