File system hickups on /home/hpc and /home/vault

checkmark-sunshine

On Friday afternoon around 5:15 pm, our /home/hpc and /home/vault file system had a hickup and no more metadata could be written. This led to “No space left on device” error messages for some of our users. Our admins were quickly able to identify the underlying problem: again severe file system abuse by some people, i.e. occupying far too many inodes and too fast changing data, thus, polluting the filesystem snapshots.

Everything is resolved by now but please check your results. Jobs that tried to write data during the hickups might have crashed unexpectedly.