Had a minor emergency today … my main linux machine completely hung up (for no apparent reason at the time).
My first indication that a problem was a foot was when Ginny asked me if there were any problems with our internet connection.
I didn’t think so, but I checked … I could get to DSLReports without a problem … I was even able to run a speed test (with respectable results).
Ginny told me she wasn’t able to get to her blog. So I went over to the actual machines and checked out TheShire to see if something was wrong. Something was. I wasn’t able to logon as root.
Obviously this isn’t good … so I checked out Rivendell and found I couldn’t logon to that system either.
Since TheShire is heavily dependant on Rivendell, I tried to figure out what was happening on Rivendell first.
The disk indicator was pegged … so I knew it was trying to do something … and syslog was working somewhat (it was simply telling me that sendmail was rejecting connections due to system load).
I tried to do a three finger salute … but that didn’t work either. Time to ‘red switch’ it (cycle power, aka shut the damn thing off and turn it back on … that phrase, btw, is aging me … it’s been a long time since systems had red switches on their power supply).
System rebooted OK … but some processes wouldn’t start.
I checked my available disk space and had none. Zip, zero, nada, nuthing, a big fat ZERO.
Took me a little while to figure out the exact syntax … but I did a search for any files greater than 100mb in size.
find / -type f -size +100000k
This turned up a number of files … but one file named ‘z’ in the Mailman logs directory.
Well, I had tried to find why a persons email address wasn’t subscribed to some lists earlier … and did a grep, piping the output into ‘z’. I checked the file out and found it was about 11gb in size. More than enough to fill up the disk.
I deleted the file and everything returned to normal.
Here’s what I learned from today’s little excitement:
- I guess there was something wrong with my grep command.
- Maybe I need to investigate activating quotas.
- Perhaps there is some wisdom in putting different directories on different partitions. If I had sent my grep workfile to a directory on a different partition, this would not have caused a problem.