[Maintenance] monit on rshg054

Luke T. Shumaker lukeshu at sbcglobal.net
Fri Oct 4 17:50:47 GMT 2013


Earlier today I noticed that labs.parabola.nu was throwing a bad
gateway error (what's that, 503?).

So, I hopped on the box to figure out what was going on.  First of
all, everything but transmissiond had been unmonitored by monit.

I first-ish did a few general troubleshooting steps, I added myself to
the monit group, and upgraded monit.  The newer monit doesn't ship
with an rc script, so I manually re-installed the old one.

In most cases, it was stupid mistakes in the monit
configuration--looking for a PID or socket file in the wrong place.
In the case of parabolaweb, it was checking to make sure the socket
was talking HTTP--in reality, the socket talks FastCGI.

After a while, I got an email saying that it timed out Nginx
(restarted the service on 3/3 checks).  I disabled the HTTPS check, it
seems to have been problematic, but Nginx really did lock up.  I had
to kill -9 it, and remove the lock file.  It seems to be fine now, but
I will keep a eye on it.

Happy hacking,
~ Luke Shumaker

PS: I have a few questions about monit CLI-usage, if anyone is
willing to discuss that with me.



More information about the Maintenance mailing list