Jason Antman's Blog

Pre-Authorized AWS Console URLs for Notifications

Pre-authorized AWS Console login URLs with limited permissions allow immediate investigation from notifications.

more ...

dashsnap.py - A Script to Snapshot a Graphite Dashboard

dashsnap.py, a script to snapshot a Graphite dashboard at various intervals

more ...

Nagios Check Plugin for Rsnapshot Backups

In a previous post, I described how I do Secure rsnapshot backups over the WAN via SSH. While my layout of rsnapshot configuration files, data, and log files is a bit esoteric, I monitor all this with a Nagios check plugin that runs on my backup host. It Assumes that …

more ...

Script to Chart Intervals Between Problem and Recovery from Nagios/Icinga Log Files

At work, we use Icinga (a fork of Nagios) for monitoring. We have a few services which are restarted or otherwise poked by event handlers, but the recovery takes a while - so we often get paged for problems which recover in a few minutes. I wrote a small perl script …

more ...

A Collection of Great Links on Monitoring, SysAdmin, Scaling, etc.

I’ve had a bunch of tabs open in my browser for a while - stuff that I read, thought was wonderful, and wanted to comment on. At risk of letting it pile up forever, here’s a collection of links that I thought were really interesting or insightful…

MongoDB is …

more ...

World of Warcraft Realm Status Check Plugin for Nagios

My wife Jackie (Syrilia) is an avid World of Warcraft player (it’s a MMORPG with over 10 million players). They have weekly server maintenance/update windows every Tuesday morning - total downtime. The length is never really fixed, so I looked around to see if there was a logical way …

more ...

Nagios Check Plugin for Linode Monthly Bandwidth Usage

Since I have most of my public-facing stuff hosted with Linode, and I have a monthly bandwidth cap (albeit one that I’ll probably never come close to), I decided that it would be a good idea to add my monthly bandwidth usage to my monitoring system. Luckily, Linode offers …

more ...

Nagios check_by_ssh and NAT

At a remote location, I have a number of machines to monitor but only one IP (dynamic on a residential connection). Most of my remote monitoring with Nagios uses check_by_ssh. Previously, I’d used one host for Nagios to SSH to, and then chained together another check_by_ssh to reach the …

more ...

Cable Management, Power Measurements, Major Outage, Cacti

So, once again, still really busy. But a few new things.

First, my racks both at home and at the apartment are atrocious. They have no cable management at all. Both started with 1-3 machines, and no real plans for upgrades (since they’re just my personal/development machines). Unfortunately …

more ...

Update, Eventum/MySQLTicketing Integration

Well I know I haven’t updated in a while. I have a whole bunch of links that I’d like to comment on, but things have been horribly busy. You can find the links in my “1-toblog” folder on del.icio.us (prefixed with “1-” so it shows up …

more ...