Information, thoughts, tips and tricks from a professional system administrator / monitoring and automation nut, open source junkie, troubleshooter, sometime software developer and ceaseless tinkerer. And some occasional commentary on my hobbies and non-tech interests.
Category Archives: Interesting Links and Resources
This is a really cool company, doing some really cool stuff, at a really large scale, and growing fast.
On another note, I’m continuing my attempt to read all of the excellent Puppet articles on Brice Figureau’s (aka masterzen) blog. It’s taking a while, as it’s really good, in-depth information that I want to rememeber, but I’d highly recommend it for anyone working with Puppet.
I’ve had a bunch of tabs open in my browser for a while – stuff that I read, thought was wonderful, and wanted to comment on. At risk of letting it pile up forever, here’s a collection of links that I thought were really interesting or insightful…
MongoDB is Fantastic for Logging – I was looking into some log storage ideas, and came by this post (on the MongoDB blog) about why Mongo is well-suited to storing logs.
Sensu – a Ruby-based cloud-oriented monitoring system. It uses AMQP/RabbitMQ to communicate between the clients and server, which is a really big part of what I think monitoring should be.
High Scalability – this is one of the few blogs I follow on a regular basis. Some really wonderful stuff, and great food for thought.
Ars Technica – Exclusive: a behind-the-scenes look at Facebook release engineering – Ars Technical is more or less “mainstream media” to me, but this is a really interesting writeup on Facebook’s release engineering process, albeit at a higher level. Specifically, it talks about their automation, phased rollouts, rollbacks, and how they release the Facebook codebase as a single giant binary, sent out via BitTorrent.
Monitoring Sucks blog posts (github) – The “monitoing sucks” movement really speaks to me, having worked extensively with Nagios, Cacti, and similar technologies. Specifically, having rolled out monitoring in a variety of “weird” scenarios (a lot of monitoring devices or whole networks behind NAT, on dynamic IP connections, or otherwise unreachable from a central server), I’ve felt a lot of pain in the current want of doing things. There are a lot of really good thoughts linked here, especially the “wonderland” series by Patrick Debois and the “Latency sucks” series by Lindsay Holmwood. This really got me thinking about my ideal monitoring system, which among other things, would integrate the “alerting” functions of Nagios with graphing/trending and correlation, would be based on some sort of message queue architecture (that supports multiple levels of proxies that could gracefully support NAT and multiple hops), and would be configured almost totally on the originating “client” (unlike the pain of distributed Nagios/Icinga).
Mike Brittain – Metrics Driven Engineering at Etsy (3.2MB PDF) – presentation slides. I’d love to see the video. Some really good ideas about putting the science back into being a SysAdmin. Also mentions a few tools I really want to play around with (including ganglia, graphite, logster and StatsD). Also mentions adding PHP memory usage and time to Apache logs, which I don’t believe I never thought of.
I recently discovered the petit program for log analysis. It’s a simple tool to pull out useful information from syslog logs in a variety of ways. I’ve only used it a few times so far, mainly on logs from problems I’ve already solved but didn’t know the cause of at first. So far, it’s proven quite useful. Here are a few examples:
petit --wordcount /var/log/messages – displays ordered count of words appearing in the log. My first step, especially if “warning”, “error” or “fatal” shows up near the top…
petit --hash --fingerprint /var/log/messages – hashes the log, removes filters (such as numerics, datestamp), and displays count of matching lines. Absolutely wonderful for web error logs, as it removes client IP addresses, line numbers, etc.
petit --mgraph /var/log/messages – graph messages per minute for the first hour of the log (ASCII of course)
petit --hgraph /var/log/messages – same as above, but messages per hour for the first day
Petit will also read from stdin with the –Xgraph options, so you can cat logfile | grep word | petit --mgraph
Just one note – this tool appears to work only on standard syslog formatted logs. If some non-datestamped lines managed to work their way into the log (i.e. someone used echo >> logfile instead of logger), it will choke.
Many thanks to Scott McCarty for this wonderful tool!
Just a quick little tip, I happened by the Mobile Barcoder Firefox add-on the other day. It’s a Firefox add-on that generates QR Code barcodes for text or links, right in your browser. While my Droid3 has a full keyboard, sometimes I still want to quickly send links from my desktop browser session to my phone. Firefox Sync helps a lot but is a bit slow on the phone (since I usually have 100+ tabs open between all of my desktop Firefox sessions), and email is an option but a bit slower.
There are two caveats about this add-on though:
The feature to generate a QR code for the URL of the current page shows up in the status bar, which isn’t shown in modern versions of Firefox. You’ll need to enable the Add-on bar.
I guess it’s pretty strange that I just found out about it now, three years later, but there’s an article on TheInquirer.net about Citibank’s website not allowing Linux, that heavily quotes a blog post of mine about the issue. Cool.