Logging OpenSSH SFTP Transactions

I just came across a really handy post on David Busby‘s blog: Enable logging in the SFTP subsystem – Oneiroi. From OpenSSH 4.4 on, you can pass arguments to Subsystem calls, and the sftp subsystem supports logging to an aribtrary syslog facility and priority. Simply adding a line like:

Subsystem       sftp    /usr/libexec/openssh/sftp-server -f LOCAL5 -l INFO

and the appropriate lines to your syslog config will give you a handy transfer log like:

Jul 16 09:22:25 hostname sftp-server[2058]: session opened for local user jantman from [A.B.C.D]
Jul 16 09:22:26 hostname sftp-server[2058]: open "/home/jantman/temp/sftp_test" flags WRITE,CREATE,TRUNCATE mode 0666
Jul 16 09:22:45 hostname sftp-server[2058]: close "/home/jantman/temp/sftp_test" bytes read 0 written 1464813
Jul 16 09:23:08 hostname sftp-server[2058]: session closed for local user jantman from [A.B.C.D]
Jul 16 09:27:50 hostname sftp-server[2309]: session opened for local user jantman from [A.B.C.D]
Jul 16 09:27:50 hostname sftp-server[2309]: open "/home/jantman/temp/sftp_test" flags READ mode 0666
Jul 16 09:27:54 hostname sftp-server[2309]: close "/home/jantman/temp/sftp_test" bytes read 1464813 written 0
Jul 16 09:27:54 hostname sftp-server[2309]: session closed for local user jantman from [A.B.C.D]

If you have syslog write these logs to their own file, remember to setup log rotation for them.

Unfortunately, I’m not aware of any way to log SCP file transfers.

Nagios Check Plugin for Rsnapshot Backups

In a previous post, I described how I do Secure rsnapshot backups over the WAN via SSH. While my layout of rsnapshot configuration files, data, and log files is a bit esoteric, I monitor all this with a Nagios check plugin that runs on my backup host. It Assumes that the output of rsnapshot is written to a text log file, one file per host, at a path that matches /path_to_log_directory/log_HOSTNAME_YYYYMMDD-HHMMSS.log where HOSTNAME is the name of the host, and YYYYMMDD-HHMMSS is a datestamp (actually, the script just finds the newest file matching log_HOSTNAME_*.log in that directory). In order to obtain correct timing of the runs, which rsnapshot doesn’t offer, it assumes that you trigger rsnapshot through a wrapper script, which runs it once per host (inside a loop?) with per-host log files and some logging information added, like:

for h in <LIST OF HOSTNAMES>
do
    LOGFILE="/mnt/backup/rsnapshot/logs/log_${h}_`date +%Y%m%d-%H%M%S`.txt"
    echo "# Starting backup at `date` (`date +%s`)" >> "$LOGFILE"
    /usr/bin/rsnapshot -c /etc/rsnapshot-$h.conf daily &>> "$LOGFILE"
    echo "# Finished backup at `date` (`date +%s`)" >> "$LOGFILE"
done

The check_rsnapshot.pl plugin uses utils.pm from Nagios, as well as Getopt::Long, File::stat, File::Basename, File::Spec and Number::Bytes::Human. This was one of my first Perl plugins, but seems to be rather acceptable. It makes the following checks based on the rsnapshot log:

  1. Backup run in the last X seconds (warning and crit thresholds)
  2. Maximum time from start to finish (warning and crit thresholds)
  3. Minimum size of backup (warning and crit thresholds)
  4. Minimum number of files in backup (warning and crit thresholds)

In addition to check_file_age checks on a number of files that are included in backups and I know are modified before each backup run, this seems to handle monitoring quite well for me. I certainly preferred running Bacula and using my MySQL-based check_bacula_job.php, but as I’m now backing up 4 machines to my desktop, I no longer have a need for Bacula (or tapes).

The script itself can be found at github.

What I Look For When Interviwing SysAdmin Candidates

I recently came by a question on ServerFault, Listing side projects in a jr. sysadmin resume asking whether people (hiring managers) think it’s appropriate to put “side projects” (running your own web and mail servers, freelance web work, etc.) on your resume. Since I’ve been interviewing candidates for a few SysAdmin positions lately, I thought I’d take the time to write down a few of my ideas on this. Two disclaimers first, though. 1) I tend to be pretty geeky, progressive, and very open source/DevOps focused at heart. Not everyone I work with will agree with what I say here. As a candidate, remember that you’ll probably interview with all types, and what I say here won’t be the best advice with Enterprise types. I’m very open source centric, and have always held SA jobs where the majority of the software I run is open source and not vendor supported. 2) If you happen to actually interview with me, don’t make the mistake of reading this and tailoring your resume/responses to fit if that’s not accurate. I’m not a manager, I’m a line SA.

First, my response to the ServerFault question:

Not a hiring manager, but an SA doing technical interviews and hiring recommendations, and also have only been with my current employer for 7 months (so I’ve been on both sides of the table recently). My current employer is a pretty big company and pays well, so we’re quite selective.

SA candidates with 5-10 years experience and a laundry list of certifications, software and hardware names, protocols, etc. are a dime a dozen. I’m looking for people who really love what they do. I have an instant bias against resumes that don’t have either a personal website/URL, or some personal projects/experience other than 9-5 job on them. There are lots of people who meet the technical qualifications. I want someone truly passionate, and that means learning and experimenting outside of work.

Personally, on my resume, I have a few personal projects listed (mainly programming projects and volunteer IT work I did for non-profits), and I also have a link to my personal resume site that has links to my SVN repo, and a bunch of other projects.

Some things I look for:

  • Not in all cases, but I like to see a website or blog listed on a resume. It’s a big plus. I have resume.jasonantman.com with copies of my resume in various formats, as well as a bunch of links I’d like employers to see.
  • If you’re a working SA, I should be able to find you on Google. Either by name or email address, I expect to google the contact information I find on your resume and find at least some mailing list/forum posts, bug reports, or software projects.
  • I can’t stress this enough, do not overstate your experience. I’ve been an SA for 5 years, a hobbyist for much longer, and I’ve never used the word “expert”. I list software, protocols, languages on my resume as beginner/basic, intermediate, and “strongest”. If you list something as “advanced” or “expert”, be prepared to answer expert-level questions. If you can’t explain a 3-way handshake, don’t list TCP/IP on your resume. If you list “strong knowledge of Linux internals”, you should be able to at least explain open() and close(). If you list advanced RADIUS experience, I will ask you to explain CSID, WPA key exchange, and what attributes are valid in an Access-Reject. In short, don’t say you’re a genius in something unless you are; you never know when your interviewer may have spent the last 6 months immersed in it.
  • All SAs should have some programming skills. If you’re a recent graduate (let’s say any time in the last 5-8 years) I’d expect at the very least a vague memory of C++, VB or Java. If you’re a working SA, I expect to see either strong Bash skills, or at least a functional knowledge of Perl. Python, PHP or Ruby; preferably both. If you’re a “senior” Linux SA, you should know enough C to be able to make sense of strace output.
  • As stated above, non-full-time-job projects are a big plus. When I took my first SA job, the majority of my experience had been doing volunteer work for a non-profit ambulance corps (which I was also a volunteer EMT on). If I said that I did 40 hours a week for them, it would be an understatement. I wrote a few 10,000+ line PHP applications for them, and designed the infrastructure to run them 24x7x365. Small shop? Sure. But I learned a LOT, especially about how to make things resilient enough that I didn’t get paged often.

I’m sure I’ll update this over time as I distill more of my ideas.

Tools for watching apache httpd and memcached

Recently I was working on a code release on a site running PHP on Apache httpd, and using >memcached. Without getting into specifics, we had a number of issues that were both Apache and memcached problems, and little visibility into them as it was running on an older server without much monitoring in place. I started looking around for simple tools that could provide a bit more insight, without many dependencies (as the machine is a relatively minimalist install). Here are some of the options I found:

  • memcache-top – A top-like script that pulls stats from memcached instances and can show both per-instance, total and average usage %, hit rate, number of connections, time to run the stats query, evictions, gets, sets, and read and write amounts. Best of all, it’s a very small perl script that requires only IO::Socket and Time::HiRes. Here’s a small example of the output:
    memcache-top v0.6       (default port: 11211, color: on, refresh: 3 seconds)
    
    INSTANCE                USAGE   HIT %   CONN    TIME    EVICT   GETS    SETS    READ    WRITE
    127.0.0.1:11211         86.6%   99.4%   115     0.6ms   0.0     4114    1669    1.3M    24.2M
    127.0.0.1:11212         85.5%   59.9%   2       0.4ms   0.0     0       0       90      8055
    
    AVERAGE:                86.0%   79.6%   58      0.5ms   0.0     2057    834     682.4K  12.1M
    
    TOTAL:          0.9GB/  1.0GB           117     1.0ms   0.0     4114    1669    1.3M    24.2M
    
  • damemtop is also a nice top-like memcached tool. On the positive side, you can specify any column from “stats”, “stats items” or “stats slabs” in the configuration file, and can choose between average or one-second snapshots for each column. On the down side, it requires the YAML and AnyEvent Perl modules, so it has some uncommon dependencies.
    damemtop: Tue Jun 26 14:02:24 2012 [sort: hostname asc] [delay: 3s]
    hostname           all_version  all_fill_rate  hit_rate  evictions  curr_items  curr_connections   cmd_get  cmd_set  bytes_written  bytes_read  get_hits  get_misses
    TOTAL:
    NA                 NA           NA             NA        NA         NA          NA                 87       32       491,735        30,894      86        1
    AVERAGE:
    NA                 NA           86.00%         99.00%    NA         NA          NA                 43       16       122,933        7,723       43        1
    10.200.1.78:11211  1.2.6        86.63%         98.04%    0          0           -1.00204024880524  51       19       386,492        21,613      50        1
    10.200.1.78:11212  1.2.6        85.46%         NA        0          0           0                  0        0        11,373         31          0         0
    10.200.1.79:11211  1.2.6        87.31%         100.00%   0          0           -1.00204024880524  36       13       82,479         9,219       36        0
    10.200.1.79:11212  1.2.6        85.08%         NA        0          0           0                  0        0        11,389         31          0         0
    loop took: 0.305617094039917
    

I’m still looking around for something for apache that uses mod_status and isn’t too verbose; ideally I’d like to be able to watch memcached, apache response codes/times, and apache mod_status all in the same terminal window.

Emacs Mode Variable for HTML

Unfortunately, I often find myself editing files that are mixed PHP and HTML, and ending with a “.php” extension. For most smaller projects/tasks, I use emacs at the command line (nox) and my .emacs settings for php-mode will latch onto the “.php” extension and open it with PHP mode. Unfortunately, PHP mode really doesn’t like embedded HTML (let alone mostly HTML with some inline PHP), and the indentation gets very messy, among other problems.

The simple solution is to add the following (XHTML 1.0 Transitional-compliant) comment to the first line of the file, which tells emacs to load html-mode:

<!-- -*- mode: html; -*- -->

You can also get emacs to do this for you, as per the Specifying File Variables documentation page. Once in html-mode, simply M-x add-file-local-variable-prop-line, enter "mode" for the variable name and use the default of the current mode.

Script to Chart Intervals Between Problem and Recovery from Nagios/Icinga Log Files

At work, we use Icinga (a fork of Nagios) for monitoring. We have a few services which are restarted or otherwise poked by event handlers, but the recovery takes a while – so we often get paged for problems which recover in a few minutes. I wrote a small perl script that greps through the archived log files for a given regex (service and/or host name) and then calculates the time from problem to recovery and graphs those times.

The script is called nagios_log_problem_interval.pl and can be downloaded from my github. Below is some sample output, the number of minutes from problem to recovery are along the Y axis and the count is along the X axis:


> nagios_log_problem_interval.pl --archivedir=/var/icinga/archive --match=myhost --backtrack=10
myhost;HTTP
Count
1:########(8)
2:##(2)
3:#(1)
4:##(2)
5:#######(7)
6:(0)
7:(0)
8:#(1)
9:(0)
10:(0)
11:#(1)
12:(0)
13:#(1)
14:(0)
15:(0)
16-29:(0)
30-59:(0)
60+:(0)

Apache httpd – logging for sites with and without load balancing

There are a few unfortunate places where I have an Apache httpd server serving multiple vhosts, some behind a F5 BigIp load balancer and some with direct traffic. For sites behind the LB, the remote IP/host will always show up as the LB’s IP/host, not that of the actual client. Using the default configuration with LogFormat directives in httpd.conf, this means that either we need to define log formats per-vhost or lose the client IP in one of our scenarios (LB or no LB).

I came by a simple solution to this on Emmanuel Chantréau‘s blog, and here is my condensed version of it. It sets an environment variable (“bigip-request”) if the BIOrigClientAddr request header is set (this header holds the client’s IP; it’s the BigIp proprietary version of the X-Forwarded-For header. You could easily substitute that more standard header in the following snippet) and then sets the “combined” LogFormat based on that variable – a version using BIOrigClientAddr if it is set, and a version using the normal “%h” remote host otherwise.

In httpd.conf:

# set the "bigip-request" env variable to "1" if there is a BIOrigClientAddr header in the request                                                                                                   
SetEnvIf BIOrigClientAddr . bigip-request
# we'll use this following LogFormat (BIOrigClientAddr in place of remote host) as "combined" IF the bigip-request env variable is set                                                                     
LogFormat "%{BIOrigClientAddr}i %l %u %t %v \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined_lb
# else we'll use this one (remote host IP address) as "combined" IF the bigip-request env variable is NOT set                                                                                   
LogFormat "%h %l %u %t %v \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined

And then in our vhost configuration:

# use this log format if we're behind an LB
CustomLog logs/<%= domain %>_access_log combined env=!bigip-request
# or this format if we're not
CustomLog logs/<%= domain %>_access_log combined_lb env=bigip-request

Creating RPMs from Perl CPAN Modules

I try my absolute best to always install software on my Linux boxes as RPMs, installed through Yum (yes, I use CentOS on servers and Fedora on my desktops/laptops). Not only is this more-or-less required to sanely manage configuration through Puppet, but it also lets me recreate a machine, or install dependencies for something, in one simple command line. Unfortunately, I run quite a bit of Perl code, and there are a lot of CPAN Perl modules that aren’t in any of the usual CentOS/Fedora repositories.

Enter cpan2rpm: a Perl script that, in its simplest invocation, downloads a specified CPAN module and automatically builds RPMs and SRPMs for it. The original version by Erick Calder hasn’t been touched since 2005, but there’s a newer version from Mediaburst, cpan2rpmmb, that seems to incorporate some nice improvements and worked quite well for me.

Perl script to convert F5 BigIp VIP address to list of internal pool member addresses

I often find myself logging in to the web UI of F5 BigIp load balancers and tracing down a VIP address to the servers that actually back it. This is an arduous, repetitive task of tracing from the VIP list to the VIP details page to find the default pool, then matching up that in the pool list and checking the pool members page. Luckily, the F5 boxes have a web service API that can be used for tasks like this. They have GPL sample code in Perl that uses only SOAP::Lite (as well as Getopt::Long and Pod::Usage) to interact with an F5 BigIp. I wrote a simple script to trace a VIP to the appropriate internal pool member addresses, assuming you have a simple configuration of VIP -> Single default pool -> pool members.

Usage is quite simple:

> ./VipToInternalHosts.pl --host=prod-lb1.example.com --user=myname --pass=mypassword --vip=128.6.30.130:80
VIP 128.6.30.130:80 (f5_vip_name) -> Pool 'pool_name'
Members of Pool 'pool_name':
	10.145.15.10:80
	10.145.15.11:80

The code can be found at http://svn.jasonantman.com/misc-scripts/VipToInternalHosts.pl via either HTTP or SVN. I hope it’s of use to someone else as well.

SysAdmin Links of The Day

A few links that I’ve had in my “mention in a blog post” category for a while: