Git Cheat Sheet

I use git quite a bit these days, both with an internal server at work and with a bunch of my projects and random code that now live on my github account. The transition from SVN hasn’t always been easy. Here’s a quick cheat sheet of some of the things that I usually forget.

  • Show diff of the last commit:
    git diff HEAD^..HEAD
  • Roll back to version xyz of a specific file (where xyz is a SHA1 commit ref):
    git checkout xyz path/to/file
  • Undo any unstaged changes to your branch:
    git checkout -f
  • Undo any staged and working directory changes:
    git reset --hard
  • Update submodules after cloning a repository:
    git submodule update --init
  • Rebase on current master to pull in new changes:
    git rebase master
  • Rebase on current master, but for files that changed, take our version (for some reason, a plain rebase seems to sometimes show conflicts on files that haven’t changed in ages on master):
    git rebase -s recursive -Xtheirs master
  • Delete a local branch:
    git branch -d BranchName
  • Delete a remote branch from origin:
    git push origin --delete BranchName

Search for a small-scale but automated RPM build system

This post is part of a series of older draft posts from a few months ago that I’m just getting around to publishing. Unfortunately, I have yet to find a build system that meets my requirements (see the last paragraph).

At work, we have a handful – currently a really small number – of RPM packages that we need to build and deploy internally for our CentOS server infrastructure. A number of them are just pulled down from specific third-party repositories and rebuilt to have the vendor set as us, and some are internally patched or developed software. We run websites, and on the product side, we’re a Python/Django shop (in fact, probably one of the largest Django apps out there). We don’t deploy our Django apps via RPM, so building and distributing RPMs is definitely not one of our core competencies. In fact, we really only want to do it when we’re testing/deploying a new distro, or when an upstream package is updated.

Last week I pulled a ticket to deploy node.js to one of our build hosts, and we’ve got a few things in the pipeline that also rely on it. I found the puppetlabs-nodejs module on Github that’s supposed to install it on RHEL/CentOS, but it pulls packages from http://patches.fedorapeople.org/oldnode/stable/, and the newest version of nodejs there is 0.6.18, which is quite old. I can’t find any actively maintained sources of newer nodejs packages for RHEL/CentOS (yeah, I know, that’s one down side to the distributions…). However, I did find that nodejs 0.9.5 is being built for Fedora 18/19 in the Fedora build system, is already in the Fedora 18 Testing and Fedora Rawhide repos, but is failing its EL6 builds in their system. The decision I’ve come to is to use the puppetlabs-nodejs module to install it, but try and rebuild the Fedora 18 RPMs under CentOS 5 and 6.

So that’s the background. Now, my current task: to search for an RPM build system for my current job. My core requirements, in no specific order, are:

  • Be relatively easy and quick to use for people who have a specfile or SRPM and want to be able to “ensure => present” the finished RPM on a system. i.e., require as little per-package configuration as possible.
  • Be able to handle rebuilding “all” of our RPMs when we roll out a new distro version. Doesn’t necessarily need to be automatic, but should be relatively simple.
  • Ideally, not need to be running constantly – i.e. something that will cope well with build hosts being VMs that are shut down when they’re not needed.
  • Handle automatically putting successfully built packages into a repository, ideally with some sort of (manual) promotion process from staging to stable.
  • Have minimal external (infrastructure) dependencies that we can’t satisfy with existing systems.

So, the first step was to research existing RPM build systems and how others do this. Here’s a list of what I could find online, though most of these are from distributions and software vendors/projects, not end-user companies that are only building for internal use.

  • Koji is the build system used by Fedora and RedHat. It’s about as full-featured as any can be, and I’m familiar with it from my time at Rutgers University, as it’s used to maintain their CentOS/RHEL packages. It’s based largely on Mock. However, setting up the build server is no trivial task; there are few installations outside of Fedora/RedHat, and it relies on either Kerberos or an SSL CA infrastructure to authenticate machines and clients. So, it’s designed for too large a scale and too much infrastructure for me.
  • PLD Linux has a builder script that seems to automate rpmbuild as well as fetching sources and resolving/building dependencies. I haven’t looked at the script yet, but apparently it’s in PLD’s “rpm-build-tools” package.
  • PLD Linux also has a CVS repository for something called pld-builder.new. The README and ARCHITECTURE files make it sound like a relatively simple mainly-Python system that builds SRPMS and binary packages when requested, and most importantly, seems like a simple system that uses little more than shared filesystem access for communication and coordination.
  • ALT Linux has Sisyphus, which combines repository management and web interface tools, package building and testing tools, and more.
  • The Dries RPM repository uses (or at least used… my reference is quite old) pydar2, “a distributed client/server program which allows you to build multiple spec files on multiple distribution/architecture combinations automatically.” That sounds like it could be what I need, but the last update says that it isn’t finished yet, and that was in 2005.
  • Mandriva Linux has pretty extensive information on their build system on their wiki and a build system theory page, but it seems to be largely a hodgepodge of shell scripts and cronjobs, and is likely not a candidate for use by anyone other than its designers.
  • Argeo provides the SLC framework which has a “RPM Factory” component, but I can’t seem to find much more than a wiki page, and can’t tell if it’s a build automation system or just handles mocking packages and putting them in a repo on a single host.
  • Dag Wieers’ repositories use (or used) a set of python scripts called DAR, “Dynamic Apt Repository builder”. They’re on github but are listed as “old” and haven’t been updated in at least 2 years. The features sound quite interesting, and though it’s based on the Apt repo format, it might provide some good ideas for implementing a similar system.

Update four months later: I’ve yet to find a build system that meets my requirements above. For the moment I’m only managing ~20 packages, so my “build system” is a single shell script that reads in some environment variables and runs through using mock to build them in the correct order (including pushing the finished RPMs back into the local repository that mock reads from) and then pushing the finished packages to our internal repository. Maybe when I have some spare time, I’ll consider a project to either make a slightly better (but simple) RPM build system based on Python, or get our Jenkins install to handle this for me.

Environment Variable Substitution in Apache httpd Configs

I’ve been configuring Apache httpd for over a decade, from a single personal web server to web farms running thousands of vhosts. In most of the “real” environments I’ve worked in, we’ve had some variation of production, stage/test/QA and development hosts; and usually some method of managing configurations between them, whether it’s source control or generating them from template. And in all of these environments, there has invariably been drift between the configurations in the various environments, whether it’s because of poor tools to maintain a unified configuration or many of those emergency redirect requests that make it into production but are never backported. This is made all the worse because everywhere I’ve worked, the real difference between what production and other environments should be is really just a string replacement in Apache configurations – /prod/ to /test/ or www.example.com to www.dev.example.com or something along those lines.

Well a few days ago I was having a discussion with some co-workers that dovetailed into this topic, and when I started some research, I found (finally after using httpd for years) that the Apache httpd 2.2 configuration file syntax documentation states that httpd supports environment variable interpolation anywhere in the config files (and httpd 2.4 supports it with Defines as well).

Yup, that’s right. All those different Apache configs I’ve worked with for years that define separate vhosts, document roots, rewrite targets, ServerAliases, etc. for www.example.com and www.qa.example.com and www.dev.example.com really only had to be www.${ENV_URL_PART}example.com, and set ENV_URL_PART in the init script or sysconfig file. (Of course this all assumes that you have your different environments served by different httpd instances, which you do, of course…)

For me, this is a very big deal. It means that finally, instead of maintaining separate sets of configs for different environments which are (theoretically, except for those emergencies) kept identical by hand, or updating templates and then re-generating each environment’s configs, we can finally follow the same commit/merge/promotion-between-environments workflow that we use for other production code and Puppet configuration. It also means that those pesky little rewrites and other minor tweaks will make it all the way back to development environments.

So, here’s a little example of how this would work in reality. Let’s assume that we have 3 main environments, prod, qa and dev (though this should work for N environments) and that domains are prefixed with “qa.” or “dev.” for the respective internal environments. We set environment variables before httpd is started, on a per-host basis, depending on what environment that host is in. On RedHat based systems, we’d add the variables to /etc/sysconfig/httpd for production:

HTTPD_ENV_NAME="prod"
HTTPD_ENV_URL_PART=""

or for QA:

HTTPD_ENV_NAME="qa"
HTTPD_ENV_URL_PART="qa."

Those variables will now be available to httpd within the configurations (and also to any applications or scripts that have access to the web server’s environment variables).

Now let’s look at an example vhost configuration file that uses the environment variables:

<VirtualHost *:80>
ServerName example.com
ServerAlias www.example.com
# Aliases including proper environment name
ServerAlias www.${HTTPD_ENV_NAME}.example.com ${HTTPD_ENV_NAME}.example.com
 
ErrorLog /var/log/httpd/example.com-error_log
CustomLog /var/log/httpd/example.com-access_log combined
 
DocumentRoot /sites/example.com/${HTTPD_ENV_NAME}/
 
# Environment-specific configuration, if we absolutely need it:
Include /etc/httpd/sites/${HTTPD_ENV_NAME}/env.conf
 
<Location "/testrewrite">
RewriteEngine on
RewriteRule /foobar/.* http://www.${HTTPD_ENV_URL_PART}example.com/baz/ [R=302,L]
</Location>
 
</VirtualHost>

Every instance of ${HTTPD_ENV_NAME} will be replaced with the value set in the sysconfig file, and likewise with every instance of ${HTTPD_ENV_URL_PART}. This way, we can have one set of configurations and use our normal source control branch/promotion process to both test and promote changes through the environments along with application code, and ensure that any straight-to-production emergency changes (everyone has customer-ordered rewrites like that, right?) make it back to development and qa.

One caveat is that, if the environment variable is not defined, the ${VAR_NAME} will be left as a literal string in the configuration file. There doesn’t seem to be any way to protect against this in httpd 2.2, other than making sure the variables are set before the server starts (and maybe setting logical default values, like an empty string, in your init script which should be overridden by the sysconfig file).

If you’re running httpd 2.4+, you can turn on mod_info and browse to http://servername/server-info?config to dump the current configuration, which will show the variable substitution.

RPM Spec Files for nodejs 0.9.5 and v8 on CentOS 5

The latest version of nodejs that I could find as an RPM for CentOS was 0.6.16, from http://patches.fedorapeople.org/oldnode/stable/. That’s the one that puppetlabs currently uses in their puppetlabs-nodejs module. There is, however, a nodejs 0.9.5 RPM in the Fedora Rawhide (19) repository. Below are some patches to that specfile, and the specfile for its v8 dependency, to get them to build on CentOS 6. You can also find the full specfiles on my github specfile repository. I had originally wanted to get them built on CentOS 5 as well, but after following the dependency tree from nodejs to http-parser to gyp, and then finding issues in the gyp source that are incompatible with CentOS 5′s python 2.4, I gave up on that target.

nodejs.spec, diff from Fedora Rawhide nodejs-0.9.5-9.fc18.src.rpm, buildID=377755 (full specfile)

diff --git a/nodejs.spec b/nodejs.spec
index 050ed86..86c0f4b 100644
--- a/nodejs.spec
+++ b/nodejs.spec
@@ -1,6 +1,6 @@
 Name: nodejs
 Version: 0.9.5
-Release: 9%{?dist}
+Release: 10%{?dist}
 Summary: JavaScript runtime
 License: MIT and ASL 2.0 and ISC and BSD
 Group: Development/Languages
@@ -25,7 +25,7 @@ Source6: nodejs-fixdep
 BuildRequires: v8-devel >= %{v8_ge}
 BuildRequires: http-parser-devel >= 2.0
 BuildRequires: libuv-devel
-BuildRequires: c-ares-devel
+BuildRequires: c-ares-devel >= 1.9.0
 BuildRequires: zlib-devel
 # Node.js requires some features from openssl 1.0.1 for SPDY support
 BuildRequires: openssl-devel >= 1:1.0.1
@@ -165,9 +165,13 @@ cp -p common.gypi %{buildroot}%{_datadir}/node
 
 %files docs
 %{_defaultdocdir}/%{name}-docs-%{version}
-%doc LICENSE
 
 %changelog
+* Thu Jan 31 2013 Jason Antman <Jason.Antman@cmgdigital.com> - 0.9.5-10
+- specify build requirement of c-ares-devel >= 1.9.0
+- specify build requirement of libuv-devel 0.9.4
+- remove duplicate %doc LICENSE that was causing cpio 'Bad magic' error on CentOS6
+
 * Sat Jan 12 2013 T.C. Hollingsworth <tchollingsworth@gmail.com> - 0.9.5-9
 - fix brown paper bag bug in requires generation script

v8.spec, diff from Fedora Rawhide 3.13.7.5-2 (full specfile)

--- v8.spec.orig       2013-01-26 16:03:18.000000000 -0500
+++ v8.spec     2013-01-31 09:04:51.068029459 -0500
@@ -21,9 +21,11 @@
 
 # %%global svnver 20110721svn8716
 
+%{!?python_sitelib: %define python_sitelib %(%{__python} -c "import distutils.sysconfig as d; print d.get_python_lib()")}
+
 Name:          v8
 Version:       %{somajor}.%{sominor}.%{sobuild}.%{sotiny}
-Release:       2%{?dist}
+Release:       5%{?dist}
 Epoch:         1
 Summary:       JavaScript Engine
 Group:         System Environment/Libraries
@@ -32,7 +34,7 @@
 Source0:       http://commondatastorage.googleapis.com/chromium-browser-official/v8-%{version}.tar.bz2
 BuildRoot:     %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n)
 ExclusiveArch: %{ix86} x86_64 %{arm}
-BuildRequires: scons, readline-devel, libicu-devel
+BuildRequires: scons, readline-devel, libicu-devel, ncurses-devel
 
 %description
 V8 is Google's open source JavaScript engine. V8 is written in C++ and is used 
@@ -51,8 +53,13 @@
 %setup -q -n %{name}-%{version}
 
 # -fno-strict-aliasing is needed with gcc 4.4 to get past some ugly code
-PARSED_OPT_FLAGS=`echo \'$RPM_OPT_FLAGS -fPIC -fno-strict-aliasing -Wno-unused-parameter -Wno-error=strict-overflow -Wno-error=unused-local-typedefs -Wno-unused-but-set-variable\'| sed "s/ /',/g" | sed "s/',/', '/g"`
+%if 0%{?el5}
+PARSED_OPT_FLAGS=`echo \'$RPM_OPT_FLAGS -fPIC -fno-strict-aliasing -Wno-unused-parameter -lncurses\'| sed "s/ /',/g" | sed "s/',/', '/g"`
+sed -i "s|'-O3',|$PARSED_OPT_FLAGS,|g" SConstruct
+%else
+PARSED_OPT_FLAGS=`echo \'$RPM_OPT_FLAGS -fPIC -fno-strict-aliasing -Wno-unused-parameter -Wno-error=strict-overflow -Wno-unused-but-set-variable\'| sed "s/ /',/g" | sed "s/',/', '/g"`
 sed -i "s|'-O3',|$PARSED_OPT_FLAGS,|g" SConstruct
+%endif
 
 # clear spurious executable bits
 find . \( -name \*.cc -o -name \*.h -o -name \*.py \) -a -executable \
@@ -198,6 +205,17 @@
 %{python_sitelib}/j*.py*
 
 %changelog
+* Thu Jan 31 2013 Jason Antman <Jason.Antman@cmgdigital.com> - 1:3.13.7.5-5
+- remove -Werror=unused-local-typedefs on cent6
+
+* Wed Jan 30 2013 Jason Antman <Jason.Antman@cmgdigital.com> - 1:3.13.7.5-4
+- define python_sitelib if it isn't already (CentOS 5)
+
+* Wed Jan 30 2013 Jason Antman <Jason.Antman@cmgdigital.com> - 1:3.13.7.5-3
+- pull 3.13.7.5-2 SRPM from Fedora 19 Koji most recent build
+- add ncurses-devel BuildRequires
+- modify PARSED_OPT_FLAGS to work with g++ 4.1.2 on CentOS 5
+ 
 * Sat Jan 26 2013 T.C. Hollingsworth <tchollingsworth@gmail.com> - 1:3.13.7.5-2
 - rebuild for icu-50
 - ignore new GCC 4.8 warning

Fedora Linux and OSX Dual Boot on Mid-2010 (6,2) 15″ MacBook Pro Laptop

As part of the transition from a contractor to a full-time employee of Cox Media Group Digital & Strategy (check out our github), I’ve been issued a Mid-2010 (6,2) 15″ MacBook Pro laptop, to replace my current Early-2008 (3,1) MacPro desktop. The desktop is currently running Fedora 17, dual-boot with with Mac OS X (left in place for firmware updates and emergencies) using the rEFInd boot manager to choose between the two OSes. It took me two days to get this working right on my desktop, but it had been my plan to duplicate this setup on my laptop. I found a lot of conflicting information online, but I decided to give it a try.

Well, I have Fedora 18 and OS X 10.8 dual-booting on the laptop, but not as planned. After a day and a half of research, troubleshooting and re-installs, here’s what I found to actually work, in the hope that nobody else will go through the ordeal I went through. Following that are some notes about the new Fedora 18 installer (Anaconda 18), especially important for anyone who’s used Linux for a while. To those who are new to Linux, don’t be dissuaded by the above. Most of the frustration I experienced is because I’ve been using Linux for a relatively long time (about 10 years), had my own ideas about exactly how I wanted things setup (which are decidedly not supported by Fedora), and had some assumptions about the installation process based on earlier versions.

How to get it working:

Forget about rEFInd. This had been the original advice from Matthew Garrett, @mjg59, kernel coder, contributor to the Anaconda project, and all-around authority on booting Linux on EFI/UEFI hardware. My advice, and the method that worked for me:

  1. Shrink your Mac partitions and leave as much free space as you want for Fedora. using the Disk Utility tool in OS X (I also created an 8GB VFAT partition that both OSes can read/write to).
  2. Download Fedora 18 64-bit DVD image, I chose the KDE version. Verify the sha256 sum if you want (they don’t have a readily visible link to the checksum file. Copy the download link, paste it into your address bar and remove the filename. You should get a directory index that includes a -CHECKSUM file.
  3. Per the Installation Guide’s Making Fedora USB Media page, use liveusb-creator to setup the installation image on the USB flash drive (I needed to start it with the --reset-mbr option). You can also use other tools (dd if you’re not on a Fedora-based distro), or a DVD, but this is the method I chose.
  4. Due to a bug in liveusb-creator, you may need to manually edit /EFI/boot/grub.cfg on the created USB stick if grub gives you a file not found error. If that happens, please see my bug report above for the action to take (in short, you need to mount the USB stick, chmod u+w /EFI/boot/grub.cfg then edit that file and replace every occurrence of “isolinux” with “syslinux” and every occurrence of “root=live:LABEL=Fedora-18-x86_64-Live-KDE.iso” with “root=live:LABEL=LIVE”).
  5. Boot the USB drive (use the alt key when you turn on the laptop to select the USB drive) and just install Fedora normally, letting it do its thing. Select a boot disk and let it put GRUB2 on the EFI partition.

When you boot, it will boot to GRUB. There will be some options for Mac OS there, but they don’t work (more on that below). If you want to boot Mac, hold down the alt/option key when you power on the laptop, which will bring you to the boot disk selector and you can pick the Mac disk. I know it’s not pretty or ideal, but it’s the best option right now.

Making it Better:

GRUB2 tries to automatically detect other OSes and configure them in the boot loader (this is done through /etc/grub.d/30_os-prober, commonly just referred to as os-prober). It tries to boot Mac directly through the xnu_kernel64 module, which not only isn’t installed on the boot partition by default, but just doesn’t work with at least Mountain Lion (10.8). So getting GRUB to boot Mac means either having the bugs in the xnu module fixed, or figuring out how to setup a chainloader to boot from GRUB to Mac. The latter is probably the method I’ll investigate, but for now, since I rarely use Mac, I’m happy having to use the alt key at boot to get there. To remove the annoying, broken Mac OS options from the grub screen, run the following commands as root (they assume you have your EFI partition mounted at /boot/efi which I believe Fedora should do by default:

cp /boot/efi/EFI/fedora/grub.cfg /boot/efi/EFI/fedora/grub.cfg.bak
echo 'GRUB_DISABLE_OS_PROBER="true"' >> /etc/default/grub
grub2-mkconfig > /boot/efi/EFI/fedora/grub.cfg

Thoughts on the Fedora 18 Anaconda Installer

I found a couple of issues with the new Anaconda 18 installer that were either unweildy or confusing for someone who’s been installing Linux for a long time. Overall, the new installer is very nice. It has a clean, even elegant UI, a relatively nice flow from start to completion, and is certainly beginner-friendly. It has fewer options than any Linux installer I’ve ever used before – not even options for package selection, firewall or SELinux configuration, etc. – but I guess this is in line with the goal of making Fedora a desktop OS for the masses. I would have appreciated an “advanced mode” installer that was more like Fedora 17 (or even much older versions), but I guess I’m an edge case, at least in the Fedora community. However, I did find two things especially difficult, both related to the fact that my laptop has two main drives (a 500GB hard drive and a 120GB SSD):

First, the installer prompted me to select a “boot disk”. I guess I should have read the installation guide, but I assumed that nomenclature translated to either “which disk should the automatic partitiioning put yout /boot partition on” or “which disk should I set the bootable flag on in the partition table”. In fact, it means “which disk should I put GRUB on the EFI partition of”. I installed, rebooted, and was shocked – and somewhat distressed – to boot directly to GRUB2 instead of the rEFInd installation I had setup. The installer didn’t have any of the previously-customary “warning: this will overwrite your MBR/EFI boot partition” notices, so I felt safe letting it continue. It turned out that this was the way I ended up going, and it also turns out that there’s a bug in Anaconda that makes it fail installation if you tell it not to write a bootloader to disk (though it’s patched by one line of Python code). But I was deeply distressed that – contrary to the experience of every, admittedly more complicated, Linux installer I’d used before – the Fedora 18 installer overwrote my EFI bootloader (analogous to overwriting the MBR on a BIOS boot machine) without ever warning me or asking for a confirmation.

Secondly, the partitioning tool is clearly designed for only one destination disk. The overview screen lists configured partitions by label and mount point, but not by physical device, so figuring out which partitions are on which physical disks takes a click on each and every partition to view that information in the detail panel. When you create a new partition, it’s automatically put in a LVM volume group spanning all disks. Changing the target of the automatically created volume group requires a few clicks, as does changing the physical disks backing any new volume groups. To assign a newly created partition to a specific disk, you have to click on an unlabeled “tool” icon under the list of partitions, far away from the information on the partition in question. It’s a nice interface for someone who clicks the “partition automatically” button, or who just knows they want to add “an extra partition”, but for anyone who has a specific layout in mind (like having /, /boot and /var, specifically sized, on the SSD and /home on the rotating disk) it takes about 4-5 more clicks and dialogs to add a partition than the last Fedora installer did. Mainly, it’s lacking any sort of Advanced Mode for partitioning that allows the user to quickly and accurately layout a more complex partitioning scheme.

Below are some screenshots from the Fedora 17 and Fedora 18 Installation Guides, which contrast both the overview of all partitions and the individual partition settings:

Fedora 18 Overview, from 9.13. Creating a Custom Partition Layout:

Fedora 17 Overview, from 9.14. Creating a Custom Layout or Modifying the Default Layout:

Fedora 18 Partition Creation/Editing, from 9.13.3. Create LVM Logical Volume:

Fedora 17 Partition Creation/Editing, from 9.14.2. Adding Partitions:

Pretty-Print a JSON response at the command line

I’ve been doing some work with RabbitMQ lately, and have been doing some testing against its HTTP-based API, which returns results in JSON. If you’re looking to pretty-print a JSON response for easier viewing, here’s a nice way to do it at the command line using Python and json.tool:

curl http://username:pass@hostname:55672/api/overview | python -m json.tool

Nagstamon on Fedora 17

Since I started my last job, I’ve been using Nagstamon on my workstation; it’s a really handy little system tray application that monitors a Nagios/Icinga instance and shows status updates/summary in a handy fashion, including flashing and (optionally) a sound alert when something changes. Unfortunately, there doesn’t seem to be a Fedora 17 package for it, though there is an entry on the Fedora package maintainers wishlist. The closest I was able to find is a repoforge/RPMforge package of Nagstamon 0.9.7.1, along with a source RPM.

Here are the steps to build that package on F17:

  1. Download and install rpm-macros-rpmforge.
  2. As root, edit /etc/rpm/macros.rpmforge and comment out the %dist macro, so we’ll still have the default “fc17″ dist tag.
  3. wget http://apt.sw.be/source/nagstamon-0.9.7.1-2.rf.src.rpm
  4. rpmbuild –rebuild nagstamon-0.9.7.1-2.rf.src.rpm

Hopefully this will help someone else as well. At the moment, Nagstamon is actually up to version 0.9.9, so hopefully I’ll build a newer package sometime soon.

Getting oVirt up and running

The bulk of this post was written way back in April 2012. If you’re just coming here, and looking to setup oVirt, you should probably skip down to the postscript for an update, and ignore most of the content here (as it’s applicable to an older oVirt version).

I recently started setting up oVirt, the community version of Red Hat Enterprise Virtualization, at work for some testing (mainly a “sandbox” VM environment, and because Foreman supports it). To start with, I had two nodes, each with two dual-core Xeon processors (VT-x capable) with 20GB RAM, one with 600GB internal storage and one with 140GB internal. While oVirt’s documentation isn’t exactly wonderful, I found a blgo post by Jason Brooks, How to Get Up and Running with oVirt, which gives a great walkthrough of getting the oVirt Engine setup on a machine, and also setting up that same machine as a VM host. As oVirt is still fairly young, this is all done on Fedora. I performed my installation via Cobbler, though I’m afraid to admit it was an entirely manual, interactive install.

I did run into a few bumps during Jason’s tutorial. In step 15, adding the data NFS export as a Storage Domain, I was unable to add the NFS export. I found the Troubleshooting NFS Storage Issues page on the oVirt wiki, ensured that SELinux was disabled and that the export had the correct permissions, confirmed that /etc/nfsmount.conf specified Nfsvers=3, rebooted, and then ran the nfs-check.py script. At this point, I was able to add the other storage domains in steps 15 and 16.

My second issue was that even on Fedora 16, I simply can’t get the spice client (through the spice-xpi browser plugin) to work. As far as I can tell from the logs, it looks like spicec is being sent a value of “None” for the secured port parameter, instead of the correct port number. I assume this is a bug in oVirt, but I’ll revisit this problem when I have time. In the mean time, I changed my test VM to use VNC, which is launched by installing the ovirt-engine-cli package (see below) on your client computer, connecting to the oVirt API with ovirt-shell:

ovirt-shell --connect --url=https://ovirt-engine.example.com:8443/api --user=admin@internal --password adminpassword

and then running console vm_name. This launches the vncviewer binary, which is in the “tigervnc” package on Fedora.

Installing ovirt-engine-cli

To run ovirt-shell on your workstation (Fedora 16, of course…) you’ll need the ovirt-engine-cli and ovirt-engine-sdk packages. I manually downloaded them from http://www.ovirt.org/releases/nightly/fedora/16/, versions 2.1.3 and 1.6.2, respecitively. The SDK and CLI are python based, so there are a few Python dependencies, all of which were automatically solved by yum. I know there are SDK and CLI packages out there for other distros, but haven’t tried them yet.

Installing Linux Guests

Installing a CentOS 6.2 x86_64 guest was relatively straightforward, and my usual kickstart infrastructure worked fine. The only catch was the VirtIO storage interface, which shows up as /dev/vdx instead of /dev/sdx; I just added another kickstart metadata option in Cobbler that allows me to use sdx by specifying “virtual=yes” (for our VMWare hosts), or vdx by specifying “virtual=ovirt”.

Setting up Authentication

As installed, oVirt only has one user, “admin@internal”; it requires an external directory service for user authentication. Currently, it supports IPA, Red Hat’s Enterprise Identity Management tool (combines RHEL, oVirt Directory Server, Kerberos and NTP; perhaps FreeIPA would work as well?) and Microsoft Active Directory. As much as I’d like to give IPA or FreeIPA a try, my company already has an AD infrastructure, so I opted to go that route. Documentation is given in the oVirt 3.0 Installation Guide, starting on page 96. Unfortunately, I was never about to get AD auth working correctly, so I just worked with the one admin user.

Adding a Node

The biggest issue I had was adding the second node to oVirt. I attempted to use the DVD Import feature of Cobbler on the oVirt Node Image ISO, but that failed. I then found the image’s LiveOS/livecd-iso-to-pxeboot script and used that to make a kernerl and initrd, and kernel parameters, for Cobbler. PXE works fine.

Postscript: I ended up blowing away my oVirt installation in favor of testing other things. At some point, the engine install got corrupted in a way that I just couldn’t fix; even though I spent all day one Saturday working on it, it took more time than I could allocate to a personal project. So this post is really semi-complete at best. However, there is some good news. Jason Brooks’ original post, How to Get Up and Running with oVirt, was written for oVirt 3.0, as was this post. Since then, there has been a new release, oVirt 3.1, which apparently has a better UI and a better installer. Jason Brooks has a new post, Up and Running with oVirt, 3.1 Edition, which covers installation and configuration of both an all-in-one machine and a separate node. If you’re looking to try oVirt, I’d recommend you give that a shot. Unfortunately (and strangely, given that this is supposed to be the “upstream” of RedHat’s proprietary RHEV) it’s still all based on Fedora.

WordPress – Automatically publish a pending post each weekday morning from a PHP script

In an earlier post, Piwik Web Analytics, and some unfortunate stats about my blog, I mentioned that the Feedburner stats for this blog show a relatively high subscribe/unsubscribe rate for this blog. I think a large part of that is my tendency to blog in spurts, and even worse, my tendency to write drafts and not publish them. In an effort to combat this, I’ve been trying to finish blog posts and then set them to “Pending” status, and go back and publish one every day (well, every day that I have some still sitting unpublished). Of course, that counts on me logging in to WordPress every day, which isn’t something I do. The following script is, at least for now, the answer for me.

This script (a standalone PHP script) uses wp-load.php to load the wordpress environment, and then finds the oldest post with a given status (“pending” in my case) and attempts to publish it. It only does this if there has not been another post published in the last 24 hours. The following script can be found in subversion at http://svn.jasonantman.com/misc-scripts/wordpress_daily_post.php:

#!/usr/bin/php
&lt;?php
/**
 * wordpress_daily_post.php
 * Script to publish the oldest post with a given status, if no
 * other post has been published in 24 hours. Intended to be run
 * via cron on weekdays.
 *
 * Copyright 2012 Jason Antman 
 *
 * Licensed under the Apache License, Version 2.0 
 *
 * use it anywhere you want, however you want, provided that this header is left intact,
 * and that if redistributed, credit is given to me.
 *
 * It is strongly requested, but not technically required, that any changes/improvements
 * be emailed to the above address.
 *
 * The latest version of this script will always be available at:
 * $HeadURL: http://svn.jasonantman.com/misc-scripts/wordpress_daily_post.php $
 * $LastChangedRevision: 40 $
 *
 * Changelog:
 * 2012-09-03 Jason Antman  - 1.0
 *  - first version
 */
 
# BEGIN CONFIGURATION
define('WP_LOAD_LOC', '/var/www/vhosts/blog.jasonantman.com/wp-load.php'); // Configure this to the full path of your Wordpress wp-load.php
define('SOURCE_POST_STATUS', 'pending'); // post status to publish
# END CONFIGURATION

$VERBOSE = false;
$DRY_RUN = false;
array_shift($argv);
while(count($argv) &gt; 0) {
  if(isset($argv[0]) &amp;&amp; $argv[0] == "-d" || $argv[0] == "--dry-run"){
    $DRY_RUN = true;
    fwrite(STDERR, "DRY RUN ONLY - NOT ACTUALLY PUBLISHING.\n");
  }
  if(isset($argv[0]) &amp;&amp; $argv[0] == "-v" || $argv[0] == "--verbose"){
    $VERBOSE = true;
    fwrite(STDERR, "WP_LOAD_LOC=".WP_LOAD_LOC."\n");
    fwrite(STDERR, "SOURCE_POST_STATUS=".SOURCE_POST_STATUS."\n");
  }
  array_shift($argv);
}
 
$_SERVER['HTTP_HOST'] = 'localhost'; // needed for wp-includes/ms-settings.php:100
require_once(WP_LOAD_LOC);
 
# check that we're running on a weekday
if(date('N') &gt;= 6) {
#  if($VERBOSE){ fwrite(STDERR, "today is a saturday or sunday, dieing.\n"); }
#  exit(1);
}
 
# find the publish date/time of the last published post
$published = get_posts(array('numberposts' =&gt; 1, 'orderby' =&gt; 'post_date', 'order' =&gt; 'DESC', 'post_status' =&gt; 'publish'));
$post = $published[0];
$pub_date = $post-&gt;post_date;
$pub_id = $post-&gt;ID;
 
if(strtotime($pub_date) &gt;= (time() - 86400)) {
  if($VERBOSE){ fwrite(STDERR, "last post (ID $pub_id) within last day ($pub_date). Nothing to do. Exiting.\n"); }
  exit(0);
} else {
  if($VERBOSE){ fwrite(STDERR, "Found last post (ID $pub_id) with post date $pub_date.\n"); }
}
 
 
# find the earliest post of status SOURCE_POST_STATUS, if there is one.
$to_post = get_posts(array('numberposts' =&gt; 1, 'orderby' =&gt; 'post_date', 'order' =&gt; 'ASC', 'post_status' =&gt; SOURCE_POST_STATUS));
if(count($to_post) ID;
$to_pub_date = $post-&gt;post_date;
$to_pub_title = $post-&gt;post_title;
$now = time();
$new_date = date("Y-m-d H:i:s", $now);
$new_date_gmt = gmdate("Y-m-d H:i:s", $now);
 
if($VERBOSE){ fwrite(STDERR, "Post to publish: ID=$to_pub_id DATE=$to_pub_date NEW_DATE=$new_date TITLE=$to_pub_title\n"); }
 
# actually publish it
if(! $DRY_RUN){
  $arr = array('ID' =&gt; $to_pub_id, 'post_status' =&gt; 'publish', 'post_date' =&gt; $new_date, 'post_date_gmt' =&gt; $new_date_gmt);
  $ret = wp_update_post($arr); // publish the post
  if($ret == 0) {
    fwrite(STDERR, "ERROR: Post $to_pub_id was not successfully published.");
    exit(1);
  }
  if($VERBOSE){ fwrite(STDERR, "Published post. New ID: $ret\n"); }
}
else {
  fwrite(STDERR, "Dry run only, not publishing post.\n");
}
 
# check that the post really was published
$published = get_posts(array('numberposts' =&gt; 1, 'orderby' =&gt; 'post_date', 'order' =&gt; 'DESC', 'post_status' =&gt; 'publish'));
$post = $published[0];
$pub_date = $post-&gt;post_date;
$pub_id = $post-&gt;ID;
$pub_title = $post-&gt;post_title;
$pub_guid = $post-&gt;guid;
 
if($pub_title != $to_pub_title) {
  fwrite(STDERR, "ERROR: title of most recent post does not match title of what we wanted to post.");
  exit(1);
}
 
fwrite(STDOUT, "Published post $pub_id at $pub_date\n");
fwrite(STDOUT, "Title: $pub_title\n");
fwrite(STDOUT, "\n\n\n GUID/Link: $pub_guid\n");
fwrite(STDOUT, "\n\n".__FILE__." on ".trim(shell_exec('hostname --fqdn'))." running as ".get_current_user()."\n");
 
?&gt;

You’ll need to set WP_LOAD_LOC (line 29) to the full path of your WordPress installation’s wp-load.php (it should be in the top-level directory of your WordPress installation. I run this script from cron like:

0 6 * * 1-5 /home/jantman/bin/wordpress_daily_post.php --verbose # publish WP pending posts daily

so that it runs at 6AM (local time) each weekday. Assuming you have cron setup to send you mail, you’ll get a daily message saying what was (or wasn’t) done.

RVM and Ruby 1.9 to test logstash grok patterns on Fedora/CentOS

I’ve been working on a personal project with Logstash lately, and it relies relatively heavily on grok filters for matching text and extracting matched parts. Today, I’ve been parsing syslog from Puppet to extract various metrics and timings, which will then be passed on from Logstash to Etsy’s statsd and then to graphite for display. Unfortunately, a few of my patterns are showing the “_grokparsefailure” tag and I just can’t seem to find the problem.

The logstash wiki provides a page on Testing your Grok patterns, as does Sean Laurent on his blog: Testing Logstash grok filters. Unfortunately, I work in a CentOS/RHEL shop, and we’re decidedly not a Ruby shop. Our Logstash install is using the monolithic/standalone Java JAR. We run Puppet, which is currently under ruby 1.8.7, and the jls-grok rubygem requires ruby 1.9. There’s no way I’d feel safe installing 1.9 on any of our machines, as they all run (and require) Puppet. So, I found out about RVM, the Ruby Version Manager, which allows you to run and switch between multiple ruby versions, and all of it is installed on a per-user basis. So, I created a new user on my Fedora 16 desktop called “rvmtest” and went about the process of setting up what’s needed to test grok patterns in the user’s local environment. I imagine this would work similarly under CentOS or RHEL, but the following is only tested on Fedora 16. If you have any issues, you should probably refer back to the RVM documentation.

  1. Create the isolated user, just to be extra careful. Login as that user.
  2. As per Installing RVM: curl https://raw.github.com/wayneeseguin/rvm/master/binscripts/rvm-installer | bash -s stable
  3. edit your ~/.bashrc and add:
    [[ -s "$HOME/.rvm/scripts/rvm" ]] &amp;&amp; . "$HOME/.rvm/scripts/rvm"
    [[ -r $rvm_path/scripts/completion ]] &amp;&amp; . $rvm_path/scripts/completion

    The first line sets up RVM for your sessions, and the second sources in tab-completion for the rvm command.

  4. source .bashrc
  5. If you’re interested, you can see a list of all known rubies with: rvm list known
  6. Install Ruby (MRI) 1.9.2: rvm install 1.9.2
  7. “switch” to that ruby: rvm use 1.9.2 and confirm it by running ruby -v
  8. Make it the default ruby for us: rvm use 1.9.2 --default
  9. Create a “gemset” (set of rubygems for our environment): rvm gemset create groktest
  10. Use it, and set it as default: rvm use 1.9.2@groktest --default
  11. for grok testing, gem install jls-grok
  12. check that it’s there: gem list
  13. Download Logstash’s default grok patterns from github
  14. You should now be ready to test some grok patterns.

While the two howto’s linked above use irb to interactively test the patterns, I prefer something easier to move to production, more reliable, and more repeatable. The following quick little ruby script takes test to match against on STDIN (log files, messages, etc.) and prints the matches to STDOUT. The script is based on test.rb from jordansissel’s ruby-grok. Note one important thing here, I couldn’t get the shebang (#!) to work with anything other than the explicit path to my RVM ruby install (which ruby) so you’ll need to manually update this yourself.

#!.rvm/rubies/ruby-1.9.2-320bin/ruby
 
require 'rubygems'
require 'grok-pure'
require 'pp'
 
grok = Grok.new
grok.add_patterns_from_file("grok-patterns")
 
pattern = 'your_grok_pattern_here'
grok.compile(pattern)
puts "PATTERN: #{pattern}"
 
while a = gets
  puts "IN: #{a}"
  match = grok.match(a)
  if match
    puts "MATCH:"
    pp match.captures
  else
    puts "No Match."
  end
end

Here’s an example using a pattern to capture information from custom syslog messages triggered by updating puppet configs. Here’s some sample messages:

[rvmtest@jantmanwork ~]$ cat puppet.log
Updated 2 files in puppet svn (environment prod) to revision 754
Updated 3 files in puppet svn (environment prod) to revision 756
Updated 1 files in puppet svn (environment prod) to revision 757

And the pattern that I use:

Updated%{SPACE}%{NUMBER:puppet_svn_num_files}%{SPACE}files%{SPACE}in%{SPACE}puppet%{SPACE}svn%{SPACE}\(environment%{SPACE}%{WORD:puppet_svn_env}\)%{SPACE}to%{SPACE}revision%{SPACE}%{NUMBER:puppet_svn_revision}

And the output of the script:

[rvmtest@jantmanwork ~]$ cat puppet.log | ./puppet-update-test.rb 
PATTERN: Updated%{SPACE}%{NUMBER:puppet_svn_num_files}%{SPACE}files%{SPACE}in%{SPACE}puppet%{SPACE}svn%{SPACE}\(environment%{SPACE}%{WORD:puppet_svn_env}\)%{SPACE}to%{SPACE}revision%{SPACE}%{NUMBER:puppet_svn_revision}
IN: Updated 2 files in puppet svn (environment prod) to revision 754
MATCH:
{"SPACE"=&gt;[" ", " ", " ", " ", " ", " ", " ", " ", " ", " "],
 "NUMBER:puppet_svn_num_files"=&gt;["2"],
 "BASE10NUM"=&gt;["2", "754"],
 "WORD:puppet_svn_env"=&gt;["prod"],
 "NUMBER:puppet_svn_revision"=&gt;["754"]}
IN: Updated 3 files in puppet svn (environment prod) to revision 756
MATCH:
{"SPACE"=&gt;[" ", " ", " ", " ", " ", " ", " ", " ", " ", " "],
 "NUMBER:puppet_svn_num_files"=&gt;["3"],
 "BASE10NUM"=&gt;["3", "756"],
 "WORD:puppet_svn_env"=&gt;["prod"],
 "NUMBER:puppet_svn_revision"=&gt;["756"]}
IN: Updated 1 files in puppet svn (environment prod) to revision 757
MATCH:
{"SPACE"=&gt;[" ", " ", " ", " ", " ", " ", " ", " ", " ", " "],
 "NUMBER:puppet_svn_num_files"=&gt;["1"],
 "BASE10NUM"=&gt;["1", "757"],
 "WORD:puppet_svn_env"=&gt;["prod"],
 "NUMBER:puppet_svn_revision"=&gt;["757"]}

Hopefully this will make the process a bit simpler for someone else…