Search for a small-scale but automated RPM build system

This post is part of a series of older draft posts from a few months ago that I’m just getting around to publishing. Unfortunately, I have yet to find a build system that meets my requirements (see the last paragraph).

At work, we have a handful – currently a really small number – of RPM packages that we need to build and deploy internally for our CentOS server infrastructure. A number of them are just pulled down from specific third-party repositories and rebuilt to have the vendor set as us, and some are internally patched or developed software. We run websites, and on the product side, we’re a Python/Django shop (in fact, probably one of the largest Django apps out there). We don’t deploy our Django apps via RPM, so building and distributing RPMs is definitely not one of our core competencies. In fact, we really only want to do it when we’re testing/deploying a new distro, or when an upstream package is updated.

Last week I pulled a ticket to deploy node.js to one of our build hosts, and we’ve got a few things in the pipeline that also rely on it. I found the puppetlabs-nodejs module on Github that’s supposed to install it on RHEL/CentOS, but it pulls packages from http://patches.fedorapeople.org/oldnode/stable/, and the newest version of nodejs there is 0.6.18, which is quite old. I can’t find any actively maintained sources of newer nodejs packages for RHEL/CentOS (yeah, I know, that’s one down side to the distributions…). However, I did find that nodejs 0.9.5 is being built for Fedora 18/19 in the Fedora build system, is already in the Fedora 18 Testing and Fedora Rawhide repos, but is failing its EL6 builds in their system. The decision I’ve come to is to use the puppetlabs-nodejs module to install it, but try and rebuild the Fedora 18 RPMs under CentOS 5 and 6.

So that’s the background. Now, my current task: to search for an RPM build system for my current job. My core requirements, in no specific order, are:

  • Be relatively easy and quick to use for people who have a specfile or SRPM and want to be able to “ensure => present” the finished RPM on a system. i.e., require as little per-package configuration as possible.
  • Be able to handle rebuilding “all” of our RPMs when we roll out a new distro version. Doesn’t necessarily need to be automatic, but should be relatively simple.
  • Ideally, not need to be running constantly – i.e. something that will cope well with build hosts being VMs that are shut down when they’re not needed.
  • Handle automatically putting successfully built packages into a repository, ideally with some sort of (manual) promotion process from staging to stable.
  • Have minimal external (infrastructure) dependencies that we can’t satisfy with existing systems.

So, the first step was to research existing RPM build systems and how others do this. Here’s a list of what I could find online, though most of these are from distributions and software vendors/projects, not end-user companies that are only building for internal use.

  • Koji is the build system used by Fedora and RedHat. It’s about as full-featured as any can be, and I’m familiar with it from my time at Rutgers University, as it’s used to maintain their CentOS/RHEL packages. It’s based largely on Mock. However, setting up the build server is no trivial task; there are few installations outside of Fedora/RedHat, and it relies on either Kerberos or an SSL CA infrastructure to authenticate machines and clients. So, it’s designed for too large a scale and too much infrastructure for me.
  • PLD Linux has a builder script that seems to automate rpmbuild as well as fetching sources and resolving/building dependencies. I haven’t looked at the script yet, but apparently it’s in PLD’s “rpm-build-tools” package.
  • PLD Linux also has a CVS repository for something called pld-builder.new. The README and ARCHITECTURE files make it sound like a relatively simple mainly-Python system that builds SRPMS and binary packages when requested, and most importantly, seems like a simple system that uses little more than shared filesystem access for communication and coordination.
  • ALT Linux has Sisyphus, which combines repository management and web interface tools, package building and testing tools, and more.
  • The Dries RPM repository uses (or at least used… my reference is quite old) pydar2, “a distributed client/server program which allows you to build multiple spec files on multiple distribution/architecture combinations automatically.” That sounds like it could be what I need, but the last update says that it isn’t finished yet, and that was in 2005.
  • Mandriva Linux has pretty extensive information on their build system on their wiki and a build system theory page, but it seems to be largely a hodgepodge of shell scripts and cronjobs, and is likely not a candidate for use by anyone other than its designers.
  • Argeo provides the SLC framework which has a “RPM Factory” component, but I can’t seem to find much more than a wiki page, and can’t tell if it’s a build automation system or just handles mocking packages and putting them in a repo on a single host.
  • Dag Wieers’ repositories use (or used) a set of python scripts called DAR, “Dynamic Apt Repository builder”. They’re on github but are listed as “old” and haven’t been updated in at least 2 years. The features sound quite interesting, and though it’s based on the Apt repo format, it might provide some good ideas for implementing a similar system.

Update four months later: I’ve yet to find a build system that meets my requirements above. For the moment I’m only managing ~20 packages, so my “build system” is a single shell script that reads in some environment variables and runs through using mock to build them in the correct order (including pushing the finished RPMs back into the local repository that mock reads from) and then pushing the finished packages to our internal repository. Maybe when I have some spare time, I’ll consider a project to either make a slightly better (but simple) RPM build system based on Python, or get our Jenkins install to handle this for me.

RPM Spec Files for nodejs 0.9.5 and v8 on CentOS 5

The latest version of nodejs that I could find as an RPM for CentOS was 0.6.16, from http://patches.fedorapeople.org/oldnode/stable/. That’s the one that puppetlabs currently uses in their puppetlabs-nodejs module. There is, however, a nodejs 0.9.5 RPM in the Fedora Rawhide (19) repository. Below are some patches to that specfile, and the specfile for its v8 dependency, to get them to build on CentOS 6. You can also find the full specfiles on my github specfile repository. I had originally wanted to get them built on CentOS 5 as well, but after following the dependency tree from nodejs to http-parser to gyp, and then finding issues in the gyp source that are incompatible with CentOS 5′s python 2.4, I gave up on that target.

nodejs.spec, diff from Fedora Rawhide nodejs-0.9.5-9.fc18.src.rpm, buildID=377755 (full specfile)

diff --git a/nodejs.spec b/nodejs.spec
index 050ed86..86c0f4b 100644
--- a/nodejs.spec
+++ b/nodejs.spec
@@ -1,6 +1,6 @@
 Name: nodejs
 Version: 0.9.5
-Release: 9%{?dist}
+Release: 10%{?dist}
 Summary: JavaScript runtime
 License: MIT and ASL 2.0 and ISC and BSD
 Group: Development/Languages
@@ -25,7 +25,7 @@ Source6: nodejs-fixdep
 BuildRequires: v8-devel >= %{v8_ge}
 BuildRequires: http-parser-devel >= 2.0
 BuildRequires: libuv-devel
-BuildRequires: c-ares-devel
+BuildRequires: c-ares-devel >= 1.9.0
 BuildRequires: zlib-devel
 # Node.js requires some features from openssl 1.0.1 for SPDY support
 BuildRequires: openssl-devel >= 1:1.0.1
@@ -165,9 +165,13 @@ cp -p common.gypi %{buildroot}%{_datadir}/node
 
 %files docs
 %{_defaultdocdir}/%{name}-docs-%{version}
-%doc LICENSE
 
 %changelog
+* Thu Jan 31 2013 Jason Antman <Jason.Antman@cmgdigital.com> - 0.9.5-10
+- specify build requirement of c-ares-devel >= 1.9.0
+- specify build requirement of libuv-devel 0.9.4
+- remove duplicate %doc LICENSE that was causing cpio 'Bad magic' error on CentOS6
+
 * Sat Jan 12 2013 T.C. Hollingsworth <tchollingsworth@gmail.com> - 0.9.5-9
 - fix brown paper bag bug in requires generation script

v8.spec, diff from Fedora Rawhide 3.13.7.5-2 (full specfile)

--- v8.spec.orig       2013-01-26 16:03:18.000000000 -0500
+++ v8.spec     2013-01-31 09:04:51.068029459 -0500
@@ -21,9 +21,11 @@
 
 # %%global svnver 20110721svn8716
 
+%{!?python_sitelib: %define python_sitelib %(%{__python} -c "import distutils.sysconfig as d; print d.get_python_lib()")}
+
 Name:          v8
 Version:       %{somajor}.%{sominor}.%{sobuild}.%{sotiny}
-Release:       2%{?dist}
+Release:       5%{?dist}
 Epoch:         1
 Summary:       JavaScript Engine
 Group:         System Environment/Libraries
@@ -32,7 +34,7 @@
 Source0:       http://commondatastorage.googleapis.com/chromium-browser-official/v8-%{version}.tar.bz2
 BuildRoot:     %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n)
 ExclusiveArch: %{ix86} x86_64 %{arm}
-BuildRequires: scons, readline-devel, libicu-devel
+BuildRequires: scons, readline-devel, libicu-devel, ncurses-devel
 
 %description
 V8 is Google's open source JavaScript engine. V8 is written in C++ and is used 
@@ -51,8 +53,13 @@
 %setup -q -n %{name}-%{version}
 
 # -fno-strict-aliasing is needed with gcc 4.4 to get past some ugly code
-PARSED_OPT_FLAGS=`echo \'$RPM_OPT_FLAGS -fPIC -fno-strict-aliasing -Wno-unused-parameter -Wno-error=strict-overflow -Wno-error=unused-local-typedefs -Wno-unused-but-set-variable\'| sed "s/ /',/g" | sed "s/',/', '/g"`
+%if 0%{?el5}
+PARSED_OPT_FLAGS=`echo \'$RPM_OPT_FLAGS -fPIC -fno-strict-aliasing -Wno-unused-parameter -lncurses\'| sed "s/ /',/g" | sed "s/',/', '/g"`
+sed -i "s|'-O3',|$PARSED_OPT_FLAGS,|g" SConstruct
+%else
+PARSED_OPT_FLAGS=`echo \'$RPM_OPT_FLAGS -fPIC -fno-strict-aliasing -Wno-unused-parameter -Wno-error=strict-overflow -Wno-unused-but-set-variable\'| sed "s/ /',/g" | sed "s/',/', '/g"`
 sed -i "s|'-O3',|$PARSED_OPT_FLAGS,|g" SConstruct
+%endif
 
 # clear spurious executable bits
 find . \( -name \*.cc -o -name \*.h -o -name \*.py \) -a -executable \
@@ -198,6 +205,17 @@
 %{python_sitelib}/j*.py*
 
 %changelog
+* Thu Jan 31 2013 Jason Antman <Jason.Antman@cmgdigital.com> - 1:3.13.7.5-5
+- remove -Werror=unused-local-typedefs on cent6
+
+* Wed Jan 30 2013 Jason Antman <Jason.Antman@cmgdigital.com> - 1:3.13.7.5-4
+- define python_sitelib if it isn't already (CentOS 5)
+
+* Wed Jan 30 2013 Jason Antman <Jason.Antman@cmgdigital.com> - 1:3.13.7.5-3
+- pull 3.13.7.5-2 SRPM from Fedora 19 Koji most recent build
+- add ncurses-devel BuildRequires
+- modify PARSED_OPT_FLAGS to work with g++ 4.1.2 on CentOS 5
+ 
 * Sat Jan 26 2013 T.C. Hollingsworth <tchollingsworth@gmail.com> - 1:3.13.7.5-2
 - rebuild for icu-50
 - ignore new GCC 4.8 warning

Dumping all Macros from an RPM Spec File

I’ve been doing a lot of RPM packaging lately, and on different (and very old) distros and versions. Sometimes I lose track of all of the macros used in specfiles (_bindir _sbindir dist _localstatedir, etc). There’s no terribly easy way to dump a list of all of the available macros. There is, however, a bit of a kludge. Insert the following code in your specfile before the %prep or %setup lines:

%dump
exit 1

The %dump macro will dump all defined macros to STDERR. The exit 1 will prevent rpmbuild from going on and trying to build the package. If you want to view the output nicely, you can pipe it through a pager like less: rpmbuild -ba filename.spec 2>&1 | less.

Just make sure to remove those two lines when you want to actually build the package.

How to make software distribution secure

We were seeing some strange behavior with Mac client machines on the network lately, specifically with DNS queries (I’d guess that a lot of it has to do with Bonjour), but the discussion touched on the DNS Changer trojan for Mac. I’d really never heard about it before, and after some basic reading, it really got me thinking about the state of software packaging, updates, and distribution. Granted, some of my observations would require sweeping changes to how packaging is handled (even on the *nixes), and would require buy-in from more than just the vendor and distributor (well, I guess MS can probably pressure ISVs to do whatever they want), but seems to be the only way to keep appliancization from becoming the solution to security issues. I’ve written about this before, and a while ago in respect to Linux, but here’s my current take on what needs to be done to software packaging to allow our machines to stay secure, no matter what OS they run.

  1. Allow packages to be installed as a user. This is a mammoth task under Windows or Mac, but still an issue under Linux. The DNS Changer trojan is a case in point – there’s no reason a “video codec” would need to be installed system-wide, and if that were simply installed user-specific, the malicious installer would never have the privileges to change system-wide DNS settings. This is also a big issue under Linux. Yum, apt, rpm, etc. should (if run as a non-root user) install packages in a user-local path under /home by default. Of course, this would mean many things would need to change in order to cope – perhaps even a change to the LSB spec.
  2. Warn about inconsistencies on package installation. The package installation program should warn a user (whether installing packages system-wide or local to a user) if the package is going to modify system-wide files, i.e. files not specifically placed by that package and that package only.
  3. Real package management for Windows and Mac It’s about time that Apple and Microsoft admit that people without billions in funding can come up with good ideas. Get rid of these Installer programs (the many many different ones). Each OS should pick a package format, develop a yum-like (or, even better, zypper-like) package management program that understands repositories. I don’t know how they’d cope with the pervasive license keys and DRM in the non-nix world, but I’m sure they could figure out a way that still allowed sane package management. The idea here is that vendors run repositories and are responsible for their GPG keys, so trojans claiming to be an update to a given vendor’s software would be rejected. Also, isn’t it about time that you can update all your software on Windows or Mac through one tool?
  4. Filesystem-based IDS for Windows and Mac Assuming it will take a while to get everyone onboard with the packaging idea, and noting that users of these OSes like installing applications from arbitrary sources, there should be an OS-level feature to audit all filesystem changes made by untrusted/unsigned applications, and a way to alert the user to these changes if they appear suspisious (essentially what Spybot Search & Destroy / TeaTimer do, but builtin to the OS).
  5. Vendor support of packaging/repositories – Along with the idea of repositories, vendors should have a trust or signing system for ISVs signing keys. If users are installing arbitrary software, making them trust an arbitrary key won’t do anything to improve security. Microsoft and Apple need to run a CA that signs the package signing keys of their ISVs. The also – and here’s the big one – need to have a parallel framework for “independent developers”. I.e. something that doesn’t cost any money for the packagers, and allows them to at least give a “this person is who they say they are” message.
  6. Finally, Make package management pervasive – Have a real push to apply the packaging and signing keys standard to all software for the OS.

On a final note, applicable to both the current state of Linux packaging and my ideas about Mac and Windows… DNS is the ideal method of key distribution (granted, yes, this just means that the security of the packager’s DNS records, and their servers and signing key, is just more of an issue). But even with Yum and Zypper, it seems to me to be logical that the packager’s public key should be stored in a DNS record (or at a URL stored in a DNS TXT record). That way, it wouldn’t be up to an end user to import and trust a key, they’d just have to trust the repository (i.e. software.adobe.com) and the package manager would pull down the key and verify that package X in software.adobe.com is, in fact, signed by the software.adobe.com key.