Getting oVirt up and running

The bulk of this post was written way back in April 2012. If you’re just coming here, and looking to setup oVirt, you should probably skip down to the postscript for an update, and ignore most of the content here (as it’s applicable to an older oVirt version).

I recently started setting up oVirt, the community version of Red Hat Enterprise Virtualization, at work for some testing (mainly a “sandbox” VM environment, and because Foreman supports it). To start with, I had two nodes, each with two dual-core Xeon processors (VT-x capable) with 20GB RAM, one with 600GB internal storage and one with 140GB internal. While oVirt’s documentation isn’t exactly wonderful, I found a blgo post by Jason Brooks, How to Get Up and Running with oVirt, which gives a great walkthrough of getting the oVirt Engine setup on a machine, and also setting up that same machine as a VM host. As oVirt is still fairly young, this is all done on Fedora. I performed my installation via Cobbler, though I’m afraid to admit it was an entirely manual, interactive install.

I did run into a few bumps during Jason’s tutorial. In step 15, adding the data NFS export as a Storage Domain, I was unable to add the NFS export. I found the Troubleshooting NFS Storage Issues page on the oVirt wiki, ensured that SELinux was disabled and that the export had the correct permissions, confirmed that /etc/nfsmount.conf specified Nfsvers=3, rebooted, and then ran the nfs-check.py script. At this point, I was able to add the other storage domains in steps 15 and 16.

My second issue was that even on Fedora 16, I simply can’t get the spice client (through the spice-xpi browser plugin) to work. As far as I can tell from the logs, it looks like spicec is being sent a value of “None” for the secured port parameter, instead of the correct port number. I assume this is a bug in oVirt, but I’ll revisit this problem when I have time. In the mean time, I changed my test VM to use VNC, which is launched by installing the ovirt-engine-cli package (see below) on your client computer, connecting to the oVirt API with ovirt-shell:

ovirt-shell --connect --url=https://ovirt-engine.example.com:8443/api --user=admin@internal --password adminpassword

and then running console vm_name. This launches the vncviewer binary, which is in the “tigervnc” package on Fedora.

Installing ovirt-engine-cli

To run ovirt-shell on your workstation (Fedora 16, of course…) you’ll need the ovirt-engine-cli and ovirt-engine-sdk packages. I manually downloaded them from http://www.ovirt.org/releases/nightly/fedora/16/, versions 2.1.3 and 1.6.2, respecitively. The SDK and CLI are python based, so there are a few Python dependencies, all of which were automatically solved by yum. I know there are SDK and CLI packages out there for other distros, but haven’t tried them yet.

Installing Linux Guests

Installing a CentOS 6.2 x86_64 guest was relatively straightforward, and my usual kickstart infrastructure worked fine. The only catch was the VirtIO storage interface, which shows up as /dev/vdx instead of /dev/sdx; I just added another kickstart metadata option in Cobbler that allows me to use sdx by specifying “virtual=yes” (for our VMWare hosts), or vdx by specifying “virtual=ovirt”.

Setting up Authentication

As installed, oVirt only has one user, “admin@internal”; it requires an external directory service for user authentication. Currently, it supports IPA, Red Hat’s Enterprise Identity Management tool (combines RHEL, oVirt Directory Server, Kerberos and NTP; perhaps FreeIPA would work as well?) and Microsoft Active Directory. As much as I’d like to give IPA or FreeIPA a try, my company already has an AD infrastructure, so I opted to go that route. Documentation is given in the oVirt 3.0 Installation Guide, starting on page 96. Unfortunately, I was never about to get AD auth working correctly, so I just worked with the one admin user.

Adding a Node

The biggest issue I had was adding the second node to oVirt. I attempted to use the DVD Import feature of Cobbler on the oVirt Node Image ISO, but that failed. I then found the image’s LiveOS/livecd-iso-to-pxeboot script and used that to make a kernerl and initrd, and kernel parameters, for Cobbler. PXE works fine.

Postscript: I ended up blowing away my oVirt installation in favor of testing other things. At some point, the engine install got corrupted in a way that I just couldn’t fix; even though I spent all day one Saturday working on it, it took more time than I could allocate to a personal project. So this post is really semi-complete at best. However, there is some good news. Jason Brooks’ original post, How to Get Up and Running with oVirt, was written for oVirt 3.0, as was this post. Since then, there has been a new release, oVirt 3.1, which apparently has a better UI and a better installer. Jason Brooks has a new post, Up and Running with oVirt, 3.1 Edition, which covers installation and configuration of both an all-in-one machine and a separate node. If you’re looking to try oVirt, I’d recommend you give that a shot. Unfortunately (and strangely, given that this is supposed to be the “upstream” of RedHat’s proprietary RHEV) it’s still all based on Fedora.

Adjusting the VirtualBox F12 BIOS Boot Prompt Timeout

I’m working from home today, connected by VPN. I’m in the process of testing a bunch of Puppet stuff, and needed to re-image a bunch of VirtualBox VMs on my desktop at work, using PXE boot to Cobbler. I’m only connected to the desktop by SSH, and running the VMs with VBoxHeadless and connecting to them via RDP (well, VRDP). The problem with this is that if I start a VM on my console window, then switch to my RDP client and connect, by the time the VM gets keyboard focus, it’s already past the VBox “Press F12 to select boot device” prompt and booting from disk. I could modify the boot order on the VM, but then that becomes a pain when it reboots after the install.

Thanks to some of the guys on the VirtualBox IRC channel, I found out about the --bioslogodisplaytime option for VMs, which controls the length of time (in milliseconds) that the boot splash screen is shown (the default value seems to be 0). It’s included in the reference guide to VBoxManage in the modifyvm section. Setting this to a value of 10 seconds or so, as shown below, is more than enough for me to start the VM, Alt-Tab to my RDP client, connect to the VM, and hit ‘F12′ to select a one-time network boot:

VBoxManage modifyvm VMNAME --bioslogodisplaytime 10000

Using VirtualBox Remotely

At work, I have a pretty beefy workstation (a Dell OptiPlex 990 with a 3.4GHz Intel Core i7-2600 and 8GB RAM running Fedora 16) that I usually run a few VMs on as my test/development environment. I usually reboot my machine every other week or so, and start VirtualBox and my VMs once the system boots. All of the VMs are Linux boxes, running test-only, so I never really cared about RDP or anything like that. Today I’m working from home and need to setup a new development environment, so here’s how to get VirtualBox working nicely assuming you’ve never set it up for VRDP (its Virtual Remote Desktop Protocol) before, and have a network connection (LAN or VPN or something) to the machine running VirtualBox. I currently have VirtualBox OSE 4.1.8 installed from rpmfusion RPM. Most of this can be found in Chapter 7 of the VirtualBox manual, but here’s a step-by-step method.

  1. First, download the Oracle (non-free) Oracle VirtualBox VM Extension Pack tarball from the VirtualBox Downloads Page, which provides VRDP support (as well as support for the virtual USB 2.0 device, Intel PXE Boot ROM support for the E1000 NIC driver, and experimental Linux host PCI passthrough suport). Then install it using:
    sudo VBoxManage extpack install Oracle_VM_VirtualBox_Extension_Pack-4.1.8-75467.vbox-extpack
  2. Assuming you have an existing VM (you can list them using VBoxManage list vms), enable VRDP support on it:
    VBoxManage modifyvm "VM name" --vrde on
  3. I like to assign specific ports to VRDP on each VM so I can “bookmark” them in my KRDC client by VM name. I generally start with 10011, as the 10011-10049 range is both unassigned and doesn’t appear in my /etc/services:
    VBoxManage modifyvm "VM name" --vrdeport 10011
  4. Start the VM, using VBoxHeadless (shows more debugging/errors, but also stays in the foreground, so you’ll want to use screen or something like it):
    VBoxHeadless --startvm "VM name"

    If all went well, it should show some output including a confirmation that the VRDE server is running on the correct port:

    Oracle VM VirtualBox Headless Interface 4.1.8_OSE
    (C) 2008-2012 Oracle Corporation
    All rights reserved.
     
    VRDE server is listening on port 3389.
  5. That’s it. Assuming you’re using something like screen, you can start a whole bunch of new VMs, and still keep the VBoxHeadless output in case of an error.

Virtualization Options

As I mentioned in Downtime past few days, coping with storms, as a result of some things I noticed with a recent power outage, I’ve decided to take the leap to virtualization. Given the cost of current hardware that supports HVM (Intel VT-x or AMD-V ), I immediately decided that I might as well give up on any thoughts of doing full virtualization or getting new-ish hardware. So I settled on the next step up from what have now – a set of HP Proliant DL360 G3 servers. I got them with a 90 day warranty from a reputable dealer, dual 2.8GHz Xeon (512K cache), 2Gb RAM, dual 36.4Gb U320 15k RPM SCSI disks and dual power supplies for $99 each. My next step is to decide what virtualization software to use.

My main goals for the project are:

  • Lower power consumption through consolidation of servers.
  • Possibility to add capacity or resources by remotely powering up an idle server and migrating VMs to it.
  • Limited fault tolerance – ability to manually restore a VM that was running on failed hardware, onto an idle server.

I originally thought Xen, just out of reflex. However, given that all of my servers have the same base – the same distribution and, ideally, the same kernel and patch level – it seemed like a lot of overhead to duplicate that for multiple VMs. So I started looking into OS-level virtualization. There are relatively few options, and I’ll admit that aside from Solaris Containers (which I learned about while working at Sun) I don’t know much about it. But OpenVZ seems to be the front runner in that area. My initial impression was that it made a lot of sense – keep one common kernel, but allow containers/virtual environments (CTs/VEs) to have, essentially, their own userland. Unfortunately, it doesn’t seem to be as hyped as Xen, and I haven’t heard very much about it in the enterprise context. And it requires running a kernel from the OpenVZ project, which means I can’t just script updates through yum as easily as normal.

On the up size, OpenVZ would allow me to eliminate the duplication of the kernel, and seems to have much less overhead than Xen (and logically so). On the down side, I lose the ability to virtualize other OSes, kernel versions, or make pre-packaged VMs. I’ve decided that if I wanted to do that, I could dedicate a single machine.

I’ve spent the last day or so doing a lot of research, and have come up with the following questions and concerns about OpenVZ which I hope to be able to answer (I’ll post the answers in a follow-up).

  • How do I handle distribution and kernel upgrades? The logical solution would be to migrate the CT to another host while I upgrade CT0 (the hardware OS/host/dom0 in Xen speak). But if the guest and host kernels must match, how does this work?
  • Can I do package upgrades within the guest/CT easily? WIll this play well with Puppet?
  • How will I handle backups? Is it logical to run bacula within each CT, or just on CT0? If just on CT0, how do I easily verify that a particular CT was backed up?
  • WIll everything play well with Puppet? (see below)
  • Am I willing to throw away my KickStart-based installs? And, similarly, am I willing to give up the possibility of migrating from a container to a Xen host or a physical host (easily)?
  • OpenVZ live migration relies on rsync. This means that there’s a significant delay (compared to shared storage) and also that I can’t migrate off of a host that’s down. Is there a way around this?
  • Similarly, live migration requires root SSH key exchange (passwordless) between the hosts. This seems about equivalent to using hosts.equiv. Do I really want root on one box to mean root on another box (and all of the containers on that box)?
  • Can I still firewall CT0? How will this work?

It seems to me that OpenVZ may be significantly less enterprise-class than Xen. Sure, this is just my home setup, but I hold it to the same standards I use for my work systems. In fact, I usually test new technologies at home before I suggest them at work. A lot of the writing on the OpenVZ wiki seems to be riddled with spelling errors. They claim “zero downtime” live migration, but if they have to rsync 2Gb of MySQL tables, that sounds like a lot more than “zero”. And, most shockingly, the Hardware testing wiki page talks about making sure your hosts aren’t overclocked or undercooled, and running cpuburn to test your system under high load. Sorry, but the engineers at HP, Sun, IBM, etc. handle that for me and most people I know. So, I’m a bit worried about the seriousness of the OpenVZ project.

Most worrisome is a post I found in the OpenVZ forum, “Stopping puppet on hn stops it in all VE”. It seems that, since CT0 is aware of all of the guest container processes, they show up in ps lists. Most, if not all RedHat init scripts use killproc to stop and restart services. This means that a service syslog stop on the CT0 (host) will stop all syslog processes, including all of them in the CTs. This seems like a major issue. Sure, I could replace killproc on CT0 with a script that parses the process list, isolates the PIDs for those running on CT0, and kills them. But what else needs to be fixed? Nagios check scripts would need to be adjusted. Is there anything else that would come back and bite me?

The bottom line is that (I guess this is logical) it seems that containers in OpenVZ will seem – and act – a lot less like a logical host than they would under Xen.

Microsoft and Novell Deliver Joint Virtualization Solution – or do they?

From PRNewsWire: Microsoft and Novell Deliver Joint Virtualization Solution Through Partners. The headline of the press release: “Supported by Dell and other channel partners, solution includes SUSE Linux Enterprise Server running as optimized guest on Windows Server 2008 Hyper-V.”

Now, maybe I’m not up on the news regarding my favorite distribution, but it seems to me that a deal allowing SuSE to be virtualized as a guest under Windows is not only “joint”, but plain moronic. Despite the marketing efforts of Microsoft, Unix-based systems (including Linux) have always had the upper hand in availability, reliability, and performance.

I must say, from what I’ve heard, Windows Server is getting *much* better in these areas – and I’ve even heard that the latest version includes an option to install without a graphical environment, and even includes a command-line that’s useful. It’s about time.

However, it seems to me, that any virtualization deal between Microsoft and a Linux distributor can provide only one logical solution: Windows Server virtualized as a guest in a high-availability Linux host. More importantly, without the insane per-processor licensing – a per-VM instance license that’s hardware-agnostic and allows VMs to be migrated across hardware as the admin sees fit.

Oh, and one more insight. If Microsoft wants to be a serious player in the virtualization arena, here’s a few “simple” steps:

  1. Get Windows Server to work correctly under Xen, VirtualBox, etc. Certify it. Provide the correct guest OS tool packages
  2. Provide simple management of Windows in a virtualized environment – minimally, a standard SSH server that’s compatible with OpenSSH, a GUI-less environment, and a serial console.
  3. Get rid of per-processor licenses. Provide a basic license that allows for, say, 10 VMs to be running at once, and allows as many installs as needed – the only licensing is based on the amount of VMs actually running. i.e., if you have 10 VMs and one gets corrupted, you can bring that one down and online a back-up image, without violating the license.
  4. Make licensing processor-agnostic. Want to migrate a Xen VM (Windows guest) from a dual-core Pentium to an 8-core Xeon, or even a 16 processor SPARC? Sure, no problem.

Links for 2008-02-23

Some links for today:

Microsoft’s new promised on interoperability, open standards. etc. – somewhat ironic given the Office Open XML debacle on “standards”. And Red Hat’s worries about it. (Ars Technica)

Groklaw’s lengthy analysis of the promises.

Pakistan removed from the Internet, causes global YouTube outage.

A Guardian article on the WikiLeaks debacle – perhaps the biggest affront to the First Amendment this year.

An InformationWeek article about some guys from BlackHat D.C. who said that they will be able to crack GSM encryption in under 30 minutes with $1,000 of technology or 30 seconds with $100,000 (FPGAs – Maybe a cluster of PS3′s?)

A Princeton Unviersity blog about cold boots possibly able to crack the Windows BitLocker system.

Yay! Firefox has hit its’ 500 Millionth download!!! And there was much rejoicing…

An ArsTechnica article on Internet Explorer, what should be done to fix it, and how there can still be a non-standards-compliant browser.

Jeremy’s Blog – the mind behind LinuxQuestions.orgprovides a recap of the 2007 LQ Members’ Choice awards. Some interesting winners were VirtualBox for virtualization package, Debain for server distro, Knoppix for Live Distro, Eclipse for IDE/Web Development Environment, Python for language of the year, and – much to my chagrin – vi/vim for editor.

A LinuxJournal article on What’s Next for Open Source and Public Meida.

LinuxInsider – EU taking Microsoft’s promises with a grain of salt, noting that MS has made “at least four similar statements” in the past.

Chris SiebenmannWhere the risk is with virtualization (and iSCSI) and Wireless, machine rooms, and the Asus eeePC.

IBM DeveloperWorks – OOXML: What’s the big deal? – outlining the technical objections to OOXML as a standard. Linked from a rootprompt.org article mentioning that “OOXML is essentially a complete replication of every chunk of data that a Microsoft Office application might possibly save in a file”.

Slashdot YRO – a guy who got hist stock photos stolen, entered into a long legal battle, and won.

Microsoft’s Windows Vista Capable lawsuit granted class-action status.

A Washington Post article on Hans Reiser’s Geek Defense strategy.

A Slashdot post linking to news that Apple sent a cease-and-decist order to the Hymn Project, which produces software to remove DRM from iTunes songs. Apple had their ISP remove all download links. (I guess the only solution is for us all to buy bandwidth right from a NSP…)

Yahoo’s shareholders are suing it for not gobbling up the Microsoft deal.

Comcast getting sued AGAIN for P2P filtering.

A leaked RIAA training video for prosecutors, going so far as to say that IP piracy can lead to arrests for drugs, weapons, or terrorism. It also includes instructions on how to get a RIAA investigator certified as a court expert.

A New York Times article on – gasp – women using the Internet. Linked from Tom Limoncelli’s blog.