Archive

Archive for the ‘Reviews’ Category

Book Comments: The Future of the Internet and How to Stop It, by Jonathan Zittrain

December 3rd, 2009

Last week I happened to find a Barnes & Noble gift card in my wallet, with $75 left on it. What a wonderful discovery! One of the pile of books that I ordered was The Future of the Internet–And How to Stop It by Jonathan Zittrain. I’d fully intended to read the book cover-to-cover, perhaps even digest the content a little, before throwing my thoughts out there (presumably to get lost into the vast sea of crap that makes up the “blogosphere”). But I just have to get some thoughts down on paper…err…LCD.

First off, when I found out that Zittrain is a professor of Internet Law at Harvard, it immediately told me two things. First, that he probably sides with content producers and/or Big ‘Net a bit too much. Second, that he probably doesn’t really understand what the hell he’s talking about, or why people made the choices they have. The fact that the first chapter of the book, which talks about history, doesn’t seem to mention ARPAnet once only confirms this. But, the B&N summary sounded like the book has a healthy dash of iPhone bashing, so I figured it’s be a good read. It was also written in 2008, so I figured that the ideas would be relatively current.

Well, I’m just under a quarter of the way into the book, and given the vast mass of notes I’ve penned in the margins, I think Mr/ Zittrain and I wouldn’t get along too well on a desert island. But I’ll try to contain my commentary – and attacks upon the author – until I’m done with the book. The thought currently in my mind is a very specific one:

Many technologically savvy people think that bad code is simply a Microsoft Windows issue. They believe that the Windows OS and the Internet Explorer browser are particularly poorly designed, and that “better” counterparts (Linux and MacOS, or the Firefox and Opera browsers) can help protect a user. This is not much added protection. Not only do these alternative OSes and browsers have their own vulnerabilities, but the fundamental problem is that the point of a PC – regardless of its OS – is that its users can easily reconfigure it to run new software from anywhere.

To be sure, Microsoft Windows has been the target of malware infections for years, but this in part reflects Microsoft’s dominant market share.

Oh, wow, is it 2004 again? I thought we’d given up on the “market share” argument. When Apache had 10% of the market share of web servers, people said it wasn’t attacked as often because of low market share. Well, Apache currently has a 47% share of the market, compared to Microsoft’s 21%, and it’s still more secure, more stable, and has fewer critical vulnerabilities*. The same market share argument was made about Firefox when it had 5% market share. Now, the share is projected between 31.85% and 47%, and it still has less serious vulnerabilities (ones that can actually damage your computer) than Windows). I thought this “market share” argument was done with.

Most important is the thing that most Microsoft-biased pundits (and, of course, Microsoft themselves) don’t ever talk about: an amazingly large number of servers run Linux. Especially e-commerce servers which house loads of personal information and credit card numbers. Estimates for big e-commerce sites put non-Windows OSes at 30-50%, and they’re quite popular among small sites that probably don’t have well-trained SysAdmins. So, if Windows wasn’t really less secure, wouldn’t we see e-commerce servers getting compromised left and right?

But there’s a more important point here. It’s about curtailing the stupidity of users. I know, in Microsoft’s defense, that Windows Vista and Windows 7 are supposed to be better with this. But, at least in the past, Windows had virtually no privilege separation. With a little code, you could effect the whole system from an arbitrary binary – or worse, with ActiveX, through the browser. I was dumbfounded that any user could install a system-wide application. The real issue here, at least with older Windows (I don’t know much about the new ones) is that Windows, from the beginning, wasn’t written to be secure. Heck, it wasn’t even designed to be attached to a real network.

Linux does have real security advantages over Windows, and not just because it has low market share. First is an actual, true implementation of privilege separation. No matter what I do in my desktop web browser, no matter what I run, even if I installed a Firefox plugin that wanted to destroy my machine, it couldn’t happen. No matter what I let some random code do, it simply can’t escape the confines of my user account.

Ok, ok, I know what you’re all saying right now. I can hear it from here: “but what if the moron does everything as root? what if they just sudo anything that they’re asked about?” Well, I have answers to that, too. My own distro of choice, OpenSuSE greatly upset me when I went to install 11.1, and the installer showed a default of one user account, automatic login, and the same password for the user and root. That’s just stupid. In fact, it’s braindead, plain and simple. I don’t care how wonderful it would be to get Linux on every desktop in the world, if we have to destroy every advantage that Linux has over other OSes, it will be worthless.

I digress. In the end, it boils down to user education. And, in some respects, I think that Linux has become too dumbed-down. There are certain things that simply shouldn’t be put in a GUI. Excuse my elitism, but if you can’t figure out how to configure Apache correctly from the command line, you have no business running an Apache installation. The same goes for countless other services and applications. So, what’s my solution? Well, here’s what I do when I install Linux for non-technical friends. Some of these things are training items, others are things that I do in terms of configuration and, IMHO, should be OS/distro defaults (unless you know some esoteric hidden switch to change them).

  • Disable graphical login as root. This enforces proper use of sudo, and also prevents a user from becoming lazy and operating as root on a regular basis.
  • Pick a good, strong root password. Write it down on a post-it note and keep it somewhere near the computer. (Yes, I know what you’re thinking. But if it’s a home computer, anyone already in the house either is trusted, or will own the computer one way or another. I’d rather have everyone in the house have access to the box, than a password that a remote attacker can easily brute force.)
  • Disable caching of sudo passwords in the desktop manager, if it already isn’t done. This is a *very* bad idea, IMHO, and effectively defeats privilege separation. If someone needs to use sudo *that* often, they’re either a knowledgeable user, or they’re doing something wrong.
  • Set the package manager to use the strictest key verification settings.
  • Provide the user with extensive documentation (can be a list of links to helpful sites) that includes – this is of paramount importance – a list of common Windows (or whatever OS they’re coming from) programs and their closest Linux equivalents. This is another measure to try and dissuade the user from searching for and installing arbitrary code.
  • Give the user a good, simple explanation of what sudo is, what root is, and why they should be worried. One of my analogies – if I have time to explain it – is to think of the computer’s security like a jewlery store. Your user account is the front door; only people who look honest are buzzed in, but they still can’t do much damage. The root password is the combination to the vault; only very trusted people can get in, and they only open it when they absolutely have to.
  • Enable a wide range of trusted repositories by default. The more likely the user is to find a package in the repos already cached, the less likely they are to download arbitrary code.
  • Explain to the user that when you install software (as root), you’re essentially giving the developer access to your system. Software should be screened by someone who knows what they’re doing (i.e. the community) before you install it.
  • I always tell people to *only* install software from the repositories I enable. If there’s something they need and it isn’t available, ask me (or ask the community) and I’ll make a package and upload it to a suitable repository. The key here – and the most difficult part – is to conquer the Windows habit of installing software from disparate sources, and train the user that only software from their repositories, or other community-standard repositories, can be trusted.
  • Show the user the correct patch/update procedure for their system. Depending on skill level and the level of attention you’re willing to give them, it might be advisable to enable automatic updates (if the OS doesn’t have a way to do it, then via cron).
  • If the user is a developer or needs to run any services, even just for development – i.e. Apache, MySQL, Postfix, etc. – properly secure them and give an overview and links to the proper security procedures.
  • Setup a second user account. Explain to the user that this is only to be used for banking and other sensitive activities. Lock it down, make sure it’s in a different group from the main user, don’t install any Firefox plugins.

Unfotunately, a lot of this is just breaking the bad administration and security habits shared by most Windows users.

While we’re on the topic, a word about package managers. I’m a Linux sysadmin, and I believe in ‘eating your own dog food’. I’ve used Linux on all of my servers, desktops, and laptops for over 4 years now. I haven’t used Windows on a regular basis in ages. I’d say I touch a Windows box for about 5 minutes a month, and usually just to use a browser. A few weeks ago, I was asked to install Windows on a desktop for someone. I did. I then attempted to install Firefox. Using what I remembered of Windows, I navigated to the “Control Panel” and clicked (err… double clicked) on “Add and Remove Programs”. Seems logical enough. I then stared at the screen for about 30 seconds, trying to find the Search box, where I could type in “Firefox”. Finally, I literally began laughing out loud, when I remembered that Windows doesn’t have unified package management, and I’d need to manually find the Firefox binary on their web site, download it, and run whatever installer program Firefox chooses to use. Same issue with updating software. I’m utterly perplexed, being a Linux user, that Windows and Mac people still search through Google or multiple web sites just to find new software. I’m even more perplexed that the OS update/patch program doesn’t also update all of the software on the system. It seems like the stone ages.

In my opinion, one of the biggest failings of modern Linux package management is the assumption (derived from multi-user systems) that all software should be installed system-wide. Granted, it doesn’t do a whole lot to actually protect a single user if they install malicious software available to just themselves (especially since most desktop installs these days are probably used as single-user systems), but I really feel that distros (especially desktop-oriented distros) should have an option to easily install packages for just the current user, and possibly do this by default.

* I can’t find the link right now, but I did find an interesting article on Microsoft’s old anti-Linux campaign (”get the facts”). One of the things mentioned was that when Microsoft compared “vulnerability counts”, they were actually comparing: 1) entire Linux distros vs just the core Windows OS, and 2) counting individual patches in Linux versus patch sets released by MS. So, not only was MS literally counting apples and oranges, but they were totally ignoring unfixed vulnerabilities. Given Microsoft’s habit of not fixing vulnerabilities – especially in “unsupported” products – it’s no wonder how they got the numbers to look so good.

So, here’s a thought. People are used to paying for an OS and for software. Start a Linux vendor that sells a desktop, newbie-oriented Linux distro. Charge a per-user flat rate for the distro and a bunch of base packages, that includes X hours of telephone support. Charge per hour/minute/whatever for additional support. Bundle in secure VNC, secure remote access, etc. in a way that will allow support to remotely access the computer, but preserve the privacy and security of the user (perhaps an app that allows the user to initiate a reverse VNC or SSH session to support). Lock down root access – allow the user to do it, but remind them every time that, outside of a specified set of commands, their actions will be logged and won’t get full support. Then figure out a way for support to write a shell script that’s sent to the user to perform administrative actions, which will all be listed in relatively simple terms for the user to examine and approve. Finally, have a *giant* package repo, all of which is free or comes with paid support. Any F/OSS packages that aren’t already in the repo can be requested by a customer, and for a flat fee for the first requesting customer (say, $10) will be examined, approved, packaged, and added to the repo.

Reviews , ,

Acer X233Hbid Review

May 18th, 2009

I just bought myself a new monitor for my MythTV box, as I’ve moved my beautiful Acer AL2416W 24″er to my new desktop. The chosen monitor, based on price, reviews and features, is the Acer X233Hbid. It’s a 23″ 16:9 (not 16:10) monitor that runs at 1920×1080, provides true 1080p, and has an HDMI input (not that I’d ever use a restricted connection). After a few minutes of having it turned on and running, the picture quality is quite nice, even with quite a bit of glare.

However, I have two major complaints within the first ten minutes of unboxing it:

  1. No real manual, nor an online copy. The monitor comes only with a Quick Start Guide. There’s no printed full manual. More distressingly, it isn’t even listed in their list of monitor models on their Support site. There’s no manual copy online either. There was a CD provided with the user’s manual on it. However, for a company that sells netbooks with no CD drive, this seems like quite a bad decision. But why, you ask, would I need a manual for my monitor?
  2. No VESA mounting instructions One of my main criteria in choosing a monitor was that it allow VESA mounting, as I have my MythTV monitor on a monitor arm (easily adjustable angle so others in the room can see). The Acer X233Hbid has a 100×100mm VESA mounting space on the back. However, in a rare design mistake (unlike my 24″ Acer AL2416W), the monitor stand is two parts – one rectangular column about 4″ long attached to the back of the monitor, and a base with a column which mates with the one on the back of the monitor. Unfortunately, the column part on the back of the monitor came pre-attached, and there was no mention in the manual of VESA mounting or how to remove the column.

Column removal: The part of the monitor base which ships attached to the monitor is a fairly easy removal. Though I was originally worried about breaking something on my beautiful new screen, I found two plastic pieces on either side of the pre-attached part of the base which appeared to be snap-in trim pieces. Prying them off with a screwdriver revealed four screws which hold this piece to the monitor. Not only was removal easy, but the trim pieces snapped back into place for a nice clean look.

Reviews , , ,

Brother HL2170W DHCP Problems

April 11th, 2009

Two weeks ago, I wrote about the Brother HL-2170W that I got for my mother. It seemed absolutely wonderful. Until the night of March 31st, just before Conficker was supposed to strike. Being that mom’s computer is the only one at home running Windows, I finished up a long-standing project – and moved her desktop, printer, and AppleTV over to a separate VLAN that’s not routable to anything else internal (i.e. anything important, or anything of mine). I’d already had the “client” VLAN setup for a while, so it was just a matter of tweaking the firewall rules and moving the static DHCP assignments from one subnet to the other.

Well, that’s where the problems started. While her Windows XP desktop and AppleTV coped nicely, and got their new addresses in DHCP as they should, the HL2170W did not, As a matter of fact, after two hours, I hadn’t seen a single DHCP request, even though I had the lease time set to 10 minutes for both subnets. So, I tried administratively downing the switch port a few times, to no avail. After a day of waiting, I came back to the problem – and still found nothing in the DHCP logs from that printer. I emailed mom and asked her to power-cycle it a few times… still nothing! It wasn’t even requesting DHCP when rolled, let alone at a regular interval!

Fast forward a week or so, to today. I’m ready to call Brother Support, as my mother hasn’t had use of her new printer in a week and a half. I’m infuriated – I’ve rolled the printer dozens of times, and not a single event in the DHCP log. I know it’s sending traffic from the port – I’ve reset the counters and they’re changing. I tried moving it back to the original VLAN and confirmed that it still has its’ original IP. I could get into the web interface via lynx and *tell it* to refresh DHCP, but this seemed quite pointless – there’s no way it’s physically possible to send the web request and then switch the port to the new VLAN before it gets DHCP.

So, I’m ready to call Brother Support. I then notice that the printer is turned off at the moment. From the switch log, it looks as though it’s been powered off for six days. So, I turn it on. And then go about starting my prep for the Brother call, first opening up a tail on the DHCP server log, grepped for the proper interface. And, wouldn’t you know, as I get up to let the dog out, the printer starts spitting out pages!

As far as I can tell, there’s something seriously wrong with the Brother HL-2170W DHCP implementation. Specifically, it didn’t get an address on the new VLAN until it was powered off for a *long* time. Even reboots wouldn’t trigger a request, until the box had been powered off for days. More importantly, though, it seems that it only gets DHCP once when it boots, and totally disregards the lease time!

Reviews , , ,

Brother HL-2170W – great features from a personal laser printer

March 18th, 2009

I’ve posted an update about serious DHCP problems with this unit.

Last week my mother’s printer died, and she asked me to find a new one for her. After a quick look on NewEgg (sort by ratings is a wonderful thing) I found the Brother HL-2170W. Aside from having a wireless interface (only a security hole, as far as I’m concerned) it seemed pretty cool – tiny B&W laser, Ethernet, PCL6, 23ppm, 32 MB RAM, 250 sheet capacity and 2400×600 dpi. So, for a mere $99 USD, I bought it for her.

When the printer showed up, I was a bit let down to find no sticker bearing the MAC address on either the box or the printer itself – and given the one-button hard control, there wasn’t a way to manually print a config sheet. So, after plugging it into the network and using the DHCP logs to give it a static assignment, a quick reboot of the printer had everything working. As usual, I skipped to the last few pages in the installation manual, and found the ½ page section on the web interface. Configuration was pretty simple – change the admin password, disable a bunch of unneeded services, etc. And then, when playing around with the admin interface, I found a bit of a holy grail – there in the enable/disable services screen were two options that I found unusual for a “personal” printer; Telnet and SNMP. I immediately tried both. An snmpwalk revealed the usual (RFC1213, HOST-RESOURCES, and Printer-MIB) including information on status and consumables. Though the Telnet login process wasn’t terribly intuitive, “help” revealed familiar set/show/clear functionality as well as an option to zero out counters. While I was a bit let down to see that there wasn’t a way to view consumable status or printer status, it did allow access to every conceivable configuration parameter, including a few that weren’t mentioned on the web interface.

All in all, while I can’t comment about reliability or quality yet, this cute little printer seems to have quite a feature set, especially when it comes to manageability and remote troubleshooting (a good thing for any printer that’s used by a family member who you support). And best of all, it supports IPP and LPR.

Reviews ,

ROUThost DNS problems; GoDaddy and Security through Obscurity

February 25th, 2009

The external-facing web site and (internal use) mailing list for the ambulance corps is hosted by ROUThost. Not my choice, it was inherited. ROUThost, first off, appears to be a fly-by-night hosting provider that just buys a few boxes in a colo facility. I should have known to raise a stink when they say you need to fax a copy of your driver’s license to get SSH turned on, and that you have to agree – in legalese – not to mess with anyone else’s configs. Well, last night, DNS for the site went down. As in nothing, wouldn’t resolve at all. I submitted a ticket online for ROUThost’s “24×7″ support – by the way, they don’t have a phone number, only an online ticket form. After 2h 34m 40s of downtime, the issue resolved itself and I downgraded the ticket from “critical” to medium. Now, 11 hours later, it still hasn’t been replied to. And my emails to support and management – 2 hours ago – are unanswered.

Once the problem started, I knew the yearly contract with ROUThost was a bad idea – even at $35/year USD. So, given the great experience I’ve had with them as registrar for my myriad domains, I took a look at >GoDaddy’s site. They offer shared hosting at around $4/month (for shared on a Linux box) and are currently offering some deals, so I figured it would be a good idea. I know and trust GoDaddy’s support, and have had an account with them for quite some time.

The ambulance corp’s web site, hosted through ROUThost, does essentially three things; provide a minimal web presence (the whole web root is probably < 1Mb minus the photo albums), five e-mail forwarders for the officers and a GNU MailMan mailing list for internal business. Unfortunately, I couldn’t find anything in their “features” list mentioning MialMan or any other listserv, or even what MTA/MDA they run.

I put a call in to GoDaddy “Sales/Support”. The poor guy had never heard of MailMan, but asked “one of the hosting guys” and was told it would only be supported on dedicate hosting accounts. Not exactly financially feasible for a mailing list with 30 subscribers, maybe 2 messages a day, and a monthly HTTP transfer of under 20Mb. I was told their shared hosting packages don’t include any mailing list/listserv software, though they include every CMS and language known to man. Hell-bent to get away from ROUThost, I then asked if they ran an MDA that supported piping mail to a command, as can be done with .procmailrc. After a brief hold (not to sound cynical, but I’m sure the gentleman was looking up “MDA”) he came back on the line and told me they didn’t. I then switched to problem-solving mode and asked what MTA and MDA they were running. Another brief hold, and I was told “I can’t tell you that”. Speechless for a moment, I asked what that meant; “we don’t give out that information”. Just about ready to begin explaining SMTP headers, I gave up and thanked him for his time.

Ok, so Sales probably doesn’t understand SMTP headers. I’d considered trying to find mail from a GoDaddy Linux hosted box and check the headers, but I figured I couldn’t do that before the call ended. So, now I’m left with a dilemma. ROUThost is not, in my opinion, reliable, and their support is flat-out nonexistent. 11 hours is far too long to wait for a reply to a “critical” ticket when someone claims 24×7 support. However, by previous experience, GoDaddy would be my next choice – but not only do they ot support mailing lists – arguably the most used feature of our current hosted account – but they won’t even tell a customer what MTA they’re running. I’m too let down by this to telnet 25 on one of their boxes and see what happens.

So what’s left? I guess waiting until (hopefully some time within the next few weeks) I upgrade to Optimum static IP at home, and consider running it all there (and hope mains power never goes out for more than 30 minutes?)

Ideas and Rants, Projects, Reviews , , , , ,

Practical PHP and MySQL

June 26th, 2008

I’m taking a summer course in Building Data Driven Websites – not that I thought I’d learn much in such a course at SCILS, but I’d like to graduate on time, and need the credits, and Bill Crosbie is just the type of rare teacher that can keep even me awake and interested. Our book is Practical PHP and MySQL: Building Eight Dynamic Web Applications (Amazon by Jono Bacon. Now, I know it’s not a real book like, say, ESA3 by Frisch, which has a healthy web presense. But this thing is all code and doesn’t even have a web site, let alone easy code downloads!

The book does come with a heavily customized Ubuntu LiveCD. However, when I popped it in my OpenSuSE workstation, I couldn’t really make much out of the CD – there was certainly no easy-to-find “this is the code” directory. Well, after some exploring, I mounted the SquashFS filesystem and poked around a bit. Strange… seems to only have one real user (root) and, though they claim this is a fully-functional LAMP server, no Apache or MySQL. Really weird. Well, after poking for a few minutes, I found the holy grail – /root/.bash_history was intact! Just a quick look through it with less and I found what I was looking for: /opt/lampp. It appears that the install is actually ApacheFriends’ LAMPP, or XAMPP for Linux (gotta wonder if the guy writing this book doesn’t even know how to install Apache… I’m sure XAMPP for Linux is more bloated than a customized build of Apache/MySQL/PHP from source, especially since it’s only being used to host 8 sample projects, so a lot could be left out).

Anyway, it appears that LAMPP is running in a chroot’ed environment. The actual sample code is rooted at /opt/lampp/htdocs/sites. It seems that all of the PHP files are also owned by root and chmod’ed 777! And the top-level index.php file makes use of absolute links, so obviously he never thought that someone may want to copy the sample code and use it on a real box.

I just can’t imagine someone who’s a beginner with Linux, let alone a Windows person, trying to get this source code onto a machine where they can actually play with it. And… to make the situation worse… the LiveCD has vi and vim, but no Emacs!!!! Eeeek!!

For anyone who needs it, I have the archive available on my site. For non-*nix people, you’ll need Gzip or an equivalent program to extract it.

Reviews , , , ,

F/OSS Monitoring Comparison – Hyperic Part I

February 8th, 2008

So, I’ve made some headway on the comparison. I have Hyperic installed and partly configured, albeit without email alerts yet. I’ve found some serious features that I need missing, but I’m going to give it a full run before I move on to another.

The full text, updated a few times a day, is available on my wiki. Here’s a bit of an excerpt:

Part I – Installation

  1. setup Xen virtual machine running OpenSuSE 10.3 base packages. (3 hours, some server problems, some Xen problems, and some time learning Xen administration from the CLI)
  2. Download hyperic-hq-installer-3.2.0-607-x86-linux.tgz from Hyperic and extract.
  3. Browse to http://support.hyperic.com/confluence/display/DOC/Full+Installation+Guide
  4. cd into hyperic-hq-installer and run ./setup.sh -full
    1. The installation can’t be run as root (though I assumed it would need root privileges).
    2. I selected to install all 3 components – Server, Shell, and Agent.
    3. Well, whoops! Sorta stupid to not allow installation as root, when the default location to install to is /home/hyperic. How do they expect an arbitrary user to install there? Even worse, it appears that the default OpenSuSE 10.3 installation doesn’t come with sudo (!!!!) so I can’t try that.
    4. As root, create /home/hyperic and chown to my user.
    5. Repear the above steps (well, hopefully not all of them).
    6. Default ports for everything – web GUI on 7080, HTTPS web GUI on 7443, jnp service on 2099, mbean server on 9093,
    7. Change domain names in default URLs to logical ones for my test environment (no real DNS, just IPcop hosts, so devel-hyperic1.localdomian). I hope that I can change these later, or even better that absolute paths aren’t used too much, as this will screw with my idea of using SSH port forwarding for remote access.
    8. Leave the default SMTP server alone and change it later – I odn’t even have mail running here at the apartment.
    9. Use the built-in PostgreSQL database with default port of 9432.
    10. Go with the defaults for everything after this.
    11. Everything runs nicely, and then it tells you to login to another terminal as root and run a script. I’m not sure I like this method, but I guess it works. Login and do it.
    12. How will it start the builtin database? As my user???? Yup. postgres is running as my user. Wonderful. Nothing in the install document mentioned user creation. Was this just assumed? Because in the naive world I live in, most installer scripts (think Nagios) create a user for you, or tell you to.
    13. Setup script complete. A few instructions follow…
  5. Run /home/hyperic/server-3.2.0/bin/hq-server.sh start… as my user. Note to self: setup a user for Postgres and Hyperic. Believe it or not, but it booted – but followed with the message, “Login to HQ at: http://127.0.0.1:7080/
  6. Browsed to http://devel-hyperic1:7080 and was greeted by a startup page, saying that the server was 18% finished booting. My, I yearn for little C binaries and a PHP frontend.
  7. Page turns blank and stops there. I refresh, and get a login page. I enter my username and password, and get a little message box where the “invalid password” box usually is – says “Server is still booting”. This is over a minute later. I’m happy to see Apache/Coyote1.1, but would like to be able to get into Hyperic in less time than it takes the machine to boot to a graphical login screen (ok, granted, I’m running XFCE). In SuSE’s YaST Xen Monitor, I see that the VM is at 45% of its’ 464MB RAM, and 90% CPU – with 8.5% consumed by dom0.
  8. CPU usage for the VM drops to 1% and I login again. BAM! Hyperic HQ. Aside from the fact that it shows NO resources… oh… start the Agent.
  9. Start the Agent on the VM running Hyperic. It asks me for the server IP address. What, no DNS? I enter the IP as it is… for now. I keep everything at defaults, including using the hqadmin username and password. Successfully started.
  10. BAM! In Dashboard, I see the auto-discovered host with the right hostname, as well as Tomcat, Agent, JBoss, and PostgreSQL. Amazing! Click “Add to Inventory”.
  11. Check out the “Resources” -> “Browse” screen. It knows this machine is OpenSuSE 10.3, and I see my four services (listed above). Of course, no metrics yet, but I see the correct IP, gateway, DNS, vendor (SuSE), kernel version, RAM, architecture, and CPU speed.
  12. Looking through the “Inventory” screen, I see everything – NICs and MACs, running servers and one service (a CPU resource). What more could a man want in…let’s see.. just over an hour!
  13. I really *love* the “Views” screen which, even out-of-the-box, allows “Live Exec” information from cpuinfo, df, ifconfig, netstat, top, who, and more.
  14. Well, it’s 03:35, and I have work and class tomorrow. I think it’s time to give Part I a rest. But first…
  15. Go to the “Platform” page for my one machine and… YES! Graphs are starting to appear!
  16. Following the suggestion here, I enable log and config tracking on the platform for /var/log/warn and /etc/hosts, respecitvely.
  17. Before I call it a night (now 03:42), I stop back at the downloads page and grab the Linux x86 Agent for the dom0 machine, hoping to get some physical information as well. While I’m at it, I grab the Linux AMD64 Agent to try on my laptop. I create “hyperic” users on each system. On the base Xen server, I give it a shot and get “Unable to register agent: Error communicating with agent: Unauthorized”. Same thing on the laptop.
  18. Did a little reading here. As to keeping all of the defaults, it turns out that both clients had firewalls blocking TCP port 2144. I opened it up on both, and also set the IP address (that the server uses to contact the client) to the correct ones. Viola! Now I have 3 clients connected, and gatheirng data for the next ~16 hours until I have time to check it out agian.

More to come in Part II tomorrow – actually doing something with Hyperic. For
now (04:08), time to sleep.


Part II – Configuration

Unfortunately, I haven’t had much time to play with Hyperic in the two days
since installation. The most I’ve really done is setup Agents on my laptop,
desktop, and the host machine (both dom0 and domU for Hyperic), so that they
start to collect data.

While I found a lot of upsetting stuff in the features list (see below), I
decided to go ahead and add some other devices. On the network at the
apartment, I have two manageable switches (a Linksys and a 3Com) – which pretty
much make up the sum of non-host equipment. I also have an IPcop box, though I
assume the standard Linux Agent will handle that. The one item missing that I
have at home is my set of APC SmartUPS UPSs with SNMP cards, but I guess I’ll
just have to skip them for this review.

First, I went in and added a platform (Resources->Browse, Tools Menu->Add
Platform) for the 3Com switch (a SuperStack II Switch 3300). It showed
successful creation – but nothing else. I went in and entered the SNMP
community string, IP, and version (1). In about a minute or so, I started to
see metrics – Availability, IP Forwards, IP In Receives, an IP In Received per
Second. While it’s quite basic, that’s good for a starting point. While the
[http://support.hyperic.com/confluence/display/DOCSHQ30/Network+Device+platform
Network Device Platform] documentation lists lots of metrics that can be
enabled, I’d also like telnet availability and – my big one since I use a
“cute” (crappy) IPcop installation for local DNS, a dig on DNS to make sure
the entry is there. In the Monitor screen, I was able to enable a bunch of
additional metrics (by clicking on the “Show All Metrics” link), though
there’s also no way (that I can find) to monitor the status of individual
ports.

Next, I browsed through the “Administration” pages, setup a few users, and
started setting *way* more default metrics for various platforms, services,
and servers. While I don’t have mail running yet, that will come this
weekend. While I added a lot of things as “Default On”, I still need to go
back and add more things in the templates as Indicators.

I also added some escalations, though they’re quite simple – you can notify HQ
users or “other users” by email or SMS, write to SysLog, or suppress alerts
for 0 minutes to 24 hours. Hopefully I’ll also find a plugin for Asterisk
integration. One striking omission is user groups. Also, the concept of
“Roles” (maybe their idea of groups?) is only available in the Enterprise
version.

At this point, I also notice one other majoe issue, though perhaps I’ll find a
solution in my experimentation – there doesn’t be a way to setup default
alerts for metrics. If they have all of this platform, server, and service
information defined as default templates, why not just have a way to assign
default users (and groups) to these objects, and have default alerts
generated?

In terms of Apache 2.2 monitoring, out-of-the-box, nothing worked. No metrics
at all. Firstly, Hyperic requires the mod_status module. Persoanlly, I’d
rather handle all of that through a backend, like Nagios. Secondly, it got the
pidfile and apache2ctl paths wrong. Furthermore, it has no “smart” checking for resources – while my Apache 2.2 resource config was clearly wrong (wrong PID file path, no mod_status), Hyperic didn’t detect this and was showing the resource as “Down”.

After that, I setup a bunch of alerts for things that I thought would be off-kilter a lot (like WARN log entries on my laptop, high memory usage on some stressed machines, etc.) as well as log and config file monitoring and alerts for them. While I didn’t have mail working yet, I figured I might as well get that stuff running.

On the Xen dom0 host that runs the Hyperic vm (box called xenmaster1), I wasn’t able to add config file tracking for any of the /etc/xen/ files. At this point I notice some serious shortcomings – not only is it not possible to define a template of alerts for a given platform/server/service, it’s also impossible to define a template for alerts. I also noticed that it’s not possible to define groups of contacts. This wasn’t much of a problem for my test installation – the alerts are only going to my roommate and I – but it would surely be an issue in any larger setting.

At this point in configuration, I come to a make-or-break point. With some of these shortcomings, I really need a way to call a script with alert information when an alert is generated – whether it’s to dial out through Asterisk or just automatically create a ticket for the problem.

Adding alerts is a cumbersome process. You have to browse to a page for a specific metric – which means going to the page for a specific platform, server, or service – and then opening the page for that metric. The actual alert creation takes up two pages – one for the metric, threshold, and time-based criteria, and a second for who to alert. This means that to add alerts for a machine, you need to view the platform page as well as the services and servers pages, and each metric therein.

I’ll be posting some more in the days to come. From a post at the Hyperic Forums, I was able to find out that a Xen plugin is in the works, but for the Open Source version, the only way to trigger a script is to send an email and have it handled by a filter such as Procmail.

Projects, Reviews , ,

Network Monitoring Trials – Part I

February 5th, 2008

After dinner tonight, I’m going to start setting up my Xen VMs and installing some monitoring software. I’ve decided that Hyperic and OpenNMS will be the first round – mainly because their free versions seem to be the most heavy-weight, and will probably take more installation time. GroundWork Open Source will come sometime later this week.

While the research that I did today has led me to start formulating some opinions on each of the contestants, I’m going to withhold comment until I get all three up and running, and have done some real work with them.

Projects, Reviews , , ,

Asus eeePC Update

February 4th, 2008

More to come sometime this week, when I have enough sanity and time to write.

In the mean time…

So it’s been two weeks since classes started again, and that means two weeks using my now-beloved eeePC 4G Surf (details in a previous post). Granted, I have my “desktop replacement” laptop (a Linux Certified LC2464) to use at my desk at home or at my apartment – though the “desktop replacement” really means that it’s easy to move from one desk to another.

So far, I really love it, but I have a few issues:

  1. I should have bought an 8GB SDHC card instead of the 4GB that I got – especially with a full install of OpenSuSE with Sun’s JDK and OpenOffice, the 2GB root partition is 99% full!
  2. After using a desktop for hours, it takes a few lines of text for my fingers to re-adjust to the small keyboard. Hopefully it’ll get easier with time.
  3. Unfortunately, I don’t have space for kernel headers or source, so I can’t compile the customized version of asus_acpi. I can’t find any binary packges, or binary kernel modules for my kernel version. That prevents me from using sleep/hibernate/suspend, and also means I don’t get accurate battery calculations. I’ve found from usage that the battery lasts 2-3 hours with wireless on and minimal screen brightness. Also, unfortunately, (maybe because of the ACPI issue?) if I dim the screen and then the screensaver comes on, when I log back in it resets to full brightness.
  4. As of this week, there’s still no MadWifi driver for the Atheros card in the 4G. I have to run it under Ndiswrapper. As a result, I can’t get monitor mode, so the eee is effectively useless for wireless site surveys and security work. There’s talk of a forthcoming MadWifi, but if nothing shows up, I may have to go with a USB adapter (I don’t want to void the warranty by swapping out the internal Mini-PCI adapter).
  5. Not a problem with the eeePC, but it seems like quite a few web sites that I’ve visited are horribly coded – with static screen sizes assumed. On the small screen on the eee, the biggest issue is when the first few characters of every line on some sites are cut off, thereby rendering the content illegible. This is an issue out of Asus’ control, but can be a hindrance to full use. I have, however, found that for many sites, switching FF to “full screen” mode (F11) helps.

Stay tuned for more, and some new scripts to automate wireless surveys, rogue AP detection, etc. And maybe even some work with autonomics and or configuration tools.

Reviews ,

Network Monitoring

January 29th, 2008

I was reading Ben Rockwood’s blog before (as I do every day, thanks to the magic of Google Reader), where he had an article praising up.time network monitoring software. Now, up.time (have I ever mentioned that I *hate* names with punctuation in them?) is proprietary software. And, given their support options, level of integration, and fancy web site, I assume it’s probably not cheap. They bill it as turnkey monitoring – they claim to be “up and monitoring” within 15 minutes “including the download”. They also have an impressive list of clients, including JP Morgan, Merrill Lynch, Cingular Wireless, Verizon Wireless, T-Mobile, Wyeth Medica, Hewlett-Packard (how ironic – did the OpenView team hear about this?), and a whole slew of other clients including major hospitals. Aside from the irony of HP using their product, I wonder to what extent these clients use up.time. Surely the likes of Merrill Lynch can afford more than a turnkey solution. And I’d bet that Verizon Wireless doesn’t use anything 100% off-the-shelf to monitor their communications systems.

Anyway, this got me thinking about network monitoring systems. Well, open-source ones, since I like the idea of having control over infrastructure. The forerunners seem to be Nagios (my personal choice), GroundWork Monitor (available in both open-source and proprietary versions), Zenoss (”Core” free version and a paid-for Enterprise version), Zabbix, OpenNMS, and Munin, Cacti, or one of the other MRTG-/RRD-based applications for graphing/trending.

As stated, I’ve always been a Nagios man. I’ve been running it for 3+ years, and it’s always worked well for me. Once you spend days learning to cope with the config files, it’s a breeze. Until they go and change them in Nagios 3. The one thing that I always missed was built-in graphing and trending. And some sort of *good* log analysis. So, Nagios 3 is coming out (the Nagios site claims to get Nagios 3 up and running in 15 minutes, as well), and I guess I should upgrade. However, after looking around a bit, I came to a frightening realization – Zabbix, Zenoss, and OpenNMS look a heck of a lot better than Nagios. Their interfaces appear much nicer (personally I think OpenNMS wins) and they seem to have a lot more features, too – like Zenoss’s inventory and configuration management. So, this got me thinking that there might be a change in the future – even though I’ve put hundreds of hours of painstaking customization into Nagios.

We’ll see where it goes. My main concern is that whatever I pick can handle integrating with my soon-to-be-implemented barcoded hardware inventory and tracking system. Integration with a good log parses, configuration file management system, and reporting system would be good too. We’ll see if the other offerings can stand up to testing (the concept of device detection is especially intriguing) or whether I’ll just end up building myself a Nagios front-end that pulls various bits of data (text, graphs, pictures, HTML, etc.) from various other sources such as Munin, Cacti, an inventory system, log parsing, etc.

Reviews