Archive

Posts Tagged ‘solaris’

My biggest problem with Linux

October 27th, 2008

For one of my wonderful classes, Internet Security, I’m doing a presentation on “patch management”. While I’m obligated to cover Windows – and, of course, will talk about MacOS – I’ll obviously be spending a good deal of time on the Unix/Linux side of things. This has gotten me thinking about one of my biggest problems with Linux (and specifically OpenSuSE, my usual default distro. Patch management is utterly awful.

Here’s the problem: I have about a dozen machines under my control. I need to keep them all up-to-date. Currently, I manually do patches and upgrades via YaST or zypper. I thought about scripting this through zypper, but that doesn’t make any sense – the packages on the machines are far from homogenous, so there’s no clear way to make one script that updates them all. I considered using Puppet or CFengine or something of that sort, but that’s too heavy-weight for me – for only a dozen machines, many of which are personal or development only, that’s a lot to keep track of by hand, and a lot of work defining which patches should be applied, and which machines shouldn’t be changed.

My other peeve is distribution upgrades. About three of my machines are still running OpenSuSE 10.0 or 10.1, both of which are unsupported, and no longer even have downloads available. Why? Becuase I’ve done major OpenSuSE upgrades before, broken a LOT of stuff, and I simply can’t risk that on machines that can’t stand extended downtime. This process *needs* to be made easier. Bottom line – it should be made no more difficult or unreliable than a kernel upgrade. IMHO, the biggest selling point for Solaris is its’ ability to do a total upgrade to a second partition, and switch-over at runtime. Why doesn’t Linux (or SuSE) have this yet?

What’s my ideal solution? A curses application that uses text-file backends (curses so I can run it over SSH even if I have a slow link or high latency, like from a SSH session on my cell phone, if need be). The app would allow me to list all of the machines I want managed. It would connect to the machines over standard SSH, and would leave an extensive audit trail of what’s done, both on the management console and on the machines (as well as running as a dedicated user). The application would maintain an inventory of all of the packages on every machine. It would check daily for new patches/updates to any of those packages, and e-mail me a daily summary of what’s new, including all dependency changes, and which machines need the update. It would also allow me to define, on a per-machine (or per-group-of-machines) basis, rules for packages that must stay at their current version – i.e. I have a bunch of PHP4 apps, so machine X needs to stay at PHP4. The e-mail summary would include any packages that aren’t going to be updated for a specific machine because of dependency/version rules, as well as warnings about any new packages that have a dependency that has a rule set. I could then run the main curses app on my admin machine and, starting from NO selections, select which updates I want to apply and whether I want to ignore or create new rules to keep something at its current version, on a per-machine or per-group basis. This curses app would generate a file (XML?) of what to do (which would also be generated or edited by hand, easily). The XML file would then be fed into a script that downloads all of the needed packages to a central (local) mirror (or, optionally, for remote machines, has them download locally on the machine), checksums them, and then installs them (running commands over SSH) on all applicable machines. It would then keep a log of all changes, both on each machine changed (in a master changelog file) and on the central administrative machine. Most importantly, the curses interface would have a simple, quick way to back out any specific update or group of updates for all machines, a group of machines, or one machine. All data needed to back out a change would be kept on each machine (say, cleaned up at the next update of that package and all of its’ dependencies) with machine-readable instructions kept in a central file, allowing local rollbacks – i.e. a machine goes down, I realize that it was because of an update to package X, and on the local machine I can check the changelog, see an entry like “Package X updated 1.0.0 to 1.0.1 on yyyy-mm-dd, Change ID 1234″ and then, to rollback, simply issue a command like “patchmgt rollback 1234″ on the effected machine.

Just some ideas, and a little rant.

Ideas and Rants , , , ,

Rutgers Student Linux Group and Sun

March 26th, 2008

The next week will be quite busy. For one, I’m going to attempt an OpenSolaris operating system installation on my Asus eeePC. Last I heard, there were some compatibility issues – though this was my first attempt at running the Solaris operating system on a laptop, and it took quite an effort to get my favorite mainstream Linux distro installed on the eeePC, though I’ve been running that on laptops for 6+ years.

This Sunday, March 30th, the Rutgers University Student Linux Users’ Group (RUSLUG) will be hosting our annual Installfest in the EIT Lab from 10 AM until 6 PM. While the event is generally marketed towards Linux newbies, it’s usually attended by a diverse range of students (and staff) from first-timers to Solaris sysadmins. I’ll be attendance, as always, and will also be conducting some demos of new Sun technology (mainly OpenSolaris and NetBeans). For anyone in the New Brunswick/Pisctaway area, I’ll also be armed with some door prizes and a whole plethora of CDs and DVDs. For anyone interested, I’ll be providing information and pointers on NetBeans and Solaris, as well as installation assistance (and maybe some prizes) for anyone looking to give OpenSolaris a whirl on their system.

Following up the Installfest, on Tuesday, April 1st at 9 PM (also in the EIT lab) will be the RUSLUG Newbie Night. It’s generally a fun-filled evening with Ubuntu LiveCDs and a general Q&A session about Linux. Generally this includes one-on-one assistance for new Linux users. In an effort to raise the level of content (and provide a diversion for more experienced users if there aren’t many new faces), I’ll be once again demo’ing some Sun technology, and specifically providing an overview of my recently completed personal mailserver migration from SuSE Linux to OpenSolaris. Once again, visitors can expect some door prizes and lots of fun CDs to take home.

In other RUSLUG news:

  • It looks like I’ll be running to become an officer next year. Anyone else at Rutgers can feel free to contact me with ideas, etc.
  • RUSLUG’s current box, ruslug.rutgers.edu, is a Dell desktop thrown on a shelf in a closet. I’d like to find someone willing to help out with procuring a new box. It doesn’t need to be anything fancy – just pretty simple, though I’d like to look into high capacity storage for mirroring distros. FYI, the current box is a Dell desktop with a 1.7GHz P-4 (256KB cache), 512 MB RAM, and about 250GB of IDE storage (150GB + 100GB, no RAID). We don’t need a big upgrade in processor power, but more RAM and RAID for the system and user disks would be nice (distro mirrors can be a big IDE/SATA or an external disk).

Higher Education , ,

Sun Blade 150 working!

December 14th, 2007

Yes, it’s 3 AM here, and I’ve been working since about 6 PM on this. But I finally got one of my two surplus (and fully locked down in NVRAM) Sun Blade 150 workstations up and running. I encountered a few problems along the way, but managed to solve them – more or less.

Problem 1 – NVRAM password set, impossible to install an OS.
I did a *lot* of googling, and asking for advice. Eventually, I came by a forum post expressing success with a procedure of pulling out and then re-inserting the NVRAM *while* the system is powered on. This left my system un-bootable. I pulled the chip again, and found two pins bent. I straightened them, re-inserted, and rebooted using the Stop+N Equivalent Functionality (after powering on the system, once you hear the POST beep, click the power button twice quickly). This temporarily resets the NVRAM to default settings. I found that the password was gone, and was able to issue the “set-defaults” command at the “ok>” prompt. I then popped in the Solaris 10 install CD, issued the “reset-all” command to reboot, and used Stop+A at boot to bring up a boot menu, and told it to boot from CDROM (”boot cdrom”). Installation then started.

Problem 2 – Invalid NVRAM
After the above procedure, when booting, I got a message following the Sun banner stating that there was a problem with the IDprom checksum. When the install CD booted, I also got messages stating “Invalid format code in IDprom”, “Warning: IDprom checksum error”, and “os-io Invalid format code in IDprom”. After another five hours of work, I found that it’s essentially something I have to live with. While OpenBoot previous to version 4 allowed use of the “mkp” and “mkpl” commands to directly write the IDprom, version 4 and above allows no access to this. The IDprom was reporting an ethernet (mac) address of all zeros. Unfortunately, there seems to be no way to correct this as far as I’ve found. However, it didn’t effect my OS installation… much.

*There may be a way to access the IDprom through OpenBoot 4.x, but I couldn’t find any reference to it online, and couldn’t figure out the FORTH commands from the reference docs.

Some helpful links for the above problems include the OpenBoot 4.x Command Reference Manual, currently found here, as well as the Sun Blade 150 Service Manual (from docs.sun.com), document 816-4379-10, currently indexed with the Sun Blade 150 docs here. It was also interesting, in my search for help, to look at the OpenBoot 3.x Command Manual, and see how easy it was to re-write the IDprom on older Sun Blade workstations.

Pleaese note that the advice given in the Unofficial SunBlade 100 FAQ and the squirrel.com Sun NVRAM FAQ doesn’t seem to work on the 150 with OpenBoot 4.x. From what I can tell, all of that advice applies only to OpenBoot 3.x!

Some other helpful links included an ITworld.com article on Sun NVRAM passwords, this Sun Developer Forum post, and this post on password recovery.

Problem 3 – MAC / Ethernet address is 00:00:00:00:00:00
When booting Solaris, I found that I couldn’t get DHCP. When I finally got the OS running and logged in as root, I realized something interesting – I couldn’t access or ping anything past the one switch I was connected to. But everything on that switch was fine, pinging both from and to the Solaris box. I ping’ed from my laptop, and then thought to run “arp -a”. It showed a MAC address of 00:00:00:00:00:00! Running “ifconfig -a” on the Solaris box confirmed this. Luckily, the first time I booted this box, I wrote down the ethernet address and hostID as shown on the banner. I ran a quick ifconfig to setup the correct MAC, like “ifconfig eri0 ether xx:xx:xx:xx:xx:xx”. Networking now worked perfectly, and I could get to everything on the LAN as well as browse the web. It would be good to somehow reset this in NVRAM, but for now I’m just going to add it to the startup scripts somewhere. One forum post that I found suggested adding the previous ifconfig command to the top of /etc/rc.c/init.d/network, which I’ve done and will see how it works at the next boot.

Problem 4 – Can’t login to SMC (Solaris Management Console)
My next task after getting the system up and running, and getting networking working, as to give myself a user account. I was logged in using the Java Desktop System, so I opened a terminal and ran “smc &”. After the usual initialization wait, I loaded the toolboxes for the local machine, and connected. When I clicked on the “users” module and entered my root password, I got an invalid password / login failed message. I tried again and again, even checking against the post-it that I wrote the password on until I memorize it. Nothing. Searching the forums, I came by this post, but the value in /etc/security/policy.conf was correctly set to ” CRYPT_DEFAULT=__unix__”. So, on a wild hunch, I used “passwd” to reset my password to a shorter one, which I use on a few other (unimportant) workstations. Magic!

I now have, after two days of work, a working Solaris box. Now that I have a good OS install, in order to get the other box working, I *should* just be able to swap HDDs, boot, login as root, and use the “eeprom” command to set “security-mode” to “none”, bypassing all of this bull****.

Projects, Tech HowTos , , , , , ,

*nix

September 28th, 2007

First off, my Sun blog should be coming sometime this weekend/early next week. If I post anything interesting there, I’ll be sure to cross-post it.
This morning at work, while reading Digg, I came by two interesting links that got me thinking:
5 Reasons Your Parents Should Use Linux and Ten Things Linux Distros Get Right (That MS Doesn’t).
Now, I’ll admit, my *nix experience is pretty much limited to Linux. I’ve used BSD a few times, but only as pre-built images for embedded systems like my Soekris boxen. I’ve used Solaris mainly just as a user/web developer in SSH at work. And while I now have a work computer running Solaris 10 and a SXDE image on my laptop, I’m still relatively new – and, given that I’m now doing hardware support and wireless work, I don’t even know what I need another machine in the office for.
That being said, the second link got me thinking. Specifically, about something I read in The Art of Unix Programming [Wikipedia] by Eric S. Raymond (available online here) with regards to interface design. One quote that I was able to find in the online version, comes from Chapter 11, under the subtitle “Tradeoffs between CLI and Visual Interfaces”,
“Resistance to CLI interfaces tends to decrease as users become more expert. In many problem domains, users (especially frequent users) reach a crossover point at which the concision and expressiveness of CLI becomes more valuable than avoiding its mnemonic load. Thus, for example, computing novices prefer the ease of GUI desktops, but experienced users often gradually discover that they prefer typing commands to a shell.”
There is another similar quote in the book, mentioning how resistance to the CLI drops as typing speed increases.
Unfortunately, in some areas I’m still bound to Windows. Though my only personal use for it is to control an ancient Umax Mirage IIse SCSI scanner (with only Windows and Mac drivers), I ultimately need to touch it now and then – whether on my mother’s box (she claims she has to have Windows and MS Office because “that’s what businesses use”) or as admin of the four boxes at the Ambulance Corps where I volunteer.
However, whenever I am (unfortunately) pushed into the task of working on a Windows box, I always feel something lacking. To be blunt, I don’t see how experienced users can deal with it. And this isn’t just an issue of multiple desktops, or reliability (I expect my desktop to have months of uptime, and my servers to have years). This isn’t just pro-Linux, it’s anti-Windows. Linux is great. Solaris seems wonderful, and I can’t want to move my servers over. And, believe it or not, due to playing around with the Solaris Management Console, for the first time in 5 years, I plan on running X on my servers.
What this is, is a talk about total workflow. Years ago, I reached the point where I am more comfortable at the command line, or in an Ncurses-style GUI, than in X.
I an attribute this to two factors – verbosity and speed. The CLI is as verbose as anything can get. I remember setting a static IP on a Windows box. I had to navigate the Start menu, open up the control panel, the network thingy, click on the network card, and work through a series of dialogs. In Linux, I clicked on the terminal icon, typed “sudo ifconfig eth0 up 192.168.0.211″ and then a password. Done. Likewise, refreshing a DHCP lease on Windows requires a whole bunch of “repair connection” nonsense, whereas in Linux all it requires is “dhclient eth0″. The bottom line I know what I’m doing. Windows should have an option to let me quickly do it.
Speed is a related issue. Click, click-click, drag, click, click…. what about just typing? Even for people who aren’t CLI-friendly, there’s Ncurses. YaST2, the SuSE/openSuSE administration tool, has both GUI and Ncurses interfaces. I always use the Ncurses interface. Why? Because I’ve been using it for years. I know that if I want to add a user through YaST, I hit the down arrow 7 times, tab once, down 5, enter. Tab once more to bring up the add user dialog. I can do this in well under a second. What’s the bottom line? Well, first of all, my hands are already on the keyboard. That’s where they like to stay. That’s where they’re comfortable. My fingers need to move a *lot* less to navigate with the arrow keys, tab, and enter than they do to use a mouse. If you know what you’re doing, if you already know what you’re looking for, then a mouse is slower than the speed of thought (or reaction).
So where’s the Windows bashing? Simple. How do people at Microsoft deal with this? How does the guy who *wrote* that network settings dialog deal with navigating the GUI every time, even though he already knows exactly what he wants to do – and probably the system calls to do it?
The bottom line is that every time I sit down at a Windows machine, I wonder how the most popular OS is one that doesn’t give any thought to advanced users. I know that I can type faster than I can move a mouse, why don’t you let me use that? More importantly, why didn’t Microsoft ever think that people would use computers on a network? When I installed Solaris, I wanted to edit a config file. I hadn’t customized *anything* yet, hadn’t installed any other software, nothing. Yet, I was able to open up a terminal and grab my .emacs file from my laptop in one line (scp).
To be totally honest, the question running through my mind is something like “everything is so much quicker on Linux. How do experts deal with Windows?”

Miscellaneous Geek Stuff , , , , ,

Sun, Solaris, other fun stuff

September 26th, 2007

A few bits of news, none of them too important:

I’ve just been hired as the Sun Microsystems Campus Ambassador for Rutgers University. Essentially, my job is to tell students, faculty, staff, and researchers about Sun’s open source technologies and what they can do. This involves giving some tech demos and talks, some networking, and completing a lot of training on my part. Hopefully also getting to hand out some neat Sun swag around campus. I have to admit, I’ve never been a real fan of GUI IDEs – I do pretty much all of my development (admittedly, very little in Java, most of my work is simple PHP stuff) on Emacs. That being said, I’m really pyched about trying some new stuff like J2EE and maybe some Ruby. However, I also looked over some of our training materials on the upcoming NetBeans 6.0, and truthfully, I’m damn interested. I haven’t used a Java IDE since three years ago in high school, and it looks as though they’ve come a very long way.

I’m also really getting into Solaris. The new openSolaris / Solaris Express releases have a lot of great features – and I can’t want to get my hands on a machine running Zones, Containers, and ZFS, just to mention a few technologies. I setup Solaris 10 on one of my work computers, and I’m hooked. It’s fine for a desktop, but I can honestly say that after a few minutes playing around with the Solaris Management Console, I’m seriously considering ditching SuSE (now OpenSuSE) which I’ve been using as a server OS for some 7 years, and switching over to Solaris.

Unfortunately, Solaris is really an enterprise OS. That means it has loads of wonderful features that are also rock-solid, and was designed with the idea of centrally-administered servers in mind – something that SuSE only caught onto recently. However, that also means that it is intended to run on what I would call “new” hardware. In other words, don’t look to put Solaris 10 or Solaris Express (openSolaris / Solaris 11) on your 386. And if you’re like me and running something along the lines of a Generation 1 Compaq Proliant ML370 with a 2nd-generation SmartArray RAID controller (a system that was made about 10 years ago), you might have some issues. It appears that the ’smartii’ RAID driver (and EISA support) was removed from Solaris 10 in 6/06. The solution? I’m going to be looking into buying a bunch of new systems, possibly Dell PowerEdge servers and, if I can get the cash, a few Sun systems.

Lastly, within the week, I’ll be setting up a blog at blogs.sun.com. This will probably contain a lot of Rutgers-specific information, but will also most definitely include my notes on Sun products, and a healthy amount of Solaris-related information.

Miscellaneous Geek Stuff ,