Tuesday, December 1, 2009

How Linux (almost) Drove Me To Windows XP

Yep. Sunday afternoon, I was this close to nuking the linux partition on my laptop. Monday, a lack of blank CDROM disks saved me.

I have … three linux machines in the house, not counting commodity items like a TiVo and such. One backup machine, one media machine, and my desktop machine. The backup machine is frozen at Debian unstable sometime four years ago — it works, and upgrades of working boxes scare me. The media machine I keep up to date, again tracking Debian unstable. I recently lost a drive on my last 1.3 kernel machine, a truly ancient and no-longer-used NAT box. My desktop machine has Ubuntu (I think Ibex) on it; but that machine mostly is used as a Windows box.

For the last nearly 10 years, however, my work laptop has been my primary development machine. I’ve been fortunate to be involved in Java development for the bulk of that time; non-GUI Java development no less. Telemetry systems, compilers, logic database servers, etc. For me, the environment is Eclipse and Java. The OS is somewhat less relevant than it is for others. (In the old days of Java 1.0, the environment was Emacs and Make. Make + Java == headaches.) (Some would say Emacs is an OS. Heh.)

So Why Run Linux At All?

Every oneof my work laptops (a succession of Dell’s broken by a Thinkpad) has come with Windows installed; somewhere in 2003-2005, Blackdown’s JDK became available, before Sun was willing to support Linux. Prior to that, the Windows JVM was faster than the Solaris JVM for concurrent operations; for a long while Solaris’s JDK was mired in green_threads and such. But even the early Blackdown JDK delivered better performance on Linux than Sun’s own JDK did on Windows — I went where the speed was. Plus, for server applications, Java really was ‘write-once-run-everywhere’. So I worked hard to keep that Linux partition running as laptops came and went.

What Went Right?

When I got my current laptop, a Dell D830, over two years ago, I decided to stray from my Debian roots. The last laptop had introduced SATA, and has been a real pain to get Debian installed on as SATA support was new and wifty. This time around I decided to try a stock Ubuntu install; Gutsy Gibbon I think. It just flew in. Installation was trivial. I waded through the GCJ junk and got Sun’s JDK installed. Again, the same programs on the same machine under Java were faster under Linux than under Windows XP. Life was good.

Sure, suspend/resume was a fond hope, Flash lagged in terms of support, yatta-yatta. But for me, Eclipse + Linux + Java was fast, fast, fast.

What Went Wrong?

I kept up the install, moving from Gutsy to Hardy to Ibex. Things stayed good. I liked KDE3.

Then I updated to Jaunty and I noticed something. For the first time, Java under XP was nearly as fast as Java under Linux. It didn’t seem that Windows had gotten any better, rather it seemed like there was more gunk in the works in Ubuntu. Plus, either within the release or on purpose, I tried KDE 4.0. It wasn’t that I didn’t like KDE 4; but it just didn’t help me work any better, at a cost of familiarlity (I was compiling KDE 2 for Solaris a lifetime ago; I liked KDE3). Suspend/Resume were still missing, and Nvidia seem to hit a pothole with their drivers. Multihead was still a pain in the ass when you moved between multiple setups daily.  I found myself sticking to Windows more often. In the last three months, I use Windows exclusively.

Then Karmic came around. I thought “Hey, there has been lots of rumbling about laptop support; maybe they fixed a bunch of stuff”. So I dist-upgraded. That did not go well; lots of stuff was in odd states, partial packages, strange error messages, general weirdness. I figured, I’d steadily upgraded across multiple releases — moving my partitions across machines — maybe it was time for a clean install? So I burned a CD, tar’d up my home directory, and installed from scratch.

Karmic is neat! Lots of stuff that just works, lots of eye-candy, cool.

Except.

It was slow as molasses.

I mean, ridiculously slow. I have a test suite (all Java) that I run multiple times a day. Around a thousand tests, across the creation and deletion of twenty-plus database servers. Under Windows it takes around 2000 seconds. Left to run overnight, it took over 4.5 hours on the exact same machine under Karmic.

9X Slower? WTF?

Turkey Troubleshooting

My family had a stay-at-home Thanksgiving. My girls watched Mythbusters for hours. I spent most of it installing various Linux distros and trying things. It was painful, and ultimately fruitless, but for an accident.

First, I realized the cpu frequency scaling was keeping my processors at 800Mhz, instead of 2.2Ghz. Further rooting revealed that cpufreqd, when installed, had a rule to limit the frequency to 800Mhz when the temp went above 55C. That was a bit low for my chip, so I removed that rule for the purposes of investigation. That was fun; I could force the system to run at 2.2Ghz, but the temp show up past 95C! I halted the tests before things got melty. (BTW, if you don’t bother to check your CPU speed, you can just be plagued by the feeling that your machine is slow, never realizing it is running slowly. The Gnome CPU Freq applet is a help here.)

No ACPI fan registered at all. The I8K modules see the fan, and can even make it spin faster, but after 3-5 seconds the fan would slow down. Still the system would zoom up past 80C.

Switching back to Windows gave me my first clue — the fan came on, louder than it ever was under Linux, and spun madly for around five minutes. Running the tests showed the CPU temp never getting above 70C. Perhaps there were fan control issues? I found various comments to this effect, lots of random cpufreq comments in the kernel changelogs, and a number of really interesting bug reports in Launchpad. This forum post sums things up nicely. Lots of frustration, apparently since Jaunty, precious few answers.

I ran through Debian Stable, Fedora 12 (very nice), and Arch Linux (also nice if you are willing to spend a bit more time managing your system). Gentoo looked attractive, but I was too cranky by that point. All the systems performed the same, with minimal fan speed and lots of overheating. Only Fedora seemed to match the polish of Ubuntu; only Arch matched the package management of Debian/Ubuntu. FWIW.

Sunday night I bagged it; resigning myself to leaving Linux behind for at least a while. I don’t really have time for a distro I have to work to maintain. This laptop is a tool. My boss will happily fail to pay me for screwing around with Linux.

Eureka!

I went into the office Monday, and I had a problem. The last distro I had installed, Arch, never picked up my Windows XP partition as bootable, and it wasn’t in the Grub menu. I could noodle around in Grub, or I could just install Ubuntu again. Hey, they might fix it someday? Maybe in Lucid?

We couldn’t find a blank CD. What are the odds? Then we found an old one, but Brasero under Arch failed to burn it. Blech.

Then my Release Engineer/IT Guy/Resident Mad Scientist says he thinks he has one already made up. We run Ubuntu on a batch of Dell servers, so it is plausible. (any guesses where this is going?) But he doesn’t have Karmic Desktop — he has Karmic Server.

What the hell — I use the laptop as a server for all intents and purposes anyway, so what could it hurt?

I got it installed, then I installed X, gnome-core, sun-java6, and Eclipse. Ran my test. Blinked. Listened as the fans spun mightily.

Temp never got above 67C. Tests completed in 1799 seconds. Ha! Victory!

Dance of the happy coder!

Wha-at Happened?

Got me. I have some theories.

  • no acpid running. NO acpi modules installed. Yes, this means no suspend/resume — I don’t miss it. Boot is so fast I don’t care anymore. I suspect the kernel ACPI code of being … not quite happy with my D830.
  • different kernel — Ubuntu server installs the PAE kernel by default. What else might be different in that kernel?

Something is now keeping it’s grimy paws out of the BIOS’s way when it comes to fan control. Someone with more time than I have can work their way incrementally from an Ubuntu Server install to an Ubuntu Desktop install and figure out where the tripwire is.

But Ubuntu Server saved Linux for me.

[Via http://designbygravity.wordpress.com]

No comments:

Post a Comment