Archive for May, 2009

Stylish Yet Functional

Friday, May 29th, 2009

I decided my desktop’s current appearance was worth sharing. Enjoy.

My MacBook Pro's desktop

The Mystery of the Horrible Boot Speed

Thursday, May 21st, 2009

My Mom recently complained to me that her Hewlett-Packard brand PC was booting awfully slowly. She blamed it on my actions, claiming that it had to be the copy of Safari that I installed. She’s horrified to install anything on her computer for fear that it will “screw it up” or “slow it down”. In her case, it’s not a terribly unreasonable fear. If something does go wrong, she doesn’t know what to do to fix it. But blaming Safari for this problem did not make sense in the slightest, so I decided to take a peek under the hood and see why Windows Vista would boot so slowly on such a decent computer (besides the obvious, of course).

It took a while to find the Windows Vista equivalent to Windows XP’s BootVis tool, but apparently they made a new set of utilities for the task, called the Microsoft Windows Performance Toolkit. I won’t go into the usage of it, because you can Google it if you need that information. But I will show you how to interpret the results you get from it, and expose the culprit of this particular performance nightmare.

The first run showed a pretty horrendous sight:

Terrible Boot Graph

Unfortunately the first results were rather worthless, as they were absolutely littered with disk input/output caused by AVG:

Stupid AVG.

So I uninstalled AVG and ran the test a second time. This time, I got some conclusive results:

Okay, now we're talkin'

There are a few things to note about this graph. First of all, the core system boot sequence is shown to be taking 75 seconds (by “core system” I mean the time it takes between BIOS and loading a login screen). This is horrifying. Consider for a moment that Linux can finish the system boot in less than 20 seconds and reach a fully logged in graphical user interface in an additional 5-15 seconds. So in perspective, a total boot time of roughly 190 seconds (75 system + 115 userspace) is absolutely terrible.

It’s time to figure out the proximate cause. I noticed in the graph above that there is roughly a 20 second period during the system boot where there is minimal CPU usage, very few disk I/O requests, but the disk utilization is maxed out. This tells me that it’s doing a lot of slow requests which stall the system. I decided to look at the disk access graph, which shows where the drive is physically reading/storing data at a point in time.

Good lord...Good lord...

Criminy. That 20 second stall (highlighted in the second image above) is being caused by thrashing between some files that are physically very far apart on disk. Here’s a zoom on that section:

Good lord...

Well, details are necessary now. We need to know what files it is thrashing between (and why) in order to be able to do anything about it.

Registry disk-thrash?

Aha! Success! Er, well, maybe not. The files are the “COMPONENTS” and “SOFTWARE” registry hives. This is pretty much bad news because the Windows defragmenter can’t resolve this problem for two reasons:

  • Microsoft’s defragmenter doesn’t optimize for the most part. It’s largely intended for quick-and-dirty defragmentation.
  • Even if it did optimize, it cannot move files that are open for reading or writing.

It’s very strange though, that Windows didn’t simply read the two files in their entirety in two large reads instead of this silly thousands-of-tiny-reads crap. I think this could very well be a bug in Windows Vista. Come on, Microsoft. What were you thinking? A programming 101 student could tell you that a large disk read is faster than lots of smaller ones.

So in order to work around the problem, I first needed to be able to defragment my Mom’s computer’s hard disk completely without running the installed operating system. It’s unfortunate she doesn’t have a Mac, because if she did, we could simply use Target Disk Mode which basically makes the computer behave like an external FireWire drive; all that would be necessary in such a case is hooking it up to another machine as you would with an external hard drive, and then defragment that way. Unfortunately for her, the easiest way to do this was to repartition the drive so that we could install a temporary copy of Windows in a second partition.

The second issue is that we needed to be able to optimize the drive, and not just defragment it. By “optimize”, I mean physically arranging files on disk in such a way that they’re quickly accessed during the system’s boot sequence. This is different from defragmentation, which simply consolidates split files. Thankfully, there’s an awesome open source defragmenter available called JkDefrag, which is capable of optimizing as well as defragmenting. Sweet!

So I installed Windows 7 RC1 onto another partition (created via Disk Management), and then ran JkDefrag with the command-line option ‘-a 7′, which tells JkDefrag to optimize by arranging the files alphabetically based on their path and file name. This should cluster the contents of folders together, making accesses much faster.

So after a couple hours, it finally became bored and finished defragmenting, so I rebooted to re-time Windows Vista’s boot sequence. It was noticeably faster already, but I needed numbers. And I was quite pleased with the results:

A beautiful sight

Note that there’s no obvious point at which a bottleneck is noticeable. Neither the CPU nor the hard drive are maxed out for too long at any one point. And then there’s the physical layout access graph:

More beauty.

Awesome. JkDefrag did its job and the files are much better clustered. Better yet, the raw numbers are even more impressive. The system boot has been reduced to 45 seconds, which is 60% of the time it took previously. Based on the disk and CPU activity, the machine’s boot is complete after about 90 seconds, which is 47% of the time it took previously. (Keep in mind that xbootmgr continues measuring for 120 seconds after the user logs in, so we can’t just use the last time marker on the boot graph as the time it takes to boot. You have to look at disk activity and CPU utilization, and figuring out the point where the constant activity ceases and the machine begins to settle.)

So now my mom’s machine boots pretty quickly. Now I need to move on to my next project: Operation: Get Mom to Stop Using Internet Explorer.

Firefox 3 OpenSSL Woes

Friday, May 15th, 2009

Well, that was frustrating. I stayed up late last night trying to figure out why my Apache 2.2 server was misbehaving, and it ended up being a Firefox 3 bug. Or more specifically, a bug in the old OpenSSL library that Firefox 3 is statically linked to.

I tried to view our cgit page, and only got this:

And if I refreshed a few times, I got this ’ssl_error_rx_unexpected_change_cipher’ error:

But strangely, if I used Safari, I got exactly what I should have:

I finally figured out that Firefox was screwing up when using the TLS 1.0 protocol. So all I had to do was edit my Apache 2.2 httpd.conf to have this line:

SSLProtocol -SSLv2 +SSLv3 -TLSv1

I don’t like this solution though. The Mozilla Firefox team should release a build of Firefox that is statically linked to a newer version of the OpenSSL library which doesn’t suffer from this bug. I much prefer TLS because of the added security it provides, so I ended up with a line that opted for security over broken SSL library compatibility:

SSLProtocol -SSLv2 -SSLv3 +TLSv1

Also note that I disable SSL v2 as well bacause it’s widely known to be inherently flawed.