Archive for the 'Programming' Category

Acid 3 is out! IE 5.5 beats IE 7?

Thursday, March 6th, 2008

I am quite excited. The Acid 3 web standards-compliance test is now up, and there’s no browser in existence yet that fully succeeds at the test.

Let me explain what these results are telling us before I show them. The World Wide Web Consortium (W3C) develops standards for Internet applications. For example, XHTML, HTML, CSS, etc. Acid 3 basically takes a lot of W3C’s newer standards and tests to be sure that the browser supports the features it should and that it behaves as it should when using them. Acid 3 is a collection of 100 different test suites to be sure that these standards are being met. So when I say that browser X gets a score of 55%, this means that browser X passed 55 of the 100 test suites.

Updated March 8, 2008. Camino nightlies switched to Gecko 1.9 to bring it up to par with Firefox and Seamonkey nightlies. All Gecko 1.9 browsers are at 69% currently (were at 68%).

Updated March 9, 2008. Too many updates to record. I’ll start logging changes again after the Slashdot effect dissipates.

Updated March 10, 2008. I’ve received quite a few emails about different results people have gotten with their browser. I’m working on adding them all to the list.

Updated March 11, 2008. Split browser list into two sections (release and beta). Added Mobile Safari to release list. Updated Gecko nightlies (now at 70%). Added SeaMonkey 1.1.8 to release list. Removed Firefox 3.0b3 and added 3.0b4 to the beta list.

Updated March 13, 2008. I’ve received multiple requests to add Shiira to the list. Unfortunately, Shiira crashes after attaining a score of 26. I’m not going to add it until it makes it through the test (failing or not) without crashing.

Updated March 15, 2008. The latest WebKit nightly (r31078) is now at 91%.

Updated March 17, 2008. The latest WebKit nightly (r31114) is now at 93%.

Updated March 18, 2008. Safari 3.1 has been released. I’ve been informed that Acid 3 itself has been updated, so I will be rerunning the tests on all listed browsers. I don’t have an iPhone, so if someone could send me a result for Mobile Safari (assuming there’s an update for it’s version of Safari as well), that would be great. More importantly, if someone could report any inconsistencies they see here (browsers with incorrect scores), please inform me.

Updated March 22, 2008. The latest WebKit nightly (r31232) is now at 95%.

Updated March 25, 2008. The latest WebKit nightly (r31306) is now at 96%.

Updated March 26, 2008. The latest WebKit nightly (r31323) is now at 98%.

Updated March 26, 2008. The latest WebKit nightly (r31356) now passes Acid 3 (100%).

Updated March 27, 2008. People keep sending me Opera’s claim of victory on Acid 3, but since I cannot verify their results, I’m not listing it here. Once the build becomes public, I’ll add the result (don’t hesitate to email me when this happens).

Updated March 28, 2008. Opera has released a public build which passes Acid 3. I’m going to test it momentarily.

Updated March 31, 2008. Opera has released 9.50 beta 9864, it gets a 79%.

Thanks to all who have submitted their results. I was going to try and maintain a list of people to individually thank, but it’s becoming too difficult to keep track of who has sent me results and managing to get those results accurately depicted here.

These results are public domain, but I would very much appreciate a link back to this page if you copy this table somewhere else. I will keep updating this table as the race continues, and if everyone copies my table, they’ll have to maintain theirs as well. Mine’s not going anywhere, just link to it. :)

Beta Browsers
Browser Version Operating System Acid 3 Score
Safari WebKit Nightly (r31388) Mac OS X 10.5.2 100% (0.44s)
Opera 9.0 Beta (build 636) “WinGogi” Windows Vista 32-bit 100% (1.31s)
Opera 9.50 Beta (build 9864) Windows XP Service Pack 2 79%
Camino 2.0a1pre nightly (2008031801) Mac OS X 10.5.2 71%
Firefox 3.0b5pre nightly (2008031804) Mac OS X 10.5.2 71%
Seamonkey 2.0a1pre nightly (2008031801) Mac OS X 10.5.2 71%
Flock 1.3pre (2008031604) Mac OS X 10.5.2 70%
Firefox 3.0b4 (2008030317) Mac OS X 10.5.2 68%
Firefox 3.0b4 (2008030714) Windows XP Service Pack 2 68%
Firefox 3.0b4 (2008030714) Windows Vista (32-bit) 68%
Internet Explorer 8.0.6001.17184 (Beta) Windows Vista (32-bit) 18%
Internet Explorer 8.0.6001.17184 (Beta) Windows XP Service Pack 2 18%

 

Released Browsers
Browser Version Operating System Acid 3 Score
Safari 3.1 (5525.13) * Mac OS X 10.5.2 75%
Safari 3.1 (525.13) * Windows Vista (32-bit) 75%
Konqueror 4.0.3 Linux 65%
Epiphany 2.22 Ubuntu 8.04 (beta) 59%
Camino 1.5.5 Mac OS X 10.5.2 52%
Firefox 2.0.0.12 (20080201) Mac OS X 10.5.2 52%
Firefox 2.0.0.12 (20080201) Windows Vista (32-bit) 52%
Firefox 2.0.0.12 (20080201) Windows XP Service Pack 2 52%
Flock 1.1 (20080304) Mac OS X 10.5.2 52%
SeaMonkey 1.1.8 (20080201) Mac OS X 10.5.2 52%
Firefox 2.0.0.12 (20080201) CentOS 5 51%
Firefox 2.0.0.12 (20080201) Windows Vista (64-bit) 51%
Epiphany 2.20.3 Gentoo Linux 51%
Konqueror 3.5.9 Gentoo Linux 51%
Firefox 1.5.0.12 (20070508) Windows XP Service Pack 2 50%
Opera 9.26 Mac OS X 10.5.2 46%
Opera 9.26 Windows Vista (32-bit) 46%
Safari 3.0.4 (5523.15) * Mac OS X 10.5.2 39%
Safari 3.0.4 (523.15) * Windows XP Service Pack 2 39%
Safari 3.0.4 (523.15) * Windows Vista (32-bit) 39%
Mobile Safari 3.0 (420.1) iPhone 39%
Internet Explorer 5.50.4807.2300 (SP2) Windows XP Service Pack 2
(Multiple IE)
14%
Internet Explorer 5.50.4134.0600 Windows ME 14%
Internet Explorer 5.50.4807.2300 (SP2) Windows 2000 Service Pack 4 13%
Internet Explorer 7.0.5730.13 Windows XP Service Pack 2 12%
Internet Explorer 7.0.6000.16609 Windows Vista (32-bit) 12%
Internet Explorer 6.0 Windows XP Service Pack 2 12%
Internet Explorer 6.0 Windows 2000 Service Pack 4 11%

* I don’t know whether Apple meant to put 5xx or 5xxx. I highly doubt that they intended to use such an inconsistent versioning scheme, but I’m going to cite here whatever they put on the About window. Thanks to Mark Rowe from Apple for explaining this: “The leading digit of the build number signifies the platform version. 45xx.y.z is Tiger (10.4), 55xx.y.z is Leopard (10.5), and 5xx.y.z is Windows.”

I guess it really comes as no surprise that Internet Explorer is currently in last place. But really, how did Internet Explorer 7 and 6 do worse than v5.5?

Note: If you want to see another browser in this listing, let me know and I will try my best to test and add it here. Please include the browser’s name, a screenshot of the result and about window.

CheckMark: A checksum algorithm benchmark

Friday, February 1st, 2008

CheckMark logo

I’ve been working on a new benchmark for the past several weeks (would have finished it much sooner too, had it not been for college). I think the results are very much worth sharing.

The first thing to know is that the source code to the new benchmark (I’ve named it CheckMark), is one of the CrissCross library’s example programs and is available in the CrissCross source repository. It will be more easily accessible when it goes into the CrissCross v0.7.0 release (probably in a few weeks).

CheckMark has several different checksum/hashing algorithms in it: Adler-32, CRC32, MD4, MD5, SHA-1, SHA-256, SHA-512, and Tiger. Tiger and SHA-512 were written to be especially fast on 64-bit machines (they use 64-bit word sizes), and the rest are optimized for 32-bit use.

The benchmarks I’ve done are fairly simple to do. First, for consistency, it’s best to run in an environment as close to the one I used as possible. I run the Linux 2.6.24 kernel and use GCC 4.1.2. To compile and run CheckMark, you need to first get a copy of the source code using Subversion, and then once the source is downloaded, do ‘make check’ in the main folder. If your machine, compiler, and CPU pass the CrissCross test suite tests, then you can go ahead and run ‘make example’. And then to run CheckMark (assuming the build goes fine), simply ‘examples/CheckMark/checkmark’. CheckMark generates 100 1024-byte strings and times how long it takes to hash all of them 500 times. The output of the program for each hash algorithm is the number of hashes calculated per second. Since each string is 1024 bytes long though, the hashes per second number is equivalent to the number of kilobytes hashed per second.

The machines I tested with CheckMark used the following processors in 32-bit mode: ARM9 (specifically, the one used in my Nintendo DS), Intel Core Duo, Intel Core 2 Duo, Intel Xeon (with a Pentium 4 core, a.k.a. NetBurst), and a PowerPC G4. I also tested the Intel Core 2 Duo in 64-bit mode to show the speed difference in 32 vs 64-bit modes.

My results are pretty intriguing, and they can be found here. Please note that in the graphs, a smaller bar is better, because the processor is accomplishing more work per clock cycle. Also note that since the graphs aren’t based on time but instead based on work per clock cycle, the clock speed of the processors doesn’t affect the graphs. In fact, the clock speed is taken into account when calculating the number of clock cycles per hash. This shows more of the raw performance of the processor and how efficient it truly is. This just goes to show yet again that clock speed is not everything!

Okay then, how are we to read these graphs? What do these results mean? Well, the most obvious thing I see when looking at the graphs is that you probably shouldn’t bother with SHA-512 or Tiger hashes unless you’re running on a 64-bit processor running in 64-bit mode. It’s also plain to see that the Core 2’s 64-bit mode very very easily outperforms its 32-bit mode when doing 64-bit math. This makes perfect sense, as the SHA-512 and Tiger algorithms make heavy use of 64-bit math. In 32-bit mode, a lot of extra work has to be done to complete a single 64-bit math operation.

Another observation I could make about these graphs is that the PowerPC just isn’t doing well at any of them, except SHA-1, where it appears to do well because the NetBurst core takes a huge performance hit. I’m fairly certain that the NetBurst’s speed issues here are due to the fact that the NetBurst core lacks a barrel shifter, the component which makes shift and rotate operations blindingly fast. There are a large number of shifts and rotates that occur in the SHA-1 algorithm, so the performance hit is understandable. Not surprisingly, the barrel shifter has returned in the Core and Core 2 architectures.

I was very impressed with how efficient the ARM9 processor was at doing these hashing algorithms. It was accomplishing so much more work per clock, even though it is a RISC processor, like the PowerPC. The only case where the ARM9 does very badly is when 64-bit math is involved.

I would love to hear your thoughts about this, as well as your own results, if you’d like me to add them to the list. Email me directly at steven@uplinklabs.net and let me know what’s on your mind!

Something I’ve Said For Years

Thursday, January 10th, 2008

Check it out. Some PhDs are slamming Java. As I said in 2006, Java’s far too much theory, and not enough practical low-level knowledge.

Computer Engineering > Computer Science

Monday, December 18th, 2006

Why would a computer engineering degree be better than a computer science one? Computer science works in the theoretical plane. Stuff that you have to run for months on end on a supercomputer or not even solve for years. Engineering is about making things work with what you have to meet a set of constraints, and deals in milliseconds or microseconds. In game programming, you have a CPU, a GPU, memory that has a certain latency, a disk with a certain seek time, etc. You need to accomplish so much every 60th of a second given those hardware constraints or else the game will suck.

Put another way, computer science is about exploring new ideas and not having to worry about the practical implementation, which is the engineering portion. Prime numbers for example. Computer scientists love to talk about ridiculously large prime numbers and all you can do with such things, like unbreakable encryption and such. Engineering is taking that abstract algorithm and making it work on, say, a 200MHz ARM processor running on your battery-powered MP3 player, where things like battery life and response time matter.

What I’ve found is that computer science majors write really bad code. Not code that doesn’t work, but code that’s slow. They don’t know how to optimize. They can’t tell you the difference between an AMD Athlon XP and an Intel Pentium 4. They can’t explain why the new Core 2 is such a good thing. In their little abstract world of trees and lists and Java, they don’t need to understand the low level hardware. Many of them can’t ever read an x86 disassembly or tell me the first thing about how many registers in a Pentium processor or what the registers are for.

Parallelization is the big thing right now. The Xbox 360, for example, has a 6-way processor. How does one write video games when you can have 6 concurrent pieces of code running at the same time? How does that change your rendering engine? Your game logic? Memory allocation? The Playstation 3 has nine hardware threads. This totally changes the way one writes video games from now on.

Computer Science has a long standing solution to concurrency - the concept of a lock. Which in Linux and Windows are synchronization objects known as locks, mutexes, semaphores, and other names. They’re used in every operating system and just about every shipping Windows and Linux application today. Even calling malloc() is a seralizing operation on the memory heap that causes a lock. Have multiple threads calling malloc() and they basically get to stand in line (a queue data structure) and execute serially. That’s not very parallel. So once again, Computer Science has given us something that doesn’t translate well into real-world performance.

My point of this is that you should NOT go into pure computer science. You’ll rot your head with abstract ideas and end up writing very poor code. Take either computer science with electrical (or computer) engineering electives, what’s known as “CS triple E”, or take computer engineering to get the best of both worlds. If you can’t visualize an algorithm or a piece of code and understand what that touches in the microprocessor, in the memory, on the disk, and what the costs and delays of all the steps are, you’re going to write crappy code.