CheckMark: A checksum algorithm benchmark
by Steven Noonan
I’ve been working on a new benchmark for the past several weeks (would have finished it much sooner too, had it not been for college). I think the results are very much worth sharing.
The first thing to know is that the source code to the new benchmark (I’ve named it CheckMark), is one of the CrissCross library’s example programs and is available in the CrissCross source repository. It will be more easily accessible when it goes into the CrissCross v0.7.0 release (probably in a few weeks).
CheckMark has several different checksum/hashing algorithms in it: Adler-32, CRC32, MD4, MD5, SHA-1, SHA-256, SHA-512, and Tiger. Tiger and SHA-512 were written to be especially fast on 64-bit machines (they use 64-bit word sizes), and the rest are optimized for 32-bit use.
The benchmarks I’ve done are fairly simple to do. First, for consistency, it’s best to run in an environment as close to the one I used as possible. I run the Linux 2.6.24 kernel and use GCC 4.1.2. To compile and run CheckMark, you need to first get a copy of the source code using Subversion, and then once the source is downloaded, do ‘make check’ in the main folder. If your machine, compiler, and CPU pass the CrissCross test suite tests, then you can go ahead and run ‘make example’. And then to run CheckMark (assuming the build goes fine), simply ‘examples/CheckMark/checkmark’. CheckMark generates 100 1024-byte strings and times how long it takes to hash all of them 500 times. The output of the program for each hash algorithm is the number of hashes calculated per second. Since each string is 1024 bytes long though, the hashes per second number is equivalent to the number of kilobytes hashed per second.
The machines I tested with CheckMark used the following processors in 32-bit mode: ARM9 (specifically, the one used in my Nintendo DS), Intel Core Duo, Intel Core 2 Duo, Intel Xeon (with a Pentium 4 core, a.k.a. NetBurst), and a PowerPC G4. I also tested the Intel Core 2 Duo in 64-bit mode to show the speed difference in 32 vs 64-bit modes.
My results are pretty intriguing, and they can be found here. Please note that in the graphs, a smaller bar is better, because the processor is accomplishing more work per clock cycle. Also note that since the graphs aren’t based on time but instead based on work per clock cycle, the clock speed of the processors doesn’t affect the graphs. In fact, the clock speed is taken into account when calculating the number of clock cycles per hash. This shows more of the raw performance of the processor and how efficient it truly is. This just goes to show yet again that clock speed is not everything!
Okay then, how are we to read these graphs? What do these results mean? Well, the most obvious thing I see when looking at the graphs is that you probably shouldn’t bother with SHA-512 or Tiger hashes unless you’re running on a 64-bit processor running in 64-bit mode. It’s also plain to see that the Core 2’s 64-bit mode very very easily outperforms its 32-bit mode when doing 64-bit math. This makes perfect sense, as the SHA-512 and Tiger algorithms make heavy use of 64-bit math. In 32-bit mode, a lot of extra work has to be done to complete a single 64-bit math operation.
Another observation I could make about these graphs is that the PowerPC just isn’t doing well at any of them, except SHA-1, where it appears to do well because the NetBurst core takes a huge performance hit. I’m fairly certain that the NetBurst’s speed issues here are due to the fact that the NetBurst core lacks a barrel shifter, the component which makes shift and rotate operations blindingly fast. There are a large number of shifts and rotates that occur in the SHA-1 algorithm, so the performance hit is understandable. Not surprisingly, the barrel shifter has returned in the Core and Core 2 architectures.
I was very impressed with how efficient the ARM9 processor was at doing these hashing algorithms. It was accomplishing so much more work per clock, even though it is a RISC processor, like the PowerPC. The only case where the ARM9 does very badly is when 64-bit math is involved.
I would love to hear your thoughts about this, as well as your own results, if you’d like me to add them to the list. Email me directly at steven@uplinklabs.net and let me know what’s on your mind!




Leave a Reply
You must be logged in to post a comment.