Update For Java Benchmark

About half a year ago I published my first results for a C vs. JVMs benchmark. Some version updates appeared since then and so I thought it’s time for another run.

Some words about the benchmarks

Four of the five benchmarks stem from benchmarks found on the The Computer Language Benchmarks Game. They have been modified to reveal the peak performance of the virtual machines, which means that each benchmark basically runs 10 times in a single process. The first run might be negatively influenced by the JIT compiler and isn’t counted and only the remaining 9 times are used to compute the average duration.
The fifth benchmark is called “himeno benchmark” and was ported from C code to java. Himeno runs long enough such that the warm up phase doesn’t matter much.

Compilers and JVMs used in this comparison

  • GCC 4.2.3 was taken as a performance baseline. All programs were compiled with the options “-O3 -msse2 -march=native -mfpmath=387 -funroll-loops -fomit-frame-pointer”. Please note that using profile guided optimizations might improve performance further.
  • LLVM has just released version 2.3 and is a very interesting project. It can compile c and c++ code using a GCC frontend to bytecode. It comes with a JIT that runs the bytecode on the target platform (aditionally it even offers an ahead of time compiler). It is used for various interesting projects and companies like most noteably apple for an OpenGL JIT and some people seem to work on using LLVM as a JIT compiler for OpenJDK. I used the lli JIT compiler command with the options -mcpu=core2 -mattr=sse42.
  • IBM has released it’s JDK 6 with it’s usual incredible bad marketing (of course there’s no windows version yet). I’ve used JDK 6 SR1, but I couldn’t find a readable list of what changes it includes. The older IBM JDK 5 is also included to see if it was worth the wait.
  • Excelsior has released a new version of it’s ahead of time compiler JET. Both version 6.0 and 6.4 have been benchmarked. JET is particularly interesting because it combines fast startup time and high peak performance and is therefore just what you’d expect from a good desktop application compiler.
  • Apache Harmony is also included in the benchmark. It’s aimed to become a full APL licenced java runtime and used for the google android platform. Recently the Apache Harmony Milestone 6 was released. I’ve taken a look at this version’s performance in this blog.
  • BEA JRockit is no longer included due to a great uncertainness about it’s current and future availability. A short period after Oracle bought BEA all download links were removed from the web page. A few days ago it was announced that JRockit will no longer be available as a standalone download.
  • SUN’s JDK 6 Update 2 and Update 6 were put to the test with the hotspot server compiler (i.e. -server option).
  • All benchmarks were measured on my Dell Insprion 9400 notebook with 2GB of RAM and a intel Core 2 running at 2GHz under Ubuntu 8.04 (x86).


Results

All images below show the duration in milliseconds (i.e. smaller bars are better).

As for mandelbrot GCC stays fastest by a good amount, the difference between Sun JDK 6U6 and 6U2 is negligible. The same is true for JET 6.4 and 6.0. Harmony shows a much improved performance since the last benchmark and close to the competitors.

mandelbrot benchmark)
The spectralnorm benchmark is kind of interesting in that LLVM and most JVMs can beat GCC in this particular benchmark. Most notably LLVM comes out fastest. Harmony has improved a bit (but not enough) since the last comparison.
spectralnorm benchmark

The results for the fannkuch benchmark are much more interesting than the ones before. There are quite large improvements for both SUN and JET. JDK 6U2 performed really weak in this benchmark and 6U6 is a small step into the right direction, but it’s still worse than any other JVM benchmarked. JET 6.4 on the other hand managed to improve such that it runs even a little bit faster than GCC. This is really remarkable since JET runs with bounds checks enabled and fannkuch has quite a few indirect array access operations. And it’s the second benchmark where LLVM shines. (Please note that harmony has been benchmarked with -Xem:opt. The -Xem:server option gives better results. The benchmarks runs in 7173 msecs then)
fannkuch benchmark

Nbody shows again that it’s possible for JVMs to reach the C++ performance level. There’s a very nice improvement for SUN’s JDK from Update 2 to Update 6 and a small improvement for JET 6.4, which runs the benchmark the fastest. IBM’s JDKs perform quite poor – just like harmony. LLVM also appears to have potential for further optimizations.
nbody benchmark
The himeno benchmark is new and really strange. It has a frightening cascade of inner loops with really a lot of array operations. I used the original c version and ported it to java manually inlining a macro used for the array access. The c version is slightly modified to port an optimization regarding that macro back to the c version (thanks Dmitry!).
This is a benchmark that shows just like fannkuch that there are some cases that SUN’s hotspot can’t handle well. GCC runs in less than 1/4 of the time it takes for JDK6U6 to finish! IBM and harmony are better but still far from good. Only JET comes to the rescue for the java world and beats LLVM and gets very close to GCC. The improvement from JET 6.0 to 6.4 is once again very astonishing. (Please not that harmony has been benchmarked with -Xem:opt. The -Xem:server option gives much better results. The benchmarks runs in 24770 msecs then, which means that harmony beats Sun’s JDK and IBM’s VM!)
himeno benchmark (size m)
The last diagram is an attempt to summarize the results. I decided to compute for each benchmark the ratio of each compiler/JVM to the fastest competitor and take the geometric average of those figures. The results back quite nicely my feeling about their performance. (The geometric mean for harmony is with option -Xem:opt, -Xem:server yields 2.13)

geometric mean
Conclusion

  • Unsurprisingly GCC is fastest.
  • Surprisingly it is followed very, very closely by JET 6.4, which delivers the best java performance.
  • LLVM does a good job but in contrast to some statements it seems not to beat GCC’s perfomance yet (even without PGO). It’ll be interesting to see how much perfomance will be lost if it’s used as a Java JIT due to bounds and type checking.
  • SUN’s JDK has some weak points that can be seen in the fannkuch and himeno benchmarks. Without these two benchmarks the hotspot server compiler would be almost competitive with GCC. Nevertheless JDK 6U6 has gained quite a bit performance in comparison to JDK 6U2.
  • Harmony still has a long way to go, but at least some progress can be seen.
  • What’s causing me some headaches is that competition between Sun, IBM and Bea (Oracle) seems to be in danger. I’m not too sorry for IBM – I neither enjoyed using their VM on an appserver (guess which one…) nor for those benchmarks, but JRockit was always been a very nice alternative and it’s performance was most of the time superior to SUN’s. So I’m very anxious about Oracle’s direction regarding JRockit. After that start they can’t do much worse.

Resources

42 Comments