Best optimization questions in July 2011

Java 7 sorting "optimisation"

22 votes

In Java6 both quicksort and mergesort were used in Arrays#sort, for primitive and object arrays respectively. In Java7 these have both changed, to DualPivotQuicksort and Timsort.

In the source for the new quicksort, the following comment appears in a couple of places (eg line 354):

 /*
  * Here and below we use "a[i] = b; i++;" instead
  * of "a[i++] = b;" due to performance issue.
  */

How is this a performance issue? Won't the compiler will reduce these to the same thing?

More broadly, what is a good strategy for investigating this myself? I can run benchmarks, but I'd be more interested in analysing any differences in the compiled code. However, I don't know what tools to use etc.

This is only an answer to the general question.

You can look at the bytecode and try to understand the differences. I.e. you could write yourself a simple example using both a[i] = b; i++; and a[i++] = b; and see whats the difference.

The simplest way to show the bytecode is the javap program (should be included in you JDK). Compile the code with javac SomeFile.java and run javap on the code: javap -c SomeFile (the -c switch tells javap to output the bytecode for each method in the file).

If you are using eclipse you could also try this one.

What's the fastest way to build a string in Ruby?

5 votes

In Ternary operator, a person wanting to join ["foo", "bar", "baz"] with commas and and "and" cited The Ruby Cookbook as saying

If efficiency is important to you, don't build a new string when you can append items onto an existing string. [And so on]... Use str << var1 << ' ' << var2 instead.

But the book was written in 2006.

Is using appending (ie <<) still the fastest way to build a large string given an array of smaller strings, in all major implementations of Ruby?

Use Array#join when you can, and String#<< when you can't.

The problem with using String#+ is that it must create an intermediary (unwanted) string object, while String#<< mutates the original string. Here are the time results (in seconds) of joining 1,000 strings with ", " 1,000 times, via Array#join, String#+, and String#<<:

Ruby 1.9.2p180      user     system      total        real
Array#join      0.320000   0.000000   0.320000 (  0.330224)
String#+ 1      7.730000   0.200000   7.930000 (  8.373900)
String#+ 2      4.670000   0.600000   5.270000 (  5.546633)
String#<< 1     1.260000   0.010000   1.270000 (  1.315991)
String#<< 2     1.600000   0.020000   1.620000 (  1.793415)

JRuby 1.6.1         user     system      total        real
Array#join      0.185000   0.000000   0.185000 (  0.185000)
String#+ 1      9.118000   0.000000   9.118000 (  9.118000)
String#+ 2      4.544000   0.000000   4.544000 (  4.544000)
String#<< 1     0.865000   0.000000   0.865000 (  0.866000)
String#<< 2     0.852000   0.000000   0.852000 (  0.852000)

Ruby 1.8.7p334      user     system      total        real
Array#join      0.290000   0.010000   0.300000 (  0.305367)
String#+ 1      7.620000   0.060000   7.680000 (  7.682265)
String#+ 2      4.820000   0.130000   4.950000 (  4.957258)
String#<< 1     1.290000   0.010000   1.300000 (  1.304764)
String#<< 2     1.350000   0.010000   1.360000 (  1.347226)

Rubinius (head)     user     system      total        real
Array#join      0.864054   0.008001   0.872055 (  0.870757)
String#+ 1      9.636602   0.076005   9.712607 (  9.714820)
String#+ 2      6.456403   0.064004   6.520407 (  6.521633)
String#<< 1     2.196138   0.016001   2.212139 (  2.212564)
String#<< 2     2.176136   0.012001   2.188137 (  2.186298)

Here's the benchmarking code:

WORDS = (1..1000).map{ rand(10000).to_s }
N = 1000

require 'benchmark'
Benchmark.bmbm do |x|
  x.report("Array#join"){
    N.times{ s = WORDS.join(', ') }
  }
  x.report("String#+ 1"){
    N.times{
      s = WORDS.first
      WORDS[1..-1].each{ |w| s += ", "; s += w }
    }
  }
  x.report("String#+ 2"){
    N.times{
      s = WORDS.first
      WORDS[1..-1].each{ |w| s += ", " + w }
    }
  }
  x.report("String#<< 1"){
    N.times{
      s = WORDS.first.dup
      WORDS[1..-1].each{ |w| s << ", "; s << w }
    }
  }
  x.report("String#<< 2"){
    N.times{
      s = WORDS.first.dup
      WORDS[1..-1].each{ |w| s << ", " << w }
    }
  }
end

Results obtained on Ubuntu under RVM. Results from Ruby 1.9.2p180 from RubyInstaller on Windows are similar to the 1.9.2 shown above.