Best ruby questions in July 2012

Interpreting a benchmark in C, Clojure, Python, Ruby, Scala and others

57 votes

Disclaimer

I know that artificial benchmarks are evil. They can show results only for very specific narrow situation. I don't assume that one language is better than the other because of the some stupid bench. However I wonder why results is so different. Please see my questions at the bottom.

Math benchmark description

Benchmark is simple math calculations to find pairs of prime numbers which differs by 6 (so called sexy primes) E.g. sexy primes below 100 would be: (5 11) (7 13) (11 17) (13 19) (17 23) (23 29) (31 37) (37 43) (41 47) (47 53) (53 59) (61 67) (67 73) (73 79) (83 89) (97 103)

Results table

In table: calculation time in seconds Running: all except Factor was running in VirtualBox (Debian unstable amd64 guest, Windows 7 x64 host) CPU: AMD A4-3305M

  Sexy primes up to:        10k      20k      30k      100k               

  Bash                    58.00   200.00     [*1]      [*1]

  C                        0.20     0.65     1.42     15.00

  Clojure1.4               4.12     8.32    16.00    137.93

  Clojure1.4 (optimized)   0.95     1.82     2.30     16.00

  Factor                    n/a      n/a    15.00    180.00

  Python2.7                1.49     5.20    11.00       119     

  Ruby1.8                  5.10    18.32    40.48    377.00

  Ruby1.9.3                1.36     5.73    10.48    106.00

  Scala2.9.2               0.93     1.41     2.73     20.84

  Scala2.9.2 (optimized)   0.32     0.79     1.46     12.01

[*1] - I'm afraid to imagine how much time will it take

Code listings

C:

int isprime(int x) {
  int i;
  for (i = 2; i < x; ++i)
    if (x%i == 0) return 0;
  return 1;
}

void findprimes(int m) {
  int i;
  for ( i = 11; i < m; ++i)
    if (isprime(i) && isprime(i-6))
      printf("%d %d\n", i-6, i);
}

main() {
    findprimes(10*1000);
}

Ruby:

def is_prime?(n)
  (2...n).all?{|m| n%m != 0 }
end

def sexy_primes(x)
  (9..x).map do |i|
    [i-6, i]
  end.select do |j|
    j.all?{|j| is_prime? j}
  end
end

a = Time.now
p sexy_primes(10*1000)
b = Time.now
puts "#{(b-a)*1000} mils"

Scala:

def isPrime(n: Int) =
  (2 until n) forall { n % _ != 0 }

def sexyPrimes(n: Int) = 
  (11 to n) map { i => List(i-6, i) } filter { _ forall(isPrime(_)) }

val a = System.currentTimeMillis()
println(sexyPrimes(100*1000))
val b = System.currentTimeMillis()
println((b-a).toString + " mils")

Scala opimized isPrime (the same idea like in Clojure optimization):

import scala.annotation.tailrec

@tailrec // Not required, but will warn if optimization doesn't work
def isPrime(n: Int, i: Int = 2): Boolean = 
  if (i == n) true 
  else if (n % i != 0) isPrime(n, i + 1)
  else false

Clojure:

(defn is-prime? [n]
  (every? #(> (mod n %) 0)
    (range 2 n)))

(defn sexy-primes [m]
  (for [x (range 11 (inc m))
        :let [z (list (- x 6) x)]
        :when (every? #(is-prime? %) z)]
      z))

(let [a (System/currentTimeMillis)]
  (println (sexy-primes (* 10 1000)))
  (let [b (System/currentTimeMillis)]
    (println (- b a) "mils")))

Clojure optimized is-prime?:

(defn ^:static is-prime? [^long n]
  (loop [i (long 2)] 
    (if (= (rem n i) 0)
      false
      (if (>= (inc i) n) true (recur (inc i))))))

Python

import time as time_

def is_prime(n):
  return all((n%j > 0) for j in xrange(2, n))

def primes_below(x):
  return [[j-6, j] for j in xrange(9, x+1) if is_prime(j) and is_prime(j-6)]

a = int(round(time_.time() * 1000))
print(primes_below(10*1000))
b = int(round(time_.time() * 1000))
print(str((b-a)) + " mils")

Factor

MEMO:: prime? ( n -- ? )
n 1 - 2 [a,b] [ n swap mod 0 > ] all? ;

MEMO: sexyprimes ( n n -- r r )
[a,b] [ prime? ] filter [ 6 + ] map [ prime? ] filter dup [ 6 - ] map ;

5 10 1000 * sexyprimes . .

Bash(zsh):

#!/usr/bin/zsh
function prime {
  for (( i = 2; i < $1; i++ )); do
    if [[ $[$1%i] == 0 ]]; then
      echo 1
      exit
    fi
  done
  echo 0
}

function sexy-primes {
  for (( i = 9; i <= $1; i++ )); do
    j=$[i-6]
    if [[ $(prime $i) == 0 && $(prime $j) == 0 ]]; then
      echo $j $i
    fi
  done
}

sexy-primes 10000

Questions

  1. Why Scala is so fast? Is it because of static typing? Or it is just using JVM very efficiently?
  2. Why such a huge difference between Ruby and Python? I thought these two are not somewhat totally different. Maybe my code is wrong. Please enlighten me! Thanks. UPD Yes, that was error in my code. Python and Ruby 1.9 are pretty equal.
  3. Really impressive jump in productivity between Ruby versions.
  4. Can I optimize Clojure code by adding type declarations? Will it help?

Rough answers:

  1. Scala's static typing is helping it quite a bit here - this means that it uses the JVM pretty efficiently without too much extra effort.
  2. I'm not exactly sure on the Ruby/Python difference, but I suspect that (2...n).all? in the function is-prime? is likely to be quite well optimised in Ruby (EDIT: sounds like this is indeed the case, see Julian's answer for more detail...)
  3. Ruby 1.9.3 is just much better optimised
  4. Clojure code can certainly be accelerated a lot! While Clojure is dynamic by default, you can use type hints, primitive maths etc. to get close to Scala / pure Java speed in many cases when you need to.

Most important optimisation in the Clojure code would be to use typed primitive maths within is-prime?, something like:

(set! *unchecked-math* true) ;; at top of file to avoid using BigIntegers

(defn ^:static is-prime? [^long n]
  (loop [i (long 2)] 
    (if (zero? (mod n i))
      false
      (if (>= (inc i) n) true (recur (inc i))))))

With this improvement, I get Clojure completing 10k in 0.635 secs (i.e. the second fastest on your list, beating Scala)

P.S. note that you have printing code inside your benchmark in some cases - not a good idea as it will distort the results, especially if using a function like print for the first time causes initialisation of IO subsystems or something like that!

Does ruby have something like python's list comprehensions(example)?

8 votes

Python has nice feature:

print([j**2 for j in [2, 3, 4, 5]]) # => [4, 9, 16, 25]

In ruby it's even simpler:

puts [2, 3, 4, 5].map{|j| j**2}

But if it's about nested loops python looks more convenient...

In python we can do this:

digits = [1, 2, 3]
chars = ['a', 'b', 'c']    
print([str(d)+ch for d in digits for ch in chars if d >= 2 if ch == 'a'])    
# => ['2a', '3a']

Equivalent in ruby:

digits = [1, 2, 3]
chars = ['a', 'b', 'c']
list = []
digits.each do |d|
    chars.each do |ch|
        list.push d.to_s << ch if d >= 2 && ch == 'a'
    end
end
puts list

Q: Does ruby have smth similar?

The common way in Ruby is to properly combine Enumerable and Array methods to achieve the same:

digits.product(chars).select{ |d, ch| d >= 2 && ch == 'a' }.map(&:join)

This is only 4 or so characters longer than the list comprehension and just as expressive (IMHO of course, but since list comprehensions are just a special application of the list monad, one could argue that it's probably possible to adequately rebuild that using Ruby's collection methods), while not needing any special syntax.

How can I install Ruby 1.9.3 in Mac OS X Mountain Lion without Xcode?

7 votes

I would like to know about alternative ways to build a development machine for Ruby 1.9.3 on OS X 10.8 Mountain Lion, that does not require Xcode.

Mountain Lion is now Golden Master as I'm writing this question, so it could be considered as final version. That cannot be said about Xcode, that is a preview release yet.

RVM recommends to install osx-gcc-installer over Xcode, but I would like not to mess my system.

What is the cleanest way to install Ruby 1.9.3 on Mountain Lion without Xcode?

osx-gcc-installer turns out to be a very good option to replace Xcode in order to install Ruby 1.9.3

These are the steps I have followed:

  1. Download & install the latest version of osx-gcc-installer here (GCC-10.7-v2 is fine): https://github.com/kennethreitz/osx-gcc-installer/downloads
  2. Install RVM as usual and select 1.9.3-head as the default ruby installation: https://rvm.io/rvm/install/
  3. Install Homebrew: https://github.com/mxcl/homebrew/wiki/installation
  4. Install libksba to resolve some dependencies with Ruby 1.9.3: brew install libksba

That's it! You should now have Ruby 1.9.3 installed on Mountain Lion working perfectly.

If you need some other packages, install them now through Homebrew, such as Imagemagick for example: brew install imagemagick

It's possible that you need XQuartz for Homebrew to work properly, as Apple is not shipping X11 since Mountain Lion. You can download it here: http://xquartz.macosforge.org/trac/wiki

EDIT:

Now (since 29th July) Command line tools for Xcode 4.4 are available.

So, the new steps are these:

  1. Download & install Command line tools for Xcode 4.4 (you don't need to download Xcode): https://developer.apple.com/downloads/index.action
  2. Install Homebrew: https://github.com/mxcl/homebrew/wiki/installation
  3. Install automake: brew install automake
  4. Install RVM as usual and select 1.9.3-head as the default ruby installation: https://rvm.io/rvm/install/

Optional step: You may need XQuartz for some components, for example for Imagemagick, so download & install XQuartz: http://xquartz.macosforge.org/trac/wiki

How to print a snowman using Ruby

7 votes

I want a placeholder for a single character string that I haven't implemented yet. How do I print a snowman in Ruby 1.9?

Currently, I can do

# coding: utf-8
puts "☃"

or

puts "\u2603"

but is it possible to use the "Index entries" field (mentioned here) snowy weather or SNOWMAN or weather, snowy to get the character to print?

I am not using Rails.

You may download the Name Index from unicode.org and parse the characters names into a Hash (or better a DB or similiar).

Then you can get it with normal data access functions.

Example:

# coding: utf-8
index = {}
File.readlines('Index.txt').each{|line|
  line =~ /(.*)\t(.*)$/
  index[$1] = $2.to_i(16).chr("UTF-8")
}

snowman = index['SNOWMAN']
p snowman #hope it works. My shell does not show the snowman
p "\u2603" == snowman #true

Edit:

There is a gem unicode_utils. With this gem you can use:

require "unicode_utils/grep"
p UnicodeUtils.grep(/^snowman$/) #=> [#<U+2603 "\u2603" SNOWMAN utf8:e2,98,83>]

How to create object and it's methods dynamically in Ruby as in Javascript?

6 votes

I recently found that dynamically creating object and methods in Ruby is quite a work, this might be because of my background experience in Javascript.

In Javascript you can dynamically create object and it's methods as follow:

function somewhere_inside_my_code() {
  foo = {};
  foo.bar = function() { /** do something **/ };
};

How is the equivalent of accomplishing the above statements in Ruby (as simple as in Javascript)?

You can achieve this with Singleton Methods. Note that you can do this with all Objects, for example:

str = "I like cookies!"

def str.piratize!
  self + " Arrrr!"
end

puts str.piratize!

outputs:

I like cookies! Arrrr!

But these methods are really only defined on this single object (hence the name), so this code (executed after the above code):

str2 = "Cookies are great!"
puts str2.piratize!

really just throws an exception

foo.rb:10:in `<main>': undefined method `piratize!' for "Cookies are great!":String (NoMethodError)

Should all ruby files have a module structure that matches the folder structure?

6 votes

Is it ruby convention for all files to be in a module with the folder structure (similar to java packages)?

For example, if I have a file structure that looks like

lib/people/utils

would the files in here have the module structure such as:

module People
  module Utils
    # some functionality for People::Utils
  end
end      

The reason I ask is because I've been reading through some rails code, and there seem to be several files that are in a file structure like this, but don't have any module declarations.

I'm guessing this would be so you could use the utility function without having to include People::Utils.

Is there a convention in ruby as to when modules should be used and when they shouldn't?

It's generally a good idea to put classes and files in a structure like that, because it will make it easier for people to map the name of the class to its definition.

But it can make sense not to do this (ultimately you structure your code however you like). I've occasionally done it when there were lots of small classes all dealing with the same thing, I put them together.

And it can make sense to have a file which does not define a module or class, e.g. a configuration file, or a binary, or a bootstrapping file (file which loads up all the other ones).

how to do number to string suffix

5 votes

As you know in ruby you can do

"%03d" % 5
#=>  "005"

"%03d" % 55
#=> "055"

"%03d" % 555
#=> "555"

so basically number will have "0" prefix for 3 string places

just wondering is there possibility to do number string suffix in similar nice way ? (please no if statements)

something 5
#=> 500

something 55
#=> 550

something 555
# => 555

how about ljust method?

"5".ljust(3, "0")

and some to_s and to_i method calls if you want to do that to integers

you could avoid string conversion with bit more math like log_10 to find number of digits in an integer and then i *= 10**x where x is how many more 0's you need

like this:

def something(int, power=3)
  int * 10**([power - Math.log10(int).to_i - 1, 0].max)
end

ruby module_function vs including module

5 votes

In ruby, I understand that module functions can be made available without mixing in the module by using module_function as shown here. I can see how this is useful so you can use the function without mixing in the module.

module MyModule
  def do_something
    puts "hello world"
  end
  module_function :do_something
end

My question is though why you might want to have the function defined both of these ways.

Why not just have

def MyModule.do_something

OR

def do_something

What kind of case would it be useful to have the function available to be mixed in or be used as a static method?

Think of Enumerable.

This is the perfect example of when you need to include it in a module. If your class defines #each, you get a lot of goodness just by including a module (#map, #select, etc.). This is the only case when I use modules as mixins - when the module provides functionality in terms of a few methods, defined in the class you include the module it. I can argue that this should be the only case in general.

As for defining "static" methods, a better approach would be:

module MyModule
  def self.do_something
  end
end

You don't really need to call #module_function. I think it is just weird legacy stuff.

You can even do this:

module MyModule
  extend self

  def do_something
  end
end

...but it won't work well if you also want to include the module somewhere. I suggest avoiding it until you learn the subtleties of the Ruby metaprogramming.

Finally, if you just do:

def do_something
end

...it will not end up as a global function, but as a private method on Object (there are no functions in Ruby, just methods). There are two downsides. First, you don't have namespacing - if you define another function with the same name, it's the one that gets evaluated later that you get. Second, if you have functionality implemented in terms of #method_missing, having a private method in Object will shadow it. And finally, monkey patching Object is just evil business :)

EDIT:

module_function can be used in a way similar to private:

module Something
  def foo
    puts 'foo'
  end

  module_function

  def bar
    puts 'bar'
  end
end

That way, you can call Something.bar, but not not Something.foo. If you define any other methods after this call to module_function, they would also be available without mixing in.

I don't like it for two reasons, though. First, modules that are both mixed in and have "static" methods sound a bit dodgy. There might be valid cases, but it won't be that often. As I said, I prefer either to use a module as a namespace or mix it in, but not both.

Second, in this example, bar would also be available to classes/modules that mix in Something. I'm not sure when this is desirable, since either the method uses self and it has to be mixed in, or doesn't and then it does not need to be mixed in.

I think using module_function without passing the name of the method is used quite more often than with. Same goes for private and protected.

Is there any expression in python that similar to ruby's ||=

5 votes

I came across an interesting expression in Ruby:

a ||= "new"

It means that if a is not defined, the "new" value will be assigned to a; otherwise, a will be the same as it is. It is useful when doing some DB query. If the value is set, I don't want to fire another DB query.

So I tried the similar mindset in Python:

a = a if a is not None else "new"

It failed. I think that it because you cannot do "a = a" in Python, if a is not defined.

So the solutions that I can come out are checking locals() and globals(), or using try...except expression:

myVar = myVar if 'myVar' in locals() and 'myVar' in globals() else "new"

or

try:
    myVar
except NameError:
    myVar = None

myVar = myVar if myVar else "new"

As we can see, the solutions are not that elegant. So I'd like to ask, is there any better way of doing this?

How about?

try:
    a = a
except NameError:
    a = "new"

It's not very short but does clearly (at least to me) explain the intent of the code.

Why does Date exist in Ruby before it is required?

5 votes

In Ruby, I'd expect that a class which has not been required would raise an "uninitialized constant" error. This is the case with CSV, for instance.

However, Date behaves strangely: it is available, but apparently does not work, until it is required.

~: irb                                                                                             
>> Date.new(2012,7,24)
ArgumentError: wrong number of arguments(3 for 0)
>> require 'date'
=> true
>> Date.new(2012,7,24)
=> #<Date: 2012-07-24 ((2456133j,0s,0n),+0s,2299161j)>

What explains this behavior?

Similar to this question. irb loads a Date class by default, but Ruby itself doesn't (try e.g. puts Date.new in a file).

It seems that the Date class that irb loads is different to the distribution class, as you have pointed out. Furthermore this only seems to be the case in Ruby 1.9 -- if I try it in 1.8, I get the same class methods before and after the require.

Mountain Lion - LibXML & Nokogiri

5 votes

I've just updated to OS X Mountain Lion and I'm getting the following when working with rails and terminal.

WARNING: Nokogiri was built against LibXML version 2.8.0, but has dynamically loaded 2.7.8

I've had a look at other answers to a similar question, but they doesn't seem to stop the warning message from appearing.

So I ended up using the following command:

bundle config build.nokogiri --with-xml2-include=/usr/local/Cellar/libxml2/2.7.8/include/libxml2 --with-xml2-lib=/usr/local/Cellar/libxml2/2.7.8/lib --with-xslt-dir=/usr/local/Cellar/libxslt/1.1.26/

And then doing:

gem uninstall nokogiri
gem install nokogiri

And then ran

bundle install

Does Bundler Gem take in consideration your Ruby environment?

4 votes

My question is simple one, does gem bundler considers your ruby environment (e.g. 1.8.7 | 1.9.2) before deciding which gem to take based on gem file?

Let's say your gemfile contains

gem 'thor'
gem 'json'
gem 'grit'

When you run bundle install will take versions of the gem that are compatible with your current ruby environment or just latest gems?

It depends! Bundler relies on the configuration of the Gemspecs that each Gem provides.

Gemspecs offer the posibility to provide different or additional dependencies based on the runtime environment. IE you can change the dependencies for JRuby or provide different binaries for i386 architectures.

As far as i know, it's not possible to declare a gem as 1.9 or 1.8 compatible (which would have made sense to me). I think it's partly so, because 1.9 is 99% downward compatible.

You are always forced to have a look at the gems themselves. Because of this, there are sites like http://isitruby19.com/

As you might see, it's not an issue of Bundler, but RubyGems.