Best ruby questions in March 2012

API Versioning for Rails Routes

51 votes

I'm trying to version my API like Stripe has. Below is given the latest API version is 2.

/api/users returns a 301 to /api/v2/users

/api/v1/users returns a 200 of users index at version 1

/api/v3/users returns a 301 to /api/v2/users

/api/asdf/users returns a 301 to /api/v2/users

So that basically anything that doesn't specify the version links to the latest unless the specified version exists then redirect to it.

This is what I have so far:

scope 'api', :format => :json do
  scope 'v:api_version', :api_version => /[12]/ do
    resources :users
  end

  match '/*path', :to => redirect { |params| "/api/v2/#{params[:path]}" }
end

The original form of this answer is wildly different, and can be found here. Just proof that there's more than one way to skin a cat.

I've updated the answer since to use namespaces and to use 301 redirects -- rather than the default of 302. Thanks to pixeltrix and Bo Jeanes for the prompting on those things.


You might want to wear a really strong helmet because this is going to blow your mind.

The Rails 3 routing API is super wicked. To write the routes for your API, as per your requirements above, you need just this:

namespace :api do
  namespace :v1 do
    resources :users
  end

  namespace :v2 do
    resources :users
  end
  match 'v:api/*path', :to => redirect("/api/v2/%{path}")
  match '*path', :to => redirect("/api/v2/%{path}")
end

If your mind is still intact after this point, let me explain.

First, we call namespace which is super handy for when you want a bunch of routes scoped to a specific path and module that are similarly named. In this case, we want all routes inside the block for our namespace to be scoped to controllers within the Api module and all requests to paths inside this route will be prefixed with api. Requests such as /api/v2/users, ya know?

Inside the namespace, we define two more namespaces (woah!). This time we're defining the "v1" namespace, so all routes for the controllers here will be inside the V1 module inside the Api module: Api::V1. By defining resources :users inside this route, the controller will be located at Api::V1::UsersController. This is version 1, and you get there by making requests like /api/v1/users.

Version 2 is only a tiny bit different. Instead of the controller serving it being at Api::V1::UsersController, it's now at Api::V2::UsersController. You get there by making requests like /api/v2/users.

Next, a match is used. This will match all API routes that go to things like /api/v3/users.

This is the part I had to look up. The :to => option allows you to specify that a specific request should be redirected somewhere else -- I knew that much -- but I didn't know how to get it to redirect to somewhere else and pass in a piece of the original request along with it.

To do this, we call the redirect method and pass it a string with a special-interpolated %{path} parameter. When a request comes in that matches this final match, it will interpolate the path parameter into the location of %{path} inside the string and redirect the user to where they need to go.

Finally, we use another match to route all remaining paths prefixed with /api and redirect them to /api/v2/%{path}. This means requests like /api/users will go to /api/v2/users.

I couldn't figure out how to get /api/asdf/users to match, because how do you determine if that is supposed to be a request to /api/<resource>/<identifier> or /api/<version>/<resource>?

Anyway, this was fun to research and I hope it helps you!

Why 0 && 1 is 1 while 1 && 0 is 0 in ruby?

23 votes

In Ruby, why are the following lines true?

0 && 1 == 1
1 && 0 == 0

Why are they different and aren't both 0?

Boolean AND operator && returns its second operand if first is not false. 0 and 1 are true in boolean expressions in Ruby. Only nil and false are false in boolean expressions.

nil && 15 # => nil
15 && 17 # => 17
15 && nil # => nil

Find unused code in a Rails app

20 votes

How do I find what code is and isn't being run in production ?

The app is well-tested, but there's a lot of tests that test unused code. Hence they get coverage when running tests... I'd like to refactor and clean up this mess, it keeps wasting my time. I have a lot of background jobs, this is why I'd like the production env to guide me. Running at heroku I can spin up dynos to compensate any performance impacts from the profiler.

Related question How can I find unused methods in a Ruby app? not helpful.

Bonus: metrics to show how often a line of code is run. Don't know why I want it, but I do! :)

Under normal circumstances the approach would be to use your test data for code coverage, but as you say you have parts of your code that are tested but are not used on the production app, you could do something slightly different.

Just for clarity first: Don't trust automatic tools. They will only show you results for things you actively test, nothing more.

With the disclaimer behind us, I propose you use a code coverage tool (like rcov or simplecov for Ruby 1.9) on your production app and measure the code paths that are actually used by your users. While these tools were originally designed for measuring test coverage, you could also use them for production coverage

Under the assumption that during the test time-frame all relevant code paths are visited, you can remove the rest. Unfortunately, this assumption will most probably not fully hold. So you will still have to apply your knowledge of the app and its inner workings when removing parts. This is even more important when removing declarative parts (like model references) as those are often not directly run but only used for configuring other parts of the system.

Another approach which could be combined with the above is to try to refactor your app into distinguished features that you can turn on and off. Then you can turn features that are suspected to be unused off and check if nobody complains :)

And as a final note: you won't find a magic tool to do your full analysis. That's because no tool can know whether a certain piece of code is used by actual users or not. The only thing that tools can do is create (more or less) static reachability graphs, telling you if your code is somehow called from a certain point. With a dynamic language like Ruby even this is rather hard to achieve, as static analysis doesn't bring much insight in the face of meta-programming or dynamic calls that are heavily used in a rails context. So some tools actually run your code or try to get insight from test coverage. But there is definitely no magic spell.

So given the high internal (mostly hidden) complexity of a rails application, you will not get around to do most of the analysis by hand. The best advice would probably be to try to modularize your app and turn off certain modules to test f they are not used. This can be supported by proper integration tests.

Where and how is the _ (underscore) variable specified in Ruby?

12 votes

Most are aware of _'s special meaning in IRb as a holder for last return value, but that is not what I'm asking about here.

The _ can be used as a variable of sorts, with special connotations, akin to a "don't care variable". Here are some useful examples illustrating its unique behavior:

lambda { |x, x| 42 }            # duplicated argument name
lambda { |_, _| 42 }.call(4, 2) # => 42
lambda { |_, _| 42 }.call(_, _) # undefined local variable or method `_'
lambda { |_| _ + 1 }.call(42)   # => 43
lambda { |_, _| _ }.call(4, 2)  # 1.8.7: => 2
                                # 1.9.3: => 4
_ = 42
_ * 100         # => 4200
_, _ = 4, 2; _  # => 2

These were all run in Ruby directly (with putss added in), not IRb, to avoid conflicting with its additional functionality.

This is all a result of my own experimentation though, as I can find no documentation on this behavior anywhere (admittedly it's not the easiest thing to search for). Ultimately, I'm curious how all of this works internally so I can better understand what it does in different circumstances. So I'm asking for references to documentation, and, preferably, the Ruby source code (and perhaps RubySpec) that reveal how _ behaves in Ruby.

Note: most of this arose out of this discussion with @Niklas B.

There is some special handling in the source to suppress the "duplicate argument name" error. The error message only appears in shadowing_lvar_gen inside parse.y, the 1.9.3 version looks like this:

static ID
shadowing_lvar_gen(struct parser_params *parser, ID name)
{
    if (idUScore == name) return name;
    /* ... */

and idUScore is defined in id.c like this:

REGISTER_SYMID(idUScore, "_");

You'll see similar special handling in warn_unused_var:

static void
warn_unused_var(struct parser_params *parser, struct local_vars *local)
{
    /* ... */
    for (i = 0; i < cnt; ++i) {
        if (!v[i] || (u[i] & LVAR_USED)) continue;
        if (idUScore == v[i]) continue;
        rb_compile_warn(ruby_sourcefile, (int)u[i], "assigned but unused variable - %s", rb_id2name(v[i]));
    }
}

You'll notice that the warning is suppressed on the second line of the for loop.

The only special handling of _ that I could find in the 1.9.3 source is above: the duplicate name error is suppressed and the unused variable warning is suppressed. Other than those two things, _ is just a plain old variable like any other. I don't know of any documentation about the (minor) specialness of _.

Strange "half to even" rounding in different languages

11 votes

GNU bash, version 4.2.24:

$> printf "%.0f, %.0f\n" 48.5 49.5
48, 50

Ruby 1.8.7

> printf( "%.0f, %.0f\n", 48.5, 49.5 )
48, 50

Perl 5.12.4

$> perl -e 'printf( "%.0f, %.0f\n", 48.5, 49.5 )'
48, 50

gcc 4.5.3:

> printf( "%.0f, %.0f\n", 48.5, 49.5 );
48, 50

GHC, version 7.0.4:

> printf "%.0f, %.0f\n" 48.5 49.5
49, 50

Wikipedia says that this kind of rounding is called round half to even:

This is the default rounding mode used in IEEE 754 computing functions and operators.

Why is this rounding used by default in C, Perl, Ruby and bash, but not in Haskell?

Is it some sort of tradition or standard? And if it is a standard, why it's used by those languages and not used by Haskell? What is a point of rounding half to even?

GHCi> round 48.5
48
GHCi> round 49.5
50

The only difference is that printf isn't using round — presumably because it has to be able to round to more than just whole integers. I don't think IEEE 754 specifies anything about how to implement printf-style formatting functions, just rounding, which Haskell does correctly.

It would probably be best if printf was consistent with round and other languages' implementations, but I don't think it's really a big deal.

How to setup a basic ruby project?

9 votes

I want to create a small ruby project with 10-20 classes / files. I need some gems and I want to use rspec as test framework.

I might want to build a gem later on, but that is not certain.

Is there some Howto or Guide that shows me how to setup the basic structure of my project?

Questions that I have are:

  • Where do I put all my custom Errors/Exceptions
  • Are there some conventions out there for naming directories like lib, bin, src etc?
  • Where do I put test data or documents.
  • Where do I require all my files so I have access to them in my project.

I know I could do everything from scratch, but I would like some guidance. There are some good gems out there that I could copy, but I am not certain what I really need and what I can delete.

I looked at http://gembundler.com/, but it stops after setting up bundler.

You can run bundle gem my_lib and rspec --init to get a good start.

~/Desktop $ bundle gem my_lib
      create  my_lib/Gemfile
      create  my_lib/Rakefile
      create  my_lib/.gitignore
      create  my_lib/my_lib.gemspec
      create  my_lib/lib/my_lib.rb
      create  my_lib/lib/my_lib/version.rb
Initializating git repo in /Users/john/Desktop/my_lib
~/Desktop $ cd my_lib/
~/Desktop/my_lib $ rspec --init
The --configure option no longer needs any arguments, so true was ignored.
  create   spec/spec_helper.rb
  create   .rspec
~/Desktop/my_lib $ mkdir bin

Then, put your code in lib, your shell scripts in bin, specs in spec, etc...

Put your test data or documents in spec/fixtures/. Require all your files in lib/my_lib.rb. You can define your exceptions either in lib/my_lib.rb or in their own files -- according to your own preference.

C source files should go in ext/my_lib

When in doubt, just look at how other gems are laid out.

Why couldn't twitter scale by adding servers the way sites like facebook have?

9 votes

I have been looking for an explanation for why twitter had to migrate part of its middle ware from Rails to Scala. What prevented them from scaling the way facebook has, by adding servers as its user base expanded. More specifically what about the Ruby/Rails technology prevented the twitter team from taking this approach?

It's not that Rails doesn't scale, but rather, requests for "live" data in Ruby (or any interpreted language) do not scale, as they are comparatively far more expensive both in terms of CPU & memory utilization than their compiled language counterparts.

Now, were Twitter a different type of service, one that had the same enormous user base, but served data that changed less frequently, Rails could be a viable option via caching; i.e. avoiding live requests to the Rails stack entirely and offloading to front end server and/or in-memory DB cache. An excellent article on this topic:

How Basecamp Next got to be so damn fast

However, Twitter did not ditch Rails for scaling issues alone, they made the switch because Scala, as a language, provides certain built-in guarantees about the state of your application that interpreted languages cannot provide: if it compiles, time wasting bugs such as fat-fingered typos, incorrect method calls, incorrect type declarations, etc. simply cannot exist.

For Twitter TDD was not enough. A quote from Dikstra in Programming in Scala illustrates this point: "testing can only prove the presence of errors, never their absence". As their application grew, they ran into more and more hard to track down bugs. The magical mystery tour was becoming a hindrance beyond performance, so they made the switch. By all accounts an overwhelming success, Twitter is to Scala what Facebook is to PHP (although Facebook uses their own ultra fast C++ preprocessor so cheating a bit ;-))

To sum up, Twitter made the switch for both performance and reliability. Of course, Rails tends to be on the innovation forefront, so the 99% non-Twitter level trafficked applications of the world can get by just fine with an interpreted language (although, I'm now solidly on the compiled language side of the fence, Scala is just too good!)

Good sound libraries?

8 votes

I need to take an audio signal, and extract overlapping audio frames from it. I then need to convert these to frequency data (FFT stuff / like a spectrogram) and analyze the frequency information.

For example, if I have a 1 minute mp3 file, I want split the file into smaller files, from 00:00.000 to 00:03.000, from 00:00.010 to 00:03.010. Then I need to see the frequency breakdown of each sub-file.

Which programming languages have good audio tools that could help me do this? Are there linux command-line tools I could use? Bonus points for Node.js (yeah right) or Haskell, which I'm most familiar with.

Haskell:

http://hackage.haskell.org/package/hsndfile. Then it's mainly just math, I'd imagine, with hmatrix and soforth.

Searching and marking paired patterns on a line

8 votes

I need to search for and mark patterns which are split somewhere on a line. Here is a shortened list of sample patterns which are placed in a separate file, e.g.:

CAT,TREE
LION,FOREST
OWL,WATERFALL

A match appears if the item from column 2 ever appears after and on the same line as the item from column 1. E.g.:

THEREISACATINTHETREE. (matches)

No match appears if the item from column 2 appears first on the line, e.g.:

THETREEHASACAT. (does not match)

Furthermore, no match appears if the item from column 1 and 2 touch, e.g.:

THECATTREEHASMANYBIRDS. (does not match)

Once any match is found, I need to mark it with \start{n} (appearing after the column 1 item) and \end{n} (appearing before the column 2 item), where n is a simple counter which increases anytime any match is found. E.g.:

THEREISACAT\start{1}INTHE\end{1}TREE.

Here is a more complex example:

THECATANDLIONLEFTTHEFORESTANDMETANDOWLINTREENEARTHEWATERFALL.

This becomes:

THECAT\start{1}ANDLION\start{2}LEFTTHE\end{2}FORESTANDMETANDOWL\start{3}INA\end{1}TREENEARTHE\end{3}WATERFALL.

Sometimes there are multiple matches in the same place:

 THECATDOESNOTLIKETALLTREES,BUTINSTEADLIKESSHORTTREES.

This becomes:

 THECAT\start{1}\start{2}DOESNOTLIKETALL\end{1}TREES,BUTINSTEADLIKESSHORT\end{2}TREES.
  • There are no spaces in the file.
  • Many non-Latin characters appear in the file.
  • Pattern matches need only be found on the same line (e.g. "CAT" on line 1 does not ever match with a "TREE" found on line 2, as those are on different lines).

How can I find these matches and mark them in this way?

Check this out (Ruby):

#!/usr/bin/env ruby
patterns = [
  ['CAT', 'TREE'],
  ['LION', 'FOREST'],
  ['OWL', 'WATERFALL']
]

lines = [
  'THEREISACATINTHETREE.',
  'THETREEHASACAT.',
  'THECATTREEHASMANYBIRDS.',
  'THECATANDLIONLEFTTHEFORESTANDMETANDOWLINTREENEARTHEWATERFALL.',
  'THECATDOESNOTLIKETALLTREES,BUTINSTEADLIKESSHORTTREES.',
  'CAT...TREE...CAT...TREE'
]

lines.each do |line|
  puts line
  matches = Hash.new{|h,e| h[e] = [] }
  match_indices = []
  patterns.each do |first,second|
    offset = 0
    while new_offset = line.index(first,offset) do
      # map second element of the pattern to minimal position it might be matched
      matches[second] << new_offset + first.size + 1
      offset = new_offset + 1
    end
  end
  global_counter = 1
  matches.each do |second,offsets|
    offsets.each do |offset|
      second_offset = offset
      while new_offset = line.index(second,second_offset) do
        # register the end index of the first pattern and 
        # the start index of the second pattern with the global match count
        match_indices << [offset-1,new_offset,global_counter]
        second_offset = new_offset + 1
        global_counter += 1
      end
    end
  end
  indices = Hash.new{|h,e| h[e] = ""}
  match_indices.each do |first,second,global_counter|
    # build the insertion string for the string positions the 
    # start and end tags should be placed in
    indices[first] << "\\start{#{global_counter}}"
    indices[second] << "\\end{#{global_counter}}"
  end
  inserted_length = 0
  indices.sort_by{|k,v| k}.each do |position,insert|
    # insert the tags at their positions
    line.insert(position + inserted_length,insert)
    inserted_length += insert.size
  end
  puts line
end

Result

THEREISACATINTHETREE.
THEREISACAT\start{1}INTHE\end{1}TREE.
THETREEHASACAT.
THETREEHASACAT.
THECATTREEHASMANYBIRDS.
THECATTREEHASMANYBIRDS.
THECATANDLIONLEFTTHEFORESTANDMETANDOWLINTREENEARTHEWATERFALL.
THECAT\start{1}ANDLION\start{2}LEFTTHE\end{2}FORESTANDMETANDOWL\start{3}IN\end{1}TREENEARTHE\end{3}WATERFALL.
THECATDOESNOTLIKETALLTREES,BUTINSTEADLIKESSHORTTREES.
THECAT\start{1}\start{2}DOESNOTLIKETALL\end{1}TREES,BUTINSTEADLIKESSHORT\end{2}TREES.
CAT...TREE...CAT...TREE
CAT\start{1}\start{2}...\end{1}TREE...CAT\start{3}...\end{2}\end{3}TREE

EDIT

I inserted some comments and clarified some of the variables.

Space filling with circles of unequal size

8 votes

Here is my problem:

  • I have a bunch of circles that I need to display inside a canvas.
  • There are an arbitrary number of circles, each with a predefined radius.
  • The summed area of circles is always smaller than the area of the canvas.

I want to position the circles so that they take up the maximal space available inside the canvas, without touching each other. My goal is to achieve a visually pleasing effect where the circles appear well distributed inside the canvas. I don't know if this is really "space filling", as my goal is not to minimize the distance between elements, but rather to maximize it.

Here is an example of what I am trying to achieve:

Circles

My first "brute force" idea was the following:

  1. For each circle: calculate the shortest distance between its border and each other circle's border; sum all of these distances, call that X.
  2. Calculate the sum of all X's.
  3. Randomly change the distances between the circles.
  4. Redo 1-3 for a preset number of iterations and take the maximal value obtained at step (2).

However, this does not seem elegant; I'm sure there is a better way to do it. Is there any existing algorithm to achieve such a layout? Is there any existing library that I could use (JavaScript or Ruby) to achieve this?

Edit

Here is a Javascript version of the accepted answer, which uses Raphael to draw the circles.

I would try to insert sphere after sphere (largest first). Each one is added in the largest available space, with some random jitter.

One relatively easy way to find (more or less) the largest available space, is to imagine a grid of points on your view and store for each grid point (in a 2D array) the closest distance to any item: edge or sphere, whichever is closest. This array is updated as each new sphere is added.

To add a new sphere, just take the grid point with highest distance and apply some random jitter (you actually know how much you can jitter, because you know the distance to the closest item). (I would randomize not more than (d-r)/2 where d is the distance in the array and r is the radius of the sphere to add.

Updating this array after adding another circle is no rocket science: you calculate for each grid point the distance to newly added sphere and replace the stored value if that was larger.

It is possible that your grid is too coarse, and you can't add any more circle (when the 2D array contains no distances larger than the radius of the circle to add). Then you have to increase (e.g. double) the grid resolution before continuing.

Here are some result of this implementation (it took me about 100 lines of code)

  • 100 Circles of varying size

100 circles of varying size

  • 500 Circles of varying size

500 circles of varying size

  • 100 Circles of same size

enter image description here

And here is some rough C++ code (just the algorithm, don't expect this to compile)

    // INITIALIZATION

    // Dimension of canvas
    float width = 768;
    float height = 1004;

    // The algorithm creates a grid on the canvas
    float gridSize=10;

    int gridColumns, gridRows;
    float *dist;

    void initDistances()
    {
      // Determine grid dimensions and allocate array
      gridColumns = width/gridSize;
      gridRows = height/gridSize;

      // We store a 2D array as a 1D array:
      dist = new float[ gridColumns * gridRows ];

      // Init dist array with shortest distances to the edges
      float y = gridSize/2.0;
      for (int row=0; row<gridRows; row++)
      {
        float distanceFromTop = y;
        float distanceFromBottom = height-y;
        for (int col=0; col<gridColumns; col++)
        {
          int i = row*gridColumns+col;
          dist[i]=(distanceFromTop<distanceFromBottom?distanceFromTop:distanceFromBottom);
        }
        y+=gridSize;
      }
      float x = gridSize/2.0;
      for (int col=0; col<gridColumns; col++)
      {
        float distanceFromLeft = x;
        float distanceFromRight = width-x;
        for (int row=0; row<gridRows; row++)
        {
          int i = row*gridColumns+col;
          if (dist[i]>distanceFromLeft) dist[i] = distanceFromLeft;
          if (dist[i]>distanceFromRight) dist[i] = distanceFromRight;
        }
        x+=gridSize;
      }
    }

    void drawCircles()
    {
      for (int circle = 0; circle<getNrOfCircles(); circle++)
      {
        // We assume circles are sorted large to small!
        float radius = getRadiusOfCircle( circle ); 

        // Find gridpoint with largest distance from anything
        int i=0;
        int maxR = 0;
        int maxC = 0;
        float maxDist = dist[0];

        for (int r=0; r<gridRows; r++) 
          for (int c=0; c<gridColumns; c++)
          {
            if (maxDist<dist[i]) {
              maxR= r; maxC= c; maxDist = dist[i];
            }
            i++;
          }

        // Calculate position of grid point
        float x = gridSize/2.0 + maxC*gridSize;
        float y = gridSize/2.0 + maxR*gridSize;

        // Apply some random Jitter
        float offset = (maxDist-radius)/2.0;
        x += (rand()/(float)RAND_MAX - 0.5) * 2 * offset;
        y += (rand()/(float)RAND_MAX - 0.5) * 2 * offset;


        drawCircle(x,y,radius);


        // Update Distance array with new circle;
        i=0;
        float yy = gridSize/2.0;
        for (int r=0; r<gridRows; r++)
        {
          float xx = gridSize/2.0;
          for (int c=0; c<gridColumns; c++)
          {
            float d2 = (xx-x)*(xx-x)+(yy-y)*(yy-y);

            // Naive implementation
            // float d = sqrt(d2) - radius;
            // if (dist[i]>d) dist[i] = d;

            // Optimized implementation (no unnecessary sqrt)
            float prev2 = dist[i]+radius;
            prev2 *= prev2;
            if (prev2 > d2)
            {
              float d = sqrt(d2) - radius;
              if (dist[i]>d) dist[i] = d;
            }



            xx += gridSize;
            i++;
          }
          yy += gridSize;
        }
      }
    }

How can I call a Proc that takes a block in a different context?

8 votes

Take this example Proc:

proc = Proc.new {|x,y,&block| block.call(x,y,self.instance_method)}

It takes two arguments, x and y, and also a block.

I want to execute that block using different values for self. Something like this nearly works:

some_object.instance_exec("x arg", "y arg", &proc)

However that doesn't allow you to pass in a block. This also doesn't work

some_object.instance_exec("x arg", "y arg", another_proc, &proc)

nor does

some_object.instance_exec("x arg", "y arg", &another_proc, &proc)

I'm not sure what else could work here. Is this possible, and if so how do you do it?

Edit: Basically if you can get this rspec file to pass by changing the change_scope_of_proc method, you have solved my problem.

require 'rspec'

class SomeClass
  def instance_method(x)
    "Hello #{x}"
  end
end

class AnotherClass
  def instance_method(x)
    "Goodbye #{x}"
  end

  def make_proc
    Proc.new do |x, &block|
      instance_method(block.call(x))
    end
  end
end

def change_scope_of_proc(new_self, proc)
  # TODO fix me!!!
  proc
end

describe "change_scope_of_proc" do
  it "should change the instance method that is called" do
    some_class = SomeClass.new
    another_class = AnotherClass.new
    proc = another_class.make_proc
    fixed_proc = change_scope_of_proc(some_class, proc)
    result = fixed_proc.call("Wor") do |x|
      "#{x}ld"
    end

    result.should == "Hello World"
  end
end

To solve this, you need to re-bind the Proc to the new class.

Here's your solution, leveraging some good code from Rails core_ext:

require 'rspec'

# Same as original post

class SomeClass
  def instance_method(x)
    "Hello #{x}"
  end
end

# Same as original post

class AnotherClass
  def instance_method(x)
    "Goodbye #{x}"
  end

  def make_proc
    Proc.new do |x, &block|
      instance_method(block.call(x))
    end
  end
end

### SOLUTION ###

# From activesupport lib/active_support/core_ext/kernel/singleton_class.rb

module Kernel
  # Returns the object's singleton class.
  def singleton_class
    class << self
      self
    end
  end unless respond_to?(:singleton_class) # exists in 1.9.2

  # class_eval on an object acts like singleton_class.class_eval.
  def class_eval(*args, &block)
    singleton_class.class_eval(*args, &block)
  end
end

# From activesupport lib/active_support/core_ext/proc.rb 

class Proc #:nodoc:
  def bind(object)
    block, time = self, Time.now
    object.class_eval do
      method_name = "__bind_#{time.to_i}_#{time.usec}"
      define_method(method_name, &block)
      method = instance_method(method_name)
      remove_method(method_name)
      method
    end.bind(object)
  end
end

# Here's the method you requested

def change_scope_of_proc(new_self, proc)
  return proc.bind(new_self)
end

# Same as original post

describe "change_scope_of_proc" do
  it "should change the instance method that is called" do
    some_class = SomeClass.new
    another_class = AnotherClass.new
    proc = another_class.make_proc
    fixed_proc = change_scope_of_proc(some_class, proc)
    result = fixed_proc.call("Wor") do |x|
      "#{x}ld"
    end
    result.should == "Hello World"
  end
end

Duplicating class in the object space object_id

7 votes

I having a strange issue where certain models in a rails engine I am using are getting duplicated in the object space.

(rdb:1) ObjectSpace.each_object(::Class).each { |klass| puts klass.to_s + ": " + klass.object_id.to_s if klass.to_s.eql?("DynamicFieldsets::Field") }
DynamicFieldsets::Field: 66866100
DynamicFieldsets::Field: 71836380
2479

When this happens, I cannot use is_a? or equality checks to test that an object is an instance of the Field class. The problem only happens in development and it looks like it may be caused by cache_classes being off. I think the object from the previous request is still in the object space but I am not sure how to remove it.

This is easy to reproduce with remove_const:

class X
  def self.foo
    "hello"
  end
end
first_x = X.new

Object.send :remove_const, :X
class X
  def self.foo
    "world"
  end
end
second_x = X.new

p first_x.class, first_x.class.object_id, second_x.class, second_x.class.object_id
  # => X, <an_id>, X, <another_id>
p first_x.class.foo, second_x.class.foo
  # => "hello", "world"

As you stated, you get this symptom only in development. When Rails reloads the classes, it simply calls remove_const on the defined classes, to force them to be reloaded (using autoload). Here's the code. Rails will actually call DynamicFieldsets::Field.before_remove_const if it is defined, as explained here, how nice :-)

These should be garbage collected and you can trigger the GC with GC.start, but if you have instances of the old classes lying around (like first_x in my example), or subclasses, the old classes can not be garbage collected.

Note that is_a? should work fine, in the sense that new instances will be kind_of? and is_a? of the new class. In my example:

first_x.is_a? X  # => false
second_x.is_a? X # => true

This is the right behavior, as X refers to the new class, not the old class.

Issue updating Ruby on Mac with Xcode 4.3.1

7 votes

I'm using RVM to install it and it gives me this error:

The provided compiler '/usr/bin/gcc' is LLVM based, it is not yet fully supported by ruby and gems, please read `rvm requirements`.

I'm on Lion 10.7.3 and I have Xcode 4.3.1.

Short answer is you can grab RVM master branch (not stable) to build it with LLVM (not gcc, I mistyped initially). It has appropriate patches to make 1.9.3-p125 to run (at least better) with Xcode 4.3.1 by default. I provided the patch. If you already installed RVM, rvm get head will install the master branch. With command line tools installed with Xcode 4.3.1, you can successfully install Ruby 1.9.3-p125.

Background

It's happen due to a simple configuration issue of Ruby 1.9.3-p125, it don't allow dynamic link modules to work. This happens if you're using Xcode 4.3.x (Ruby Issue#6080).

This issue have fixed in change set r34840.

RVM has patch system which provides per-version basis. This patch is included in the RVM (master branch for now) and now default for p125 configuration steps.

Xcode 4.3.x Command Line Tool

First, With Xcode 4.3.x, you need to install command line tool AFTER installing Xcode 4.3.x, by following directions: 1) Launching Xcode, 2) Open “Preferences” from the “Xcode” item on the menu bar. 3) Select “Downloads” tab (icon). 4) Click “Install” button for “Command Line Tools” (directions borrowed from my friend's site here)

If Xcode 4.3.1 is correctly installed, then cc --version should emit:

% cc --version
Apple clang version 3.1 (tags/Apple/clang-318.0.54) (based on LLVM 3.1svn)
Target: x86_64-apple-darwin11.3.0
Thread model: posix

autoconf and automake

You need autoconf and automake, since Xcode 4.3.x don't have them. Install them either brew or MacPorts. With MacPorts:

sudo port install autoconf automake

Recommended installation step with RVM

Then, To install specific branch of RVM, you can:

REPO=wayneeseguin
BRANCH=master # stable for the stable branch
curl -s https://raw.github.com/${REPO}/rvm/${BRANCH}/binscripts/rvm-installer > /tmp/rvm-installer.sh
bash /tmp/rvm-installer.sh --branch ${REPO}/${BRANCH}

Or if RVM is already installed:

rvm get head   # master branch, for stable branch "rvm get stable"

After that, install openssl, iconv and readline using rvm pkg command for best result. I do following lately. Part of this might need to be included to RVM..

rvm pkg install openssl
rvm pkg install readline # if you prefer GNU readline

Then, finally, install the Ruby.

rvm install 1.9.3-p125 --with-readline-dir=$rvm_path/usr --with-openssl-dir=$rvm_path/usr --with-tcl-lib=/usr --with-tk-lib=/usr

rvm pkg's help recommend different parameter, the help is broken. So use above for now. You need tcl/tk parameters if you have them via MacPorts(like me)

By the way, It is possible to install old Xcode then run rvm with export CC="gcc-4.2" rvm install 1.9.3-p125, but I personally think clang (LLVM) is the way to go for future, if possible.

Hope this helps.

Edit on 2012/3/31

iconv don't need to install, also added autoconf/automake requirements for clarifications.

What OCR options exist beyond Tesseract?

7 votes

I've used Tesseract a bit and it's results leave much to be desired. I'm currently detecting very small images (35x15, without border, but have tried adding one with imagemagick with no ocr advantage); they range from 2 chars to 5 and are a pretty reliable font, however the characters are variable enough that simply using an image size checksum or such is not going to work.

I actually have tried: http://www.free-ocr.co.uk/ and surprisingly it has 100% accuracy. The problem I have with utilizing it is that I cannot rely on another outside service's reliability for this particular use case. I need to be able to control uptime to a higher degree.

What options exist for OCR besides sticking with Tesseract or doing a complete custom training of it? Also, it would be VERY helpful if this were compatible with Heroku style hosting (at least where I can compile the bins and shove them over).

I have successfully used GOCR in the past for small image OCR. I would say accuracy was around 85%, after getting the grayscale options set properly, on fairly regular fonts. It fails miserably when the fonts get complicated and has trouble with multiline layouts.

Also have a look at Ocropus, which is maintained by Google. Its related to Tesseract, but from what I understand, its OCR engine is different. With just the default models included, it achieves near 99% accuracy on high-quality images, handles layout pretty well and provides HTML output with information concerning formatting and lines. However, in my experience, its accuracy is very low when the image quality is not good enough. That being said, training is relatively simple and you might want to give it a try.

Both of them are easily callable from the command line. GOCR usage is very straightforward; just type gocr -h and you should have all the information you need. Ocropus is a bit more tricky; here's a usage example, in Ruby:

require 'fileutils'
tmp = 'directory'
file = 'file.png'

`ocropus book2pages #{tmp}/out #{file}`
`ocropus pages2lines #{tmp}/out`
`ocropus lines2fsts #{tmp}/out`
`ocropus buildhtml #{tmp}/out > #{tmp}/output.html`

text = File.read("#{tmp}/output.html")
FileUtils.rm_rf(tmp)

Why use procs instead of methods?

7 votes

I'm new to programming, and ruby is my first real run at it. I get blocks, but procs seem like a light method/function concept -- why use them? Why not just use a method?

Thanks for your help.

Proc is a callable piece of code. You can store it in a variable, pass as an argument and otherwise treat it as a first-class value.

Why not just use a method?

Depends on what you mean by "method" here.

class Foo
  def bar
    puts "hello"
  end
end

f = Foo.new

In this code snippet usage of method bar is pretty limited. You can call it, and that's it. However, if you wanted to store a reference to it (to pass somewhere else and there call it), you can do this:

f = Foo.new
bar_method = f.method(:bar)

Here bar_method is very similar to lambda (which is similar to Proc). bar_method is a first-class citizen, f.bar is not.

For more information, read the article mentioned by @minitech.

Not understanding Classes, Modules, and the class << self method

7 votes

I have the following code:

class MyClass  
  module MyModule
    class << self

      attr_accessor :first_name

      def myfunction
        MyModule.first_name = "Nathan"
      end

    end
  end
end

When I call the method myfunction like so, it works fine:

> me = MyClass::MyModule.myfunction
=> "Nathan"
> me
=> "Nathan"

But if I removed the class << self and add a self. prefix to myfunction, it doesn't work.

For example:

class MyClass  
  module MyModule

    attr_accessor :first_name

    def self.myfunction
      MyModule.first_name = "Nathan"
    end

  end
end


> me = MyClass::MyModule.myfunction
NoMethodError: undefined method `first_name=' for MyClass::MyModule:Module

I'm trying to understand the class << self method. I thought it was a way add the self. prefix to the all the methods inside of it, but if that was true, why doesn't it work if I remove it and prefix each method with self. manually?

Thanks in advance for your help.

This is because your attr_accessor :first_name is also wrapped by the class << self.

To do it the way you suggest, you can use mattr_accessor like so:

require 'active_support'

class MyClass  
  module MyModule

    mattr_accessor :first_name

    def self.myfunction
      MyModule.first_name = "Nathan"
    end

  end
end

How do you read this ternary condition in Ruby?

6 votes

I came across a ternary in some code and I am having trouble understanding the conditional:

str.split(/',\s*'/).map do |match|
  match[0] == ?, ?
    match : "some string"
end.join

I do understand that I am splitting a string at certain points and converting the total result to an array, and dealing with each element of the array in turn. Beyond that I have no idea what's going on.

A (slightly) less confusing way to write this is:

str.split(/',\s*'/).map do |match|
  if match[0] == ?,
    match
  else
    "some string"
  end
end.join

I think multiline ternary statements are horrible, especially since if blocks can return in Ruby.

Probably the most confusing thing here is the ?, which is a character literal. In Ruby 1.8 this means the ASCII value of the character (in this case 44), in Ruby 1.9 this is just a string (in this case ",").

The reason for using a character literal instead of just "," is that the return value of calling [] on a string changed in Ruby 1.9. In 1.8 it returned the ASCII value of the character at that position, in 1.9 it returns a single-character string. Using ?, here avoids having to worry about the differences in String#[] between Ruby 1.8 & 1.9.

Ultimately the conditional is just checking if the first character in match is ,, and if so it keeps the value the same, else it sets it to "some string".

Ruby: file encryption/decryption with private/public keys

6 votes

I search for algorithm for file encryption/decryption which satisfies following requirements:

  • Algorithm must be reliable
  • Algorithm should be fast for rather big files
  • Private key can be generated by some parameter (for example, password)
  • Generated private key must be compatible with public key (public key is generated only once and stored in database)

Is there any ruby implementation of suggested algorythms? Any gem?

Note Well: As emboss mentions in the comments, this answer is a poor fit for an actual system. Firstly, file encryption should not be carried out using this method (The lib provides AES, for example.). Secondly, this answer does not address any of the wider issues that will also affect how you engineer your solution.

The original source also goes into more details.

Ruby can use openssl to do this:

#!/usr/bin/env ruby

# ENCRYPT

require 'openssl'
require 'base64'

public_key_file = 'public.pem';
string = 'Hello World!';

public_key = OpenSSL::PKey::RSA.new(File.read(public_key_file))
encrypted_string = Base64.encode64(public_key.public_encrypt(string))

And decrypt:

#!/usr/bin/env ruby

# DECRYPT

require 'openssl'
require 'base64'

private_key_file = 'private.pem';
password = 'boost facile'

encrypted_string = %Q{
...
}

private_key = OpenSSL::PKey::RSA.new(File.read(private_key_file),password)
string = private_key.private_decrypt(Base64.decode64(encrypted_string))

from here

Match a string against multiple paterns

5 votes

How can I match a string against multiple patterns using regular expression in ruby.

I am trying to see if a string is included in an array of prefixes, This is not working but I think it demonstrates at least what I am trying to do.

# example:
# prefixes.include?("Mrs. Kirsten Hess")

prefixes.include?(name) # should return true / false

prefixes = [
  /Ms\.?/i,
  /Miss/i,
  /Mrs\.?/i,
  /Mr\.?/i,
  /Master/i,
  /Rev\.?/i,
  /Reverend/i,
  /Fr\.?/i,
  /Father/i,
  /Dr\.?/i,
  /Doctor/i,
  /Atty\.?/i,
  /Attorney/i,
  /Prof\.?/i,
  /Professor/i,
  /Hon\.?/i,
  /Honorable/i,
  /Pres\.?/i,
  /President/i,
  /Gov\.?/i,
  /Governor/i,
  /Coach/i,
  /Ofc\.?/i,
  /Officer/i,
  /Msgr\.?/i,
  /Monsignor/i,
  /Sr\.?/i,
  /Sister\.?/i,
  /Br\.?/i,
  /Brother/i,
  /Supt\.?/i,
  /Superintendent/i,
  /Rep\.?/i,
  /Representative/i,
  /Sen\.?/i,
  /Senator/i,
  /Amb\.?/i,
  /Ambassador/i,
  /Treas\.?/i,
  /Treasurer/i,
  /Sec\.?/i,
  /Secretary/i,
  /Pvt\.?/i,
  /Private/i,
  /Cpl\.?/i,
  /Corporal/i,
  /Sgt\.?/i,
  /Sargent/i,
  /Adm\.?/i,
  /Administrative/i,
  /Maj\.?/i,
  /Major/i,
  /Capt\.?/i,
  /Captain/i,
  /Cmdr\.?/i,
  /Commander/i,
  /Lt\.?/i,
  /Lieutenant/i,
  /^Lt Col\.?$/i,
  /^Lieutenant Col$/i,
  /Col\.?/i,
  /Colonel/i,
  /Gen\.?/i,
  /General/i
]

Use Regexp.union to combine them:

union(pats_ary) → new_regexp

Return a Regexp object that is the union of the given patterns, i.e., will match any of its parts.

So this will do:

re = Regexp.union(prefixes)

then you use re as your regex:

if name.match(re)
    #...

How to use modules in Rails application

5 votes

I just created a module location.rb inside /lib folder with following contents:

module Location
  def self.my_zipcode()
    zip_code = "11215"
  end
end

And now in my controller i am trying to call "my_zipcode" method:

class DirectoryController < ApplicationController
  def search
    require 'location'
    zip_code = Location.my_zipcode()
  end
end

But it throws an error:

undefined method `my_zipcode' for Location:Module

You might have to restart the rails server for it to recognize stuff in the lib directory.