Best ruby questions in March 2011

Protect sensitive attributes w/ declarative_authorization

14 votes

Whats a cool way to protect attributes by role using declarative_authorization? For example, a user can edit his contact information but not his role.

My first inclination was to create multiple controller actions for different scenarios. I quickly realized how unwieldy this could become as the number of protected attributes grows. Doing this for user role is one thing, but I can imagine multiple protected attributes. Adding a lot controller actions and routes doesn't feel right.

My second inclination was to create permissions around specific sensitive attributes and then wrap the form elements with View hepers provided by declarative_authorizations. However, the model and controller aspect of this is a bit foggy in my mind. Suggestions would be awesome.

Please advise on the best way to protect attributes by role using declaritive_authorizations ... thanks

EDIT 2011-05-22
Something similar is now in Rails as of 3.1RC https://github.com/rails/rails/blob/master/activerecord/test/cases/mass_assignment_security_test.rb so I would suggest going that route now.

ORIGINAL ANSWER
I just had to port what I had been using previously to Rails 3. I've never used declarative authorization specifically, but this is pretty simple and straightforward enough that you should be able to adapt to it.

Rails 3 added mass_assignment_authorizer, which makes this all really simple. I used that linked tutorial as a basis and just made it fit my domain model better, with class inheritance and grouping the attributes into roles.

In model

acts_as_accessible :admin => :all, :moderator => [:is_spam, :is_featured]
attr_accessible :title, :body # :admin, :moderator, and anyone else can set these

In controller

post.accessed_by(current_user.roles.collect(&:code)) # or however yours works
post.attributes = params[:post]

lib/active_record/acts_as_accessible.rb

# A way to have different attr_accessible attributes based on a Role
# @see ActsAsAccessible::ActMethods#acts_as_accessible
module ActiveRecord
  module ActsAsAccessible
    module ActMethods
      # In model
      # acts_as_accessible :admin => :all, :moderator => [:is_spam]
      # attr_accessible :title, :body
      #
      # In controller
      # post.accessed_by(current_user.roles.collect(&:code))
      # post.attributes = params[:post]
      #
      # Warning: This frequently wouldn't be the concern of the model where this is declared in,
      # but it is so much more useful to have it in there with the attr_accessible declaration.
      # OHWELL.
      #
      # @param [Hash] roles Hash of { :role => [:attr, :attr] }
      # @see acts_as_accessible_attributes
      def acts_as_accessible(*roles)
        roles_attributes_hash = Hash.new {|h,k| h[k] ||= [] }
        roles_attributes_hash = roles_attributes_hash.merge(roles.extract_options!).symbolize_keys

        if !self.respond_to? :acts_as_accessible_attributes
          attr_accessible
          write_inheritable_attribute :acts_as_accessible_attributes, roles_attributes_hash.symbolize_keys
          class_inheritable_reader    :acts_as_accessible_attributes

          # extend ClassMethods unless (class << self; included_modules; end).include?(ClassMethods)
          include InstanceMethods unless included_modules.include?(InstanceMethods)
        else # subclass
          new_acts_as_accessible_attributes = self.acts_as_accessible_attributes.dup
          roles_attributes_hash.each do |role,attrs|
            new_acts_as_accessible_attributes[role] += attrs
          end
          write_inheritable_attribute :acts_as_accessible_attributes, new_acts_as_accessible_attributes.symbolize_keys
        end
      end
    end

    module InstanceMethods
      # @param [Array, NilClass] roles Array of Roles or nil to reset
      # @return [Array, NilClass]
      def accessed_by(*roles)
        if roles.any?
          case roles.first
          when NilClass
            @accessed_by = nil
          when Array
            @accessed_by = roles.first.flatten.collect(&:to_sym)
          else
            @accessed_by = roles.flatten.flatten.collect(&:to_sym)
          end
        end
        @accessed_by
      end

      private
      # This is what really does the work in attr_accessible/attr_protected.
      # This override adds the acts_as_accessible_attributes for the current accessed_by roles.
      # @see http://asciicasts.com/episodes/237-dynamic-attr-accessible
      def mass_assignment_authorizer
        attrs = []
        if self.accessed_by
          self.accessed_by.each do |role|
            if self.acts_as_accessible_attributes.include? role
              if self.acts_as_accessible_attributes[role] == :all
                return self.class.protected_attributes
              else
                attrs += self.acts_as_accessible_attributes[role]
              end
            end
          end
        end
        super + attrs
      end
    end
  end
end

ActiveRecord::Base.send(:extend, ActiveRecord::ActsAsAccessible::ActMethods)

spec/lib/active_record/acts_as_accessible.rb

require 'spec_helper'

class TestActsAsAccessible
  include ActiveModel::MassAssignmentSecurity
  extend ActiveRecord::ActsAsAccessible::ActMethods
  attr_accessor :foo, :bar, :baz, :qux
  acts_as_accessible :dude => [:bar], :bra => [:baz, :qux], :admin => :all
  attr_accessible :foo
  def attributes=(values)
    sanitize_for_mass_assignment(values).each do |k, v|
      send("#{k}=", v)
    end
  end
end

describe TestActsAsAccessible do
  it "should still allow mass assignment to accessible attributes by default" do
    subject.attributes = {:foo => 'fooo'}
    subject.foo.should == 'fooo'
  end
  it "should not allow mass assignment to non-accessible attributes by default" do
    subject.attributes = {:bar => 'baaar'}
    subject.bar.should be_nil
  end
  it "should allow mass assignment to acts_as_accessible attributes when passed appropriate accessed_by" do
    subject.accessed_by :dude
    subject.attributes = {:bar => 'baaar'}
    subject.bar.should == 'baaar'
  end
  it "should allow mass assignment to multiple acts_as_accessible attributes when passed appropriate accessed_by" do
    subject.accessed_by :bra
    subject.attributes = {:baz => 'baaaz', :qux => 'quuux'}
    subject.baz.should == 'baaaz'
    subject.qux.should == 'quuux'
  end
  it "should allow multiple accessed_by to be specified" do
    subject.accessed_by :dude, :bra
    subject.attributes = {:bar => 'baaar', :baz => 'baaaz', :qux => 'quuux'}
    subject.bar.should == 'baaar'
    subject.baz.should == 'baaaz'
    subject.qux.should == 'quuux'
  end
  it "should allow :all access" do
    subject.accessed_by :admin
    subject.attributes = {:bar => 'baaar', :baz => 'baaaz', :qux => 'quuux'}
    subject.bar.should == 'baaar'
    subject.baz.should == 'baaaz'
    subject.qux.should == 'quuux'
  end
end

Web application admin generators

13 votes

Since Symfony 1.x's admin generator, I found this kind of tool really useful to prototype applications, show something very quickly to customers etc.

Now for Symfony2, admin generator does not seems to be a priority (see here and here)

Django's admin generator seems very interesting...

Which web application admin generator (any language / technology) would you recommend (pros / cons) ?

Thanks.

Django's automatic admin app is excellent. Once you've written your models, it automatically creates a full-featured admin app around them where you can create, update and delete records. It's also extensible and customizable for just about whatever you need.

Here's a pretty good overview about it. Django (and python) is intuitive and satisfying to work with -- I highly recommend that you set it up and play with it and see how well it works.

What blocks Ruby, Python to get Javascript V8 speed?

11 votes

Are there any Ruby / Python features that are blocking implementation of optimizations (e.g. inline caching) V8 engine has?

Python is co-developed by Google guys so it shouldn't be blocked by software patents.

Or this is rather matter of resources put into the V8 project by Google.

What blocks Ruby, Python to get Javascript V8 speed?

Nothing.

Well, okay: money. (And time, people, resources, but if you have money, you can buy those.)

V8 has a team of brilliant, highly-specialized, highly-experienced (and thus highly-paid) engineers working on it, that have decades of experience (I'm talking individually – collectively it's more like centuries) in creating high-performance execution engines for dynamic OO languages. They are basically the same people who also created the Sun HotSpot JVM (among many others).

Lars Bak, the lead developer, has been literally working on V8 for 25 years, which is basically his entire (professional) life (and V8's, too). Some of the people writing Ruby VMs aren't even 25 years old.

Are there any Ruby / Python features that are blocking implementation of optimizations (e.g. inline caching) V8 engine has?

Given that at least IronRuby, JRuby, MagLev, MacRuby and Rubinius have either monomorphic (IronRuby) or polymorphic inline caching, the answer is obviously no.

Modern Ruby implementations already do a great deal of optimizations. For example, for certain operations, Rubinius's Hash class is faster than YARV's. Now, this doesn't sound terribly exciting until you realize that Rubinius's Hash class is implemented in 100% pure Ruby, while YARV's is implemented in 100% hand-optimized C.

So, at least in some cases, Rubinius can generate better code than GCC!

Or this is rather matter of resources put into the V8 project by Google.

Yes. Not just Google. V8 is 25 years old now. The people who are working on V8 also created the Self VM (to this day one of the fastest dynamic OO language execution engines ever created), the Animorphic Smalltalk VM (to this day one of the fastest Smalltalk execution engines ever created), the HotSpot JVM (the fastest JVM ever created, probably the fastest VM period) and OOVM (one of the most efficient Smalltalk VMs ever created).

In fact, Lars Bak, the lead developer of V8, worked on every single one of those, plus a few others.

Ruby vs Lua as scripting language for C++

11 votes

I am currently building a game server (not an engine), and I want it to be extendable, like a plugin system.
The solution I found is to use a scripting language. So far, so good.

I'm not sure if I should use Ruby or Lua. Lua is easier to embed, but Ruby has a larger library, and better syntax (in my opinion). The problem is, there is no easy way I found to use Ruby as scripting language with C++, whereas it's very easy with Lua.

Toughs about this? Suggestions for using Ruby as scripting language (I tried SWIG, but it isn't nearly as neat as using Lua)?

Thanks.

I've used Lua extensively in the past.

Luabind is really easy to use, there is no need for an external generator like SWIG, the doc is great. Compile times remain decent.

Biggest problem I've seen : lua is mostly ... write-only. You don't really have classes, but only associative arrays with a bit of syntaxic sugar ( object['key'] can be written object.key ), so you easily end up adding a 'member' in an obscure function, completely forget about it, and have side effects later.

For this reason, and this reason only, I'd prefer Python. Boost::Python is the basis for Luabind so both have a similar API (Luabind used to be slightly easier to build but not anymore). In terms of functionality, they are quite equivalent.

Not directly related : None of these can be reliably used in a multithreaded environment (so this depends on the complexity of your server).

  • N Python threads : the GIL ( Global Interpreter Lock ) is on your way. Each and every time you use a variable in a thread, it's locked, so it kinda ruins the point, except for long I/O operations and calls to C functions.
  • lua has coroutines, but they aren't parallelisable.
  • Ruby threads aren't really threads, but similar to Lua's coroutines

Note that you can still create one environement for each thread, but they won't be able to communicate (except with a C++ machinery). This is especially easy in Lua.

Rails for Zombies Lab 4 > Exercise 3

9 votes

Hi!
I stucked in the fourth Rails for Zombies lab at the third exercise. This is my task: Create action that will create a new Zombie and then redirect to the created zombie's show page. I've got the following params array:

params = { :zombie => { :name => "Greg", :graveyard => "TBA" } }

I wrote the following code as a solution:

def create
   @zombie = Zombie.create   
   @zombie.name = params[ :zombie [ :name ] ]   
   @zombie.graveyard = params[ :zombie [ :graveyard ] ]
   @zombie.save   

   redirect_to(create_zombie_path)
end

But when I submit it I got the following error:
#<TypeError: can't convert Symbol into Integer>

I know that I made a mistake but I cannot figure out where. Please help me.

def create
   @zombie = Zombie.create(params[:zombie])
   redirect_to @zombie
end

Why is Net::HTTP's set_debug_output dangerous if used in production?

9 votes

There is a very useful method in Net::HTTP library that gives ability to debug HTTP requests.

Here is what documentation says about that:

set_debug_output(output)

WARNING This method causes serious security hole. Never use this method in production code.

Set an output stream for debugging.

http://ruby-doc.org/stdlib/libdoc/net/http/rdoc/classes/Net/HTTP.html#M001371

What security hole is mentioned here?

Looking at the code, there is no other security hole, except for the fact that everything in the HTTP protocol is passed to the stream you provide. If you don't take care and the output is put somewhere you don't suspect it, this could expose the internal workings of you application.

IMHO, the statement in the documentation is pretty hard and doesn't provide a good explanation regarding the security hole. I think the comment should read something along the lines of:

Be careful and sit on your hands before you type, since setting a debug_output will expose the complete HTTP protocol (including possible sensitive information) to the stream that is passed in.

Long story short: there is no "hidden" security hole.

Irrational number representation in any programming language?

9 votes

Does anyone know of an irrational number representation type/object/class/whatever in any programming language?

All suggestions welcome.

Simply put, if I have two irrational objects, both representing the square root of five, and I multiply those objects, I want to get back the integer five, not float 4 point lots o' 9s.

Specifically, I need the representation to be able to collect terms, not just resolve every time to an integer/float. For instance, if I want to add the square root of five to one, I don't want it to return some approximation integer/float, I want it to return an object that I can add/multiply with another irrational object, such that I can tell the object to resolve at the latest time possible to minimize the float approximation error.

Thanks much!

What you are looking for is called symbolic mathematics. You might want to try some computer algebra system like Maxima, Maple or Mathematica. There are also libraries for this purpose, for example the SymPy library for Python.

What is the %w "thing" in ruby?

9 votes

I'm referring to the %w operator/constructor/whatever you may call it, used like this:

%w{ foo bar baz }
=> ["foo", "bar", "baz"]

I have several questions about it:

  • What is the proper name of that %w "thing"? Operator? Literal? Constructor?
  • Does it matter whether I use {}s instead of []s?
  • Are there any other things like this (for example, one that gives you an array of symbols instead)?
  • Can they be nested (one %w inside another %w, in order to create nested arrays)?
  • Where can I find documentation about it?

Unsure about the "official" documentation but this is pretty good : http://en.wikibooks.org/wiki/Ruby_Programming/Syntax/Literals#The_.25_Notation

Whether you use {} [] () or <> does not matter except if your string contains this character e.g.:

%q{a closing parenthesis: ")"}

The syntax is pretty complex so remembering every variant is not very useful, but it can come in handy when you are hacking quickly and want to avoid taking care of escape characters manually.

SessionsHelper in railstutorial.org: Should helpers be general-purpose modules for code not needed in views?

9 votes

railstutorial.org has a suggestion which strikes me as a little odd.

It suggests this code:

class ApplicationController < ActionController::Base 
  protect_from_forgery 
  include SessionsHelper 
end 

The include SessionsHelper makes the methods available from ApplicationController, yes, but it makes them available in any view, as well. I understand that authentication/authorization is cross-cutting, but is this really the best place?

That seems to me to be potentially too broad of a scope. Putting code which implements, say, a before_filter which conditionally redirects (as the railstutorial.org example does) in a module which more commonly contains view helpers seems surprising.

Would functionality not strictly needed in views be better placed in ApplicationController or elsewhere?

Or am I just thinking too much about this?

Indeed, your feeling is correct imho.

I would implement this the other way round: add the functions sign_in and current_user to ApplicationController (or if you really want to: in a separate module defined in lib and include it), and then make sure that the current_user method is available in the view.

In short:

class ApplicationController

  helper :current_user

  def sign_in

  end

  def current_user
    @current_user ||= user_from_remember_token
  end
end

Of course, if you have a lot of code to place into your ApplicationController it can get messy. In that case I would create a file lib\session_management.rb:

module SessionManagement
  def self.included(base)
    base.helper :current_user
  end

  def sign_in
    ..
  end

  def current_user
    ..
  end
end

and inside your controller you can then just write:

class ApplicationController
  include SessionManagement
end

Hope this helps. end

What will the major/minor differences be between ruby 1.9.2 and ruby 2.0?

8 votes

I've been told that ruby 1.9.2 is ruby 2.0 but ruby 1.9.3 is slated to be released in the near future and it will contain some performance enhancements.

So what are they planning for 2.0? Will it be much different than ruby 1.9.x?

Two features that are already implemented in YARV, and which will most likely end up in Ruby 2.0, are traits (mix) and Module#prepend.

The mix method, unlike the current include method, takes a list of modules, and mixes all of them in at the same time, making sure that they have no conflicting methods. It also gives you a way to easily resolve conflicts, if e.g. two modules you want to mix in define the same method. So, basically, while the include method allows you to treat a module as a mixin, the mix method allows you to treat a module as a trait.

Module#prepend mixes a module into a class or module, again just like include does, but instead of inserting it into the inheritance chain just above the class, it inserts is just below the class. This means that methods in the module can override methods in the class, and they can delegate to the overriden methods with super, both of which is not possible when using include. This basically makes alias_method_chain obsolete.

One feature that has been discussed for a couple of months (or 10 years, depending on how you count), are Refinements. There has been discussion for over 10 years now to add a way to do scoped, safe monkey patching in Ruby. I.e. a way where I can monkey patch a core class, but only my code sees that monkey patch, other code doesn't. For many years, the frontrunner for that kind of safe monkey patching were Selector Namespaces, however more recently, Classboxes have been getting a lot of attention, and even more recently, a prototype implementation and specification of Refinements, a variant of Classboxes, was put forward.

Generally speaking, the big theme of Ruby 2.0 is scalability: scaling up to bigger teams, bigger codebases, bigger problem sizes, bigger machines, more cores. But also scaling down to smaller machines like embedded devices.

The three features I mentioned above are for scaling to bigger teams and bigger codebases. Some proposed features for scaling to bigger problem sizes and more cores are parallel collections and parallel implementations of Enumerable methods such as map, as well as better concurrency abstractions such as futures, promises, agents, actors, channels, join patterns or something like that.

STI, one controller

7 votes

Hi! I'm new to rails and I'm kind of stuck with this design problem, that might be easy to solve, but I don't get anywhere: I have two different kinds of advertisements: highlights and bargains. Both of them have the same attributes: title, description and one image (with paperclip). They also have the same kind of actions to apply on them: index, new, edit, create, update and destroy.

I set a STI like this:

Ad Model: ad.rb

class Ad < ActiveRecord::Base
end

Bargain Model: bargain.rb

class Bargain < Ad
end

Highlight Model: highlight.rb

class Highlight < Ad
end

The problem is that I'd like to have only one controller (AdsController) that executes the actions I said on bargains or highlights depending on the URL, say www.foo.com/bargains[/...] or www.foo.com/highlights[/...].

For example:

  • GET www.foo.com/highlights => a list of all the ads that are highlights.
  • GET www.foo.com/highlights/new => form to create a new highlight etc...

How can i do that?

Thanks!

Hi!

First. Add some new routes:

resources :highlights, :controller => "ads", :type => "Highlight"
resources :bargains, :controller => "ads", :type => "Bargain"

And fix some actions in AdsController. For example:

def new
  @ad = Ad.new()
  @ad.type = params[:type]
end

For best approach for all this controller job look this comment

That's all. Now you can go to localhost:3000/highlights/new and new Highlight will be initialized.

Index action can look like this:

def index
  @ads = Ad.where(:type => params[:type])
end

Go to localhost:3000/highlights and list of highlights will appear.
Same way for bargains: localhost:3000/bargains

etc

URLS

<%= link_to 'index', :highlights %>
<%= link_to 'new', [:new, :highlight] %>
<%= link_to 'edit', [:edit, @ad] %>
<%= link_to 'destroy', @ad, :method => :delete %>

for being polymorphic :)

<%= link_to 'index', @ad.class %>

ruby should I use self. or @

7 votes

Here is my ruby code

class Demo
  attr_accessor :lines

  def initialize(lines)
    self.lines = lines
  end
end

In the above code I could have used

    @lines = lines

Mostly I see people using @ in initialize method. Is there a preferred way of doing among these two and why?

When you use @lines, you are accessing the instance variable itself. self.lines actually goes through the lines method of the class; likewise, self.lines = x goes through the lines= method. So use @ when you want to access the variable directly, and self. when you want to access via the method.

To directly answer your question, normally you want to set the instance variables directly in your initialize method, but it depends on your use-case.

What does %{} do in Ruby?

7 votes

In Matt's post about drying up cucumber tests, Aslak suggests the following.

When I have lots of quotes, I prefer this:

Given %{I enter “#{User.first.username}” in “username”}

What is the %{CONTENT} construct called? Will someone mind referencing it in some documentation? I'm not sure how to go about looking it up.

There's also the stuff about %Q. Is that equivalent to just %? What of the curly braces? Can you use square braces? Do they function differently?

Finally, what is the #{<ruby stuff to be evaluated>} construct called? Is there a reference to that in documentation somewhere, too?

  1. "Percent literals" is usually a good way to google some information:

  2. #{} is called "string interpolation".

How can I find open source projects to contribute to (Ruby, Rails)

7 votes

I'm a Ruby on Rails developer with a bit of time on my hands.

I would like to use this time to give back and learn by contributing to an open source project.

I'm not a top notch programmer and would like to start small.

Where can i find small open source projects in Ruby or Rails ? And how can I contribute ?

Alex

My advice is to look at the projects you use and really love, then get on their message boards and see what's needed. This is usually when people say "send me a pull request" or "send me a patch"

You can also look at a project's github "issues" tab. Any of these are generally something that can be worked on. You'll fork the project, make changes (and add tests), then send the maintainer a pull-request.

Anyway, long story short: work on something you love using.

Ruby: Self reference in hash

7 votes

Is it possible to reference one element in a hash within another element in the same hash?

# Pseudo code
foo = { :world => "World", :hello => "Hello #{foo[:world]}" }
foo[:hello] # => "Hello World"

Indirectly perhaps...

foo = { :world => 'World', :hello => lambda { "Hello #{foo[:world]}" }}

puts foo[:hello].call

Is there a clean way to avoid calling a method on nil in a nested params hash?

7 votes

I'm interested in getting the nested 'name' parameter of a params hash. Calling something like

params[:subject][:name]

throws an error when params[:subject] is empty. To avoid this error I usually write something like this:

if params[:subect] && params[:subject][:name]

Is there a cleaner way to implement this?

IMHO the best solution by far is Ick's maybe. You don't need to significantly change your code, just intersperse maybe proxies when necessary, explicit yet compact:

params[:subject].maybe[:name]

The same author (raganwald) also wrote the (probably best known) andand, but while it works very similar, I think that writing maybe is much nicer.

Rails article helper - "a" or "an"

6 votes

Does anyone know of a Rails Helper which can automatically prepend the appropriate article to a given string? For instance, if I pass in "apple" to the function it would turn out "an apple", whereas if I were to send in "banana" it would return "a banana"

I already checked the Rails TextHelper module but could not find anything. Apologies if this is a duplicate but it is admittedly a hard answer to search for...

None that I know of but it seems simple enough to write a helper for this right? Off the top of my head

def indefinite_articlerize(params_word)
    %w(a e i o u).include?(params_word[0].downcase) ? "an #{params_word}" : "a #{params_word}"
end

hope that helps

edit 1: Also found this thread with a patch that might help you bulletproof this more https://rails.lighthouseapp.com/projects/8994/tickets/2566-add-aan-inflector-indefinitize

Sort values using a specific collation in Ruby/Rails

6 votes

Is it possible to sort an array of values using a specific collation in Ruby? I have a need to sort according to the da_DK collation.

Given the array %w(Aarhus Aalborg Assens) I would like to have ['Assens', 'Aalborg', 'Aarhus'] back which is the correct order in Danish.

The standard sort method

%w(Aarhus Aalborg Assens).sort

returns something that looks like the ascii order (at least not the Danish order):

["Aalborg", "Aarhus", "Assens"]

The environment is both Snow Leopard and linux running ruby 1.9.2 and Rails 3.0.5.

I found the ffi-locale on Github and that solves my problem as far as I can see.

It allows the following code:

FFILocale::setlocale FFILocale::LC_COLLATE, 'da_DK.UTF-8'
%w(Aarhus Aalborg Assens).sort { |a,b| FFILocale::strcoll(a, b) }

Which returns the correct result:

=> ["Assens", "Aalborg", "Aarhus"]

I haven't investigated performance yet but it calls out to native code so it ought to be faster that Ruby character replacement code...

Update
It is not perfect :( It does not work properly on Snow Leopard - it seems that the strcoll function is broken on OS X and have been for some time. It is annoying to me but the main platform for deployment is linux - where it works - so it is my currently preferred solution.

Merge array of hashes to get hash of arrays of values

6 votes

This is the opposite of Turning a Hash of Arrays into an Array of Hashes in Ruby.

Elegantly and/or efficiently turn an array of hashes into a hash where the values are arrays of all values:

hs = [
  { a:1, b:2 },
  { a:3, c:4 },
  { b:5, d:6 }
]
collect_values( hs )
#=> { :a=>[1,3], :b=>[2,5], :c=>[4], :d=>[6] }

This terse code almost works, but fails to create an array when there are no duplicates:

def collect_values( hashes )
  hashes.inject({}){ |a,b| a.merge(b){ |_,x,y| [*x,*y] } }
end
collect_values( hs )
#=> { :a=>[1,3], :b=>[2,5], :c=>4, :d=>6 }

This code works, but can you write a better version?

def collect_values( hashes )
  # Requires Ruby 1.8.7+ for Object#tap
  Hash.new{ |h,k| h[k]=[] }.tap do |result|
    hashes.each{ |h| h.each{ |k,v| result[k]<<v } }
  end
end

Solutions that only work in Ruby 1.9 are acceptable, but should be noted as such.

Update: Benchmarking Results

Here are the results of benchmarking the various answers below (and a few more of my own), using three different arrays of hashes:

  • one where each hash has distinct keys, so no merging ever occurs:
    [{:a=>1}, {:b=>2}, {:c=>3}, {:d=>4}, {:e=>5}, {:f=>6}, {:g=>7}, ...]

  • one where every hash has the same key, so maximum merging occurs:
    [{:a=>1}, {:a=>2}, {:a=>3}, {:a=>4}, {:a=>5}, {:a=>6}, {:a=>7}, ...]

  • and one that is a mix of unique and shared keys:
    [{:c=>1}, {:d=>1}, {:c=>2}, {:f=>1}, {:c=>1, :d=>1}, {:h=>1}, {:c=>3}, ...]
               user     system      total        real
Phrogz 2a  0.577000   0.000000   0.577000 (  0.576000)
Phrogz 2b  0.624000   0.000000   0.624000 (  0.620000)
Glenn 1    0.640000   0.000000   0.640000 (  0.641000)
Phrogz 1   0.671000   0.000000   0.671000 (  0.668000)
Michael 1  0.702000   0.000000   0.702000 (  0.700000)
Michael 2  0.717000   0.000000   0.717000 (  0.726000)
Glenn 2    0.765000   0.000000   0.765000 (  0.764000)
fl00r      0.827000   0.000000   0.827000 (  0.836000)
sawa       0.874000   0.000000   0.874000 (  0.868000)
Tokland 1  0.873000   0.000000   0.873000 (  0.876000)
Tokland 2  1.077000   0.000000   1.077000 (  1.073000)
Phrogz 3   2.106000   0.093000   2.199000 (  2.209000)

The fastest code is this method that I added:

def collect_values(hashes)
  {}.tap{ |r| hashes.each{ |h| h.each{ |k,v| (r[k]||=[]) << v } } }
end

I've accepted glenn mcdonald's answer as it was competitive in terms of speed, reasonably terse, but (most importantly) because it pointed out the danger of using a Hash with a self-modifying default proc for convenient construction, as this may introduce bad changes when the user is indexing it later on.

Finally, here's the benchmark code, in case you want to run your own comparisons:

require 'prime'   # To generate the third hash
require 'facets'  # For tokland1's map_by
AZSYMBOLS = (:a..:z).to_a
TESTS = {
  '26 Distinct Hashes'   => AZSYMBOLS.zip(1..26).map{|a| Hash[*a] },
  '26 Same-Key Hashes'   => ([:a]*26).zip(1..26).map{|a| Hash[*a] },
  '26 Mixed-Keys Hashes' => (2..27).map do |i|
    factors = i.prime_division.transpose
    Hash[AZSYMBOLS.values_at(*factors.first).zip(factors.last)]
  end
}

def phrogz1(hashes)
  Hash.new{ |h,k| h[k]=[] }.tap do |result|
    hashes.each{ |h| h.each{ |k,v| result[k]<<v } }
  end
end
def phrogz2a(hashes)
  {}.tap{ |r| hashes.each{ |h| h.each{ |k,v| (r[k]||=[]) << v } } }
end
def phrogz2b(hashes)
  hashes.each_with_object({}){ |h,r| h.each{ |k,v| (r[k]||=[]) << v } }
end
def phrogz3(hashes)
  result = hashes.inject({}){ |a,b| a.merge(b){ |_,x,y| [*x,*y] } }
  result.each{ |k,v| result[k] = [v] unless v.is_a? Array }
end
def glenn1(hs)
  hs.reduce({}) {|h,pairs| pairs.each {|k,v| (h[k] ||= []) << v}; h}
end
def glenn2(hs)
  hs.map(&:to_a).flatten(1).reduce({}) {|h,(k,v)| (h[k] ||= []) << v; h}
end
def fl00r(hs)
  h = Hash.new{|h,k| h[k]=[]}
  hs.map(&:to_a).flatten(1).each{|v| h[v[0]] << v[1]}
  h
end
def sawa(a)
  a.map(&:to_a).flatten(1).group_by{|k,v| k}.each_value{|v| v.map!{|k,v| v}}
end
def michael1(hashes)
  h = Hash.new{|h,k| h[k]=[]}
  hashes.each_with_object(h) do |h, result|
    h.each{ |k, v| result[k] << v }
  end
end
def michael2(hashes)
  h = Hash.new{|h,k| h[k]=[]}
  hashes.inject(h) do |result, h|
    h.each{ |k, v| result[k] << v }
    result
  end
end
def tokland1(hs)
  hs.map(&:to_a).flatten(1).map_by{ |k, v| [k, v] }
end
def tokland2(hs)
  Hash[hs.map(&:to_a).flatten(1).group_by(&:first).map{ |k, vs|
    [k, vs.map{|o|o[1]}]
  }]
end

require 'benchmark'
N = 10_000
Benchmark.bm do |x|
  x.report('Phrogz 2a'){ TESTS.each{ |n,h| N.times{ phrogz2a(h) } } }
  x.report('Phrogz 2b'){ TESTS.each{ |n,h| N.times{ phrogz2b(h) } } }
  x.report('Glenn 1  '){ TESTS.each{ |n,h| N.times{ glenn1(h)   } } }
  x.report('Phrogz 1 '){ TESTS.each{ |n,h| N.times{ phrogz1(h)  } } }
  x.report('Michael 1'){ TESTS.each{ |n,h| N.times{ michael1(h) } } }
  x.report('Michael 2'){ TESTS.each{ |n,h| N.times{ michael2(h) } } }
  x.report('Glenn 2  '){ TESTS.each{ |n,h| N.times{ glenn2(h)   } } }
  x.report('fl00r    '){ TESTS.each{ |n,h| N.times{ fl00r(h)    } } }
  x.report('sawa     '){ TESTS.each{ |n,h| N.times{ sawa(h)     } } }
  x.report('Tokland 1'){ TESTS.each{ |n,h| N.times{ tokland1(h) } } }
  x.report('Tokland 2'){ TESTS.each{ |n,h| N.times{ tokland2(h) } } }
  x.report('Phrogz 3 '){ TESTS.each{ |n,h| N.times{ phrogz3(h)  } } }

end

Take your pick:

hs.reduce({}) {|h,pairs| pairs.each {|k,v| (h[k] ||= []) << v}; h}

hs.map(&:to_a).flatten(1).reduce({}) {|h,(k,v)| (h[k] ||= []) << v; h}

I'm strongly against messing with the defaults for hashes, as the other suggestions do, because then checking for a value modifies the hash, which seems very wrong to me.

What's a Rails plugin, or Ruby gem, to automatically fix English grammar?

5 votes

Facebook just re-launched Comments, with a automatic grammar fixing feature.

What does the grammar filter do?

Adds punctuation (e.g. periods at the end of sentences)
Trims extra whitespace
Auto cases words (e.g. capitalize the first word of a sentence)
Expands slang words (e.g. plz becomes please)
Adds a space after punctuation (e.g. Hi,Cat would become Hi, Cat)
Fix common grammar mistakes (e.g. convert ‘dont' to ‘don’t’)

What is an equivalent plugin or gem?

I don't know of anything with those particular features.

However, you might look at Ruby LinkParser, which is a Ruby wrapper for the Link Grammar parser developed by academics and used by the Abiword project for grammar checking. (Note that "link" in Link Grammer parser doesn't refer to HTML links, but rather to a structure that described English syntax as a set of links between words).

Here's another interesting checker, written in Ruby, which is designed to check LaTex files for some of the problems you mention (plus others).