Best ruby-on-rails-3 questions in May 2011

How to prepare for data loss in a production website?

13 votes

I am building an app that is fast moving into production and I am concerned about the possibility that due to hacking, some silly personal error (like running rake db:schema:load or rake db:rollback) or other circumstance we may suffer data loss in one database table or even across the system.

While I don't find it likely that the above will happen, I would be remiss in not being prepared in case it ever does.

I am using Heroku's PG Backups (which is to be replaced with something else this month), and I also run automated daily backups to S3: http://trevorturk.com/2010/04/14/automated-heroku-backups/, successfully generating .dump files.

What is the correct way to deal with data loss on a production app?

  1. How would I restore the .dump file in case I need to? Can I do a selective restore if a small part of the system is hit?
  2. In case a selective restore is not possible: assume one table loses data 4 hours after the last backup. Result => would fixing the lost table require rolling back 4 hours of users' activity? Any good solution to this?
  3. What is the best way to support users through the inconvenience if something like this happens?

A full DR (disaster recovery) solution requires the following:

  1. Multisite. If a fire, flood, Osama Bin Laden or whathaveyou strikes the Amazon (or is it Salesforce?) data center that Heroku uses, you want to be sure that your data is safe elsewhere.
  2. On-going replication of the data to a separate site (or sites). That means that every transaction that's written to your database on one site, is replicated within seconds to the mirror database on the other site. Most RDBMS's have mechanisms to let you do a master-slave replication like that.
  3. The same goes for anything you put on a filesystem outside of the database, such as images, XML configuration files etc. S3 is a good solution here - they replicate everything to multiple data centers for you.
  4. I won't hurt to create periodic (daily or so) dumps of the database and store them separately (e.g. on S3). This helps you recover from data corruption that propagates to the slave DBs.
  5. Automate the process of data recovery. You want this to just work when you need it.
  6. Test everything. Ideally, you want to automate the test process and run it periodically to ensure that your backups can restore. Netflix Chaos Monkey is an extreme example of this.

I'm not sure how you'd implement all this on Heroku. A complete solution is still priced out of reach for most companies - we're running this across our own data centers (one in the US, one in EU) and it costs many millions. Work according to the 80-20 rule - on-going backup to a separate site, plus a well tested recovery plan (continuously test your ability to recover from backups) covers 80% of what you need.

As for supporting users, the best solution is simply to communicate timely and truthfully when trouble happens and make sure you don't lose any data. If your users are paying for your service (i.e. you're not ad-supported), then you should probably have an SLA in place.

has_one, :through => model VS simple method ?

7 votes

I have some issues using has_one, through => model. The best is to show you my case.

class Category
  has_many :articles
end

class Article
  has_many :comments
  belongs_to :category
end

class Comment
  belongs_to :article
  has_one :category, :through => :articles
end

Everthing works fine. I can do comment.category. The problem is when I create a new comment and set up its article, I have so save the comment to make the association works. Example :

 >> comment = Comment.new
 >> comment.article = Article.last
 >> comment.category
     -> nil
 >> comment.article.category
     -> the category
 >> comment.save
 >> comment.category
     -> nil
 >> comment.reload
 >> comment.category
     -> the category

has_one, through => model anyway do not set up, build constructor and create method. So, I want to replace my comment model by :

class Comment
  belongs_to :article
  def category
    article.category
  end
end

Sounds a good idea ?

Nothing wrong with your idea. I can't see many situations in which has_one :category, :through => :articles would be the obvious better choice (unless eager-loading with Comment.all(:include => :category) ).

A hint on delegate:

class Comment
  belongs_to :article
  delegate :category, :to => :article

A different approach:

class Comment
  belongs_to :article
  has_one :category, :through => :article

  def category_with_delegation
    new_record? ? article.try(:category) : category_without_delegation
  end

  alias_method_chain :category, :delegation

Weird behaviour of ruby regex in rails with utf8 char.

7 votes

Hey to all,

I have problem with one of my validation regex when using nonstandard utf-8 character. So, I run a few experiments and it appears that ruby regex behave different when there are with rails environment or in plain ruby.

I post here my expriment with a Chinese string.

In ruby "pure" :

string = "運動會"
puts string[/\A[\w]*\z/]
=> match "運動會" - ok

In rails :

# coding: utf-8
task :test => :environment do
  string = "運動會"
  puts string[/\A[\w]*\z/]
end
$ rake test
=> nothing - not ok

If I omit # coding: utf-8, it comes with invalid multibyte char (US-ASCII). Anyway, even with this, it doesn't match.

Of course, I have checked everything (ruby_version, encoding of script files in utf-8..)

I use :

  • Rails 3.0.7
  • Ruby 1.9.2 (ruby-1.9.2-p180)

So my conclusion is that rails alter the way regex behave and I did not find a way to make it behaves like in normal ruby.

Ok, I found an answer to my problem. The \w behaves only with ascii character in ruby 1.9 against all unicode caracter in ruby 1.8. In ruby 1.9, now we have to use : [\w\P{ASCII}]

More infos : http://www.ruby-forum.com/topic/210770