Natural Language Date & Time Parsing for ActiveRecord

Update: If you’re on Rails 2.1 or later, be sure to read the update to this post.

Chronic is a nice natural language parser for Ruby, but my first stab at adding it to a Rails application immediately felt wrong. I was adding special case code in a controller to re-parse the field if the initial parse failed. Of course, that needed to be duplicated for every field I wanted to support it.

A much better (second) idea was, why not add this directly to ActiveRecord? Every model gets it for free, and it turns out the code is hardly any more complicated.

require 'active_record'
require 'chronic'
 
module ChronicParser
  def self.extended(object)
    class << object
      alias_method_chain :string_to_date, :chronic
      alias_method_chain :string_to_time, :chronic
    end
  end
 
  def string_to_date_with_chronic(string)
    value = string_to_date_without_chronic(string)
    if value.nil?
      now = TzTime.now rescue Time.now
      value = Chronic.parse(string, :now => now).to_date rescue nil
    end
 
    value
  end
 
  def string_to_time_with_chronic(string)
    value = string_to_time_without_chronic(string)
    if value.nil?
      now = TzTime.now rescue Time.now
      value = Chronic.parse(string, :now => now)
    end
 
    value
  end
end
 
ActiveRecord::ConnectionAdapters::Column.extend(ChronicParser)

Put this where it will be executed during Rails initialization and you’re done.

Note also that while Chronic doesn’t claim to support time zones, this code will do the right thing by supplying a local perspective of the user’s current time, using TzTime. Someone in Japan using your site (which is hosted in California) in the early morning will get the expected result for “tomorrow.”

If a zone hasn’t been set for TzTime, it’ll fall back on the server’s current time. If you aren’t using TzTime, modify appropriately.

Rails 1.2.4, RESTful Routes and Custom Actions

If you use RESTful routes and define custom actions for your resources, you might suddenly find your custom action don’t work after upgrading to 1.2.4. Whereas this used to work fine in 1.2.3:

map.resources :users
map.resources :users, :collection => { :signin => :any,
                                       :signout => :post,
                                       :forgot_password => :any,
                                       :welcome => :get }
map.resources :member => { :email => :get,
                                     :prefs => :get,
                                     :close => :post }

This is actually wrong, and creates duplicate entries in the routing table. The symptom of this problem is that your custom actions (e.g. /users/signin) will run the show action with a bogus ID of signin.

The proper way to define these resource routes is in one shot:

map.resources :users, :collection => { :signin => :any,
                                       :signout => :post,
                                       :forgot_password => :any,
                                       :welcome => :get },
                      :member => { :email => :get,
                                   :prefs => :get,
                                   :close => :post }

(Thanks to wolfmanjm on the rubyonrails-talk list for posting and answering his own question, as I was having the exact same problem.)

Rails 1.2.4 Fixes Google Analytics Goal Conversions

The Rails core team released a small update to the 1.2 stable branch last week, 1.2.4, which fixes a few potential security problems and offers some minor enhancements.

If you plan to move your application to 2.0 when it’s released, the 1.2.4 update turns on some additional deprecation warnings that will help point out potential trouble in your existing code. Another useful resource for this is Mislav Marohnić’s compatibility checker script, though be forewarned that it may return false positives, since it’s just doing regular expression matches.

rake routes, previously available on edge, is included. This will show you all of your named routes, in match order, which makes debugging route-related issues much easier. (You can do this prior to 1.2.4, too, but that requires installing sake and the routes sake recipe.)

For me, though, by far the most welcome enhancement in the 1.2.4 update is something I’ve had to live with since starting to use Google Analytics: Analytics doesn’t know how to handle a semicolon in a goal URL. If you use RESTful routing and display a “thank you” page when a resource (say, an order) is successfully created, chances are high that you won’t be able to track this as a goal in Analytics. This is because custom actions for resources used a semicolon to separate the action from the rest of the path. 1.2.4, just like the coming 2.0, changes this to a regular slash, to avoid incompatibilities with software, such as Analytics, that mistake the semicolon as part of the query string.

If for some reason you want the old behavior, you’re in luck, as the choice of separator is now a class-level accessor on ActionController::Base:

ActionController::Base.resource_action_separator = ';'  # or whatever
Customize the Error Reporting for ActiveRecord Forms

When a form in your Rails application is used to edit (or create) an ActiveRecord instance, ActionView uses the ActiveRecordHelper module to create the HTML input field tags. In the event of an error, your input field is wrapped inside a DIV with class=”fieldWithErrors”. This isn’t so bad, but you’re left with few options for customizing the look of your forms when there is an error. For example, what if you wanted to put the error text associated with a field right next to the field?

One way to do this is by putting additional markup in your forms. Something like:

<% if object.errors.on(:some_field) %>
  <span class="error"><%= object.errors.on(:some_field) %></span>
<% end %>
... rest of field markup ...

That works, but you have to repeat the fields at least twice: once for the potential error condition, and once for the field itself.

You could write up a new form helper that includes the boilerplate error markup, but that falls apart as soon as you want to put in a set of checkbox or radio button controls. Maybe you only want one error message for the entire group, which is almost certainly true for radio buttons.

Another alternative that I don’t see widely acknowledged is to override the standard Proc used by the ActiveRecordHelper for errors: ActionView::Base.field_errro_proc. Here’s an example:

ActionView::Base.field_error_proc = Proc.new do |html_tag, instance|
  error_text = ""
  errors = instance.object.errors.on(instance.method_name)
  if errors
    errors.to_a.each do |error|
      error_text << "<p class=\"error\">#{error}</p>"
    end
  end
  "<div class=\"errors\">#{error_text}</div>#{html_tag}"
end

The Proc instance is passed the HTML tag (already built into a string) and an instance of InstanceTag. The latter is the class used by many of the ActionView helpers for building HTML tags that correspond to a particular piece of data in an object. InstanceTag keeps a reference back to that object and method, and you can make use of it in your custom field_error_proc, as shown above.

One little annoyance you don’t see until you start down this path is that the method that builds checkboxes always includes a hidden field with the same name. This is done to ensure that some value for that field is always sent to the server when the form is submitted. The trouble is that the hidden field gets no special treatment from ActiveRecordHelper: field_error_proc is called twice, once for the checkbox field and once for the hidden field, and custom error messaging, as shown above, gets rendered twice. This can be fixed by overriding InstanceTag#tag:

def tag(name, options)
  if object.respond_to?(:errors) && object.errors.respond_to?(:on)
    suppress_errors = (name == "input" && options["type"] == "hidden")
    error_wrapping(tag_without_error_wrapping(name, options),
                   object.errors.on(@method_name) && !suppress_errors)
  else
    tag_without_error_wrapping(name, options)
  end
end
Sparkline graphs using data: URIs

A while back, I ran across sparkline graphs, created by Edward Tufte for displaying data visually in a very compact space, usually inline with text. Google uses these extensively in their Analytics product, and it seemed like a great fit for representing data on the ZingLists administrative dashboard.

There are two easily found implementations of sparkline graphs for Ruby: Sparklines and Bumpspark. Bumpspark is nice because the graphs use data: URIs and don’t require any extra parts server-side, but the Sparklines plugin offers far more drawing options.

Why not the best of both? Here is a modified sparkline_tag method for Sparklines, which accepts a new option, :inline_data. When true, it emits a data: URI instead of a reference back to your server. I like this for several reasons: it loads faster, it doesn’t require a new controller, and it doesn’t require a new route if you’ve done away with the generic /:controller/:action/:id and are using RESTful routes.

require 'base64'
 
  def sparkline_tag(results = [], options = {})
    tag_options = { :class => (options[:class] || 'sparkline'),
                    :alt => "Sparkline Graph" }
    if options.delete(:inline_data)
      tag_options.merge!(:src => "data:image/png;base64,#{Base64.encode64(Sparklines.plot(results, options)).gsub("\n", "")}")
    else
      url = { :controller => 'sparklines', :results => results.join(',') }
      tag_options.merge!(:src => url_for(url.merge(options)))
    end
    tag(:img, tag_options)
  end

Update: When I emailed this to Geoff (who maintains Sparklines), he reminded me that IE doesn’t support data: URIs, so :inline_data isn’t useful if your pages need to display in IE.

Collecting Statistics from PostgreSQL in Rails

The PostgreSQL database includes a statistics collection facility that can give you information about your database, tables, indexes and sequences. I just posted a new Rails plug-in that makes it very easy to gather this information and display it in a Rails application.

pgsql_stats screenshot

All of the counters described in the PostgreSQL manual are represented in the models in the plug-in. To name a few:

  • Number of scans over the table or index
  • Cache hit/miss counters (and cache hit rate, by a simple computation)
  • On-disk size

In the above screenshot (taken very soon after the server was started), it’s easy to see that the cron_runs table is by far the largest in the database, followed by its primary key index. Of the entities that have been touched, a large percentage of requests are being satisfied by the buffer cache. You can’t see it in that image, but I’ve defined ranges that turn the green bars red if the cache hit rate falls below 75%.

I’ve set up a Google Group forum for further discussion. Some additional information is available in the README, and the plug-in can be installed like any other:

$ script/plugin install \ http://svn.lightyearsoftware.com/svn/plugins/pgsql_stats

All on one line, of course.

Update Sep. 10, 2007: There is now a usage example on the Google Group that shows how to get the results shown in the screenshot.

Update Jul. 23, 2008: Part of the fallout of Google disabling my account appears to be that the group I set up for discussion and support disappeared, too. I have moved discussion and support to my own forums: pgsql_stats forum.

Inheritance in ActiveRecord, without STI

One of the few times Rails let me down while developing ZingLists was how it deals with class inheritance in ActiveRecord. Actually, unless you want Single-Table Inheritance semantics, you may as well pretend it doesn’t exist, because I couldn’t find a clean way to do it.

Here was my particular problem:

Out of the box, Rails likes to construct URLs that usually end with a numeric ID when you’re dealing with resources. There is a method, to_param, that the URL helpers call when they want to turn an object into an ID suitable for a URL.

Conventional wisdom says that Google likes URLs that are meaningful. A string of numbers at the end doesn’t mean anything, which is why you often see URLs ending with a snippet from the page title. Blogs often do this (just look at the top of this page, if you’re not reading from the index).

So, to get nice URLs, modify to_param. But what if you only want to do it some of the time? Inheritance usually solves this problem: customize in the derived class. But ActiveRecord forces STI on you if you do this, and maybe you don’t want that.

In ZingLists, I have a single table, lists, that contains all of the lists in the system, whether they are private lists for a member or lists that a member has published to the community. It pretty much has to be this way, since publishing a list does not fix it in time. The member may (and probably will) continue to use it for themselves, and if they add to it, I’d like those changes to be available in the public view immediately, with no extra work.

I want public list URLs to be nice for Google, but don’t have any desire to junk up private list URLs, too. How to solve this problem?

Duck Typing. The URL helpers don’t care what kind of class you hand them, as long as it responds to to_param:

class PublicList
  def initialize(list)
    @list = list
  end
 
  def to_param
    "#{@list.id}-#{@list.name[0..29].tr_s(" ", "-").gsub(/[^-a-z0-9]+/i, "")}"
  end
end

Now, when I want to create a pretty URL for Google’s benefit, I just have to instantiate a wrapper around the real object:

public_list_path(PublicList.new(list))

(Wishful thinking: What would be really neat is a facility where you can get STI-like semantics, but provide your own definition of how to differentiate between the types of object, rather than have it hard-coded to a column called “type” that contains a string representation of the class to instantiate.)

Introducing ZingLists

Last week, we finally launched our first product: ZingLists. It’s a community-oriented site for making and sharing lists. What kind of lists?

  • Simple, itemized lists for things like what to put in a home emergency/disaster kit
  • To-do lists where tasks have a due date and even a schedule for repeating the task
  • Fun, lighthearted lists of your favorite CDs, movies, places to go…

ZingLists is not the first time I’ve built a real web site, but it is my first serious project that uses Ruby on Rails and I have to say that it has been a very enjoyable experience. There is only one thing I can think of where I wanted to do a little more than Rails offered out of the box, and there wasn’t already a hook somewhere. (More about that later.)

The site is quite young at the moment and so public content is a little thin, but it will get better over time. Feature-wise, the site is very useful. I have had my personal to-do lists running from it for a couple of months, and have been using it for more generic list keeping as well.

There is almost always room for improvement, though, so if anyone has any feedback, feel free to leave a comment here or drop us a note at the support page.

Easier time zone handling in Rails

Update: with the release of Rails 2.1, much of this article is now obsolete.

The application I’m working on at the moment deals with a lot of dates and times, and its userbase could span many different time zones. Dealing with time zones and converting from one zone to another is tedious work, but there are some things you can do to make your life simpler.

First, always think in UTC. If not for daylight saving time, you could probably ignore this rule, but thanks to DST, you can’t. DST takes a relatively simple add or subtract and turns it into a tangle of what-ifs. Some zones don’t observe any daylight saving time rules. The zones that do may change when DST starts and stops from year to year.

Thinking in UTC means storing your times in UTC in the database, without exception. Rails gives you only a little help in this area, with ActiveRecord::Base.default_timezone. It only helps you with created_at/on and updated_at/on. It won’t touch your other timestamp fields.

Bugs often come about by missing little details, and forgetting to get a Time instance in UTC instead of localtime is just the sort of thing I know I’d do eventually. Ruby makes fixing this once easy. Reopen the Time class, and change the behavior of Time.now. Put this in your environment.rb:

class Time
  def self.now_utc
    return now_local.utc
  end
 
  class << self
    alias_method :now_local, :now
    alias_method :now, :now_utc
  end
end

I think this gets you 90% of the way home. The remaining 10% is handling display issues, and Jamis Buck’s TzTime helps immensely with this part. Set TzTime’s zone at the start of each request to the zone of the user making the request, then make life even easier by ensuring you always use a helper for displaying dates and times. Mine looks like this:

def datetime(object, options = {})
  return "argument is a #{object.class}, not a Date or Time" unless object.is_a?(Date) || object.is_a?(Time)
 
  format = if object.is_a?(Date) || options[:date_only] then "%b %d, %Y"
           elsif options[:time_only] then "%I:%M %p"
           else "%b %d, %Y %I:%M %p"
           end
 
  object = TzTime.zone.utc_to_local(object) if object.is_a?(Time) && object.utc?
  return object.strftime(format)
end

To date, I’ve only found one gotcha, and while it was aggravating to find, it was easy to fix. The TMail that ships in Rails 1.2 (as part of ActionMailer) has a nasty habit of ignoring the @sent_on instance variable when sending mail via SMTP. It will always set this header, contrary to what ActionMailer’s RDoc tells you. Unfortunately, it sets the header using Time.now, which returns UTC with the above modification, but marks the time in the local zone. End result: if your localtime is behind UTC, the mail looks like it’s sent in the future. To fix, put this in environment.rb:

class TMail::Mail
  def add_date
  end
end

With this change, TMail will never add a Date: header, allowing the MTA to add it itself.