sql_logging Updated for Edge Rails

I updated the sql_logging plug-in today to work on Edge Rails.  The PostgreSQL adapter on Edge tries to load a different database driver, and explicitly loading it when the application isn’t using it causes a crash when the server starts.  Today’s change reworks things by asking ActiveRecord what adapter it’s using and only loading the extensions for it.

One other change includes making PostgreSQL query statistics work on Rails 2.0.  The adapter there prefers to call the execute method instead of query.

Things should still work on Rails 1.2, but I’m moving applications to 2.0 whenever possible, so I’ll rely on a bug report if something is broken.

Finally, I’ll leave with a little tip: statistics aren’t kept for queries that lack a name or look like Rails asking for schema information.  If you have a significant amount of query work done using the raw connection (select_all or select_one), make sure you give the query a name so it can be tracked properly and show up in the top 10 list.

Get the plug-in from svn at http://svn.lightyearsoftware.com/svn/plugins/sql_logging.

Working with ActiveRecord Callbacks

This post should be subtitled “And how not to waste one of YOUR afternoons because of them.”

As the documentation describes, there are a total of 18 “normal” callbacks defined by ActiveRecord that allow you to manage the lifecycle of your objects. For the purpose of this post, I define “normal” as callbacks that can be implemented in a way other than overriding the named method. There are two more (after_initialize and after_find), but they must be implemented as an overridden method, so my argument here does not apply to them.

The short of it: don’t define your callbacks as explicit methods. In other words, don’t do this:

def after_save
  # ...
end

OK, so why? The answer boils down to how ActiveRecord executes callbacks and why it may decide to stop running the chain.

Explicit method implementations like this come dead last in the calling order, and since callbacks can return false and stop the execution of further callbacks, your code may never be executed. This can come as a complete surprise, because associations might define callbacks. For example, has_one defines an after_save callback that will try to save the associated model if you just saved a new record or if the associated model is a new record. If for some reason that associated model isn’t valid, the return value is false and ActiveRecord kills the chain of callbacks. Perhaps you have an after_save method explicitly defined that would fix this problem? It won’t be called, no exception is raised and you’ll spend a couple of hours reading ActiveRecord code to piece things together.

The ActiveRecord::Callbacks documentation explains most of this, but in bits and pieces and never in a way that makes it clear to how to best avoid trouble. Here are my new rules of the road:

  1. Declare callbacks either as symbols to method names (not named after their callback), inline blocks, strings (for eval) or completely separate objects.
  2. Declare them very early in your model, but especially before associations. Associations may declare their own callbacks and you probably want yours to run first.
The sql_logging Debugging Plug-In

One of my current projects involves significantly improving the search function in a Rails application, a thorny problem that led me to come up with the method to quiet down the Rails logger in my SQL log silencer. It turns out, that wasn’t enough. I have expanded quite a bit on that idea by folding in some useful debugging aids from other developers, making them more useful, and adding a SQL query “hit list” after each request completes. I’m releasing it as the sql_logging plug-in.

Installation is the usual (on one line):

$ script/plugin install \
http://svn.lightyearsoftware.com/svn/plugins/sql_logging

I have set up a forum for discussion and support of the plug-in.

So, what does this give you?

Out of the box, you get these new things:

  • A count of the number of rows returned and a rough count of bytes for each SELECT query
  • A summary at the end of the request of the total number of queries, bytes returned and SQL executions
  • Each SQL execution includes a clean-up backtrace to show you exactly what part of your code triggered the query
  • A summary at the end of the request showing a “top 10” list of queries

The last one is particular useful. Queries are grouped by their backtrace, and then sorted by either number of executions, rows returned or bytes returned. This provides an excellent way to target your performance optimization work.

When sorted by executions, you very quickly see where the same code is triggering a query over and over, and can spend your optimization time trying to reduce this by adding the model to an earlier query (via :include) or by getting more rows at a time by batching up loads (WHERE id IN (...)). Here’s a sample:

Completed in 13.01586 (0 reqs/sec) | Rendering: 2.69928 (20%) | DB: 8.68216 (66%), 154 selects, 1728.1 KB, 166 execs
Top 10 SQL executions:
  Executed 95 times, returning 180 rows (230.5 KB):
    app/models/merge_keyword.rb:34:in `get_merges_by_keyword'
    app/models/merge_keyword.rb:30:in `each'
    app/models/merge_keyword.rb:30:in `get_merges_by_keyword'
    app/models/merge.rb:107:in `search'
    app/models/merge.rb:105:in `each'
    app/models/merge.rb:105:in `search'
    app/controllers/search_controller.rb:98:in `results'
  Executed 22 times, returning 30 rows (1.3 KB):
    app/models/merge.rb:80:in `attribute'
    app/views/merges/_result_line.rhtml:2:in `_run_rhtml_47app47views47merges47_result_line46rhtml'
    app/views/search/results.rhtml:172:in `_run_rhtml_47app47views47search47results46rhtml'
    app/views/search/results.rhtml:170:in `each'
    app/views/search/results.rhtml:170:in `each_with_index'
    app/views/search/results.rhtml:170:in `_run_rhtml_47app47views47search47results46rhtml'
  Executed 20 times, returning 30 rows (1.2 KB):
    app/models/merge.rb:80:in `attribute'
    app/helpers/application_helper.rb:40:in `merge_detail_link'
    app/views/merges/_result_line.rhtml:26:in `_run_rhtml_47app47views47merges47_result_line46rhtml'
    app/views/search/results.rhtml:172:in `_run_rhtml_47app47views47search47results46rhtml'
    app/views/search/results.rhtml:170:in `each'
    app/views/search/results.rhtml:170:in `each_with_index'
    app/views/search/results.rhtml:170:in `_run_rhtml_47app47views47search47results46rhtml'

By sorting by rows or bytes returned, you may be able to determine that you need to let the database do more of the filtering work. It may be inefficient to have the database return a pile of rows to your application, only to discard a bunch of them.

The silencer from my previous post is in there, too, along with a new silencer that quiets only the backtrace for the duration of the block:

silence_sql do
  # ...
end
 
silence_sql_backtrace do
  # ...
end

I must give a big hat tip to Adam Doppelt and Nathaniel Talbott. Adam’s SQL logging enhancements formed the basis for the logging enhancements here. I made them a little more generic so they worked with both MySQL and PostgreSQL, as well as tracking executions for things that don’t return data. Nathaniel’s QueryTrace plug-in has been around for a while, and I originally tried making this new plug-in simply work with it, but in the end decided to fold the functionality in so the “top 10” list didn’t duplicate the backtrace cleaning code. I also added a regexp to the cleaner so it excludes the plug-in’s code from the backtrace.

There are a few more goodies in the plug-in, so look at the README for more details.

Update: I fixed a few bugs and added a feature. The plug-in now ignores queries whose name ends with ” Column”. These are internal Rails queries and their inclusion skewed the results.

The new feature tracks the execution time of SQL, and lets you sort the top-10 list by either total time for all instances of the SQL or the median execution time. Total time is also now the default sorting method, as this is probably the best way to attack SQL optimization: whatever is taking up the biggest percentage of your time in the database is a good place to start.

Update 2: The plug-in works on Rails 2.0.2 now.

A SQL Log Silencer

Here’s something I just whipped up to quiet ActiveRecord’s logging in development mode. Normally, ActiveRecord’s AbstractAdapter logs every SQL query as it is made. Sometimes, it’s nice to quiet this down without changing your log level while tracking down a problem.

class ActiveRecord::ConnectionAdapters::AbstractAdapter
  def log_info_with_silencing(sql, name, runtime)
    unless ActiveRecord::Base.silence_sql_logging
      log_info_without_silencing(sql, name, runtime)
    end
  end
 
  alias_method_chain :log_info, :silencing
end
 
module SqlSilencer
  def silence_sql
    logging = ActiveRecord::Base.silence_sql_logging
    begin
      ActiveRecord::Base.silence_sql_logging = true
      yield
    ensure
      ActiveRecord::Base.silence_sql_logging = logging
    end
  end
end
 
class ActiveRecord::Base
  @@silence_sql_logging = false
  cattr_accessor :silence_sql_logging
 
  include SqlSilencer
end
 
class ActionController::Base
  include SqlSilencer
end

Drop this into a file in config/initializers if you’re on Rails 2.0 or require it from within environment.rb for pre-2.0.

To use inside a controller action or model method, simply wrap the code you want to silence inside a block:

def index
  silence_sql do
    # lots of work here...
  end

One interesting side effect of this is that you can easily see how much database work is done as a result of what’s in your view. Wrap the entire controller action inside a silence_sql block, then look at your log. Any queries you see are a result of things you didn’t load in the controller.

Natural Language Date & Time Parsing for ActiveRecord

Update: If you’re on Rails 2.1 or later, be sure to read the update to this post.

Chronic is a nice natural language parser for Ruby, but my first stab at adding it to a Rails application immediately felt wrong. I was adding special case code in a controller to re-parse the field if the initial parse failed. Of course, that needed to be duplicated for every field I wanted to support it.

A much better (second) idea was, why not add this directly to ActiveRecord? Every model gets it for free, and it turns out the code is hardly any more complicated.

require 'active_record'
require 'chronic'
 
module ChronicParser
  def self.extended(object)
    class << object
      alias_method_chain :string_to_date, :chronic
      alias_method_chain :string_to_time, :chronic
    end
  end
 
  def string_to_date_with_chronic(string)
    value = string_to_date_without_chronic(string)
    if value.nil?
      now = TzTime.now rescue Time.now
      value = Chronic.parse(string, :now => now).to_date rescue nil
    end
 
    value
  end
 
  def string_to_time_with_chronic(string)
    value = string_to_time_without_chronic(string)
    if value.nil?
      now = TzTime.now rescue Time.now
      value = Chronic.parse(string, :now => now)
    end
 
    value
  end
end
 
ActiveRecord::ConnectionAdapters::Column.extend(ChronicParser)

Put this where it will be executed during Rails initialization and you’re done.

Note also that while Chronic doesn’t claim to support time zones, this code will do the right thing by supplying a local perspective of the user’s current time, using TzTime. Someone in Japan using your site (which is hosted in California) in the early morning will get the expected result for “tomorrow.”

If a zone hasn’t been set for TzTime, it’ll fall back on the server’s current time. If you aren’t using TzTime, modify appropriately.

Rails 1.2.4, RESTful Routes and Custom Actions

If you use RESTful routes and define custom actions for your resources, you might suddenly find your custom action don’t work after upgrading to 1.2.4. Whereas this used to work fine in 1.2.3:

map.resources :users
map.resources :users, :collection => { :signin => :any,
                                       :signout => :post,
                                       :forgot_password => :any,
                                       :welcome => :get }
map.resources :member => { :email => :get,
                                     :prefs => :get,
                                     :close => :post }

This is actually wrong, and creates duplicate entries in the routing table. The symptom of this problem is that your custom actions (e.g. /users/signin) will run the show action with a bogus ID of signin.

The proper way to define these resource routes is in one shot:

map.resources :users, :collection => { :signin => :any,
                                       :signout => :post,
                                       :forgot_password => :any,
                                       :welcome => :get },
                      :member => { :email => :get,
                                   :prefs => :get,
                                   :close => :post }

(Thanks to wolfmanjm on the rubyonrails-talk list for posting and answering his own question, as I was having the exact same problem.)

Rails 1.2.4 Fixes Google Analytics Goal Conversions

The Rails core team released a small update to the 1.2 stable branch last week, 1.2.4, which fixes a few potential security problems and offers some minor enhancements.

If you plan to move your application to 2.0 when it’s released, the 1.2.4 update turns on some additional deprecation warnings that will help point out potential trouble in your existing code. Another useful resource for this is Mislav Marohnić’s compatibility checker script, though be forewarned that it may return false positives, since it’s just doing regular expression matches.

rake routes, previously available on edge, is included. This will show you all of your named routes, in match order, which makes debugging route-related issues much easier. (You can do this prior to 1.2.4, too, but that requires installing sake and the routes sake recipe.)

For me, though, by far the most welcome enhancement in the 1.2.4 update is something I’ve had to live with since starting to use Google Analytics: Analytics doesn’t know how to handle a semicolon in a goal URL. If you use RESTful routing and display a “thank you” page when a resource (say, an order) is successfully created, chances are high that you won’t be able to track this as a goal in Analytics. This is because custom actions for resources used a semicolon to separate the action from the rest of the path. 1.2.4, just like the coming 2.0, changes this to a regular slash, to avoid incompatibilities with software, such as Analytics, that mistake the semicolon as part of the query string.

If for some reason you want the old behavior, you’re in luck, as the choice of separator is now a class-level accessor on ActionController::Base:

ActionController::Base.resource_action_separator = ';'  # or whatever
Customize the Error Reporting for ActiveRecord Forms

When a form in your Rails application is used to edit (or create) an ActiveRecord instance, ActionView uses the ActiveRecordHelper module to create the HTML input field tags. In the event of an error, your input field is wrapped inside a DIV with class=”fieldWithErrors”. This isn’t so bad, but you’re left with few options for customizing the look of your forms when there is an error. For example, what if you wanted to put the error text associated with a field right next to the field?

One way to do this is by putting additional markup in your forms. Something like:

<% if object.errors.on(:some_field) %>
  <span class="error"><%= object.errors.on(:some_field) %></span>
<% end %>
... rest of field markup ...

That works, but you have to repeat the fields at least twice: once for the potential error condition, and once for the field itself.

You could write up a new form helper that includes the boilerplate error markup, but that falls apart as soon as you want to put in a set of checkbox or radio button controls. Maybe you only want one error message for the entire group, which is almost certainly true for radio buttons.

Another alternative that I don’t see widely acknowledged is to override the standard Proc used by the ActiveRecordHelper for errors: ActionView::Base.field_errro_proc. Here’s an example:

ActionView::Base.field_error_proc = Proc.new do |html_tag, instance|
  error_text = ""
  errors = instance.object.errors.on(instance.method_name)
  if errors
    errors.to_a.each do |error|
      error_text << "<p class=\"error\">#{error}</p>"
    end
  end
  "<div class=\"errors\">#{error_text}</div>#{html_tag}"
end

The Proc instance is passed the HTML tag (already built into a string) and an instance of InstanceTag. The latter is the class used by many of the ActionView helpers for building HTML tags that correspond to a particular piece of data in an object. InstanceTag keeps a reference back to that object and method, and you can make use of it in your custom field_error_proc, as shown above.

One little annoyance you don’t see until you start down this path is that the method that builds checkboxes always includes a hidden field with the same name. This is done to ensure that some value for that field is always sent to the server when the form is submitted. The trouble is that the hidden field gets no special treatment from ActiveRecordHelper: field_error_proc is called twice, once for the checkbox field and once for the hidden field, and custom error messaging, as shown above, gets rendered twice. This can be fixed by overriding InstanceTag#tag:

def tag(name, options)
  if object.respond_to?(:errors) && object.errors.respond_to?(:on)
    suppress_errors = (name == "input" && options["type"] == "hidden")
    error_wrapping(tag_without_error_wrapping(name, options),
                   object.errors.on(@method_name) && !suppress_errors)
  else
    tag_without_error_wrapping(name, options)
  end
end
Sparkline graphs using data: URIs

A while back, I ran across sparkline graphs, created by Edward Tufte for displaying data visually in a very compact space, usually inline with text. Google uses these extensively in their Analytics product, and it seemed like a great fit for representing data on the ZingLists administrative dashboard.

There are two easily found implementations of sparkline graphs for Ruby: Sparklines and Bumpspark. Bumpspark is nice because the graphs use data: URIs and don’t require any extra parts server-side, but the Sparklines plugin offers far more drawing options.

Why not the best of both? Here is a modified sparkline_tag method for Sparklines, which accepts a new option, :inline_data. When true, it emits a data: URI instead of a reference back to your server. I like this for several reasons: it loads faster, it doesn’t require a new controller, and it doesn’t require a new route if you’ve done away with the generic /:controller/:action/:id and are using RESTful routes.

require 'base64'
 
  def sparkline_tag(results = [], options = {})
    tag_options = { :class => (options[:class] || 'sparkline'),
                    :alt => "Sparkline Graph" }
    if options.delete(:inline_data)
      tag_options.merge!(:src => "data:image/png;base64,#{Base64.encode64(Sparklines.plot(results, options)).gsub("\n", "")}")
    else
      url = { :controller => 'sparklines', :results => results.join(',') }
      tag_options.merge!(:src => url_for(url.merge(options)))
    end
    tag(:img, tag_options)
  end

Update: When I emailed this to Geoff (who maintains Sparklines), he reminded me that IE doesn’t support data: URIs, so :inline_data isn’t useful if your pages need to display in IE.