Updating RubyGems to Recent 1.3.x
The RubyGems update process can be temperamental. If you fall more than a release or two behind, you might find yourself in a dependency cycle that stops an update cold. Recently, I tried to update a 1.2.0 install to the current 1.3.5.
# gem update --system
Updating RubyGems
Updating rubygems-update
Successfully installed rubygems-update-1.3.5
ERROR: While executing gem ... (NameError)
undefined local variable or method `remote_gemspecs' for
#<Gem::Commands::UpdateCommand:0xb7e26640>
This is a known issue. Running the command again results in “nothing to update,” also a known issue. The fix is to separately install rubygems-update and run update_rubygems.
# gem install rubygems-update Successfully installed rubygems-update-1.3.5 1 gem installed Installing ri documentation for rubygems-update-1.3.5... Installing RDoc documentation for rubygems-update-1.3.5... Could not find main page README Could not find main page README Could not find main page README Could not find main page README # update_rubygems /usr/lib/ruby/site_ruby/1.8/rubygems.rb:578:in `report_activate_error': Could not find RubyGem builder (>= 0) (Gem::LoadError) from /usr/lib/ruby/site_ruby/1.8/rubygems.rb:134:in `activate' from /usr/lib/ruby/site_ruby/1.8/rubygems.rb:158:in `activate' from /usr/lib/ruby/site_ruby/1.8/rubygems.rb:157:in `each' from /usr/lib/ruby/site_ruby/1.8/rubygems.rb:157:in `activate' from /usr/lib/ruby/site_ruby/1.8/rubygems.rb:49:in `gem' from /usr/bin/update_rubygems:18
To fix this “report_activate_error”, I installed builder. Tried the update again, same thing for session and hoe-seattlerb. This is where I found myself in a dependency cycle.
rubygems-update 1.3.5 requires (among other things) hoe-seattlerb
hoe-seattlerb requires hoe >= 2.3.0
hoe >= 2.3.0 requires rubygems >= 1.3.1
I might have the specific dependency chain a bit off, but it’s enough to say that something required by rubygems-update itself requires a semi-recent RubyGems.
The solution is to update in stages. First update to 1.3.0, then update the rest of the way:
# gem install rubygems-update -v 1.3.0 Successfully installed rubygems-update-1.3.0 1 gem installed # update_rubygems Installing RubyGems 1.3.1 ... # gem update --system Updating RubyGems Updating rubygems-update Successfully installed rubygems-update-1.3.5 :0:Warning: Gem::SourceIndex#search support for String patterns is deprecated Updating RubyGems to 1.3.5 Installing RubyGems 1.3.5 RubyGems 1.3.5 installed
Capistrano Deployments from GitHub
Late last week I spent a chunk of time helping a client troubleshoot a deployment problem with Capistrano. For several weeks, I’ve been working on the site, deploying without any trouble, but when he tried to do it himself, it failed:
$ cap deploy
* executing `deploy'
* executing `deploy:update'
** transaction: start
* executing `deploy:update_code'
updating the cached checkout on all servers
* executing (...long git command snipped...)
servers: ["example.com"]
[example.com] executing command
** [example.com :: err] Permission denied (publickey).
** [example.com :: err] fatal: The remote end hung up unexpectedly
command finished
*** [deploy:update_code] rolling back
The problem was that I’d created a new user on his server to separate the production site from the staging site, and the staging user did not have a SSH key pair to access GitHub. It worked for me because I did have my SSH key on GitHub and a SSH agent in place that allowed authentication to be done through me instead of the server.
Capistrano’s error message in this case is not very helpful. If you think you’re having a similar problem, try cap shell and run a command, such as uptime. If that works, it’s not a permissions problem to your server, it’s probably from your server to GitHub. Either install a SSH agent or create a new key pair specifically for deployments.
Testing HTTP Digest Authentication in Rails
Rails 2.3 introduced HTTP Digest authentication to go along with HTTP Basic as simple ways to authenticate access to your application. HTTP Digest is more secure than Basic for several reasons. First, no passwords are transmitted in cleartext (Basic only Base64 encodes them — there is no encryption). Second, the content of the HTTP_AUTHORIZATION header is tied to a request method and URI. Third, there is a limited window of time an attacker could re-use the header in a replay attack even against the same request method and URI (five minutes in 2.3.2).
These are compelling reasons to switch to Digest from Basic, but there is one problem. How do you test it? For good test coverage, clearly you want to verify that actions that should be protected are, and that only valid username/password combinations permit access.
Basic is easy (in test_helper.rb):
def authenticate_with_http_basic(user = 'one', password = 'one') @request.env['HTTP_AUTHORIZATION'] = "Basic #{Base64.encode64("#{user}:#{password}")}" end
Try this for Digest (outside of everything else in ActiveSupport::TestCase):
require 'digest/md5' class ActionController::TestCase def authenticate_with_http_digest(user = 'admin', password = 'admin', realm = 'Application') unless ActionController::Base < ActionController::ProcessWithTest ActionController::Base.class_eval { include ActionController::ProcessWithTest } end @controller.instance_eval %Q( alias real_process_with_test process_with_test def process_with_test(request, response) credentials = { :uri => request.env['REQUEST_URI'], :realm => "#{realm}", :username => "#{user}", :nonce => ActionController::HttpAuthentication::Digest.nonce, :opaque => ActionController::HttpAuthentication::Digest.opaque, } request.env['HTTP_AUTHORIZATION'] = ActionController::HttpAuthentication::Digest.encode_credentials( request.request_method, credentials, "#{password}", false ) real_process_with_test(request, response) end ) end end
Now, precede a call to get, post, put or delete with a call to authenticate_with_http_digest. The code above monkeypatches process_with_test to set up the proper HTTP_AUTHORIZATION header, but only for the current @controller, then runs the request normally.
Natural Language Date & Time Parsing for ActiveRecord (Rails 2.1+)
If you read my previous post on natural language date & time parsing for ActiveRecord and tried it on Rails 2.1 or later, you may have noticed it works fine for dates, but not dates with times.
As part of the time zones feature introduced in Rails 2.1, changes were made to how ActiveRecord parses date/time columns during attribute assignment. The overridden method I showed in my previous post is never called, and Chronic doesn’t get a chance to parse the string.
For Rails 2.1 and later, instead you need to provide a new version of ActiveSupport::TimeZone#parse:
class ActiveSupport::TimeZone def parse_with_chronic(string) time = parse_without_chronic(string) if time.nil? time = Chronic.parse(string, :now => self.now) time = self.local(time.year, time.month, time.day, time.hour, time.min, time.sec) end time end alias_method_chain :parse, :chronic end
Rails 2.2 Integration Tests Always Use the Cookie Store
If you’re upgrading an existing Rails app to 2.2 or later, and you aren’t already using the cookie session store, you will likely find that your integration tests are failing. The symptom will be a “500 Internal Server Error” mixed in with “E”s for each failing test. The stack trace at the end will start with:
NoMethodError: You have a nil object when you didn't expect it!
You might have expected an instance of ActiveRecord::Base.
The error occurred while evaluating nil.[]=
.../vendor/rails/actionpack/lib/action_controller/integration.rb:294:in `process'
If you dig into integration.rb, you’ll find it’s trying to access the @header hash.
The cause of this problem is that integration tests are now hard-coded to use the cookie session store. If you aren’t using it, there’s a good chance you haven’t defined a secret for it in your environment.rb. To solve, add this to your environment.rb:
config.action_controller.session = { :session_key => '_session_key', :secret => 'long unguessable string' }
Get your own “long unguessable string” by running rake secret.
PostgreSQL and Ruby on Rails
The current state state of PostgreSQL on Rails (early 2009) is a bit of a muddy mess. There are three different gems you can use to connect ActiveRecord to PostgreSQL and no guidance I can find about which of the two native adapters to use. I’m going to try and clear this up.
postgres-pr is a pure Ruby adapter. It doesn’t require native libraries and should work in most situations. However, because it isn’t native, it’s the low performer and likely won’t offer access to all of PostgreSQL’s features.
postgres is the old adapter. It appears to be maintained now by Jeff Davis, who forked it from Dave Lee. If you are compiling against PostgreSQL 8.3, you must use Jeff’s version (currently 0.7.9.2008.01.28), which includes build fixes for 8.3.
pg is the new adapter, also maintained by Jeff Davis. He says this one has a better design than postgres and offers more features. It does not work prior to Rails 2.1, however. Rails 2.1 and later will attempt to load this driver first and fall back to postgres. If you use any plug-ins that monkeypatch the database driver, you might have problems with pg (my own sql_logging, for one, is broken — I’ll fix this shortly).
So:
- If you can build native extensions, are on Rails 2.1 or later, use pg.
- If you can build native extensions, but are on Rails 2.0 or earlier, use postgres.
- If you cannot build native extensions, use postgres-pr.
Building native extensions on OS X can be tricky, though. Here’s what I use on an Intel Mac, using the Ruby and Rubygems that ship with OS X and PostgreSQL 8.3 from MacPorts. (Unlike Robby’s guide, I do not advocate moving the system’s Ruby out of the way. You’re likely to break other stuff if you do so.)
sudo env ARCHFLAGS='-arch i386' gem install pg --remote -- --with-pgsql-include=/opt/local/include/postgresql83 --with-pgsql-lib=/opt/local/lib/postgresql83
If you use a different version of PostgreSQL than 8.3, make the appropriate substitution. If you’re still on PowerPC, change the ARCHFLAGS to -arch ppc. If you want to use postgres instead of pg:
sudo env ARCHFLAGS='-arch i386' gem install postgres --remote -- --with-pgsql-include=/opt/local/include/postgresql83 --with-pgsql-lib=/opt/local/lib/postgresql83
sql_logging Updated for Edge Rails
I updated the sql_logging plug-in today to work on Edge Rails. The PostgreSQL adapter on Edge tries to load a different database driver, and explicitly loading it when the application isn’t using it causes a crash when the server starts. Today’s change reworks things by asking ActiveRecord what adapter it’s using and only loading the extensions for it.
One other change includes making PostgreSQL query statistics work on Rails 2.0. The adapter there prefers to call the execute method instead of query.
Things should still work on Rails 1.2, but I’m moving applications to 2.0 whenever possible, so I’ll rely on a bug report if something is broken.
Finally, I’ll leave with a little tip: statistics aren’t kept for queries that lack a name or look like Rails asking for schema information. If you have a significant amount of query work done using the raw connection (select_all or select_one), make sure you give the query a name so it can be tracked properly and show up in the top 10 list.
Get the plug-in from svn at http://svn.lightyearsoftware.com/svn/plugins/sql_logging.
Working with ActiveRecord Callbacks
This post should be subtitled “And how not to waste one of YOUR afternoons because of them.”
As the documentation describes, there are a total of 18 “normal” callbacks defined by ActiveRecord that allow you to manage the lifecycle of your objects. For the purpose of this post, I define “normal” as callbacks that can be implemented in a way other than overriding the named method. There are two more (after_initialize and after_find), but they must be implemented as an overridden method, so my argument here does not apply to them.
The short of it: don’t define your callbacks as explicit methods. In other words, don’t do this:
def after_save # ... end
OK, so why? The answer boils down to how ActiveRecord executes callbacks and why it may decide to stop running the chain.
Explicit method implementations like this come dead last in the calling order, and since callbacks can return false and stop the execution of further callbacks, your code may never be executed. This can come as a complete surprise, because associations might define callbacks. For example, has_one defines an after_save callback that will try to save the associated model if you just saved a new record or if the associated model is a new record. If for some reason that associated model isn’t valid, the return value is false and ActiveRecord kills the chain of callbacks. Perhaps you have an after_save method explicitly defined that would fix this problem? It won’t be called, no exception is raised and you’ll spend a couple of hours reading ActiveRecord code to piece things together.
The ActiveRecord::Callbacks documentation explains most of this, but in bits and pieces and never in a way that makes it clear to how to best avoid trouble. Here are my new rules of the road:
- Declare callbacks either as symbols to method names (not named after their callback), inline blocks, strings (for eval) or completely separate objects.
- Declare them very early in your model, but especially before associations. Associations may declare their own callbacks and you probably want yours to run first.
The sql_logging Debugging Plug-In
One of my current projects involves significantly improving the search function in a Rails application, a thorny problem that led me to come up with the method to quiet down the Rails logger in my SQL log silencer. It turns out, that wasn’t enough. I have expanded quite a bit on that idea by folding in some useful debugging aids from other developers, making them more useful, and adding a SQL query “hit list” after each request completes. I’m releasing it as the sql_logging plug-in.
Installation is the usual (on one line):
$ script/plugin install \
http://svn.lightyearsoftware.com/svn/plugins/sql_logging
I have set up a forum for discussion and support of the plug-in.
So, what does this give you?
Out of the box, you get these new things:
- A count of the number of rows returned and a rough count of bytes for each SELECT query
- A summary at the end of the request of the total number of queries, bytes returned and SQL executions
- Each SQL execution includes a clean-up backtrace to show you exactly what part of your code triggered the query
- A summary at the end of the request showing a “top 10″ list of queries
The last one is particular useful. Queries are grouped by their backtrace, and then sorted by either number of executions, rows returned or bytes returned. This provides an excellent way to target your performance optimization work.
When sorted by executions, you very quickly see where the same code is triggering a query over and over, and can spend your optimization time trying to reduce this by adding the model to an earlier query (via :include) or by getting more rows at a time by batching up loads (WHERE id IN (...)). Here’s a sample:
Completed in 13.01586 (0 reqs/sec) | Rendering: 2.69928 (20%) | DB: 8.68216 (66%), 154 selects, 1728.1 KB, 166 execs
Top 10 SQL executions:
Executed 95 times, returning 180 rows (230.5 KB):
app/models/merge_keyword.rb:34:in `get_merges_by_keyword'
app/models/merge_keyword.rb:30:in `each'
app/models/merge_keyword.rb:30:in `get_merges_by_keyword'
app/models/merge.rb:107:in `search'
app/models/merge.rb:105:in `each'
app/models/merge.rb:105:in `search'
app/controllers/search_controller.rb:98:in `results'
Executed 22 times, returning 30 rows (1.3 KB):
app/models/merge.rb:80:in `attribute'
app/views/merges/_result_line.rhtml:2:in `_run_rhtml_47app47views47merges47_result_line46rhtml'
app/views/search/results.rhtml:172:in `_run_rhtml_47app47views47search47results46rhtml'
app/views/search/results.rhtml:170:in `each'
app/views/search/results.rhtml:170:in `each_with_index'
app/views/search/results.rhtml:170:in `_run_rhtml_47app47views47search47results46rhtml'
Executed 20 times, returning 30 rows (1.2 KB):
app/models/merge.rb:80:in `attribute'
app/helpers/application_helper.rb:40:in `merge_detail_link'
app/views/merges/_result_line.rhtml:26:in `_run_rhtml_47app47views47merges47_result_line46rhtml'
app/views/search/results.rhtml:172:in `_run_rhtml_47app47views47search47results46rhtml'
app/views/search/results.rhtml:170:in `each'
app/views/search/results.rhtml:170:in `each_with_index'
app/views/search/results.rhtml:170:in `_run_rhtml_47app47views47search47results46rhtml'
By sorting by rows or bytes returned, you may be able to determine that you need to let the database do more of the filtering work. It may be inefficient to have the database return a pile of rows to your application, only to discard a bunch of them.
The silencer from my previous post is in there, too, along with a new silencer that quiets only the backtrace for the duration of the block:
silence_sql do # ... end silence_sql_backtrace do # ... end
I must give a big hat tip to Adam Doppelt and Nathaniel Talbott. Adam’s SQL logging enhancements formed the basis for the logging enhancements here. I made them a little more generic so they worked with both MySQL and PostgreSQL, as well as tracking executions for things that don’t return data. Nathaniel’s QueryTrace plug-in has been around for a while, and I originally tried making this new plug-in simply work with it, but in the end decided to fold the functionality in so the “top 10″ list didn’t duplicate the backtrace cleaning code. I also added a regexp to the cleaner so it excludes the plug-in’s code from the backtrace.
There are a few more goodies in the plug-in, so look at the README for more details.
Update: I fixed a few bugs and added a feature. The plug-in now ignores queries whose name ends with ” Column”. These are internal Rails queries and their inclusion skewed the results.
The new feature tracks the execution time of SQL, and lets you sort the top-10 list by either total time for all instances of the SQL or the median execution time. Total time is also now the default sorting method, as this is probably the best way to attack SQL optimization: whatever is taking up the biggest percentage of your time in the database is a good place to start.
Update 2: The plug-in works on Rails 2.0.2 now.
A SQL Log Silencer
Here’s something I just whipped up to quiet ActiveRecord’s logging in development mode. Normally, ActiveRecord’s AbstractAdapter logs every SQL query as it is made. Sometimes, it’s nice to quiet this down without changing your log level while tracking down a problem.
class ActiveRecord::ConnectionAdapters::AbstractAdapter def log_info_with_silencing(sql, name, runtime) unless ActiveRecord::Base.silence_sql_logging log_info_without_silencing(sql, name, runtime) end end alias_method_chain :log_info, :silencing end module SqlSilencer def silence_sql logging = ActiveRecord::Base.silence_sql_logging begin ActiveRecord::Base.silence_sql_logging = true yield ensure ActiveRecord::Base.silence_sql_logging = logging end end end class ActiveRecord::Base @@silence_sql_logging = false cattr_accessor :silence_sql_logging include SqlSilencer end class ActionController::Base include SqlSilencer end
Drop this into a file in config/initializers if you’re on Rails 2.0 or require it from within environment.rb for pre-2.0.
To use inside a controller action or model method, simply wrap the code you want to silence inside a block:
def index silence_sql do # lots of work here... end
One interesting side effect of this is that you can easily see how much database work is done as a result of what’s in your view. Wrap the entire controller action inside a silence_sql block, then look at your log. Any queries you see are a result of things you didn’t load in the controller.
