Safari’s Web Inspector in Nightly

Surfin’ Safari has a great post talking about some recent improvements to the Web Inspector in the latest WebKit nightly builds. Safari’s Web Inspector has long been a really great tool for examining the DOM and simple profiling of a requested page and its resources, but these new updates make it a serious tool for web developers.

If you’ve never tried a WebKit nightly build before, it is easy (on a Mac, anyway) to run it side-by-side with Safari. You don’t have to give up your stable, released browser to check out these new features. Still, it will be very nice when these improvements make it into an official release.

For Firefox users, YSlow is a very nice extension built on top of Firebug that provides a detailed report card on page load performance, including suggestions for improving it.

Announcing Pat Counter for iPhone

This evening I finally received an email from Apple informing me that our first application for the iPhone and iPod touch is ready for sale. I’m pleased to announce Pat Counter, a simple way to keep track of a running count without having to hold something in your hand or look at a screen to tap a button.

It’s a simple application, but it was an interesting experience in developing and shipping an application for a new platform. Rands was right about 1.0: it’s amazing how much work and time it takes to take care of all the little details, even in something as simple as Pat Counter. Valley start-ups really do go out of business because it can be so hard. (Aside: Rands’s book Managing Humans is quite good.)

I wasn’t able to find any data points out there about lead times, but for me it was five days between submission of the binary to Apple and approval for sale. That includes a weekend, but it seems like apps are still approved then, just at a slower rate.

Finally, a word of advice: make sure your binary is right the first time. I goofed my initial build and it took seven days before Apple told me about it, by which time I’d already noticed it myself. Replacing the binary was effectively the same as the initial submission; I went to the back of the line. Read the instructions for a distribution build closely and follow them exactly.

Update: It’s now available on the App Store with a direct link, but isn’t in the search results.

Update 2 (Sep. 15): It now shows up in search results, a mere five days after approval.

What else might be in Snow Leopard?

Apple Inc. announced Mac OS X 10.6 “Snow Leopard” at WWDC’08 on Monday, but other than a press release and a page each for 10.6 Client and Server, very little information was given that isn’t covered under the WWDC NDA.

Still, one “feature” is interesting, especially in the face of (mostly confirmed) rumors that 10.6 will drop support for the PowerPC:

Snow Leopard dramatically reduces the footprint of Mac OS X, making it even more efficient for users, and giving them back valuable hard drive space for their music and photos.

Dropping PowerPC will save reduce the size of executable code by around 50%. Here is the output of otool -f for iTunes:

Fat headers
fat_magic 0xcafebabe
nfat_arch 2
architecture 0
    cputype 18
    cpusubtype 0
    capabilities 0x0
    offset 4096
    size 17345088
    align 2^12 (4096)
architecture 1
    cputype 7
    cpusubtype 3
    capabilities 0x0
    offset 17350656
    size 17053824
    align 2^12 (4096)

The PowerPC’s CPU type is 18 (look in /usr/include/mac/machine.h), so the PowerPC code in the main iTunes executable accounts for 50.4% of the total. I’m going to assume that this ratio will mostly hold true for frameworks, too.

The entire iTunes bundle is 122 MB, though. Trimming the 16.5 MB + 1 MB of bundled frameworks off of that is only about 14%; not what I’d call a “dramatic” footprint reduction. What else might they have planned?

Another couple of options are transparent compression at the filesystem layer and conversion of image resources to a vector format.

Apple’s own page says that read/write ZFS will be in 10.6 Server, so it’s reasonable to expect it will be in Client, too, perhaps with some of the more advanced features disabled. ZFS includes an option to transparently compress data. NTFS on Windows has had this for years, and third-party products did it on FAT years before that, so it’s entirely reasonable to expect that Macs will finally get this, too. 71% of the iTunes bundle is resources, mostly localizations, and inside those, the largest directories are for the in-app help, which is HTML.

Support for resolution independence has been rumored for years, and while Apple has supported it to some degree since 10.4, it hasn’t really caught on. Might they finally be converting all of the system image resources to a vector format? I don’t have any numbers on it, but a vector graphic is certain to take up much less space than a 512×512 PNG.

Advertising and Making Money Online

David Heinemeier Hansson gave an amusing talk at Startup School ’08 and it echoes something I’ve been thinking about for some time. A lot of web sites out there get very, very popular and yet there is no indication they plan to make money. Many more web sites fall back on advertising as their primary business model.

ZingLists isn’t terrifically popular, but I’ve given a lot of thought to how it might generate some revenue and least pay the hosting bill. Up to now, I’ve taken the easy route and put Google ads on the list summary pages. They aren’t very intrusive and once in a while they’re actually useful, displaying an ad that helps you cross something off that list.

There are really two problems with relying on advertising as a business model. The first is my problem. Relying on someone else to sell the ads (Google) means you essentially have a single customer. If something happens to that customer, all of your revenue disappears in a puff of smoke. Google disabled my account about a week ago and I can’t get them to tell me why. It’s the whole account, too: AdSense, Analytics, Gmail, Calendar, everything. I have no idea what the perceived problem is, though my best guess is click fraud. That sucks, though, because I can’t control an end-user’s behavior, and if I don’t show them the ads, I have no chance of generating revenue. So Google AdSense (or any single provider of ads) is not a real business model.

The second big problem with advertising is this simple question: why are you in business? Is it to sell advertising or to write software / provide a service? If it’s the latter, I’m sorry to inform you that generating revenue by selling your own ads means you are in fact in the business of selling ads, not writing software. This is not a business I want to be in.

Another Example of How Not to Do Customer Relations

Mozy finally announced their 1.0 release for Mac OS X. I haven’t heard from these guys since I signed up for the notification, probably more than a year ago. Then today, out of the blue, I get an email that leads off with:

One of the nice things about being an ugly American is that I make all sorts of ignorant and culturally insensitive remarks about other countries. And as luck would have it, I had that opportunity last week when I visited Montreal. (That’s in Canada.)

The first thing I noticed about Montreal is that it’s a very clean city – like Toronto in that respect. Oh wait – actually, the first thing I noticed was the streets were on fire and full of police and rioting citizens because I guess they won an O-KAY (that’s hockey) game.

Admitting that you’re a jerk doesn’t give you a free pass to go ahead and act like a jerk. If anything, it’s worse because you’ve made it clear that you know it’s wrong.

And as a method of contacting a list of people who otherwise haven’t heard from your company before? This isn’t how I’d choose to establish a relationship with potential customers.

I’ll rest well knowing that Mozy won’t get any of my money, especially with so many other options available in their market.

sql_logging Updated for Edge Rails

I updated the sql_logging plug-in today to work on Edge Rails.  The PostgreSQL adapter on Edge tries to load a different database driver, and explicitly loading it when the application isn’t using it causes a crash when the server starts.  Today’s change reworks things by asking ActiveRecord what adapter it’s using and only loading the extensions for it.

One other change includes making PostgreSQL query statistics work on Rails 2.0.  The adapter there prefers to call the execute method instead of query.

Things should still work on Rails 1.2, but I’m moving applications to 2.0 whenever possible, so I’ll rely on a bug report if something is broken.

Finally, I’ll leave with a little tip: statistics aren’t kept for queries that lack a name or look like Rails asking for schema information.  If you have a significant amount of query work done using the raw connection (select_all or select_one), make sure you give the query a name so it can be tracked properly and show up in the top 10 list.

Get the plug-in from svn at http://svn.lightyearsoftware.com/svn/plugins/sql_logging.

Quality of Service

I clearly take my job a little too seriously.

I’ve been filling up some free hours these past couple of weeks putting some polish on ZingLists. Most of the changes involve tweaking a few things here and there, but the big one is that I’m adding a mobile interface for the most essential features, and that necessitated moving from Rails 1.2 to Rails 2.0.

Everything went smoothly during development, but late Friday, when I pushed the code to production, anything that touched the new code died with a scary-looking illegal instruction signal. My production server runs FreeBSD, and I had upgraded a few other packages to stay current with security issues and bug fixes, so this left me with the cold feeling of a completely dead site late on a Friday afternoon.

As it turns out, while I haven’t solved the problem yet, I got lucky through some over-engineering on my part when I first wrote my Capistrano deployment recipes. I do deployments in two stages. Stage 1: push the code, run migrations and background tasks (e.g generating a Google sitemap). Stage 2: move configuration files into place, switch the “current” symlink and restart processes. Since I hadn’t run stage 2 yet, everything was still working fine.

Twitter also pushed some new code on Friday, but they’re having problems big enough that the site may as well be down.

ZingLists is microscopic compared to Twitter in just about every way imaginable. I’m one guy with no VC, Twitter is a dozen-ish people with at least a few million in VC. Neither of us are making any money off our web sites. Shouldn’t I be the one that doesn’t care about my uptime?

I’m optimistic that I will solve my “illegal instruction” problem with Rails 2.0, so in the interest of priming this post for anyone else who runs into the same thing, here’s a summary of the issue. (I expect most of you will stop reading here.)

The “illegal instruction” seems to be a case of the Ruby interpreter not dealing well with a stack overflow. This could be from deep recursion, but I doubt it in my case since the same works fine in development. Even simple things like List.find(:all) from script/console causes the problem, and that’s a simple ActiveRecord query.

On FreeBSD 6.2, the Ruby port is always compiled with pthreads. Because of the way that this version of FreeBSD is written, if a dynamic shared object loaded into a running program needs a shared library (like pthreads), that library must be linked to by the executing program. In this case, that’s the Ruby interpreter. The default pthreads stack size on FreeBSD is relatively small compared to other operating systems, though I couldn’t track down an exact number. It’s in the ballpark of 64 KB or 128 KB. Linux usually uses something closer to 1 or 2 MB.

My theory at the moment is that Rails 2.0 nests method calls deeper than 1.2, just enough to exhaust the stack and trigger the illegal instruction signal. I’m guessing this is because the stack overflow results in some random bit of code being executed. I’m on the AMD64 platform, and I believe it marks non-code pages with the NX (no execute) bit, so a jump off to a random address is highly likely to be caught by the kernel.

I don’t really want to recompile or hack the Ruby interpreter to make pthreads stacks larger, since I’d have to make the same fix every time Ruby is updated. I’m going to try to replicate the problem in a 6.2 virtual machine, then upgrade it to 6.3 and possibly 7.0 to see if either of those solve it. Barring a response from the Ruby port maintainer, I may even look at switching to Ubuntu Linux, though I hate to do that since I’m otherwise very satisfied with FreeBSD.

Updated April 24: so as usual, admitting in public that I have a problem with some code virtually guarantees that the problem is in my code.  I had an infinite recursion bug due to an <tt>:include</tt> on an association.  FreeBSD simply died really fast, probably due its small thread stack, but I also was missing an adequate test case to catch the problem.

Working with ActiveRecord Callbacks

This post should be subtitled “And how not to waste one of YOUR afternoons because of them.”

As the documentation describes, there are a total of 18 “normal” callbacks defined by ActiveRecord that allow you to manage the lifecycle of your objects. For the purpose of this post, I define “normal” as callbacks that can be implemented in a way other than overriding the named method. There are two more (after_initialize and after_find), but they must be implemented as an overridden method, so my argument here does not apply to them.

The short of it: don’t define your callbacks as explicit methods. In other words, don’t do this:

def after_save
  # ...
end

OK, so why? The answer boils down to how ActiveRecord executes callbacks and why it may decide to stop running the chain.

Explicit method implementations like this come dead last in the calling order, and since callbacks can return false and stop the execution of further callbacks, your code may never be executed. This can come as a complete surprise, because associations might define callbacks. For example, has_one defines an after_save callback that will try to save the associated model if you just saved a new record or if the associated model is a new record. If for some reason that associated model isn’t valid, the return value is false and ActiveRecord kills the chain of callbacks. Perhaps you have an after_save method explicitly defined that would fix this problem? It won’t be called, no exception is raised and you’ll spend a couple of hours reading ActiveRecord code to piece things together.

The ActiveRecord::Callbacks documentation explains most of this, but in bits and pieces and never in a way that makes it clear to how to best avoid trouble. Here are my new rules of the road:

  1. Declare callbacks either as symbols to method names (not named after their callback), inline blocks, strings (for eval) or completely separate objects.
  2. Declare them very early in your model, but especially before associations. Associations may declare their own callbacks and you probably want yours to run first.
The sql_logging Debugging Plug-In

One of my current projects involves significantly improving the search function in a Rails application, a thorny problem that led me to come up with the method to quiet down the Rails logger in my SQL log silencer. It turns out, that wasn’t enough. I have expanded quite a bit on that idea by folding in some useful debugging aids from other developers, making them more useful, and adding a SQL query “hit list” after each request completes. I’m releasing it as the sql_logging plug-in.

Installation is the usual (on one line):

$ script/plugin install \
http://svn.lightyearsoftware.com/svn/plugins/sql_logging

I have set up a forum for discussion and support of the plug-in.

So, what does this give you?

Out of the box, you get these new things:

  • A count of the number of rows returned and a rough count of bytes for each SELECT query
  • A summary at the end of the request of the total number of queries, bytes returned and SQL executions
  • Each SQL execution includes a clean-up backtrace to show you exactly what part of your code triggered the query
  • A summary at the end of the request showing a “top 10” list of queries

The last one is particular useful. Queries are grouped by their backtrace, and then sorted by either number of executions, rows returned or bytes returned. This provides an excellent way to target your performance optimization work.

When sorted by executions, you very quickly see where the same code is triggering a query over and over, and can spend your optimization time trying to reduce this by adding the model to an earlier query (via :include) or by getting more rows at a time by batching up loads (WHERE id IN (...)). Here’s a sample:

Completed in 13.01586 (0 reqs/sec) | Rendering: 2.69928 (20%) | DB: 8.68216 (66%), 154 selects, 1728.1 KB, 166 execs
Top 10 SQL executions:
  Executed 95 times, returning 180 rows (230.5 KB):
    app/models/merge_keyword.rb:34:in `get_merges_by_keyword'
    app/models/merge_keyword.rb:30:in `each'
    app/models/merge_keyword.rb:30:in `get_merges_by_keyword'
    app/models/merge.rb:107:in `search'
    app/models/merge.rb:105:in `each'
    app/models/merge.rb:105:in `search'
    app/controllers/search_controller.rb:98:in `results'
  Executed 22 times, returning 30 rows (1.3 KB):
    app/models/merge.rb:80:in `attribute'
    app/views/merges/_result_line.rhtml:2:in `_run_rhtml_47app47views47merges47_result_line46rhtml'
    app/views/search/results.rhtml:172:in `_run_rhtml_47app47views47search47results46rhtml'
    app/views/search/results.rhtml:170:in `each'
    app/views/search/results.rhtml:170:in `each_with_index'
    app/views/search/results.rhtml:170:in `_run_rhtml_47app47views47search47results46rhtml'
  Executed 20 times, returning 30 rows (1.2 KB):
    app/models/merge.rb:80:in `attribute'
    app/helpers/application_helper.rb:40:in `merge_detail_link'
    app/views/merges/_result_line.rhtml:26:in `_run_rhtml_47app47views47merges47_result_line46rhtml'
    app/views/search/results.rhtml:172:in `_run_rhtml_47app47views47search47results46rhtml'
    app/views/search/results.rhtml:170:in `each'
    app/views/search/results.rhtml:170:in `each_with_index'
    app/views/search/results.rhtml:170:in `_run_rhtml_47app47views47search47results46rhtml'

By sorting by rows or bytes returned, you may be able to determine that you need to let the database do more of the filtering work. It may be inefficient to have the database return a pile of rows to your application, only to discard a bunch of them.

The silencer from my previous post is in there, too, along with a new silencer that quiets only the backtrace for the duration of the block:

silence_sql do
  # ...
end
 
silence_sql_backtrace do
  # ...
end

I must give a big hat tip to Adam Doppelt and Nathaniel Talbott. Adam’s SQL logging enhancements formed the basis for the logging enhancements here. I made them a little more generic so they worked with both MySQL and PostgreSQL, as well as tracking executions for things that don’t return data. Nathaniel’s QueryTrace plug-in has been around for a while, and I originally tried making this new plug-in simply work with it, but in the end decided to fold the functionality in so the “top 10” list didn’t duplicate the backtrace cleaning code. I also added a regexp to the cleaner so it excludes the plug-in’s code from the backtrace.

There are a few more goodies in the plug-in, so look at the README for more details.

Update: I fixed a few bugs and added a feature. The plug-in now ignores queries whose name ends with ” Column”. These are internal Rails queries and their inclusion skewed the results.

The new feature tracks the execution time of SQL, and lets you sort the top-10 list by either total time for all instances of the SQL or the median execution time. Total time is also now the default sorting method, as this is probably the best way to attack SQL optimization: whatever is taking up the biggest percentage of your time in the database is a good place to start.

Update 2: The plug-in works on Rails 2.0.2 now.

A SQL Log Silencer

Here’s something I just whipped up to quiet ActiveRecord’s logging in development mode. Normally, ActiveRecord’s AbstractAdapter logs every SQL query as it is made. Sometimes, it’s nice to quiet this down without changing your log level while tracking down a problem.

class ActiveRecord::ConnectionAdapters::AbstractAdapter
  def log_info_with_silencing(sql, name, runtime)
    unless ActiveRecord::Base.silence_sql_logging
      log_info_without_silencing(sql, name, runtime)
    end
  end
 
  alias_method_chain :log_info, :silencing
end
 
module SqlSilencer
  def silence_sql
    logging = ActiveRecord::Base.silence_sql_logging
    begin
      ActiveRecord::Base.silence_sql_logging = true
      yield
    ensure
      ActiveRecord::Base.silence_sql_logging = logging
    end
  end
end
 
class ActiveRecord::Base
  @@silence_sql_logging = false
  cattr_accessor :silence_sql_logging
 
  include SqlSilencer
end
 
class ActionController::Base
  include SqlSilencer
end

Drop this into a file in config/initializers if you’re on Rails 2.0 or require it from within environment.rb for pre-2.0.

To use inside a controller action or model method, simply wrap the code you want to silence inside a block:

def index
  silence_sql do
    # lots of work here...
  end

One interesting side effect of this is that you can easily see how much database work is done as a result of what’s in your view. Wrap the entire controller action inside a silence_sql block, then look at your log. Any queries you see are a result of things you didn’t load in the controller.