April 14, 2007
Twitter & Rails: Hiccups Today, Proven Scalibility Tomorrow
Running on Rails has forced us to deal with scaling issues – issues that any growing site eventually contends with – far sooner than I think we would on another framework.
The common wisdom in the Rails community at this time is that scaling Rails is a matter of cost: just throw more CPUs at it. The problem is that more instances of Rails (running as part of a Mongrel cluster, in our case) means more requests to your database. At this point in time there’s no facility in Rails to talk to more than one database at a time. The solutions to this are caching the hell out of everything and setting up multiple read-only slave databases, neither of which are quick fixes to implement. So it’s not just cost, it’s time, and time is that much more precious when people can[’t] reach your site.
It was compounded this week by Mike Pence’s attempt to dramatize disagreements between the Rails and Seaside communities.
Then Coding Horror tried to break it down by per-language runtime execution time.
I’ll be honest: I don’t know much beyond the basics about Rails. I cut my teeth on the Active Record design pattern, even if I find it lacking now. And what I know about Ruby, I like, because it feels like a much improved Perl.
I find it implausible that Twitter has reached the limits of a non-partitioned (some would call it “federated,” though that is, technically, incorrect) database schema. That comes later. It also doesn’t seem plausible that Rails forces all of your tables into a single database.
That leaves database replication. So, my understanding of his complaint is that Rails doesn’t support a one master-many slaves architecture. Otherwise, that is fairly quick and easy for a company in trouble to set up.
There are really three takeaways from this.
First, it doesn’t matter how “slow” Ruby is compared to C++ or Java. Their problem isn’t with the Ruby VM as an execution environment, it’s with Rails as a platform. And the only problem there is its lack of support for common, sensible data-level scaling strategies. Really, any language out there today is capable of scaling. (Sure, I’m simplifying. Platforms that don’t require spawning a new process for each request are “more scalable” and also may have shorter response times.)
Second, Twitter is a company with a lot of goodwill in the community. They’ll get through this. A lot of us only wish our employer were in the position of not losing members and positive press every minute our site cannot withstand the traffic. I envy their developers right now: their team will gel in a way it probably hasn’t yet, and a year from now they’ll be sharing their scaling experiences at all the conferences, and swapping war stories with new team members. (Aren’t they still hiring?)
Third, Rails will get through this. Twitter is, as far as I know, the first Rails-based company to push Rails to its (well-known, well-criticized) limits. If the Twitter developers don’t contribute improvements back to the Rails community, other Rails developers will. This experience only makes Rails more attractive to new start-ups who, if they’re lucky, will have the same traffic issues in a year.