December 26, 2006
Scaling eBay: A 21st Century Web Story
While I’ve been on vacation, others have pointed out an SDForum presentation given by two senior eBay technologists, Randy Shoup and Dan Pritchett, on The Ebay Architecture: Striking a balance between site stability, feature velocity, performance and cost.
They tell a scalability story that is becoming more common, one which many people still find counter-intuitive:
- partitioned database schemas, based on usage patterns; mostly partitioned by primary key, with mapping tables where necessary;
- running with database slaves with various lag times;
- no stored procedures;
- no joins, sorts, or referential integrity constraints performed at the database layer: all this is done at the application layer, where CPU availability is easier to scale;
- no client-side or distributed transactions;
- homebrew frameworks as necessary (in eBay’s case, they use their own Java ORM framework, and their own connection pooling)
Aside from the tale of what they’re doing now, they provide an excellent history of eBay’s scaling problems through the years. It’s a great overview of what leads a company to these solutions.
If rules like “no joins at the database level” are becoming so common (or, as Joe Gregorio commented, it’s almost as if everybody is slowly implementing their own version of BigTable), why is it still counter-intuitive? I blame it on University Education. The approach to teaching databases at most universities is a lot like teaching multiplication tables to first graders: a lot of rule learning and regurgitation.
(There’s a very predictable 10-to-15 month cycle for a new hire at Webshots, which starts with blaming MySQL for all the world’s problems, moves through secretly inserting joins into production code, followed by resentment of the Establishment Authority, finally leading to enlightenment. Not that Webshots is anywhere near as good as the big guys when it comes to scaling.)
If you find this interesting, be sure to check out Scoble’s interview of Eric Billingsley, head of eBay Research, which I blogged about in October. Eric focused more on scaling search, and also goes into some of the history.
What I still find most fascinating about eBay is their approach to rolling out features: there’s a new release that contains 100K+ LOC (how much is auto-generated?) every two weeks, and they complete 300 new features per quarter. From what I hear from those inside eBay, this requires careful release management and putting in hours during night shifts, but it’s still awe-inspiring.
Finally, check out the summary and commentary by Frank Sommers over at Artima, which concludes with the following insight:
[T]he main message of the eBay presentation […] [is] the fact that eBay was able to meet the challenges of its growth with subsequent refinements to its system, all the while keeping the site operational.
The reason that’s interesting is because it suggests that you can start with almost any architecture—even with Perl or Rails or JSP pages—as long as you know how to migrate to the next step, and have the capability to do so, if and when you need to scale your app. That, in turn, suggests that the key test of scalability is not so much how each architecture stage scales, but how readily a company or an organization can move an application from one architecture step to the next. That indicates that scaling is as much an individual or organizational question as a technical one.