February 27, 2008

Who’s Depending on You?

Posted in Uncategorized at 10:33 pm by mj

Development practices are all about dependencies.

Usually, we think of things we depend on. “Crap, I forgot to upgrade to commons-logging 1.1.1.”

Sometimes, we think of others using our code. “I better make sure this unit test consistently passes, or that new guy I just made an offer to is going to hate me and post bad things about me online.”

But there are times when what are perfectly justifiable architectural decisions in most contexts are the wrong decision when other teams are depending on you.

Case 1.
Let’s say you’re a central credit card processing gateway for various teams throughout your organization. For perfectly valid reasons, you’re going simply perform an auth and return–and then try to capture later from a batch job. Maybe that batch job runs 5 seconds later, maybe 60, maybe it’s broken for 2 hours. Most of the time, it doesn’t really matter.

But here’s the gotcha: what if other teams want to run unit tests (OK, “automated integration tests using unit testing tools”)? What if there are actions that a user is forbidden from taking during this delay, which is usually OK since they’re not likely to take those actions anyway (e.g., cancelling 5 minutes after subscribing to a service).

Suddenly, your scalable architectural decision makes everybody relying on you think you’re a flake, because their unit tests often fail for no apparent reason.

Or else they have to persist the last 24 hours worth of test accounts, and keep track of them.

Oops.

All the moreso when you take the common lax approach to maintaining your development and staging environments. Who really cares if the batch job doesn’t run in dev for a few days? Well, that guy over there trying to meet a deadline who just wasted six hours because he thought the problem was in his code…

Case 2.
You’re an architect who has a neat new idea for how to deploy configuration changes to production. It’s completely different from what everybody else is doing, but it’s theoretically perfect, and requires less code than any alternative.

Of course, often what’s simplest from a coding perspective can waste many hours of your operations’ guys valuable time. Maybe it’s hard to understand. Maybe it’s hard to figure out what’s going wrong if the server crashes. Maybe it’s so different from the other deployments they have to manage that they spend thirty minutes relearning every time they have to deal with it. Maybe it decouples what ought not be decoupled, and it’s near impossible to get version dependencies just right between different components.

The difference from the first case is, the operations and database engineering teams usually have enough political clout (and, um, willfulness) to make you sit down and talk to them first.

Case 3.
You’re the kind of cowboy programmer who (rightly) believes that there’s no such thing as too much information when something’s wrong. So you start plopping down error messages and stack traces left and right, for every conceivable error.

Six months later, your teammate gets called at 3AM that the web site is on its knees. So, he starts digging into the logs…only to discover tens of thousands of lines of stack traces flashing by, and no easy way of distinguishing real problems from red herrings.

Sure, you’re right, that’s a team problem–there really ought to be good tools for examing all those logs in real-time, and said tools should allow you to filter out common problems that aren’t occuring with greater than usual frequency. So it’s not your fault, right?

Wrong. Unless and until your team have such tools, your trusted team mates will only get ticked off at having to wade through the very muddy, very swift river.

So, who’s depending on you?
Not that I’ve been counting and cataloging all my mistakes (that would be too depressing), but most of my mistakes have been attributable to not thinking about everybody who depends on me. And many of my team mates’ mistakes are the same.

It’s a surprising thing, but consider this: a junior programmer with absolutely no experience in a fast-paced, scalable environment has two options. He can think of himself and do the most expedient thing, maybe use the most cutting-edge technology to grow his mad skills. Or he can think of the people depending on him, and, even in the absence of a good mentor or even specific advice, bring the project into a state that will help them when they need it most.

If he chooses the latter, 99% of his inexperience becomes mostly harmless.

And still, experienced people often choose the former. And 99% of our experience becomes mostly worthless, if only for a few hours a week.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: