March 1, 2008
If you’re like me, the first time you heard the FizzBuzz programming interview meme, you tried it yourself in your favorite language.
“Reasonable solution, less than one minute, works. I’m not a loser, at least. w00t!”
But in Oh, Go Ahead. Overthink FizzBuzz., chalain takes it to the next level, with a random number-based solution…and a publicly-available Ruby gem.
“Maybe I am a loser, after all.”
 Unless your first inclination is to try this problem in Java, in which case you’re seriously in need of an intervention.
February 27, 2008
Development practices are all about dependencies.
Usually, we think of things we depend on. “Crap, I forgot to upgrade to commons-logging 1.1.1.”
Sometimes, we think of others using our code. “I better make sure this unit test consistently passes, or that new guy I just made an offer to is going to hate me and post bad things about me online.”
But there are times when what are perfectly justifiable architectural decisions in most contexts are the wrong decision when other teams are depending on you.
Let’s say you’re a central credit card processing gateway for various teams throughout your organization. For perfectly valid reasons, you’re going simply perform an auth and return–and then try to capture later from a batch job. Maybe that batch job runs 5 seconds later, maybe 60, maybe it’s broken for 2 hours. Most of the time, it doesn’t really matter.
But here’s the gotcha: what if other teams want to run unit tests (OK, “automated integration tests using unit testing tools”)? What if there are actions that a user is forbidden from taking during this delay, which is usually OK since they’re not likely to take those actions anyway (e.g., cancelling 5 minutes after subscribing to a service).
Suddenly, your scalable architectural decision makes everybody relying on you think you’re a flake, because their unit tests often fail for no apparent reason.
Or else they have to persist the last 24 hours worth of test accounts, and keep track of them.
All the moreso when you take the common lax approach to maintaining your development and staging environments. Who really cares if the batch job doesn’t run in dev for a few days? Well, that guy over there trying to meet a deadline who just wasted six hours because he thought the problem was in his code…
You’re an architect who has a neat new idea for how to deploy configuration changes to production. It’s completely different from what everybody else is doing, but it’s theoretically perfect, and requires less code than any alternative.
Of course, often what’s simplest from a coding perspective can waste many hours of your operations’ guys valuable time. Maybe it’s hard to understand. Maybe it’s hard to figure out what’s going wrong if the server crashes. Maybe it’s so different from the other deployments they have to manage that they spend thirty minutes relearning every time they have to deal with it. Maybe it decouples what ought not be decoupled, and it’s near impossible to get version dependencies just right between different components.
The difference from the first case is, the operations and database engineering teams usually have enough political clout (and, um, willfulness) to make you sit down and talk to them first.
You’re the kind of cowboy programmer who (rightly) believes that there’s no such thing as too much information when something’s wrong. So you start plopping down error messages and stack traces left and right, for every conceivable error.
Six months later, your teammate gets called at 3AM that the web site is on its knees. So, he starts digging into the logs…only to discover tens of thousands of lines of stack traces flashing by, and no easy way of distinguishing real problems from red herrings.
Sure, you’re right, that’s a team problem–there really ought to be good tools for examing all those logs in real-time, and said tools should allow you to filter out common problems that aren’t occuring with greater than usual frequency. So it’s not your fault, right?
Wrong. Unless and until your team have such tools, your trusted team mates will only get ticked off at having to wade through the very muddy, very swift river.
So, who’s depending on you?
Not that I’ve been counting and cataloging all my mistakes (that would be too depressing), but most of my mistakes have been attributable to not thinking about everybody who depends on me. And many of my team mates’ mistakes are the same.
It’s a surprising thing, but consider this: a junior programmer with absolutely no experience in a fast-paced, scalable environment has two options. He can think of himself and do the most expedient thing, maybe use the most cutting-edge technology to grow his mad skills. Or he can think of the people depending on him, and, even in the absence of a good mentor or even specific advice, bring the project into a state that will help them when they need it most.
If he chooses the latter, 99% of his inexperience becomes mostly harmless.
And still, experienced people often choose the former. And 99% of our experience becomes mostly worthless, if only for a few hours a week.
January 25, 2006
Six spam comments in six posts in a single session! Woohoo! I’m a star, baby!
In case you were wondering, yes, I am alive. Really. I have neglected my blog way too much since the holidays. I will get back on track this weekend.
October 31, 2005
Hmm. My last post from Sunday morning is no longer available. And here I was trying to get into the habit of writing every other day.
Such are the perils of relying on a (free) third-party service.
Update: Looks like it’s back. Not only was wp.com able to restore the old post, but after a short time of absence, this later post was also restored. Good for Matt. Usually when sites crash, you lose data from one of the three critical periods: pre-crash, during-crash, or post-crash.