September 27, 2008

Three subversion tips: svn:ignore, svn merge, svn move

Posted in Development, Software tagged , , at 7:57 am by mj

Since I complained earlier this year about the state of Subversion tools, I’ve been thinking about a follow-up that’s a bit more positive.

This doesn’t exactly count, but I thought I’d share a few productivity lessons I’ve learned recently.

Using svn:ignore
svn:ignore is a special subversion property that instructs Subversion to ignore any files (or directories) that match a given pattern.

The common use case is to ignore build artifacts to prevent accidental check-ins and eliminate clutter on svn status, etc. For example, you can ignore all *.jar files in a particular directory, or ignore your build directory, etc.

Unfortunately, this can tend to hide problems with your build artifacts. For a project I’m working on now, we have timestamped JAR files stuffed into a common directory. The JAR files themselves are svn:ignore‘d, which means svn status will never display them.

And as I found recently, this could result in 8 GB of “hidden” files that only becomes apparent when you, say, try to copy a remote workspace into a local one for managing with Eclipse.

Shame on the developers for not deleting them as part of ant clean. But it happens, no getting around that.

Thankfully, the Subversion developers thought about this case, and introduced the --no-ignore flag to svn status. With this option, ignored files are displayed along with added, modified and deleted files, with an I in the first column.

Cleaning up your subversion repository is, therefore, as simple as:

svn status --no-ignore |
grep -P '^I' |
perl -n -e '/^\I[\s\t]+(.*)$/; my $f=$1; if (-d $f) { print "Deleting directory $f\n"; `rm -rv "$f"`; } else { print "Deleting file $f\n"; `rm -v "$f"`; }'

That will remove all files and directories that Subversion is ignoring (but not files that just have not yet been added to source control). Stick that in a script in your path, and live happily ever after.


Merging back into trunk
The most common use case when merging is to provide a range of revisions in trunk to pull into your branch. For example:

svn merge -r 100:114

What happens is you tell Subversion, “I don’t care what happened before revision 100, because that’s already in my branch…so just apply changes between version 100 and 114.”

But what’s not obvious–nor, as far as I can tell, available in standard reference books–is how to merge back into trunk. It turns out, the way to do this is to disregard everything you’ve learned about subversion.

The problem is that you’ve been merging changes from trunk into your branch. So if you simply choose the naive approach of picking up all changes since your base branch revision until your final check-in, and try to apply those to trunk, you’ll get conflicts galore, even on files you never touched in your branch (except to pull from trunk).

The solution is to use a different form of the merge command, as so:

svn merge ./@115

where revision 115 represents your last merge from trunk.

This actually just compares the two repositories at the specified revision, and pulls in the differences, all the differences, and nothing but the differences. So help me Knuth.


Beware the power of svn move
One of the much-touted benefits of subversion (particularly as compared to CVS) is the support for moving files around. But, until 1.5, there has been a glaring error that is often overlooked and can get you into trouble.

Because svn move is implemented as a svn delete followed by a svn add, Subversion thinks the new file has no relation to the old file. Therefore, if you have local changes to foo, and your arch nemesisco-worker Randy moves it to bar, your changes will simply disappear!

Subversion 1.5 has partially addressed this, at least for single files. Under the new regime, your changes to foo will be merged with any changes to bar. However, you still need to be careful with moving directories.

This is more insidious than moving methods around inside the same file. While in that case Subversion will freak out and your merges will become difficult, at least you’ll see the conflict and your changes won’t disappear while you’re not looking.

The lesson, then, is to talk with your team-mates before any refactoring. (svn lock doesn’t seem to provide any help unless everybody’s in trunk.)

Rumor has it svn 1.6 will address this even more practically by introducing the svn fuck-you-and-your-dog command. But until then, you have to do it the old fashion way.


August 23, 2008

Wisdom of Crowds

Posted in Software tagged , , at 9:33 am by mj

I am living without internet in my temporary condo in Seattle (the horror! my god! I’m dying! how did people live twenty years ago? no wonder there are so many wars!), and am working on deadlines at the office, so have few chances to write.

But I am trying to keep up with the news using Google Reader’s offline mode.

Which brings me to this bit on collective intelligence from Nat Torkington:

Systems that channel individual behaviours to create new and valuable data are showing up everywhere. We point to Amazon Recommendations as the canonical example, but it’s hard to find an area that isn’t using individual actions to produce collective wisdom.

Not that I disagree, but the thought just struck me. We always bring up recommendation engines on Amazon or Pandora or Netflix or Facebook…

Wisdom represents the ability to understand the world better and, through that understanding, improve it (or at least one’s standing in it). (See the Wikipedia entry on wisdom, which agrees with me.)

How is “finding your niche” (or even moving outside your niche) in books or music or movies or online friends…wisdom?

That seems more like plain ol’ socialization to me. On a much larger scale than ever before, granted, but can we call what we’re doing with these tools at this moment in history increasing our wisdom?

Coincidentally, this chart from Newscientist (via Paul Kedrosky–no direct link available) shows what happens when people “recommend” (in a generic sense) stocks to one another:

I bet if we plotted the popularity of artists and movies (at all points along the head and long tail) we’d find similar results.

I guess I’m still waiting for automated tools that increase my wisdom. Are there tools that will look for trends in people that live longer, which will help me live to 100? Are there tools that will look for trends in people that are successful, which will help me retire when I’m 45 and spend the next 55 years traveling the world (and, hopefully, the moon and Mars)? Are there tools that will help us reform our political structure so that it’s even worth living longer? Tools that make it harder to not have compassion? Tools that prevent us from foisting dictators and nanny-states upon ourselves?

Yes, some of these things are coming–but at the moment, we basically have simple data mining tools that help experts know where to focus their attention, then filter and draw conclusions and make suggestions to the rest of us.

I love my Pandora. I love recommendation engines in general. I even love my credit card company’s fraud algorithms. My life is much better–I am much happier–as a result.

But I don’t feel any more wise.

January 27, 2008

The State of Subversion Tools

Posted in Software tagged , , , , , at 4:06 pm by mj

My team moved to Subversion (from CVS) this past November. Because we were in the midst of the holidays at that time, we’re only now beginning to get a sense of whether our first impressions were accurate.

Our first impression was that Subversion’s tools are immature, and will increase the amount of project management/code maintenance that a team lead or release manager will have to perform.

For example, we’d gotten quite used to Eclipse‘s merging capabilities with CVS. Merging was still a manual process and required some effort, and CVS certainly isn’t the right tool for a team that has to maintain many branches, but the tools alleviated much of the pain.

By contrast, the Subclipse plugin provides little value, requires a lot of additional labor to do even the simplest things, and nullifies some of Subversion’s strengths (e.g., moving files as would happen during a refactoring–Subclipse usually does not pick it up as a move, so history gets lost). It also occasionally corrupts one’s local SVN metadata, usually when switching between branches/tags.

I don’t want to dwell on the negatives, though. As far as I’m concerned, a lot of really smart people who’ve probably written more code than I have like Subversion. In such situations, my natural inclination is to assume I’m missing something and continue plodding along.

A co-worker of mine is ready to covertly switch back to CVS under the cover of darkness, and I really have no good answer for many of his complaints. Somehow arguments from authority or appeals to the greater intellect of others are unsatisfying.

What’s really curious to me is that so many common operations aren’t provided by either the SVN command-line, nor, apparently, by any major tool sets. The Subversion book is a truly excellent source of information and contains procedures for just about everything you’d want to do, and all those good recipes are just screaming for automation!

The immaturity of the Subclipse plugin has driven me to the command-line for most things. But already I’ve had to write a number of wrappers to perform common operations:

  1. First, there was the inane requirement of passing in the full repository URL to every command that doesn’t operate on the local working copy. Even CVS read the CVSHOME environment variable. (How many times are you working with multiple repositories simultaneously, let alone applying changes from one repository to another?)
  2. Then, there was the inability to move multiple files at once. Even if you’re working on your local copy and they’re in the same directory (file globs are not supported).
  3. Then, setting properties has trouble operating recursively unless you want to apply the same property to all files. (Granted, I was originally confused by propedit and used a wrapper that wouldn’t force me to fire up an editor. Then I discovered propset, whose limits are fewer. Why do we need two?)
  4. Then, there was the inability to pull in specific change sets (revisions) from a divergent branch, which I call a “pull” operation. (This is something that ought to be easy from within Subclipse.)
  5. Then, there was the need to do all the grunt-work necessary to undelete a previously deleted file.
  6. Finally (so far), there is the inability to see all tags applied to a given file. My wrapper is slow (it has to check every friggin tag), but it works. Unless the file has been moved.

I still find myself liking SVN. I like being able to move files and directories around the hierarchy. I like that directories are versioned, so I can see a history of files that have been added and deleted. I like being able to easily recover a deleted file.

From a project management perspective, I also like being able to query the branch and see the whole history for that branch. I can even go to the very top and ask for the whole history of the entire friggin repository, including all branches and all tags. With a good filter, it’s even readable.

I don’t blame the SVN authors for creating basic building blocks and limiting the verbs and options available. The goal from their perspective is probably approachability. And they’ve built a great command-line help system, which surpasses just about every tool I’ve ever used.

So where are the mature tool sets that use these building blocks to add significant value?

SVN is now 7 years old, and has been a de facto open source standard for going on 3 years. Will SVN be supplanted by its successor before mature tools develop?

What is your favorite tool?