September 20, 2008
Match-making for Java Strings
(Inspired by Jeff Atwood’s recent ‘outing’ as a regex sympathizer, which got me thinking about the line between “too many” and “too few” regular expressions and how some languages make it a choice between “too few” and “none”.)
Java has a
Pattern, which forces you to pre-declare your regex string.
And it has a
Matcher, which matches on a
It should be noted that a
Pattern‘s pattern turns most patterns into a mess of backslashes since the pattern is wrapped in a plain-old Java
matches(), which matches using the
Pattern‘s pattern, but only if the pattern would otherwise match the
Matcher‘s whole string.
Matcher can also
find(), which matches a
Pattern pattern even if the pattern would only match a substring of the
String, which is what most patterns match and what most languages call matching on a string.
lookAt(), which matches on a
Pattern pattern, which, like
find(), can match a pattern on a substring of the string, but only if the
String‘s matching substring starts at the beginning of the string.
String matched by the
Matcher can be sliced by a call to
region(start,end), which allows
lookAt() to interpret a substring of the
String as being the whole string.
Now, after calling
find() or any of
String-matching cousins, a consumer of a
Matcher can call
group(int) to get the
String substring that the
Pattern‘s pattern captured when matching on the
But if you’re lazy, and you have no groups in your pattern, and a
matches() is sufficient, then
String gives you
matches(pattern) which is precisely equivalent to constructing a
Pattern with your pattern and passing a new
Matcher your existing
So with effective use of Java object syntax, you too can use regular expressions to make your matches on Java
Strings almost as obscurely as other languages clearly make matches on their strings!
Is it any wonder Java programmers don’t realize that regular expressions are a beautiful… thing?