Thursday, November 17, 2011

Regular expressions simplify pattern-matching code - 13


·  java RegexDemo .*+end "This is the end": uses a possessive quantifier to match all characters followed by end in This is the end zero or more times. The following output results:
Regex = .*+end
Text = This is the end

The possessive quantifier produces no matches because it causes a matcher to consume the entire text, leaving nothing left to match end. In contrast, the greedy quantifier in java RegexDemo .*end "This is the end" produces a match because it causes a matcher to keep backing off one character at a time until the rightmost end matches.

Boundary matchers

We sometimes want to match patterns at the beginning of lines, at word boundaries, at the end of text, and so on. Accomplish that task with a boundary matcher, a regex construct that identifies a match location. Table 2 presents Pattern's supported boundary matchers.
Table 2. Boundary matchers
Boundary Matcher
Description
^
The beginning of a line
$
The end of a line
\b
A word boundary
\B
A nonword boundary
\A
The beginning of the text
\G
The end of the previous match
\Z
The end of the text (but for the final line terminator, if any)
\z
The end of the text

No comments: