Metacharacters
Although literal string regex constructs are useful, more powerful regex constructs combine literal characters with metacharacters. For example, ina.b
, the period metacharacter (.
) represents any character that appears between a
and b
. To see the period metacharacter in action, execute the following command line: java RegexDemo .ox "The quick brown fox jumps over the lazy ox."
.ox
as the regex and The quick brown fox jumps over the lazy ox.
as the text command-line argument. RegexDemo
searches the text for matches that begin with any character and end with ox
, and produces the following output: Regex = .ox
Text = The quick brown fox jumps over the lazy ox.
Found fox
starting at index 16 and ending at index 19
Found ox
starting at index 39 and ending at index 42
fox
and ox
(with a leading space character). The .
metacharacter matches the f
in the first match and the space character in the second match. What happens if we replace
.ox
with the period metacharacter? That is, what outputs when we specify java . "The quick brown fox jumps over the lazy ox."
? Because the period metacharacter matches any character, RegexDemo
outputs a match for each character in its text command-line argument, including the terminating period character. Tip |
To specify . or any metacharacter as a literal character in a regex construct, quote—convert from meta status to literal status—the metacharacter in one of two ways:
In either scenario, don't forget to double each backslash character (as in \\. or \\Q.\\E ) that appears in a string literal (e.g., String regex = "\\."; ). Do not double the backslash character when it appears as part of a command-line argument. |
No comments:
Post a Comment