The following command-line example uses the
^
boundary matcher metacharacter to ensure that a line begins with The
followed by zero or more word characters: java RegexDemo ^The\w* Therefore
^
indicates that the first three text characters must match the pattern's subsequent T
, h
, and e
characters. Any number of word characters may follow. The command line above produces the following output: Regex = ^The\w*
Text = Therefore
Found Therefore
starting at index 0 and ending at index 9
java RegexDemo ^The\w* " Therefore"
. What happens? No match is found because a space character precedes Therefore
. Embedded flag expressions
Matchers assume certain defaults, such as case-sensitive pattern matching. A program may override any default by using an embedded flag expression, that is, a regex construct specified as parentheses metacharacters surrounding a question mark metacharacter (?
) followed by a specific lowercase letter. Pattern
recognizes the following embedded flag expressions: - (?i)
:
enables case-insensitive pattern matching. Example:java RegexDemo (?i)tree Treehouse
matchestree
withTree
. Case-sensitive pattern matching is the default. - (?x)
:
permits whitespace and comments beginning with the#
metacharacter to appear in a pattern. A matcher ignores both. Example:java RegexDemo ".at(?x)#match hat, cat, and so on" matter
matches.at
withmat
. By default, whitespace and comments are not permitted; a matcher regards them as characters that contribute to a match. - (?s)
:
enables dotall mode. In that mode, the period metacharacter matches line terminators in addition to any other character. Example:java RegexDemo (?s). \n
matches.
with\n
. Nondotall mode is the default: line-terminator characters do not match. - (?m)
:
enables multiline mode. In multiline mode,^
and$
match just after or just before (respectively) a line terminator or the text's end. Example:java RegexDemo (?m)^.ake make\rlake\n\rtake
matches.ake
withmake
,lake
, andtake
. Non-multiline mode is the default:^
and$
match only at the beginning and end of the entire text. - (?u)
:
enables Unicode-aware case folding. This flag works with(?i)
to perform case-insensitive matching in a manner consistent with the Unicode Standard. The default: case-insensitive matching that assumes only characters in the US-ASCII character set match.
:
enables Unix lines mode. In that mode, a matcher recognizes only the \n
line terminator in the context of the .
, ^
, and $
metacharacters. Non-Unix lines mode is the default: a matcher recognizes all terminators in the context of the aforementioned metacharacters.
No comments:
Post a Comment