Tuesday, November 22, 2011

Regular expressions simplify pattern-matching code - 18


Writing code to break text into its component parts (such as a text file's employee record into a set of fields) is a task many developers find tedious. Pattern relieves that tedium by providing a pair of text-splitting methods:
  • public String [] split(CharSequence text, int limit): splits text around matches of the current Pattern object's pattern. This method returns an array, where each entry specifies a text sequence separated from the next text sequence by a pattern match (or the text's end); and all array entries store in the same order as they appear in the text. The number of array entries depends on limit, which also controls the number of matches that occur. A positive value means that, at most, limit-1 matches are considered and the array's length is no greater than limit entries. A negative value means all possible matches are considered and the array can have any length. A zero value means all possible matches are considered, the array can have any length, and trailing empty strings are discarded.
  • public String [] split(CharSequence text): invokes the previous method with zero as the limit and returns the method call's result.

Suppose you want to split an employee record, consisting of name, age, street address, and salary, into its field components. The following code fragment uses split(CharSequence text) to accomplish that task:
Pattern p = Pattern.compile (",\\s");
String [] fields = p.split ("John Doe, 47, Hillsboro Road, 32000");
for (int i = 0; i < fields.length; i++)
     System.out.println (fields [i]);

No comments: