Wednesday, December 28, 2011

Java's character and assorted string classes support text-processing - 28


For a practical demonstration of StringTokenizer's methods, I created a PigLatin application that translates English text to its pig Latin equivalent. For those unfamiliar with the pig Latin game, this coded language moves a word's first letter to its end and then adds ay. For example: computer becomes omputercay; Java becomes Avajay, etc. Punctuation is not affected. Listing 6 presents PigLatin's source code:
Listing 6: PigLatin.java
// PigLatin.java
import java.util.StringTokenizer;
class PigLatin
{
   public static void main (String [] args)
   {
      if (args.length != 1)
      {
          System.err.println ("usage: java PigLatin phrase");
          return;
      }
      StringTokenizer st = new StringTokenizer (args [0], " \t:;,.-?!");
      while (st.hasMoreTokens ())
      {
         StringBuffer sb = new StringBuffer (st.nextToken ());
         sb.append (sb.charAt (0));
         sb.append ("ay");
         sb.deleteCharAt (0);
         System.out.print (sb.toString () + " ");
      }
      System.out.print ("\r\n");
   }
}

To see what Hello, world! looks like in pig Latin, execute java PigLatin "Hello, world!". You see the following output:
elloHay orldWay

According to pig Latin's rules, the output is not quite correct. First, the wrong letters are capitalized. Second, the punctuation is missing. The correct output is:
Ellohay, Orldway!

Use what you've learned in this article to fix those problems.
Review
Java's Character, String, StringBuffer, and StringTokenizer classes support text-processing programs. Such programs use Character to indirectly store char variables in data structure objects and access a variety of character-oriented utility methods; use String to represent and manipulate immutable strings; use StringBuffer to represent and manipulate mutable strings; and use StringTokenizer to extract a string's tokens.
This article also cleared up three mysteries about strings. First, you saw how the compiler and classloader allow you to treat string literals (at the source-code level) as if they were String objects. Thus, you can legally specify synchronized ("sync object") in a multithreaded program requiring synchronization. Second, you learned why Strings are immutable, and how immutability works with internment to save heap memory when a program requires many strings and to allow fast string searches. Finally, you learned what happens when you use the string concatenation operator to concatenate strings and how StringBuffer is involved in that task.

No comments: