Monday, December 5, 2011

Java's character and assorted string classes support text-processing - 5


To demonstrate Character's isDigit(char c) and isLetter(char c) methods, I've created a CA (character analysis) application that counts a text file's digits, letters, and other characters. In addition to printing those counts, CA calculates and prints each count's percentage of the total count. Listing 1 presents CA's source code (don't worry about the file-reading logic: I'll explain FileInputStream and other file-related concepts in a future article):
Listing 1: CA.java
// CA.java
// Character Analysis
import java.io.*;
class CA
{
   public static void main (String [] args)
   {
      int ch, ndigits = 0, nletters = 0, nother = 0;
      if (args.length != 1)
      {
          System.err.println ("usage: java CA filename");
          return;
      }
      FileInputStream fis = null;
      try
      {
          fis = new FileInputStream (args [0]);
          while ((ch = fis.read ()) != -1)
             if (Character.isLetter ((char) ch))
                 nletters++;
             else
             if (Character.isDigit ((char) ch))
                 ndigits++;
             else
                 nother++;
          System.out.println ("num letters = " + nletters);
          System.out.println ("num digits = " + ndigits);
          System.out.println ("num other = " + nother + "\r\n");
          int total = nletters + ndigits + nother;
          System.out.println ("% letters = " +
                              (double) (100.0 * nletters / total));
          System.out.println ("% digits = " +
                              (double) (100.0 * ndigits / total));
          System.out.println ("% other = " +
                              (double) (100.0 * nother / total));
      }
      catch (IOException e)
      {
          System.err.println (e);
      }
      finally
      {
          try
          {
              fis.close ();
          }
          catch (IOException e)
          {
          }
      }
   }
}

If you want to perform a character analysis on CA's source file—CA.java—execute java CA ca.java. You see the following output:
num letters = 609
num digits = 18
num other = 905
% letters = 39.75195822454308
% digits = 1.174934725848564
% other = 59.07310704960835

No comments: