Parsing a Comma-Separated String
String str = "tim,kerry,timmy,camden"; String[] results = str.split(",");
The split() method on the String class accepts a regular expression as its only parameter, and will return an array of String objects split according to the rules of the passed-in regular expression. This makes parsing a comma-separated string an easy task. In this phrase, we simply pass a comma into the split() method, and we get back an array of strings containing the comma-separated data. So the results array in our phrase would contain the following content:
results[0] = tim results[1] = kerry results[2] = timmy results[3] = camden
Another useful class for taking apart strings is the StringTokenizer class. We will repeat the phrase using the StringTokenizer class instead of the split() method.
String str = "tim,kerry,timmy,Camden"; StringTokenizer st = new StringTokenizer(str, ","); while (st.hasMoreTokens()) { System.out.println(st.nextToken()); }
This code example will print each of the names contained in the original string, str, on a separate line, as follows:
tim kerry timmy camden
Notice that the commas are discarded and not output.
The StringTokenizer class can be constructed with one, two, or three parameters. If called with just one parameter, the parameter is the string that you want to tokenize, or split up. In this case, the delimiter is defaulted to natural word boundaries. The tokenizer uses the default delimiter set, which is " \t\n\r\f": the space character, the tab character, the newline character, the carriage-return character, and the form-feed character.
The second way of constructing a StringTokenizer object is to pass two parameters to the constructor. The first parameter is the string to be tokenized, and the second parameter is a string containing the delimiters that you want to split the string on. This overrides the default delimiters and sets them to whatever you pass in the second argument.
Finally, you can pass a third argument to the StringTokenizer constructor that designates whether delimiters should be returned as tokens or discarded. This is a Boolean parameter. A value of true passed here will cause the delimiters to be returned as tokens. False is the default value, which discards the delimiters and does not treat them as tokens.
You should also review the phrases in Chapter 6. With the addition of regular expression support to Java in JDK1.4, many of the uses of the StringTokenizer class can be replaced with regular expressions. The official JavaDoc states that the StringTokenizer class is a legacy class and its use should be discouraged in new code. Wherever possible, you should use the split() method of the String class or the regular expression package.