Section 7.10 From the Java Library: java.util.StringTokenizer
One of the most widespread string-processing tasks is that of breaking up a string into its components, or tokens.
For example, when processing a sentence, you may need to break the sentence into its constituent words, which are considered the sentence tokens. When processing a name-password string, such as “boyd:14irXp”, you may need to break it into a name and a password.
Tokens are separated from each other by one or more characters which are known as delimiters. Thus, for a sentence, white space, including blank spaces, tabs, and line feeds, serve as the delimiters. For the password example, the colon character serves as a delimiter.
Java’s
java.util.StringTokenizer
class is specially designed for breaking strings into their tokens (Figure 7.10.1). When instantiated with a String
parameter, a StringTokenizer
breaks the string into tokens, using white space as delimiters. For example, if we instantiated a StringTokenizer
as in the codeStringTokenizer sTokenizer
= new StringTokenizer("This is an English sentence.");
it would break the string into the following tokens, which would be stored internally in the
StringTokenizer
in the order shown:This is an English sentence.
Note that the period is part of the last token (“sentence.”). This is because punctuation marks are not considered delimiters by default.
If you wanted to include punctuation symbols as delimiters, you could use the second
StringTokenizer()
constructor, which takes a second String
parameter (Figure 7.10.1). The second parameter specifies a string of those characters that should be used as delimiters. For example, in the instantiation,StringTokenizer sTokenizer
= new StringTokenizer("This is an English sentence.",
"\b\t\n,;.!");
various punctuation symbols (periods, commas, and so on) are included among the delimiters. Note that escape sequences (
\b\t\n
) are used to specify blanks, tabs, and newlines.The
hasMoreTokens()
and nextToken()
methods can be used to process a delimited string, one token at a time. The first method returns true
as long as more tokens remain; the second gets the next token in the list. For example, here’s a code segment that will break a standard URL string into its constituent parts:String url = "http://java.trincoll.edu/~jjj/index.html";
StringTokenizer sTokenizer = new StringTokenizer(url,":/");
while (sTokenizer.hasMoreTokens()) {
System.out.println(sTokenizer.nextToken());}
This code segment will produce the following output:
http java.trincoll.edu ~jjj index.html
The only delimiters used in this case were the “
:
” and “/
” symbols. And note that nextToken()
does not return the empty string between “:
” and “/
” as a token.You have attempted of activities on this page.