com.Ostermiller.util Java Utilities


com.Ostermiller.util
Class StringTokenizer

java.lang.Object
  extended by com.Ostermiller.util.StringTokenizer
All Implemented Interfaces:
Enumeration<String>, Iterator<String>

public class StringTokenizer
extends Object
implements Enumeration<String>, Iterator<String>

The string tokenizer class allows an application to break a string into tokens. More information about this class is available from ostermiller.org.

The tokenization method is much simpler than the one used by the StreamTokenizer class. The StringTokenizer methods do not distinguish among identifiers, numbers, and quoted strings, nor do they recognize and skip comments.

The set of delimiters (the characters that separate tokens) may be specified either at creation time or on a per-token basis.

There are two kinds of delimiters: token delimiters and non-token delimiters. A token is either one token delimiter character, or a maximal sequence of consecutive characters that are not delimiters.

A StringTokenizer object internally maintains a current position within the string to be tokenized. Some operations advance this current position past the characters processed.

The implementation is not thread safe; if a StringTokenizer object is intended to be used in multiple threads, an appropriate wrapper must be provided.

The following is one example of the use of the tokenizer. It also demonstrates the usefulness of having both token and non-token delimiters in one StringTokenizer.

The code:

String s = "  (   aaa \t  * (b+c1 ))";
StringTokenizer tokenizer = new StringTokenizer(s, " \t\n\r\f", "()+*");
while (tokenizer.hasMoreTokens()) {
    System.out.println(tokenizer.nextToken());
};

prints the following output:

(
aaa
*
(
b
+
c1
)
)

Compatibility with java.util.StringTokenizer

In the original version of java.util.StringTokenizer, the method nextToken() left the current position after the returned token, and the method hasMoreTokens() moved (as a side effect) the current position before the beginning of the next token. Thus, the code:

String s = "x=a,b,c";
java.util.StringTokenizer tokenizer = new java.util.StringTokenizer(s,"=");
System.out.println(tokenizer.nextToken());
while (tokenizer.hasMoreTokens()) {
    System.out.println(tokenizer.nextToken(","));
};

prints the following output:

x
a
b
c

The Java SDK 1.3 implementation removed the undesired side effect of hasMoreTokens method: now, it does not advance current position. However, after these changes the output of the above code was:

x
=a
b
c

and there was no good way to produce a second token without "=".

To solve the problem, this implementation introduces a new method skipDelimiters(). To produce the original output, the above code should be modified as:

String s = "x=a,b,c";
StringTokenizer tokenizer = new StringTokenizer(s,"=");
System.out.println(tokenizer.nextToken());
tokenizer.skipDelimiters();
while (tokenizer.hasMoreTokens()) {
    System.out.println(tokenizer.nextToken(","));
};

Since:
ostermillerutils 1.00.00
Author:
Stephen Ostermiller http://ostermiller.org/contact.pl?regarding=Java+Utilities

Constructor Summary
StringTokenizer(String text)
          Constructs a string tokenizer for the specified string.
StringTokenizer(String text, String nontokenDelims)
          Constructs a string tokenizer for the specified string.
StringTokenizer(String text, String delims, boolean delimsAreTokens)
          Constructs a string tokenizer for the specified string.
StringTokenizer(String text, String nontokenDelims, String tokenDelims)
          Constructs a string tokenizer for the specified string.
StringTokenizer(String text, String nontokenDelims, String tokenDelims, boolean returnEmptyTokens)
          Constructs a string tokenizer for the specified string.
 
Method Summary
 int countTokens()
          Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception.
 int countTokens(String delims)
          Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception using the given set of (non-token) delimiters.
 int countTokens(String delims, boolean delimsAreTokens)
          Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception using the given set of delimiters.
 int countTokens(String nontokenDelims, String tokenDelims)
          Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception using the given set of delimiters.
 int countTokens(String nontokenDelims, String tokenDelims, boolean returnEmptyTokens)
          Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception using the given set of delimiters.
 int getCurrentPosition()
          Get the the index of the character immediately following the end of the last token.
 boolean hasMoreElements()
          Returns the same value as the hasMoreTokens() method.
 boolean hasMoreTokens()
          Tests if there are more tokens available from this tokenizer's string.
 boolean hasNext()
          Returns the same value as the hasMoreTokens() method.
 String next()
          Returns the same value as the nextToken() method, except that its declared return value is Object rather than String.
 String nextElement()
          Returns the same value as the nextToken() method, except that its declared return value is Object rather than String.
 String nextToken()
          Returns the next token from this string tokenizer.
 String nextToken(String nontokenDelims)
          Returns the next token in this string tokenizer's string.
 String nextToken(String delims, boolean delimsAreTokens)
          Returns the next token in this string tokenizer's string.
 String nextToken(String nontokenDelims, String tokenDelims)
          Returns the next token in this string tokenizer's string.
 String nextToken(String nontokenDelims, String tokenDelims, boolean returnEmptyTokens)
          Returns the next token in this string tokenizer's string.
 String peek()
          Returns the same value as nextToken() but does not alter the internal state of the Tokenizer.
 void remove()
          This implementation always throws UnsupportedOperationException.
 String restOfText()
          Retrieves the rest of the text as a single token.
 void setDelimiters(String delims)
          Set the delimiters used to this set of (non-token) delimiters.
 void setDelimiters(String delims, boolean delimsAreTokens)
          Set the delimiters used to this set of delimiters.
 void setDelimiters(String nontokenDelims, String tokenDelims)
          Set the delimiters used to this set of delimiters.
 void setDelimiters(String nontokenDelims, String tokenDelims, boolean returnEmptyTokens)
          Set the delimiters used to this set of delimiters.
 void setReturnEmptyTokens(boolean returnEmptyTokens)
          Set whether empty tokens should be returned from this point in in the tokenizing process onward.
 void setText(String text)
          Set the text to be tokenized in this StringTokenizer.
 boolean skipDelimiters()
          Advances the current position so it is before the next token.
 String[] toArray()
          Retrieve all of the remaining tokens in a String array.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

StringTokenizer

public StringTokenizer(String text,
                       String nontokenDelims,
                       String tokenDelims)
Constructs a string tokenizer for the specified string. Both token and non-token delimiters are specified.

The current position is set at the beginning of the string.

Parameters:
text - a string to be parsed.
nontokenDelims - the non-token delimiters, i.e. the delimiters that only separate tokens and are not returned as separate tokens.
tokenDelims - the token delimiters, i.e. delimiters that both separate tokens, and are themselves returned as tokens.
Throws:
NullPointerException - if text is null.
Since:
ostermillerutils 1.00.00

StringTokenizer

public StringTokenizer(String text,
                       String nontokenDelims,
                       String tokenDelims,
                       boolean returnEmptyTokens)
Constructs a string tokenizer for the specified string. Both token and non-token delimiters are specified and whether or not empty tokens are returned is specified.

Empty tokens are tokens that are between consecutive delimiters.

It is a primary constructor (i.e. all other constructors are defined in terms of it.)

The current position is set at the beginning of the string.

Parameters:
text - a string to be parsed.
nontokenDelims - the non-token delimiters, i.e. the delimiters that only separate tokens and are not returned as separate tokens.
tokenDelims - the token delimiters, i.e. delimiters that both separate tokens, and are themselves returned as tokens.
returnEmptyTokens - true if empty tokens may be returned; false otherwise.
Throws:
NullPointerException - if text is null.
Since:
ostermillerutils 1.00.00

StringTokenizer

public StringTokenizer(String text,
                       String delims,
                       boolean delimsAreTokens)
Constructs a string tokenizer for the specified string. Either token or non-token delimiters are specified.

Is equivalent to:

Parameters:
text - a string to be parsed.
delims - the delimiters.
delimsAreTokens - flag indicating whether the second parameter specifies token or non-token delimiters: false -- the second parameter specifies non-token delimiters, the set of token delimiters is empty; true -- the second parameter specifies token delimiters, the set of non-token delimiters is empty.
Throws:
NullPointerException - if text is null.
Since:
ostermillerutils 1.00.00

StringTokenizer

public StringTokenizer(String text,
                       String nontokenDelims)
Constructs a string tokenizer for the specified string. The characters in the nontokenDelims argument are the delimiters for separating tokens. Delimiter characters themselves will not be treated as tokens.

Is equivalent to StringTokenizer(text,nontokenDelims, null).

Parameters:
text - a string to be parsed.
nontokenDelims - the non-token delimiters.
Throws:
NullPointerException - if text is null.
Since:
ostermillerutils 1.00.00

StringTokenizer

public StringTokenizer(String text)
Constructs a string tokenizer for the specified string. The tokenizer uses " \t\n\r\f" as a delimiter set of non-token delimiters, and an empty token delimiter set.

Is equivalent to StringTokenizer(text, " \t\n\r\f", null);

Parameters:
text - a string to be parsed.
Throws:
NullPointerException - if text is null.
Since:
ostermillerutils 1.00.00
Method Detail

setText

public void setText(String text)
Set the text to be tokenized in this StringTokenizer.

This is useful when for StringTokenizer re-use so that new string tokenizers do not have to be created for each string you want to tokenizer.

The string will be tokenized from the beginning of the string.

Parameters:
text - a string to be parsed.
Throws:
NullPointerException - if text is null.
Since:
ostermillerutils 1.00.00

hasMoreTokens

public boolean hasMoreTokens()
Tests if there are more tokens available from this tokenizer's string. If this method returns true, then a subsequent call to nextToken with no argument will successfully return a token.

The current position is not changed.

Returns:
true if and only if there is at least one token in the string after the current position; false otherwise.
Since:
ostermillerutils 1.00.00

nextToken

public String nextToken()
Returns the next token from this string tokenizer.

The current position is set after the token returned.

Returns:
the next token from this string tokenizer.
Throws:
NoSuchElementException - if there are no more tokens in this tokenizer's string.
Since:
ostermillerutils 1.00.00

skipDelimiters

public boolean skipDelimiters()
Advances the current position so it is before the next token.

This method skips non-token delimiters but does not skip token delimiters.

This method is useful when switching to the new delimiter sets (see the second example in the class comment.)

Returns:
true if there are more tokens, false otherwise.
Since:
ostermillerutils 1.00.00

countTokens

public int countTokens()
Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception. The current position is not advanced.

Returns:
the number of tokens remaining in the string using the current delimiter set.
Since:
ostermillerutils 1.00.00
See Also:
nextToken()

setDelimiters

public void setDelimiters(String delims)
Set the delimiters used to this set of (non-token) delimiters.

Parameters:
delims - the new set of non-token delimiters (the set of token delimiters will be empty).
Since:
ostermillerutils 1.00.00

setDelimiters

public void setDelimiters(String delims,
                          boolean delimsAreTokens)
Set the delimiters used to this set of delimiters.

Parameters:
delims - the new set of delimiters.
delimsAreTokens - flag indicating whether the first parameter specifies token or non-token delimiters: false -- the first parameter specifies non-token delimiters, the set of token delimiters is empty; true -- the first parameter specifies token delimiters, the set of non-token delimiters is empty.
Since:
ostermillerutils 1.00.00

setDelimiters

public void setDelimiters(String nontokenDelims,
                          String tokenDelims)
Set the delimiters used to this set of delimiters.

Parameters:
nontokenDelims - the new set of non-token delimiters.
tokenDelims - the new set of token delimiters.
Since:
ostermillerutils 1.00.00

setDelimiters

public void setDelimiters(String nontokenDelims,
                          String tokenDelims,
                          boolean returnEmptyTokens)
Set the delimiters used to this set of delimiters.

Parameters:
nontokenDelims - the new set of non-token delimiters.
tokenDelims - the new set of token delimiters.
returnEmptyTokens - true if empty tokens may be returned; false otherwise.
Since:
ostermillerutils 1.00.00

countTokens

public int countTokens(String delims)
Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception using the given set of (non-token) delimiters. The delimiters given will be used for future calls to nextToken() unless new delimiters are given. The current position is not advanced.

Parameters:
delims - the new set of non-token delimiters (the set of token delimiters will be empty).
Returns:
the number of tokens remaining in the string using the new delimiter set.
Since:
ostermillerutils 1.00.00
See Also:
countTokens()

countTokens

public int countTokens(String delims,
                       boolean delimsAreTokens)
Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception using the given set of delimiters. The delimiters given will be used for future calls to nextToken() unless new delimiters are given. The current position is not advanced.

Parameters:
delims - the new set of delimiters.
delimsAreTokens - flag indicating whether the first parameter specifies token or non-token delimiters: false -- the first parameter specifies non-token delimiters, the set of token delimiters is empty; true -- the first parameter specifies token delimiters, the set of non-token delimiters is empty.
Returns:
the number of tokens remaining in the string using the new delimiter set.
Since:
ostermillerutils 1.00.00
See Also:
countTokens()

countTokens

public int countTokens(String nontokenDelims,
                       String tokenDelims)
Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception using the given set of delimiters. The delimiters given will be used for future calls to nextToken() unless new delimiters are given. The current position is not advanced.

Parameters:
nontokenDelims - the new set of non-token delimiters.
tokenDelims - the new set of token delimiters.
Returns:
the number of tokens remaining in the string using the new delimiter set.
Since:
ostermillerutils 1.00.00
See Also:
countTokens()

countTokens

public int countTokens(String nontokenDelims,
                       String tokenDelims,
                       boolean returnEmptyTokens)
Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception using the given set of delimiters. The delimiters given will be used for future calls to nextToken() unless new delimiters are given. The current position is not advanced.

Parameters:
nontokenDelims - the new set of non-token delimiters.
tokenDelims - the new set of token delimiters.
returnEmptyTokens - true if empty tokens may be returned; false otherwise.
Returns:
the number of tokens remaining in the string using the new delimiter set.
Since:
ostermillerutils 1.00.00
See Also:
countTokens()

nextToken

public String nextToken(String nontokenDelims,
                        String tokenDelims)
Returns the next token in this string tokenizer's string.

First, the sets of token and non-token delimiters are changed to be the tokenDelims and nontokenDelims, respectively. Then the next token (with respect to new delimiters) in the string after the current position is returned.

The current position is set after the token returned.

The new delimiter sets remains the used ones after this call.

Parameters:
nontokenDelims - the new set of non-token delimiters.
tokenDelims - the new set of token delimiters.
Returns:
the next token, after switching to the new delimiter set.
Throws:
NoSuchElementException - if there are no more tokens in this tokenizer's string.
Since:
ostermillerutils 1.00.00
See Also:
nextToken()

nextToken

public String nextToken(String nontokenDelims,
                        String tokenDelims,
                        boolean returnEmptyTokens)
Returns the next token in this string tokenizer's string.

First, the sets of token and non-token delimiters are changed to be the tokenDelims and nontokenDelims, respectively; and whether or not to return empty tokens is set. Then the next token (with respect to new delimiters) in the string after the current position is returned.

The current position is set after the token returned.

The new delimiter set remains the one used for this call and empty tokens are returned in the future as they are in this call.

Parameters:
nontokenDelims - the new set of non-token delimiters.
tokenDelims - the new set of token delimiters.
returnEmptyTokens - true if empty tokens may be returned; false otherwise.
Returns:
the next token, after switching to the new delimiter set.
Throws:
NoSuchElementException - if there are no more tokens in this tokenizer's string.
Since:
ostermillerutils 1.00.00
See Also:
nextToken()

nextToken

public String nextToken(String delims,
                        boolean delimsAreTokens)
Returns the next token in this string tokenizer's string.

Is equivalent to:

  • If the second parameter is false -- nextToken(delimiters, null)
  • If the second parameter is true -- nextToken(null, delimiters)

Parameters:
delims - the new set of token or non-token delimiters.
delimsAreTokens - flag indicating whether the first parameter specifies token or non-token delimiters: false -- the first parameter specifies non-token delimiters, the set of token delimiters is empty; true -- the first parameter specifies token delimiters, the set of non-token delimiters is empty.
Returns:
the next token, after switching to the new delimiter set.
Throws:
NoSuchElementException - if there are no more tokens in this tokenizer's string.
Since:
ostermillerutils 1.00.00
See Also:
nextToken(String,String)

nextToken

public String nextToken(String nontokenDelims)
Returns the next token in this string tokenizer's string.

Is equivalent to nextToken(delimiters, null).

Parameters:
nontokenDelims - the new set of non-token delimiters (the set of token delimiters will be empty).
Returns:
the next token, after switching to the new delimiter set.
Throws:
NoSuchElementException - if there are no more tokens in this tokenizer's string.
Since:
ostermillerutils 1.00.00
See Also:
nextToken(String,String)

hasMoreElements

public boolean hasMoreElements()
Returns the same value as the hasMoreTokens() method. It exists so that this class can implement the Enumeration interface.

Specified by:
hasMoreElements in interface Enumeration<String>
Returns:
true if there are more tokens; false otherwise.
Since:
ostermillerutils 1.00.00
See Also:
Enumeration, hasMoreTokens()

nextElement

public String nextElement()
Returns the same value as the nextToken() method, except that its declared return value is Object rather than String. It exists so that this class can implement the Enumeration interface.

Specified by:
nextElement in interface Enumeration<String>
Returns:
the next token in the string.
Throws:
NoSuchElementException - if there are no more tokens in this tokenizer's string.
Since:
ostermillerutils 1.00.00
See Also:
Enumeration, nextToken()

hasNext

public boolean hasNext()
Returns the same value as the hasMoreTokens() method. It exists so that this class can implement the Iterator interface.

Specified by:
hasNext in interface Iterator<String>
Returns:
true if there are more tokens; false otherwise.
Since:
ostermillerutils 1.00.00
See Also:
Iterator, hasMoreTokens()

next

public String next()
Returns the same value as the nextToken() method, except that its declared return value is Object rather than String. It exists so that this class can implement the Iterator interface.

Specified by:
next in interface Iterator<String>
Returns:
the next token in the string.
Throws:
NoSuchElementException - if there are no more tokens in this tokenizer's string.
Since:
ostermillerutils 1.00.00
See Also:
Iterator, nextToken()

remove

public void remove()
This implementation always throws UnsupportedOperationException. It exists so that this class can implement the Iterator interface.

Specified by:
remove in interface Iterator<String>
Throws:
UnsupportedOperationException - always is thrown.
Since:
ostermillerutils 1.00.00
See Also:
Iterator

setReturnEmptyTokens

public void setReturnEmptyTokens(boolean returnEmptyTokens)
Set whether empty tokens should be returned from this point in in the tokenizing process onward.

Empty tokens occur when two delimiters are next to each other or a delimiter occurs at the beginning or end of a string. If empty tokens are set to be returned, and a comma is the non token delimiter, the following table shows how many tokens are in each string.

StringNumber of tokens
"one,two"2 - normal case with no empty tokens.
"one,,three"3 including the empty token in the middle.
"one,"2 including the empty token at the end.
",two"2 including the empty token at the beginning.
","2 including the empty tokens at the beginning and the ends.
""1 - all strings will have at least one token if empty tokens are returned.

Parameters:
returnEmptyTokens - true iff empty tokens should be returned.
Since:
ostermillerutils 1.00.00

getCurrentPosition

public int getCurrentPosition()
Get the the index of the character immediately following the end of the last token. This is the position at which this tokenizer will begin looking for the next token when a nextToken() method is invoked.

Returns:
the current position or -1 if the entire string has been tokenized.
Since:
ostermillerutils 1.00.00

toArray

public String[] toArray()
Retrieve all of the remaining tokens in a String array. This method uses the options that are currently set for the tokenizer and will advance the state of the tokenizer such that hasMoreTokens() will return false.

Returns:
an array of tokens from this tokenizer.
Since:
ostermillerutils 1.00.00

restOfText

public String restOfText()
Retrieves the rest of the text as a single token. After calling this method hasMoreTokens() will always return false.

Returns:
any part of the text that has not yet been tokenized.
Since:
ostermillerutils 1.00.00

peek

public String peek()
Returns the same value as nextToken() but does not alter the internal state of the Tokenizer. Subsequent calls to peek() or a call to nextToken() will return the same token again.

Returns:
the next token from this string tokenizer.
Throws:
NoSuchElementException - if there are no more tokens in this tokenizer's string.
Since:
ostermillerutils 1.00.00

com.Ostermiller.util Java Utilities


Copyright © 2001-2012 by Stephen Ostermiller