com.Ostermiller.Syntax.Lexer
Class Token

java.lang.Object
  extended by com.Ostermiller.Syntax.Lexer.Token
Direct Known Subclasses:
CToken, HTMLToken, HTMLToken1, JavaScriptToken, JavaToken, LatexToken, PlainToken, PropertiesToken, SQLToken

public abstract class Token
extends Object

A generic token class.


Field Summary
static int INITIAL_STATE
          The initial state of the tokenizer.
static int UNDEFINED_STATE
          The state of the tokenizer is undefined.
 
Constructor Summary
Token()
           
 
Method Summary
abstract  String errorString()
          get a String that explains the error, if this token is an error.
abstract  int getCharBegin()
          get the offset into the input in characters at which this token started
abstract  int getCharEnd()
          get the offset into the input in characters at which this token ended
abstract  String getContents()
          The actual meat of the token.
abstract  String getDescription()
          A description of this token.
abstract  int getID()
          A unique ID for this type of token.
abstract  int getLineNumber()
          get the line number of the input on which this token started
abstract  int getState()
          Get an integer representing the state the tokenizer is in after returning this token.
abstract  boolean isComment()
          Determine if this token is a comment.
abstract  boolean isError()
          Determine if this token is an error.
abstract  boolean isWhiteSpace()
          Determine if this token is whitespace.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

UNDEFINED_STATE

public static final int UNDEFINED_STATE
The state of the tokenizer is undefined.

See Also:
Constant Field Values

INITIAL_STATE

public static final int INITIAL_STATE
The initial state of the tokenizer. Anytime the tokenizer returns to this state, the tokenizer could be restarted from that point with side effects.

See Also:
Constant Field Values
Constructor Detail

Token

public Token()
Method Detail

getID

public abstract int getID()
A unique ID for this type of token. Typically, ID numbers for each type will be static variables of the Token class.

Returns:
an ID for this type of token.

getDescription

public abstract String getDescription()
A description of this token. The description should be appropriate for syntax highlighting. For example "comment" might be returned for a comment. This should make it easy to do html syntax highlighting. Just use style sheets to define classes with the same name as the description and write the token in the html file with that css class name.

Returns:
a description of this token.

getContents

public abstract String getContents()
The actual meat of the token.

Returns:
a string representing the text of the token.

isComment

public abstract boolean isComment()
Determine if this token is a comment. Sometimes comments should be ignored (compiling code) other times they should be used (syntax highlighting). This provides a method to check in case you feel you should ignore comments.

Returns:
true if this token represents a comment.

isWhiteSpace

public abstract boolean isWhiteSpace()
Determine if this token is whitespace. Sometimes whitespace should be ignored (compiling code) other times they should be used (code beautification). This provides a method to check in case you feel you should ignore whitespace.

Returns:
true if this token represents whitespace.

isError

public abstract boolean isError()
Determine if this token is an error. Lets face it, not all code conforms to spec. The lexer might know about an error if a string literal is not closed, for example.

Returns:
true if this token is an error.

getLineNumber

public abstract int getLineNumber()
get the line number of the input on which this token started

Returns:
the line number of the input on which this token started

getCharBegin

public abstract int getCharBegin()
get the offset into the input in characters at which this token started

Returns:
the offset into the input in characters at which this token started

getCharEnd

public abstract int getCharEnd()
get the offset into the input in characters at which this token ended

Returns:
the offset into the input in characters at which this token ended

errorString

public abstract String errorString()
get a String that explains the error, if this token is an error.

Returns:
a String that explains the error, if this token is an error, null otherwise.

getState

public abstract int getState()
Get an integer representing the state the tokenizer is in after returning this token. Those who are interested in incremental tokenizing for performance reasons will want to use this method to figure out where the tokenizer may be restarted. The tokenizer starts in Token.INITIAL_STATE, so any time that it reports that it has returned to this state, the tokenizer may be restarted from there.