com.Ostermiller.Syntax.Lexer
Class Token
java.lang.Object
com.Ostermiller.Syntax.Lexer.Token
- Direct Known Subclasses:
- CToken, HTMLToken, HTMLToken1, JavaScriptToken, JavaToken, LatexToken, PlainToken, PropertiesToken, SQLToken
public abstract class Token
- extends Object
A generic token class.
Field Summary |
static int |
INITIAL_STATE
The initial state of the tokenizer. |
static int |
UNDEFINED_STATE
The state of the tokenizer is undefined. |
Constructor Summary |
Token()
|
Method Summary |
abstract String |
errorString()
get a String that explains the error, if this token is an error. |
abstract int |
getCharBegin()
get the offset into the input in characters at which this token started |
abstract int |
getCharEnd()
get the offset into the input in characters at which this token ended |
abstract String |
getContents()
The actual meat of the token. |
abstract String |
getDescription()
A description of this token. |
abstract int |
getID()
A unique ID for this type of token. |
abstract int |
getLineNumber()
get the line number of the input on which this token started |
abstract int |
getState()
Get an integer representing the state the tokenizer is in after
returning this token. |
abstract boolean |
isComment()
Determine if this token is a comment. |
abstract boolean |
isError()
Determine if this token is an error. |
abstract boolean |
isWhiteSpace()
Determine if this token is whitespace. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
UNDEFINED_STATE
public static final int UNDEFINED_STATE
- The state of the tokenizer is undefined.
- See Also:
- Constant Field Values
INITIAL_STATE
public static final int INITIAL_STATE
- The initial state of the tokenizer.
Anytime the tokenizer returns to this state,
the tokenizer could be restarted from that point
with side effects.
- See Also:
- Constant Field Values
Token
public Token()
getID
public abstract int getID()
- A unique ID for this type of token.
Typically, ID numbers for each type will
be static variables of the Token class.
- Returns:
- an ID for this type of token.
getDescription
public abstract String getDescription()
- A description of this token. The description should
be appropriate for syntax highlighting. For example
"comment" might be returned for a comment. This should
make it easy to do html syntax highlighting. Just use
style sheets to define classes with the same name as
the description and write the token in the html file
with that css class name.
- Returns:
- a description of this token.
getContents
public abstract String getContents()
- The actual meat of the token.
- Returns:
- a string representing the text of the token.
isComment
public abstract boolean isComment()
- Determine if this token is a comment. Sometimes comments should be
ignored (compiling code) other times they should be used
(syntax highlighting). This provides a method to check
in case you feel you should ignore comments.
- Returns:
- true if this token represents a comment.
isWhiteSpace
public abstract boolean isWhiteSpace()
- Determine if this token is whitespace. Sometimes whitespace should be
ignored (compiling code) other times they should be used
(code beautification). This provides a method to check
in case you feel you should ignore whitespace.
- Returns:
- true if this token represents whitespace.
isError
public abstract boolean isError()
- Determine if this token is an error. Lets face it, not all code
conforms to spec. The lexer might know about an error
if a string literal is not closed, for example.
- Returns:
- true if this token is an error.
getLineNumber
public abstract int getLineNumber()
- get the line number of the input on which this token started
- Returns:
- the line number of the input on which this token started
getCharBegin
public abstract int getCharBegin()
- get the offset into the input in characters at which this token started
- Returns:
- the offset into the input in characters at which this token started
getCharEnd
public abstract int getCharEnd()
- get the offset into the input in characters at which this token ended
- Returns:
- the offset into the input in characters at which this token ended
errorString
public abstract String errorString()
- get a String that explains the error, if this token is an error.
- Returns:
- a String that explains the error, if this token is an error, null otherwise.
getState
public abstract int getState()
- Get an integer representing the state the tokenizer is in after
returning this token.
Those who are interested in incremental tokenizing for performance
reasons will want to use this method to figure out where the tokenizer
may be restarted. The tokenizer starts in Token.INITIAL_STATE, so
any time that it reports that it has returned to this state, the
tokenizer may be restarted from there.