com.Ostermiller.Syntax.Lexer
Class HTMLToken1

java.lang.Object
  extended by com.Ostermiller.Syntax.Lexer.Token
      extended by com.Ostermiller.Syntax.Lexer.HTMLToken1

public class HTMLToken1
extends Token

A HTMLToken1 is a token that is returned by a lexer that is lexing a HTML source file. It has several attributes describing the token: The type of token, the text of the token, the line number on which it occurred, the number of characters into the input at which it started, and similarly, the number of characters into the input at which it ended.


Field Summary
static int CHAR_REF
           
static int COMMENT
           
static int END_TAG_NAME
           
static int EQUAL
           
static int ERROR_MALFORMED_TAG
           
static int NAME
           
static int REFERENCE
           
static int SCRIPT
           
static int TAG_END
           
static int TAG_NAME
           
static int TAG_START
           
static int VALUE
           
static int WHITE_SPACE
           
static int WORD
           
 
Fields inherited from class com.Ostermiller.Syntax.Lexer.Token
INITIAL_STATE, UNDEFINED_STATE
 
Constructor Summary
HTMLToken1(int ID, String contents, int lineNumber, int charBegin, int charEnd)
          Create a new token.
HTMLToken1(int ID, String contents, int lineNumber, int charBegin, int charEnd, int state)
          Create a new token.
 
Method Summary
 String errorString()
          get a String that explains the error, if this token is an error.
 int getCharBegin()
          get the offset into the input in characters at which this token started
 int getCharEnd()
          get the offset into the input in characters at which this token ended
 String getContents()
          get the contents of this token
 String getDescription()
          A description of this token.
 int getID()
          get the ID number of this token
 int getLineNumber()
          get the line number of the input on which this token started
 int getState()
          Get an integer representing the state the tokenizer is in after returning this token.
 boolean isCharacterReference()
          Checks this token to see if it is a character reference.
 boolean isComment()
          Checks this token to see if it is a comment.
 boolean isError()
          Checks this token to see if it is an Error.
 boolean isName()
          Checks this token to see if it is a name of a name value pair.
 boolean isScript()
          Checks this token to see if it is an script.
 boolean isSeparator()
          Checks this token to see if it is a tag.
 boolean isTagName()
          Checks this token to see if it is an tag name.
 boolean isValue()
          Checks this token to see if it is a value of a name value pair.
 boolean isWhiteSpace()
          Checks this token to see if it is White Space.
 boolean isWord()
          Checks this token to see if it is a word.
 String toString()
          get a representation of this token as a human readable string.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

TAG_START

public static final int TAG_START
See Also:
Constant Field Values

TAG_END

public static final int TAG_END
See Also:
Constant Field Values

EQUAL

public static final int EQUAL
See Also:
Constant Field Values

WORD

public static final int WORD
See Also:
Constant Field Values

REFERENCE

public static final int REFERENCE
See Also:
Constant Field Values

TAG_NAME

public static final int TAG_NAME
See Also:
Constant Field Values

END_TAG_NAME

public static final int END_TAG_NAME
See Also:
Constant Field Values

NAME

public static final int NAME
See Also:
Constant Field Values

VALUE

public static final int VALUE
See Also:
Constant Field Values

CHAR_REF

public static final int CHAR_REF
See Also:
Constant Field Values

SCRIPT

public static final int SCRIPT
See Also:
Constant Field Values

COMMENT

public static final int COMMENT
See Also:
Constant Field Values

WHITE_SPACE

public static final int WHITE_SPACE
See Also:
Constant Field Values

ERROR_MALFORMED_TAG

public static final int ERROR_MALFORMED_TAG
See Also:
Constant Field Values
Constructor Detail

HTMLToken1

public HTMLToken1(int ID,
                  String contents,
                  int lineNumber,
                  int charBegin,
                  int charEnd)
Create a new token. The constructor is typically called by the lexer

Parameters:
ID - the id number of the token
contents - A string representing the text of the token
lineNumber - the line number of the input on which this token started
charBegin - the offset into the input in characters at which this token started
charEnd - the offset into the input in characters at which this token ended

HTMLToken1

public HTMLToken1(int ID,
                  String contents,
                  int lineNumber,
                  int charBegin,
                  int charEnd,
                  int state)
Create a new token. The constructor is typically called by the lexer

Parameters:
ID - the id number of the token
contents - A string representing the text of the token
lineNumber - the line number of the input on which this token started
charBegin - the offset into the input in characters at which this token started
charEnd - the offset into the input in characters at which this token ended
state - the state the tokenizer is in after returning this token.
Method Detail

getState

public int getState()
Get an integer representing the state the tokenizer is in after returning this token. Those who are interested in incremental tokenizing for performance reasons will want to use this method to figure out where the tokenizer may be restarted. The tokenizer starts in Token.INITIAL_STATE, so any time that it reports that it has returned to this state, the tokenizer may be restarted from there.

Specified by:
getState in class Token

getID

public int getID()
get the ID number of this token

Specified by:
getID in class Token
Returns:
the id number of the token

getContents

public String getContents()
get the contents of this token

Specified by:
getContents in class Token
Returns:
A string representing the text of the token

getLineNumber

public int getLineNumber()
get the line number of the input on which this token started

Specified by:
getLineNumber in class Token
Returns:
the line number of the input on which this token started

getCharBegin

public int getCharBegin()
get the offset into the input in characters at which this token started

Specified by:
getCharBegin in class Token
Returns:
the offset into the input in characters at which this token started

getCharEnd

public int getCharEnd()
get the offset into the input in characters at which this token ended

Specified by:
getCharEnd in class Token
Returns:
the offset into the input in characters at which this token ended

isSeparator

public boolean isSeparator()
Checks this token to see if it is a tag.

Returns:
true if this token is a reserved word, false otherwise

isWord

public boolean isWord()
Checks this token to see if it is a word.

Returns:
true if this token is an identifier, false otherwise

isTagName

public boolean isTagName()
Checks this token to see if it is an tag name.

Returns:
true if this token is a tag name, false otherwise

isName

public boolean isName()
Checks this token to see if it is a name of a name value pair.

Returns:
true if this token is a name, false otherwise

isValue

public boolean isValue()
Checks this token to see if it is a value of a name value pair.

Returns:
true if this token is a value, false otherwise

isCharacterReference

public boolean isCharacterReference()
Checks this token to see if it is a character reference.

Returns:
true if this token is a character reference, false otherwise

isScript

public boolean isScript()
Checks this token to see if it is an script.

Returns:
true if this token is an script, false otherwise

isComment

public boolean isComment()
Checks this token to see if it is a comment.

Specified by:
isComment in class Token
Returns:
true if this token is a comment, false otherwise

isWhiteSpace

public boolean isWhiteSpace()
Checks this token to see if it is White Space. Usually tabs, line breaks, form feed, spaces, etc.

Specified by:
isWhiteSpace in class Token
Returns:
true if this token is White Space, false otherwise

isError

public boolean isError()
Checks this token to see if it is an Error. Unfinished comments, numbers that are too big, unclosed strings, etc.

Specified by:
isError in class Token
Returns:
true if this token is an Error, false otherwise

getDescription

public String getDescription()
A description of this token. The description should be appropriate for syntax highlighting. For example "comment" is returned for a comment.

Specified by:
getDescription in class Token
Returns:
a description of this token.

errorString

public String errorString()
get a String that explains the error, if this token is an error.

Specified by:
errorString in class Token
Returns:
a String that explains the error, if this token is an error, null otherwise.

toString

public String toString()
get a representation of this token as a human readable string. The format of this string is subject to change and should only be used for debugging purposes.

Overrides:
toString in class Object
Returns:
a string representation of this token