SFE.Compiler
Class Tokenizer

java.lang.Object
  extended by SFE.Compiler.Tokenizer

public class Tokenizer
extends java.lang.Object

The Tokenizer class takes an input stream and parses it into "tokens", allowing the tokens to be read one at a time. The tokenizer can recognize keywords, identifiers, numbers (int consts), quoted strings(string consts) and various symbols. The tokenizer skipps C and C++ style comments.


Field Summary
static int EOF
          A constant indicating that the end of the stream has been read.
static int IDENTIFIER
          A constant indicating that an identifier token has been read.
static int INT_CONST
          A constant indicating that a constant number token has been read.
static int KEYWORD
          A constant indicating that a keyword token has been read.
static int KW_BOOLEAN
          A constant indicating that the type boolean has been read.
static int KW_CONST
          A constant indicating that the const keyword has been read.
static int KW_ELSE
          A constant indicating that the else keyword has been read.
static int KW_ENUM
          A constant indicating that the enum keyword has been read.
static int KW_FALSE
          A constant indicating that the false keyword has been read.
static int KW_FOR
          A constant indicating that the for keyword has been read.
static int KW_FUNCTION
          A constant indicating that the function keyword has been read.
static int KW_IF
          A constant indicating that the if keyword has been read.
static int KW_INT
          A constant indicating that the int keyword has been read.
static int KW_PROGRAM
          A constant indicating that the program keyword has been read.
static int KW_STRUCT
          A constant indicating that the struct keyword has been read.
static int KW_TO
          A constant indicating that the to keyword has been read.
static int KW_TRUE
          A constant indicating that the true keyword has been read.
static int KW_TYPE
          A constant indicating that the type keyword has been read.
static int KW_VAR
          A constant indicating that the var keyword has been read.
static int STRING_CONST
          A constant indicating that a constant string token has been read.
static int SYMBOL
          A constant indicating that a symbol token has been read.
 
Constructor Summary
Tokenizer(java.io.Reader input)
          Create a tokenizer that parses the given stream.
 
Method Summary
 void advance()
          Parses the next token from the input stream of this tokenizer.
 java.lang.String getIdentifier()
          Return the identifier in current token.
 java.lang.String getKeyword()
          Return the current keyword token.
 boolean hasMoreTokens()
          Indicate if there are more tokens left in the input reader.
 int intVal()
          Return the integer in current token.
 int keyword()
          Return the int representing the current keyword token.
 int lineNumber()
          Return the current line number.
 java.lang.String stringVal()
          Return the string in current token.
 char symbol()
          Return the char represnting the current symbol.
 int tokenType()
          Return the current token Type.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

EOF

public static final int EOF
A constant indicating that the end of the stream has been read.

See Also:
Constant Field Values

KEYWORD

public static final int KEYWORD
A constant indicating that a keyword token has been read.

See Also:
Constant Field Values

SYMBOL

public static final int SYMBOL
A constant indicating that a symbol token has been read.

See Also:
Constant Field Values

IDENTIFIER

public static final int IDENTIFIER
A constant indicating that an identifier token has been read.

See Also:
Constant Field Values

INT_CONST

public static final int INT_CONST
A constant indicating that a constant number token has been read.

See Also:
Constant Field Values

STRING_CONST

public static final int STRING_CONST
A constant indicating that a constant string token has been read.

See Also:
Constant Field Values

KW_PROGRAM

public static final int KW_PROGRAM
A constant indicating that the program keyword has been read.

See Also:
Constant Field Values

KW_TYPE

public static final int KW_TYPE
A constant indicating that the type keyword has been read.

See Also:
Constant Field Values

KW_FUNCTION

public static final int KW_FUNCTION
A constant indicating that the function keyword has been read.

See Also:
Constant Field Values

KW_BOOLEAN

public static final int KW_BOOLEAN
A constant indicating that the type boolean has been read.

See Also:
Constant Field Values

KW_INT

public static final int KW_INT
A constant indicating that the int keyword has been read.

See Also:
Constant Field Values

KW_IF

public static final int KW_IF
A constant indicating that the if keyword has been read.

See Also:
Constant Field Values

KW_ELSE

public static final int KW_ELSE
A constant indicating that the else keyword has been read.

See Also:
Constant Field Values

KW_FOR

public static final int KW_FOR
A constant indicating that the for keyword has been read.

See Also:
Constant Field Values

KW_TRUE

public static final int KW_TRUE
A constant indicating that the true keyword has been read.

See Also:
Constant Field Values

KW_FALSE

public static final int KW_FALSE
A constant indicating that the false keyword has been read.

See Also:
Constant Field Values

KW_CONST

public static final int KW_CONST
A constant indicating that the const keyword has been read.

See Also:
Constant Field Values

KW_VAR

public static final int KW_VAR
A constant indicating that the var keyword has been read.

See Also:
Constant Field Values

KW_STRUCT

public static final int KW_STRUCT
A constant indicating that the struct keyword has been read.

See Also:
Constant Field Values

KW_ENUM

public static final int KW_ENUM
A constant indicating that the enum keyword has been read.

See Also:
Constant Field Values

KW_TO

public static final int KW_TO
A constant indicating that the to keyword has been read.

See Also:
Constant Field Values
Constructor Detail

Tokenizer

public Tokenizer(java.io.Reader input)
Create a tokenizer that parses the given stream.

Parameters:
input - a Reader object providing the input stream.
Method Detail

lineNumber

public int lineNumber()
Return the current line number.

Returns:
the current line number of this stream tokenizer.

hasMoreTokens

public boolean hasMoreTokens()
Indicate if there are more tokens left in the input reader.

Returns:
true if there are more tokens; false otherwise.

advance

public void advance()
             throws java.io.IOException
Parses the next token from the input stream of this tokenizer. This method should only be called if hasMoreTokens() is true. initially there is no current token.

Throws:
java.io.IOException - - if an I/O error occurs.

tokenType

public int tokenType()
Return the current token Type. The int representing the token type is one of the following: EOF, KEYWORD, SYMBOL, IDENTIFIER, INT_CONST, STRING_CONST.

Returns:
an int value representing the token type.

keyword

public int keyword()
Return the int representing the current keyword token. This method should be called only if tokenType() is KEYWORD.

Returns:
an int value representing the current keyword token.

getKeyword

public java.lang.String getKeyword()
Return the current keyword token. This method should be called only if tokenType() is KEYWORD.

Returns:
the string of the current keyword.

symbol

public char symbol()
Return the char represnting the current symbol. This method should be called only if tokenType() is SYMBOL.

Returns:
the current symbol.

getIdentifier

public java.lang.String getIdentifier()
Return the identifier in current token. This method should be called only if tokenType() is IDENTIFIER.

Returns:
a string containing the identifier.

intVal

public int intVal()
Return the integer in current token. This method should be called only if tokenType() is INT_CONST.

Returns:
the current token integer value.

stringVal

public java.lang.String stringVal()
Return the string in current token. This method should be called only if tokenType() is STRING_CONST.

Returns:
the current token string value.