ICPL Compiler Project
Phase 1
Lexical Analysis
Part 1 -- The lexical analysis module
A separately compiled module is to be prepared to
perform lexical analysis on a source file. This module will the only
module in the compiler that will have direct access to the ICPL source
code. The interface to this module will contain
-
A type definition for the lexical token types.
-
A symbol table which will be an array of strings.
-
An array of error messages.
-
An initialization procedure that will open the ICPL source
file and create any necessary data structures.
-
A termination procedure that will close the ICPL source file
and deallocate any opened data structures.
-
A procedure GetToken that would return each time it is called
the next token in the ICPL source file (as defined in part 1) and its value.
Token values are defined as following
-
Key word tokens and symbol tokens will have a null value.
-
Operators will have a value that identifies which operator
of that precedence level was found.
-
Integer constants will have as their value the integer value
of the string token. In case of overflow, an error token will be
returned.
-
String constants will have as their value the string without
the opening and closing quotes.
-
Identifiers will have as their value their index into the
symbol table.
-
An end of file token will be returned for each call to GetToken
after the end of the source file has been reached.
-
An error token will be returned any time no proper lexical
unit is found. The value of an error token is a three-tuple consisting
of the line number of the ICPL source file on which the error occurs, an
index into an array of error messages, and a string consisting of the false
token found. Optionally, you may also include a character count within
the line on which the error occurs.
The lexical analysis module must be properly documented.
Part II -- The test program
The test program should be able to open a source ICPL file
of the users choice and find by repeated calls to the GetToken procedure
all of the tokens within that file. It should print out to a file
or display on the screen the following:
-
A count as to how many times each token is returned.
-
A count as to how many times each identifier is returned.
-
Other output so as to completely test the lexical analyzer.
The output of the test program should be readable to someone
who has not read the code of the program.