CSCI 2041 Assignment 5: The Calculon Interpreter
Due Dates
Milestone Due: 11:59pm Wed 12/5 10% credit Final Due: 11:59pm Tue 12/11 90% credit - Approximately 6.66% of total grade
- Submit to Canvas
- Projects are individual work: no collaboration with other students is allowed. Seek help from course staff if you get stuck for too long.
CODE DISTRIBUTION: a5-code.zip
TESTING CODE:
- Milestone Tests: a5-milestone-tests.zip (Instructions)
- Required Tests: a5-required-tests.zip (Instructions)
- All Tests: a5-all-tests.zip (Includes Optional Problem Tests)
CHANGELOG:
- Mon Dec 10 20:38:55 CST 2018
Optional tests have been added linked to the assignment 5 specification and directly here: a5-all-tests.zip. Copy all files in the folder created by the zip including
Makefile
into the project directory.In the included file
OPTIONAL_PROBLEMS.txt
fill in ayes
for optional problems that you worked on.The new Makefile includes targets
make test-op1 # test optional problem 1 make test-op2 # test optional problem 2 make test-op3 # test optional problem 3 make test-op4 # test optional problem 4 make test-op5 # test optional problem 5 make test-op # test all optional problems
- Fri Dec 7 11:31:05 CST 2018
- Post 898 reported bugs in the
test_calculon.sh
script that cause errors when running on Mac OSX. These have been fixed in an updated version oftest_calculon.sh
. Download it if needed. - Thu Dec 6 16:19:13 CST 2018
- Tests for all required problems are now available, linked at the
top of the A5 specification. Instructions to install them are
here. Note that you must copy all files including
Makefile
from thea5-required-tests/
directory to the project directory. Tests for Optional Problems will be posted over the weekend. - Wed Dec 5 12:07:25 CST 2018
- Finished an incomplete sentence associated with the explanation for function application.
- Tue Dec 4 15:09:14 CST 2018
- The demonstration of running the required/optional solution
scripts had typos that have been corrected. The name of the
scripts are
calculon_required_solution
andcalculon_optional_solution
. - Tue Dec 4 14:14:59 CST 2018
- Points for automated tests for the final submission have been adjusted: the tests associated with Problem 2 are worth 10 points while those associated with Problem 3 are only worth 5 points. This was a mistake in the original specification as there is much less to test associated with Problem 3's closures.
- Sat Dec 1 17:56:50 CST 2018
Post 844 identified a minor problem with the
test_calculon.sh
script which will cause it to not print out a side-by-side diff for failed tests in some environments. The quick fix for this is to export theCOLUMNS
variable as in> export COLUMNS
which will ensure that the script has access to this variable. Alternatively, download an updated version of
test_calculon.sh
.- Fri Nov 30 18:25:37 CST 2018
- Milestone tests have been posted and linked to the to of the
spec. Download them there. They only test for completion of
Problem 1 which adds
if/then/else
to Calculon. Instructions for running the tests have been added to the Milestone Tests Section. - Thu Nov 29 10:53:14 CST 2018
- Post 836 identified a typo in the instructions for lexing
comparisons: the correct token name for
>
isGreatThan
rather thanGreaterThan
; the former appears in the codebase for Calculon. In retrospect, GreatThan sounds stupid but it's too late to change now.
1 Overview
Much can be learned about programming and programming languages by building them from the ground up. The purpose of this project is to do that just that: construct a small, interactive language interpreter dubbed Calculon. The language understood by the interpreter is an amalgam of constructs from other languages mainly chosen for ease of implementation. The features are demonstrated in the Demo Session later.
Calculon will compile out of the box via a make
invocation. This
will allow you to experiment with the interpreter and examine its
basic internal structure.
> make ocamlc -g -annot -c calclex.ml ocamlc -g -annot -c calcparse.ml ocamlc -g -annot -c calceval.ml ocamlc -g -annot -o demo_parse calclex.cmo calcparse.cmo demo_parse.ml ocamlc -g -annot -o demo_eval calclex.cmo calcparse.cmo calceval.cmo demo_eval.ml ocamlc -g -annot -o calculon calclex.cmo calcparse.cmo calceval.cmo calculon.ml > calculon calculon> 1+2*3; - : IntDat(7) calculon> quit; That was so terrible I think you gave me cancer! >
There are a number of features that must be implemented internally
before they will work in the interpreter such as if/then/else
and
less/greater comparisons. These are the subject of the required
problems.
To further assist with understanding what Calculon should do, reference implementations are provided on CSE Labs machines.
calculon_required_solution
: script that links to an executable which implements all the required problems but none of the optional problems.calculon_optional_solution
: script that links to an executable which implements all the required problems and all optional problems.make setup
which ensures that the two scripts are executable.
> make setup # ensure permissions are correct chmod u+x calculon_required_solution chmod u+x calculon_optional_solution > ./calculon_optional_solution # run solution with all optional problems completed Running kauffman's implementation calculon> if true then 1 else 2; # doesn't work until Problem 1 is completed - : IntDat(1) calculon> source small.calc; # must complete optional problem 2 Reading source file 'small.calc' calculon> def one = 1; one : IntDat(1) calculon> def double = @n n*2; double : Closure(n, <fun>) calculon> End of file 'small.calc'
Use these versions to explore what can and cannot be done with Calculon.
1.1 Calculon Features and Problems
The interpreter is broken into 4 source files
File | Purpose |
---|---|
calclex.ml |
Lexer / Tokenizer |
calcparse.ml |
Parser to construct an expression tree |
calceval.ml |
Evaluates an expression tree |
calculon.ml |
Top-level interactive loop |
Each of these must be edited to complete features of Calculon associated with required and optional problems.
Feature | Lex | Parse | Eval | Top | Problem |
---|---|---|---|---|---|
Language Constructs | |||||
Ident/Constants | ok | ok | ok | -- | Complete |
Add/Sub/Mul/Div | ok | ok | ok | -- | Complete |
let/in | ok | ok | ok | -- | Complete |
if/then/else | P1 | P1 | P1 | -- | P1: Add conditionals |
Less/Greater/Equal | P2 | P2 | P2 | -- | P2: Add comparisons |
Lambdas/Closures | ok | P3 | P3 | -- | P3: Evaluate lambdas as closures |
Function Application | ok | ok | P4 | -- | P4: Evaluate function applications |
Recursive let/in | -- | -- | P5 | -- | P5: Modify let/in to be recursive |
and/or booleans | O1 | O1 | O1 | -- | O1: Optional Problem |
Top-level Features | |||||
Top-level recursive def |
-- | -- | -- | ok | Complete |
Top-level built-ins | -- | -- | -- | ok | Complete |
Read source file |
-- | -- | -- | O2 | O2: Optional Problem |
Save/Load binary file | -- | -- | -- | O3 | O3: Optional Problem |
Informative Exceptions | -- | -- | -- | O4 | O4: Optional Problem |
Expression Optimization | -- | -- | O5 | O5 | O5: Optional Problem |
The above table lists the features of the programming language along with the problems associated with completing them. Middle columns correspond to the above source files with symbols meaning the following:
Symbol | Meaning |
---|---|
-- | Source file is not relevant to the feature |
ok | No change is needed the sourc file |
P# | An Required Problem requires the file to be edited |
O# | An Optional problem requires the file to be edited |
1.2 Assignment Files and Download
Download the code pack linked at the top of the page. Unzip this which will create a project folder. Create new files in this folder. Ultimately you will re-zip this folder to submit it.
NOTE: Tests Released Later: The tests for this assignment will not be released with the publication of the specification. They will be released sometime after once they have been completed. Further instructions on how to use them will be made available on release.
File | State | Prob# | Notes |
---|---|---|---|
Makefile |
Provided | All | Build file to compile all programs |
calclex.ml |
Edit | All | Lexer / Tokenizer |
calcparse.ml |
Edit | All | Parser to construct an expression tree |
calceval.ml |
Edit | All | Evaluates an expression tree |
calculon.ml |
Edit | All | Top-level interactive loop |
demo_parse.ml |
Provided | - | Stand-alone utility for testing parsing |
demo_eval.ml |
Provided | - | Stand-alone utility for testing evaluation |
small.calc |
Provided | - | Small input file with Calculon source code. |
optim.calc |
Provided | - | Larger input file with expression optimization opporutnities |
Automated Testing Instructions
Unzip a5-required-tests.zip
to create the directory a5-required-tests/
. Copy all
files in the a5-required-tests
directory to the project directory
a5-code
. The following unix command should do this if a5-required-tests
and
a5-code are in the same directory.
cp -r a5-required-tests/* a5-code/
The Makefile provided is a copy of Makefile.withtests. The original version is called Makefile.notests.
The new Makefile introduces the targets
make test-p1
: test problem 1make test-p2
: test problem 2make test-p3
: test problem 3make test-p4
: test problem 4make test-p5
: test problem 5make test
: test all problems
Single tests can be run using the test_calculon.sh
script with a
data input file appropriate to an individual problem and the test
number to run. Below is an example.
> ./test_calculon.sh test_p2_data.sh 3 Setting COLUMNS based on stty COLUMNS is 113 Loading tests from test_p2_data.sh 10 tests loaded Running single TEST 3 RUNNING TESTS TEST 3 tokens-comp-vars : OK ===================================== OVERALL: 1 / 1 tests correct
1.3 Milestone Tests grading 10
There is an midpoint submission deadline worth 10% of the credit for the assignment. Submit working code at this point that compiles and passes the tests associated with Problem 1 to receive 10% of the credit for the project. This is to motivate getting an early start on this rather large assignment.
To run the milestone tests download the zip of milestone tests and run the following commands.
> unzip a5-milestone-tetsts.zip # unzip the contents > cp a5-milestone-tests/* a5-code/* # copy test code into assignment directory > cd a5-code/ # change to project directory > make test-p1 # run the tests which only target problem 1
Weight | Criteria |
---|---|
MILESTONE TESTS | |
10 | make test-p1 correctly compiles and passes tests |
No manual inspection will be done during milestone evaluation |
1.4 General Grading Criteria grading 5
During final manual inspection the following will be examined for credit.
Weight | Criteria |
---|---|
5 | CORRECT SUBMISSION and STYLE |
All files submitted and properly zipped | |
make test compiles all code and runs tests |
|
Code is indented reasonably well to indicate scopes | |
Comments provide insight for intricate blocks |
1.5 Demo Session
Some of Calculon's features are demonstrated below.
> calculon calculon> help; CALCULON: a calculator/interpreter with an unholy... ACTING TALENT! --SUPPORTED SYNTAX-- <expr> ; : expression evaluation terminated by semicolon def x = <expr> ; in <expr> : top-level variable bindings 1 + 2*9 / (4-2) : basic arithmetic operators let x=<expr> in <expr> ; : let bindings for locals if <expr> then <expr> else <expr> : conditional execution @n n*2+4 : lambda abstraction with parameter n def double = @n n*2; : top-level function binding double 4; : function application def mul = @a @b a*b; : nested lambdas for multi-arg function --BUILTIN COMMANDS-- help; : print this help message quit; : quit calculon whos; : show all top-level bindings and their value types undef <ident>; : remove a global binding debug {true|false}; : enable/disabe debug printing of backtraces tokens <expr>; : show the results of lexing the given <expr> parsetree <expr>; : show the results of parsing the given <expr> show <expr>; : show the structure of the evaluated data in <expr> --OPTIONAL COMMANDS-- source <filename>; : read command from <filename> as if typed save <filename>; : save all definitions as binary data in <filename> load <filename>; : load binary variable defs from <filename> opparsetree <expr>; : show the results of parsing AND optimizing the given <expr> optimize {true|false}; : enable/disabe optimization of expressions in normal evaluation calculon> 1+2+3*4/2; # basic math - : IntDat(9) calculon> let a=2 in let b=3 in a*b; # let bindings - : IntDat(6) calculon> def x = 7; # top-level definitions x : IntDat(7) calculon> def nope = false; # booleans nope : BoolDat(false) calculon> def double = @n 2*n; # function definitions via @ lambdas double : Closure(n, <fun>) calculon> double 2; # function call - : IntDat(4) calculon> double 7; - : IntDat(14) calculon> whos; # listing of top-level defs double : Closure(n, <fun>) nope : BoolDat(false) x : IntDat(7) calculon> def mul = @a @b a*b; # functions of multiple params via nested lambdas mul : Closure(a, <fun>) calculon> mul 3 7; - : IntDat(21) calculon> def mul_two = mul 2; # partial application for curried function mul_two : Closure(b, <fun>) calculon> mul_two 5; # completed application - : IntDat(10) # recursive definition at top-level calculon> def fact = @n if n=1 then 1 else n*(fact (n-1)); fact : Closure(n, <fun>) calculon> fact 5; # recursive function call - : IntDat(120) calculon> tokens mul 3 7; # show tokenization of expression Tokens: [Ident(mul); IntTok(3); IntTok(7); Semicolon ] calculon> parsetree mul 3 7; # show parsetree of expression Parse tree: Apply .func_expr: Apply .func_expr: Varname(mul) .param_expr: IntExp(3) .param_expr: IntExp(7) calculon> show mul; # show structure of data Closure(param_name: a, varmap={...} code= Lambda( b ) Mul Varname(a) Varname(b) ) calculon> quit; # Noooooooooo! That was so terrible I think you gave me cancer! >
2 Problem 1: Adding Conditionals
Calculon should support simple conditional expressions of the familiar form
if <bool> then <expr> else <expr>
This will allow conditional execution like the following examples.
calculon> if true then 10 else 5; - : IntDat(10) calculon> if false then 10 else 5; - : IntDat(5) calculon> if true then if false then 1 else 2 else 3; - : IntDat(2)
While the use of Boolean constants may seem a bit limiting, the next
problem adds integer comparisons such as x < y
which will greatly
increase the power of conditionals.
Currently there are tokens and expression types present for conditionals but there is no lexer, parser, or evaluator support. Completing this problem will allow the above expressions to evaluate as indicated.
2.1 Lexing Conditionals
The starting state of Calculon does not recognize conditional keywords like
if
so will treat them as normal identifiers. This leads to odd errors in whichif
is misinterpreted as a function being applied to several arguments with errors like:calculon> if true then 1 else 5; ERROR: I'm not familiar with the type of thing I'm seeing... Calceval.EvalError("Function application is not yet implemented", "{}", "\nApply...
This can be seen using the
tokens
command which will listif
as an identifier rather than its own tokencalculon> tokens if true then 8*2 else 4*3; Tokens: [Ident(if); BoolTok(true); Ident(then); IntTok(8); Times; IntTok(2); Ident(else); IntTok(4); Times; IntTok(3); Semicolon ]
Once the lexer/parser create conditionals, this error will change.
In
calclex.ml
, edit the main lexing functionlex_string
so that it recognizes keyword associated with conditionals (if
etc.). Examine thetoken
type at the top of the file and return appropriate tokens for each keyword.Recompile Calculon and check the results of lexing by using the
tokens
command which should yield results like the below.calculon> tokens if true then 8*2 else 4*3; Tokens: [If; BoolTok(true); Then; IntTok(8); Times; IntTok(2); Else ; IntTok(4); Times; IntTok(3); Semicolon ]
This shows that
if
is no the tokenIf
rather than an identifier. which is a good start.
2.2 Parsing Conditionals
In
calcparse.ml
, edit the functionparse_cond
which is meant to handle conditionals. The correct sequence for a conditional isif <expr> then <expr> else <expr>
That means the a series of nested checks via
match/with
looking for the corresponding tokens. In each case, an arbitrary expression can appear in place of<expr>
such as the following:calculon> if let x=true in x then 1 else 2; - : IntDat(1) calculon> if if true then false else true then 1 else 2; - : IntDat(2)
Thus you will need to employ the top-level parsing function
parse_expr
to parse a whole expression between each of the keyword tokens.In the
expr
type defined at the top ofcalcparse.ml
, theCond
variant is associated with conditional execution. The fieldsif_expr then_expr else_expr
should be assigned the three associated expressions listed in the form above.While working, employ the
parsetree
command to show the parse tree of a given expression. For example, the following is a correct parse tree:calculon> parsetree if true then 8*2 else 4-3; Parse tree: Cond .if_expr: BoolExp(true) .then_expr: Mul IntExp(8) IntExp(2) .else_expr: Sub IntExp(4) IntExp(3)
Note that conditionals require all of
if/then/else
in Calculon so leaving any part off should lead to errors as shown below. Note that the error messages are required for missingthen
and missingelse
cases.calculon> if true then 1 else 2; - : IntDat(1) calculon> if true then 1 else; ERROR: I'm not familiar with the type of thing I'm seeing... Calcparse.ParseError("syntax error", _) calculon> if true then 1; ERROR: I'm not familiar with the type of thing I'm seeing... Calcparse.ParseError("Expected 'else' ", _) calculon> if true then; ERROR: I'm not familiar with the type of thing I'm seeing... Calcparse.ParseError("syntax error", _) calculon> if true; ERROR: I'm not familiar with the type of thing I'm seeing... Calcparse.ParseError("Expected 'then' ", _)
2.3 Evaluating Conditionals
- In
calceval.ml
, locate theeval_expr
function and the case associated withCond
. You will need to make use of the fields likeif_expr
in the record which are defined at the top ofcalcparse.ml
. Remove the exception raise and start coding. - Conditional execution is simple: evaluate the expression associated
with the
if
part, sometimes called the test. If it is true, then evaluate only thethen
expression, otherwise evaluate theelse
expression only. The structure of your code will flow according to the above. Evaluating nested expressions should be done recursively via calls to
eval_expr
. This function always returns adata_t
type which is defined at the top of thecalceval.ml
file. It represents the three kinds of data that exist in Calculon:IntDat
for integers,BoolDat
for booleans, andClosure
for functions of one parameter.On evaluating the result of an expression, pattern match the results against these three types. Often one is interested in only one kind of data as the others are in error.
For conditionals, the results of evaluating
if_expr
must be aBoolDat
which carries a true (evalthen_expr
) or false (evalelse_expr
). Other kinds likeIntDat
should generate errors with messages like the following where the utilitydata_string
has been used to stringify the erroneous data.calculon> if 1 then 2 else 3; ERROR: I'm not familiar with the type of thing I'm seeing... Calceval.EvalError("Expected Bool for if(), found 'IntDat(1)'", ... calculon> if 8+7 then 2 else 3; ERROR: I'm not familiar with the type of thing I'm seeing... Calceval.EvalError("Expected Bool for if(), found 'IntDat(15)'", ... calculon> def x = 21; x : IntDat(21) calculon> if x then 1 else 2; ERROR: I'm not familiar with the type of thing I'm seeing... Calceval.EvalError("Expected Bool for if(), found 'IntDat(21)'", "{x: IntDat(21)}", ... # Only works once lambdas can be evaluated calculon> if (@n n*2) then 2 else 3; ERROR: I'm not familiar with the type of thing I'm seeing... Calceval.EvalError("Expected Bool for if(), found 'Closure(n, <fun>)'", ...
- Once evaluation code is complete, Calculon should handle
if/then/else
constructs like any of its other constructs.
2.4 Grading Criteria for P1: Adding Conditionals grading 25
The following criteria will be checked in this problem.
Weight | Criteria |
---|---|
AUTOMATED TESTS for conditionals | |
10 | make test-p1 correctly compiles and passes tests |
MANUAL INSPECTION for conditionals | |
5 | calclex.ml |
Keywords if/then/else properly added into keyword recognition. |
|
5 | calcparse.ml |
Proper match/with against If token to recognize conditional |
|
Use of parse_expr to parse a whole expression associated with if |
|
Further match/with to look for subsequent Then/Else tokens |
|
Proper creation of a Cond record |
|
Exceptions thrown associated with missing then/else |
|
5 | calceval.ml |
Recursive evaluation of if_expr to determine true/false test |
|
match/with used to check for BoolDat result |
|
Evaluation of ONLY then_expr or else_expr based on test |
|
Raises an evaluation error if test does not give a BoolDat |
3 Problem 2: Adding Comparisons
Calculon should support integer comparison of the following three types:
x < y # less than x > y # greater than x = y # equal to
Each of these operates on two integers and produces a boolean result. This result can be used with conditionals to get more dynamic behavior.
calculon> def x = 5; x : IntDat(5) calculon> def y = 7; y : IntDat(7) calculon> if x > y then x else y; - : IntDat(7) calculon> def x = 10; x : IntDat(10) calculon> if x > y then x else y; - : IntDat(10)
Like conditionals, there are token and expression types established for comparisons but no lexer, parser, or evaluation support for the construct. Completing this problem will add comparisons to Calculon.
3.1 Lexing Comparisons
In the provided version of Calculon, the symbols
<
and>
are not recognized by the lexer while the=
sign is but not in comparisons. This leads to the following two kinds of errors.calculon> 5 < 7; ERROR: I'm not familiar with the type of thing I'm seeing... Calclex.LexError("char '<' not recognized, at position 2 in the input '5 < 7;'") calculon> 5 = 7; - : IntDat(5) WARNING: Great Shatner's ghost! WARNING: The following tokens remain and will NOT be evaluated [Equal; IntTok(7); Semicolon ]
- In
calclex.ml
, add support for the symbols<
and>
at an appropriate location. Generatetoken
typesLessThan
andGreatThan
when these symbols are recognized. No action is needed for the=
which is already tokenized. Use the
tokens
command to check that code changes result in proper tokenization.calculon> tokens 1 < 10+3; Tokens: [IntTok(1); LessThan; IntTok(10); Plus; IntTok(3); Semicolon ] calculon> tokens 1 > 10+3; Tokens: [IntTok(1); GreatThan; IntTok(10); Plus; IntTok(3); Semicolon ]
3.2 Parsing Comparisons
Comparisons are among the lowest precedence operations in most programming environments and Calculon follows this convention. This allows arithmetic to happen before comparisons are evaluated such as in the following.
calculon> 5*2 > 3+4; - : BoolDat(true) calculon> 5*2 = 3+4+3; - : BoolDat(true) calculon> 5*2 = 3+4+2; - : BoolDat(false) calculon> 1 < 10*10+2; - : BoolDat(true)
- Due to the low-precedence of comparisons, a new production
(parsing function) will need to be introduced in the parser that
precedes all other productions save for the top-level
parse_expr
parsing entry point. - In
calcparse.ml
, create a parsing functionparse_compare
betweenparse_expr
andparse_addsub
. Though this function appears visually between them, it will only be part of the parsing recursive descent if other function call it. Modifyparse_expr
to callparse_compare
and plan to make calls toparse_addsub
inparse_compare
. parse_compare
should look for a sequence like one of the following:<expr> < <expr> <expr> > <expr> <expr> = <expr>
<expr>
here are generated by deeper recursive calls starting with low-precedence arithmetic operations like addition and subtraction.Comparison has a simpler grammar than arithmetic like
+
as there is no chaining of comparisons.1 + 2 + 3 # allowed in Calculon 1 < 2 < 3 # not allowed in Calculon
Thus the structure of the code for parsing comparisons can be simpler than what is used for arithmetic operators: no iteration is needed.
- Parse a left-hand expression, then use
match/with
to look for appropriate tokens to indicate a comparisons is being done. If so, parse a right-hand expression. The left and right expressions being compared should be saved along with the operation and result in anIntop
variant ofexpr
. Examine how these are created in the arithmetic parsing to understand what is required. Like arithmetic operations, if no appropriate symbol is found, then comparisons should simply return a left-hand expression. As before, use the
parsetree
command to ensure code changes result in an appropriate parse tree. Ensure that the comparisons happen last, after other arithmetic operations.calculon> parsetree 10 > 10+3; Parse tree: Greater IntExp(10) Add IntExp(10) IntExp(3) calculon> parsetree 5*2+1 = 10+3; Parse tree: Equal Add Mul IntExp(5) IntExp(2) IntExp(1) Add IntExp(10) IntExp(3)
3.3 Evaluating Comparisons
- In
calceval.ml
, identify where comparisons should be handled ineval_expr
. Like arithmetic, comparisons are anIntop
operations. Add cases to an appropriatematch/with
statement to handle them. - Unlike other arithmetic operations which generate
IntDat
, comparison result inBoolDat
with true/false values associated. Make sure in to adhere to this convention. On completing the evaluation function, try out some expressions involving comparisons to ensure that they work properly.
calculon> 5*2+1 = 10+3; - : BoolDat(false) calculon> 5*2+1 = 10+1; - : BoolDat(true) calculon> 5*2+1 < 10+4; - : BoolDat(true) calculon> def x = 2; x : IntDat(2) calculon> if x < 3 then 5 else 10; - : IntDat(5) calculon> let y=4 in if y < x then 1 else 2; - : IntDat(2)
Ensure that the warnings or errors result if repeated comparison is attempted as in the following.
calculon> 1 < 2 < 3; - : BoolDat(true) WARNING: Great Shatner's ghost! WARNING: The following tokens remain and will NOT be evaluated [LessThan; IntTok(3); Semicolon ] calculon> if 1 < 2 < 3 then 10 else 20; ERROR: I'm not familiar with the type of thing I'm seeing... Calcparse.ParseError("Expected 'then' ", _)
3.4 Grading Criteria for P2: Adding Comparisons grading 25
The following criteria will be checked in this problem.
Weight | Criteria |
---|---|
AUTOMATED TESTS for adding comparisons | |
10 | make test-p2 correctly compiles and passes tests |
MANUAL INSPECTION for adding comparisons | |
5 | calclex.ml |
Added lexing spport for < and > at appropriate, single-character location |
|
5 | calcparse.ml |
New parse_compare function added and called by parse_eval |
|
parse_compare makes calls to arithmetic parsing functions. |
|
No iteration/recursion used which prevents repeated comparisons like 1 < 2 < 3 |
|
Clear cases handling each of less than, greater than, equal to comparisons. | |
Each case constructs an Intop expression with appropriate operator. |
|
Clear case for no matching comparison. | |
5 | calceval.ml |
Correct case of eval_expr used to handle comparisons as a kind of Intop . |
|
Clear cases handling each of less than, greater than, equal. | |
BoolDat data clearly produced as a result of the evaluation. |
|
Raise and evaluation error for comparison of non-Integer kinds. |
4 Problem 3: Lambda Evaluation
We have discussed the notion of lambda expressions and closures previously in the class and this problem will make that discussion concrete. In Calculon
- A lambda expression is syntactic construct. It is the
Lambda
variant of theexpr
type which is created during parsing. The intent of the expression is to create a function but this does not happen until it is evaluated just asAdd(2,3)
does not become5
until evaluated. - A closure is a data element. It is the
Closure
variant in thedata_t
type which represents the results of expressions. Being a datum, a closure can be assigned to variable names viadef
andlet/in
like integers and booleans allowing for named functions. Unlike integers and booleans, closures can be applied to an argument which will evaluate its associated expression. This function application is addressed in the next problem while the current problem allows for creation of closures.
Calculon supports lambda expressions via the @
symbol which is
followed by a parameter name and expression to evaluate. Lambdas can
be nested creating functions of multiple parameters though only the
first parameter name is visible. The show
command which shows the
internal structure of data is useful for analyzing closures as show
below.
calculon> @n n+1; - : Closure(n, <fun>) calculon> def inc = @n n+1; inc : Closure(n, <fun>) calculon> def mul = @first @second first*second; mul : Closure(first, <fun>) calculon> whos; inc : Closure(n, <fun>) mul : Closure(first, <fun>) calculon> show inc; Closure(param_name: n, varmap={...} code= Add Varname(n) IntExp(1) ) calculon> show mul; Closure(param_name: first, varmap={...} code= Lambda( second ) Mul Varname(first) Varname(second) )
4.1 Lexing/Parsing Lambdas
- The lexing for lambda expressions is already complete but it is
worthwhile to examine the associated portions of code. The lexer
simply needs to identify
@
characters as a specialAt
token. The parser will need to be modified to handle produce lambda expressions. In its initial state, errors will result from use of the
@
symbol:calculon> parsetree @n n+1; ERROR: I'm not familiar with the type of thing I'm seeing... Calcparse.ParseError("syntax error", _)
- In
calcparse.ml
find the existingparse_lambda
function which presently just falls through to the higher precedence function application. - Add cases that look for the token sequence
@ <ident> <expr>
which is a correctly specified lambda expression. Note that the second token should be anIdent
which carries a variable name. - Construct a
Lambda
variant ofexpr
using the appropriate data. The definition for this variant is at the top ofcalcparse.ml
. - Also match an error case which will help users identify mistakes
in their use of the
@
symbol. This would look for a sequence of tokens of the form@ _
where the second token is not anIdent
(anything else). Raise a parse error in this case with the message below which indicates that@
should be followed by an identifier. Test your parsing with the
parsetree
command as demonstrated below.calculon> parsetree @n n+1; Parse tree: Lambda( n ) Add Varname(n) IntExp(1) calculon> parsetree @first @second first*second; Parse tree: Lambda( first ) Lambda( second ) Mul Varname(first) Varname(second) calculon> parsetree @ 1+2*n; ERROR: I'm not familiar with the type of thing I'm seeing... Calcparse.ParseError("Expected identifier in lambda expression after @", _)
4.2 Evaluating Closures
- In
calceval.ml
, locate the case ineval_expr
which deals with lambda expressions. - Evaluating a lambda simply creates a closure. Thus, all that needs
to happen is creation of a
Closure
variant ofdata_t
. The correspondence between fields ofLambda
andClosure
should be obvious save for thevarmap
. Closures in
calculon
are lexically scoped which means that they retain all variable bindings present at their creation. For example, look closely at thevarmap
portion in the demo below.############################## # EXAMPLE 1 calculon> def x=2; # define x and y x : IntDat(2) calculon> def y=5; y : IntDat(5) calculon> def add_xy = @a a+x+y; # define a function using x,y add_xy : Closure(a, <fun>) calculon> show add_xy; # variable map includes x,y Closure(param_name: a, varmap={... x: IntDat(2), y: IntDat(5)} code= Add ^^^^^^^^^^^^^^^^^^^^^^^^^^ Add Varname(a) Varname(x) Varname(y) ) calculon> def x=199; # redefine x and y x : IntDat(199) calculon> def y=255; y : IntDat(255) calculon> show add_xy; # original x,y values are retained in map Closure(param_name: a, varmap={... x: IntDat(2), y: IntDat(5)} code= Add Add Varname(a) Varname(x) Varname(y) ) ############################## # EXAMPLE 2 calculon> def double = let two=2 in @n two*n; # local var 'two' to closure double : Closure(n, <fun>) calculon> two; # two is not defined at top level ERROR: I'm not familiar with the type of thing I'm seeing... Calceval.EvalError("No variable 'two' bound", ... calculon> show double; # double retains value for 'two' Closure(param_name: n, varmap={... two: IntDat(2), ...} code= Mul Varname(two) Varname(n) )
- To enable easy lexical closures, Calculon internally uses
OCaml's persistent maps. At the top of
calceval.ml
, a moduleVarmap
is defined along with the typevarmap_t
which mapsstring
todata_t
, variable names to variable values. - To allow a closure to retain the environment in which it is
created, construct it with the current
varmap
. Note that all calls toeval_expr
are passed avarmap
with the current bindings. Use this to construct the closure. - Test your results using Calculon's
show
command which shows the internal structure associated with adata_t
at the top level (see previous examples). Ensure that thevarmap
associated with each closure contains the bindings that were part of the lexical scope in which it was created. - Note that the approach taken here is a bit lazy as ALL variables in the scope of a closure are retained rather than those that are strictly needed for it to work. A more hygenic implementation would analyze the closure code to determine the free variables that need to be retained from the global scope and only retain those for better memory efficiency. However, the present approach of retaining a copy of the map is favored due to ease of implementation.
4.3 Grading Criteria for P3: Adding Closures grading 15
The following criteria will be checked in this problem.
Weight | Criteria |
---|---|
AUTOMATED TESTS for adding closures | |
5 | make test-p3 correctly compiles and passes tests |
MANUAL INSPECTION for adding closures | |
5 | calcparse.ml |
Clear case in parse_lambda identifying an @ <ident> token sequence |
|
Correct creation of a Lambda expression |
|
Inclusion of an error case for @ followed by not an Ident |
|
5 | calceval.ml |
Correct construction of Closure record | |
Inclusion of a variable map to facilitate lexical scoping |
5 Problem 4: Function Application
Function application in Calculon is identical to function application in OCaml: it is simply a sequence of expressions next to one another as in
<clos_expr> <arg1_expr> <arg2_expr> ... <argN_expr>
As the above indicates, the first expression should be a closure: other kinds of data such as integers cannot be applied, often a variable binding the function not always. Concretely, this looks likes the following:
calculon> (@n n+1) 5; # apply anonymous func - : IntDat(6) calculon> def double = @n 2*n; # doubling function double : Closure(n, <fun>) calculon> double 2; # apply function - : IntDat(4) calculon> double (1+5); # apply function - : IntDat(12) calculon> def mul = @a @b a*b; # func of 2 args mul : Closure(a, <fun>) calculon> mul 10 5; # apply function - : IntDat(50) calculon> mul 2 9; # apply function - : IntDat(18) calculon> def max = # max function @a @b if a > b then a else b; max : Closure(a, <fun>) calculon> max 10 5; # call on two args - : IntDat(10) calculon> max 2 9; # call on two args - : IntDat(9)
In the case of a single argument as in <clos_expr> <arg_expr>
, the
implementation of function application an obvious 3-step process
- Evaluate the argument expression to get the argument data
- Add the argument data to the closure's
varmap
bound to its parameter name - Evaluate the closure's code with the new variable map and return the resulting data.
What is trickier to understand is how functions of "multiple"
arguments work. Recall that Lambda
expressions have only a single
parameter. Multi-argument functions are really nested Lambda
expressions as in:
calculon> parsetree @a @b a*b; # func of 2 args Parse tree: Lambda( a ) # nested lambdas Lambda( b ) Mul Varname(a) Varname(b)
Evaluating the first lambda creates a closure but the second lambda is remains un-evaluated.
calculon> def mul = @a @b a*b; # func of 2 args mul : Closure(a, <fun>) calculon> show mul; Closure(param_name: a, varmap={...} code= # outer closure Lambda( b ) # inner lambda Mul Varname(a) Varname(b) )
On applying the closure, its first parameter becomes defined and its code is evaluated. In the case of a Lambda expression, this results in another closure.
calculon> def mul = @a @b a*b; # func of 2 args mul : Closure(a, <fun>) calculon> parsetree mul 2; Parse tree: Apply # function application of ... .func_expr: Varname(mul) # variable associated with closure .param_expr: IntExp(2) # argument calculon> mul 2; # apply func of 2 args to 1 arg - : Closure(b, <fun>) # gives a closure
This second closure may be immediately applied to more arguments are bound to a name for later use as a partially applied function.
calculon> mul 2 4; # full application - : IntDat(8) calculon> def doub = mul 2; # partial application doub : Closure(b, <fun>) calculon> show doub; # closure needs a param Closure(param_name: b, varmap={a: IntDat(2), ...} code= Mul ^^^^^^ a is rememberd though Varname(a) Varname(b) ) calculon> doub 4; # apply partial function - : IntDat(8) calculon> doub 9; # apply partial function - : IntDat(18)
Thus, functions in Calculon are curried: they can be partially applied.
Note that errors will result when applying a closure to too many arguments as the resulting data will no longer be a closure. This is the same error that results from trying to use data such as integers as a function.
calculon> def mul = @a @b a*b; # func of 2 args mul : Closure(a, <fun>) calculon> mul 2; # partial application - : Closure(b, <fun>) calculon> mul 2 4; # full application - : IntDat(8) calculon> mul 2 4 6; # too many arguments ERROR: I'm not familiar with the type of thing I'm seeing... Calceval.EvalError("Expected Closure for application, found 'IntDat(8)'", ... calculon> 8 6; # same error ERROR: I'm not familiar with the type of thing I'm seeing... Calceval.EvalError("Expected Closure for application, found 'IntDat(8)'", ...
5.1 Lexing/Parsing Function Application
- Lexing is complete for function application: there is no special syntax associated with it.
Parsing is provided for function application and requires no modification. In
calcparse.ml
, function application is a high-precedence operation which looks for two adjacent expressions. This creates anApply
variant ofexpr
. This is done in a left-associative fashion as arithmetic operators are parsed so that applications to multiple argumentsclos arg1 arg2 ... argN
are parsed as
((((clos arg1) arg2) ...) argN)
This allows the results of the first application to create another closure which is applied to the second argument and so forth.
5.2 Evaluating Function Application
- In
calceval.ml
, locate whereApply
expressions are handled and begin editing. Evaluate the
func_expr
associated with the application. In many cases this will be a simple variable name but be an arbitrarily complex expression:doub (2+1) # simple func_expr (@m 2*m) (2+1) # more complex func_expr
- The resulting data produce by the
func_expr
should be aClosure
. Do pattern matching to ensure this is the case. If it is a closure, evaluate the parameter expression associated with theApply
. - Refresh your understanding of the fields in the
Closure
variant. Access thevarmap
associated with the closure, and add a binding between its parameter name and the parameter data computed in the previous step. Importantly do not use the varmap available ineval_expr
as this may have different bindings than thevarmap
of the closure. - Finally, recursively evaluate the closures code in the presence of
the
varmap
created in the previous step and return the results. If the function expression is NOT a closure, application cannot proceed so raise an evaluation error. With a message like the one below. Use the
data_string
function to stringify the data that was not a closure.calculon> 8 6; ERROR: I'm not familiar with the type of thing I'm seeing... Calceval.EvalError("Expected Closure for application, found 'IntDat(8)'", ... calculon> true 9 5; ERROR: I'm not familiar with the type of thing I'm seeing... Calceval.EvalError("Expected Closure for application, found 'BoolDat(true)'", ...
Test out function application evaluation after recompiling using the examples below. Note the use of the global
undef <ident>
command to remove a binding from the top level. This is useful to ensure that closures properly retain their lexical scope and are evaluated in that scope.############################## # EXAMPLE 1 calculon> def double = @n 2*n; # single arg function double : Closure(n, <fun>) calculon> double 6; # application - : IntDat(12) ############################## # EXAMPLE 2 calculon> def mulall = @a @b @c a*b*c; # multiple args mulall : Closure(a, <fun>) calculon> mulall 1 2 3; # full application - : IntDat(6) calculon> def mullast = mulall 1 2; # partial application mullast : Closure(c, <fun>) calculon> mullast 3; # completed application - : IntDat(6) calculon> mullast 4; # completed application - : IntDat(8) ############################## # EXAMPLE 3 calculon> def x = 10; # test lexical scope x : IntDat(10) calculon> def addx = @a a+x; # uses x addx : Closure(a, <fun>) calculon> addx 5; # works fine - : IntDat(15) calculon> undef x; # undefine x calculon> x; ERROR: I'm not familiar with the type of thing I'm seeing... Calceval.EvalError("No variable 'x' bound", ... calculon> addx 5; # function still works - : IntDat(15) calculon> def x=40; # redefine x x : IntDat(40) calculon> addx 5; # original x=10 used - : IntDat(15)
5.3 Grading Criteria for P4: Function Application grading 10
The following criteria will be checked in this problem.
Weight | Criteria |
---|---|
AUTOMATED TESTS for function application | |
5 | make test-p4 correctly compiles and passes tests |
MANUAL INSPECTION for function application | |
5 | calceval.ml |
Recursive evaluation of the function expression | |
Use of match/with to determine if the result was a closure or not | |
Evaluation of the argument expression | |
Binding of argument data to parameter name in closure's varmap | |
Evaluation of closure code with the new varmap | |
Clear error case handling non-closure application |
6 Problem 5: Recursive Let/In
Top-level functions in Calculon can be defined recursively. There is
no special syntax associated with recursion for top level def
statements which is less like OCaml's special let rec...
and more
like the conventions of C/Java/Python/Scheme/every other language.
Examples of top-level recursive functions in Calculon are given below.
calculon> def fact = @n if n=1 then 1 else n*(fact (n-1)); fact : Closure(n, <fun>) calculon> fact 3; - : IntDat(6) calculon> fact 4; - : IntDat(24) calculon> fact 5; - : IntDat(120) calculon> def pow = @base @exp if exp=0 then 1 else let res=pow base (exp-1) in base*res; pow : Closure(base, <fun>) calculon> pow 2 4; - : IntDat(16) calculon> pow 2 5; - : IntDat(32) calculon> pow 3 5; - : IntDat(243)
This is quite handy to as it enables Calculon to compute most of what is computable, though not all.
The provided version of Calculon already has recursive top-level statements. No modifications are required to get them working though the previous problems must be completed for the above code to work.
There are some situations in which recursive helper functions are useful such as the following.
calculon> def sum_facts = # sums a sequence of factorials let fact = # local function fact @n if n=1 then 1 else n*(fact (n-1)) in @i @stop # main body of sum_facts if i > stop then 0 else let facti = fact i in let rest = sum_facts (i+1) stop in facti + rest; sum_facts : Closure(i, <fun>) calculon> sum_facts 10 2; - : IntDat(0) calculon> sum_facts 1 3; - : IntDat(9) calculon> sum_facts 1 5; - : IntDat(153) calculon> sum_facts 5 10; - : IntDat(4037880)
While a bit contrived as a top-level definition for fact
would
suffice, the let/in
binding for the local fact
definition allows
nested recursive functions to be defined. Recursive functions defined
via let/in
is not present in Calculon by default and is the
subject of this problem.
6.1 Adding Recursive Let
- Examine
calculon.ml
and locate the code responsible for setting up a top-leveldef
binding. Study this code carefully as it will provide a template for how to implement recursive let bindings. - In
calceval.ml
, locate the whereeval_expr
handlesLetin
expressions. This code is relatively simple. - Recursive let bindings only matters for functions in Calculon. That
means one only needs to look for closures. After the expression
associated with the let is evaluated, check it to see whether it is
a closure using
match/with
. - If not a closure, nothing needs to change. However, closures should
have their
varmap
modified to include the variable name which is being let-bound. This will allow them to refer back to their own function name in the event that recursion is required. - Be careful to directly modify the closure's map and NOT the
varmap
that the let binding is altering as these may have different bindings in them. - When finished, recompile and check your results with either the example above or the simple one below.
# ensure there are no alternative facts calculon> undef fact; # recursive let binding calculon> let fact = @n if n=1 then 1 else n*(fact (n-1)) in fact 5; - : IntDat(120)
6.2 Grading Criteria for P5: Recursive Let grading 10
The following criteria will be checked in this problem.
Weight | Criteria |
---|---|
AUTOMATED TESTS for recursive Let | |
5 | make test-p5 correctly compiles and passes tests |
MANUAL INSPECTION for recursive let | |
5 | calceval.ml |
In Letin evaluation, use of match/with to look for a closure result |
|
Mutation of to closure's varmap present | |
Binding of varname added to closures map to allow recursion |
7 Optional Problems / Makeup Credit
The following problems are optional and can be completed. Any additional points earned here will contribute the overall score on Assignments in the course. This will make up for credit lost on previous assignments but will not cause totals exceed 100% on assignments. With respect to grading, keep in mind the following:
- This assignment is worth 2X the credit of other assignments
- There are 50 total points of optional problems
- Completing all optional problems would make up for entirely missing an assignment
Note that the instructions for the optional problems are less thorough than for the required problems by intention as solving them will rely on your ability to work out solutions on your own.
7.1 OptProb 1: Boolean Operations
While not strictly necessary, boolean operators like and
and or
allow conditions to be stated somewhat more succinctly than would
otherwise be possible. This problem implements these two boolean
operators in Calculon. This allows code like the following to
execute.
calculon> true and false; - : BoolDat(false) calculon> true or false; - : BoolDat(true) calculon> if 1<2 and 2<3 then 10 else 20; - : IntDat(10) calculon> def between = @lo @hi @x (lo < x) and (x < hi); between : Closure(lo, <fun>) calculon> between 1 3 2; - : BoolDat(true) calculon> between 1 3 5; - : BoolDat(false) calculon> between 1 3 0; - : BoolDat(false)
Boolean operations are lower precedence than arithmetic and lower precedence than comparison. This allows combinations of these other operations to be done prior to boolean operations a shown below.
calculon> parsetree 1+2>4 or 2+3>4; Parse tree: Or Greater Add IntExp(1) IntExp(2) IntExp(4) Greater Add IntExp(2) IntExp(3) IntExp(4) calculon> 1+2>4 or 2+3>4; - : BoolDat(true)
One important convention that Calculon follows is that most languages
implement and
as a higher precedence operator than or
which allows
the following interpretations.
calculon> def x=5; x : IntDat(5) calculon> 1<x and x<3 or 3<x and x<7; # eval 'and' before 'or' - : BoolDat(true) calculon> (1<x and x<3) or (3<x and x<7); # explicit parens for the above - : BoolDat(true) calculon> parsetree 1<x and x<3 or 3<x and x<7; Parse tree: Or And Less IntExp(1) Varname(x) Less Varname(x) IntExp(3) And Less IntExp(3) Varname(x) Less Varname(x) IntExp(7)
Implementing Boolean Operations
- Modify the
calclex.ml
to recognize keywordsand
andor
and generate appropriate tokens. - Modify
calcparse.ml
to create parsing functions for and/or. Since these are at different precedence levels, write separate parse functions for each calledparse_or
andparse_and
. Model these functions after arithmetic parsing functions likeparse_addsub
as chains of left-associative and/or's are allowed. - Ensure the new parse functions are lower precedence than all other
productions. Make sure to call them from the top-level parsing
entry point
parse_expr
. - Boolean operations should create
Boolop
variants ofexpr
which are similar toIntop
expressions for integers. - In
calceval.ml
, implement evaluation ofBoolop
variants in the appropriate case ofeval_expr
. Model this after howIntop
expressions are evaluated.
Grading Criteria for OptProb 1: Boolean Operations grading 10
Weight | Criteria |
---|---|
AUTOMATED TESTS for boolean operations | |
5 | make test-op1 correctly compiles and passes tests |
5 | MANUAL INSPECTION for boolean operations |
calclex.ml |
|
Addition of and and or to keywords, tokens produced |
|
calcparse.ml |
|
Addition of parse_or and parse_and parsing functions |
|
Clearly two functions which give and/or different precedence levels | |
Creation of Boolop expressions associated with and/or |
|
calceval.ml |
|
Evaluation of Boolop which combine BoolDats to produce a BoolDat |
|
Error checking to ensure only BoolDat data are combined |
7.2 OptProb 2: Evaluate Source Files
A hand feature of OCaml's REPL is the #use "source.ml"
directive
which will read OCaml code from a source file and evaluate it in the
REPL. Calculon benefits from a similar functionality if the source
command is implemented.
> cat small.calc # unix command line def one = 1; # contents of file small.calc def double = @n n*2; > calculon # start calculon calculon> source small.calc; # read from source file Reading source file 'small.calc' calculon> def one = 1; # from file one : IntDat(1) calculon> def double = @n n*2; # from file double : Closure(n, <fun>) calculon> End of file 'small.calc' # end of file calculon> whos; # show definitions double : Closure(n, <fun>) one : IntDat(1) calculon> double one; # call loaded function - : IntDat(2) calculon> double 3; - : IntDat(6)
This allows source files to be created in text editors and loaded into Calculon.
Variable bindings in a Calculon session may be lost if a source file is loaded which redefines them:
calculon> def one = 999; # define variable 'one' one : IntDat(999) calculon> source small.calc; # source a file Reading source file 'small.calc' calculon> def one = 1; # re-def of 'one' one : IntDat(1) ... End of file 'small.calc' calculon> whos; double : Closure(n, <fun>) one : IntDat(1) # one redefined calculon> one; - : IntDat(1)
Implementing the source
command
- This is a top-level command so open
calculon.ml
where other top-level commands likewhos
andparsetree
are found. - Analyze the main interactive loop for Calculon which is near the
bottom of
calculon.ml
. This loop repeatedly calls a sequence of functions which read input from thestdin
channel (keyboard typing) and process statements associated with it. It is a good model for what needs to be done save that input should come from a file source instead ofstdin
. - Examine other top-level commands and establish a case which
corresponds to a
source
command. Copy relevant code from the main loop. Note that anIdent
carrying the filename should come after thesource
command. - Study the OCaml manual, previous projects, and internet sources to determine how to open an input file in OCaml. Open the named file and use it the source for functions that process statements.
- As in the main loop, when the end of file is reached, an exception is raised which needs to be caught. In this case it indicates there is no more input available in the file and control should be returned to the main loop. Make sure to close the open file.
As the file is opened and closed, print messages associated with this to indicate to a user what is being automatically read in.
calculon> source small.calc; # source a file Reading source file 'small.calc' ... End of file 'small.calc' calculon> whos; # back to interactive
Grading Criteria for OptProb 2: Evaluate Source Files grading 10
Weight | Criteria |
---|---|
AUTOMATED TESTS for Evaluate Source Files | |
5 | make test-op2 correctly compiles and passes tests |
5 | MANUAL INSPECTION for Evaluate Source Files |
calculon.ml |
|
Addition of top-level source command in appropriate case. |
|
Use of similar to processing loop to main Calculon loop | |
Use of existing statement processing functions | |
Clear opening and closing of an input file and use as input | |
Messages for begining/ending of the input file |
7.3 OptProb 3: Save/Load Sessions
Another common feature of REPL's is the ability to save and restore
sessions. Calculon can support this with the save <file>
and load
<file>
commands as follows.
> calculon # start calculon calculon> def pi=314; # define a few variables pi : IntDat(314) calculon> def mul = @a @b a*b; mul : Closure(a, <fun>) calculon> whos; mul : Closure(a, <fun>) pi : IntDat(314) calculon> save mydefs.dat; # save defs to a binary file Saved binary data in 'mydefs.dat' calculon> quit; That was so terrible I think you gave me cancer! > calculon # restart calculon calculon> whos; # no defs initially calculon> load mydefs.dat; # load binary data Loading binary data from 'mydefs.dat' calculon> whos; # defs are back mul : Closure(a, <fun>) pi : IntDat(314) calculon> mul pi 2; - : IntDat(628)
In Calculon, saving a session simply amounts to writing the global
variable map defined in calculon.ml
to disk while loading involves
reading the data back and merging it with the current variable map.
Writing and reading a complex data structure like a varmap_t
is somewhat difficult if users were required to do it by
hand. Fortunately, like many sensible languages, OCaml supports
marshalling of data, sometimes known as serialization. This is
the process of writing the in-memory binary representation of a data
structure to disk which allows it to be read back directly into
memory.
The module Marshal
provides this serialization of any data structure
in OCaml including user-defined data. No special knowledge of the
data structure is needed to marshal it. Knowledge of the functions in
Marshal
will make this problem relatively easy to complete.
Implementing Save/Load Sessions
- Both save and load are top-level commands so examine the code in
calculon.ml
to create an appropriatematch
case. Both save/load are followed by anIdent
which carries the name of the file to use. Examine the functions in the standard
Marshal
module. Read carefully the documentation associated with the functionsto_channel outchan
andfrom_channel inchan
. As a simple example of these, here is an OCaml REPL session which demonstrates their use.# let mylst = [1;2;3];; val mylst : int list = [1; 2; 3] # let mylist = [1;2;3];; val mylist : int list = [1; 2; 3] # let outchan = open_out "save.dat";; val outchan : out_channel = <abstr> # Marshal.to_channel outchan mylist [];; - : unit = () # close_out outchan;; - : unit = () # let inchan = open_in "save.dat";; val inchan : in_channel = <abstr> (* must indicate the type of data read via from_channel *) (* vvvvvvvv *) # let loaded : int list = Marshal.from_channel inchan;; val loaded : int list = [1; 2; 3] # close_in inchan;; - : unit = ()
- The data created by
to_channel
is binary and will not be easily understood in standard text editors so do not expect any great revelations associated with attempting to read it. Implement
save <file>;
according to the example above which usesto_channel
. Output the global variable map defined incalculon.ml
which contains all bindings. Make sure to open and close the output channel. Print a message indicating that data has been saved as in:calculon> save mydefs.dat; Saved binary data in 'mydefs.dat'
Implement
load <file>;
according to the example above. This will allow a binary data to read back in and converted to avarmap_t
. Print a message indicating that data is being loaded as in:calculon> load mydefs.dat; Loading binary data from 'mydefs.dat'
When loading, existing bindings should be preserved as much as possible though definitions in the loaded variable map should override conflicting bindings in the current global map. To be concrete, here is an example:
> calculon; calculon> def a=1; # define a,b a : IntDat(1) calculon> def b=2; b : IntDat(2) calculon> save mydefs.dat; # save to file Saved binary data in 'mydefs.dat' calculon> quit; # exit That was so terrible I think you gave me cancer! > calculon # restart calculon> def b=3; # define b,c b : IntDat(3) calculon> def c=4; c : IntDat(4) calculon> whos; b : IntDat(3) c : IntDat(4) calculon> load mydefs.dat; # load data Loading binary data from 'mydefs.dat' calculon> whos; a : IntDat(1) # loaded from file b : IntDat(2) # redefines b from file c : IntDat(4) # already present
To accomplish this, load the
varmap_t
from a file, and merge it into the global variable map using an appropriate function. SinceVarmap
is derived from theMap.Make
functor, it will have several functions that allow for merging. Examine functions available for merging and creating unions in the documentation forMap.Make
. Experiment with these in a REPL to determine the correct way to invoke one.
Grading Criteria for OptProb 3: Save/Load Sessions grading 10
Weight | Criteria |
---|---|
AUTOMATED TESTS for Save/Load Sessions | |
5 | make test-op3 correctly compiles and passes tests |
5 | MANUAL INSPECTION for Save/Load Sessions |
calculon.ml |
|
Addition of top-level save/load commands in appropriate cases |
|
Clear opening and closing of input/output channels during save/load | |
Messages printed indicated save/load to a file name | |
Use of Marshal module functions to write/read binary maps. |
7.4 OptProb 4: Informative Exceptions
While colorful, the error messages in Calculon are a bit general and hard on the eyes as indicated below.
calculon> 10!; ERROR: I'm not familiar with the type of thing I'm seeing... Calclex.LexError("char '!' not recognized, at position 2 in the input '10!;'") calculon> 10+; ERROR: I'm not familiar with the type of thing I'm seeing... Calcparse.ParseError("syntax error", _) calculon> def doub = @ * 2; ERROR: I'm not familiar with the type of thing I'm seeing... Calcparse.ParseError("Expected identifier in lambda expression after @", _) calculon> 10+true; ERROR: I'm not familiar with the type of thing I'm seeing... Calceval.EvalError("Expect Int for right arithmetic expression, found 'BoolDat(true)'", "{}", "\nAdd\n IntExp(10)\n BoolExp(true)\n")
One minor improvement to this situation would be to treat each exceptions typ slightly differently and print associated messages as shown below.
calculon> 10!; LEX ERROR: char '!' not recognized, at position 2 in the input '10!;' calculon> 10+ ; PARSE ERROR: syntax error tokens: [Semicolon ] PARSE ERROR: Expected identifier in lambda expression after @ tokens: [At; Times; IntTok(2); Semicolon ] calculon> 10+true; EVAL ERROR: Expect Int for right arithmetic expression, found 'BoolDat(true)' varmap: {} expression: Add IntExp(10) BoolExp(true)
While still not tremendously helpful, these messages are at least more specific to the kind of error (lex, parse, or evaluate).
To complete this problem, develop some knowledge of exception handling. A reasonable source is in one of our course texts: Practical OCaml Ch 10 on Exception Handling. Exceptions are similar to algebraic types and exception handling of various types is similar to pattern matching. After some research and practice, exception handling will feel quite familiar.
Implementing Informative Exceptions
- In
calculon.ml
locate where errors in statement processing are handled (exceptions are "caught"). There is one case here which handles all kinds of exceptions. - Add 3 cases to the exception handling for each of the exception types that can occur in the Lexing, Parsing, and Evaluation phases. Examine the associated source files if needed to determine the types associated with each of these exceptions.
- For each case, print a more specific error message as indicated in
the example above. Examine the fields available in the exception
types which carry data about the circumstances associated why a
problem arose. Some stringifying functions such as
tokenlist_string
are useful to format messages as indicated. - Use the examples provided previously as a guide to how the error messages should be printed for better clarity.
Grading Criteria for OptProb 4: Informative Exceptions grading 10
Weight | Criteria |
---|---|
AUTOMATED TESTS for Save/Load Sessions | |
5 | make test-op4 correctly compiles and passes tests |
5 | MANUAL INSPECTION for Save/Load Sessions |
calculon.ml |
|
Addition of cases to exception handling specfic to Lex/Parse/Eval errors | |
Use of data fields carried by caught exceptions | |
Printing of correctly formatted error messages specific to each exception type |
7.5 OptProb 5: Expression Optimization
As we have seen, parsing creates a data structure representing what a code should do while evaluation actually executes the code. In between these two steps exists a whole world of possible optimizations which transform the original expressions to produce equivalent results more efficiently. This problem addresses some possible optimization opportunities.
As an opening example, consider the following function in Calculon.
calculon> def five = @n 2+3; five : Closure(n, <fun>) calculon> show five; Closure(param_name: n, varmap={...} code= Add IntExp(2) IntExp(3) )
Whenever the function five
is called, its parameter is ignored and
instead 2 and 3 are added to return 5 every time. It is not hard to
see that this function would be better written
calculon> def five_better = @n 5; five_better : Closure(n, <fun>) calculon> show five_better; Closure(param_name: n, varmap={...} code= IntExp(5) )
where the adding of 2 and 3 is done ahead of time saving the need to do an addition at evaluation time.
An interesting question is whether this optimization can be done automatically, a technique called constant folding.
Similarly, the following two functions have the same effect, of adding 9 but the latter does so without needing to do any let binding.
calculon> def add_nine = @n let nine=9 in n+nine; add_nine : Closure(n, <fun>) calculon> show add_nine; Closure(param_name: n, varmap={...} code= Letin( nine ) # evaluate a let binding .var_expr: IntExp(9) .in_expr: Add Varname(n) Varname(nine) ) calculon> def add_nine_nolet = @n n+9; add_nine_nolet : Closure(n, <fun>) calculon> show add_nine_nolet; Closure(param_name: n, varmap={...} code= Add # no let binding to evaluate Varname(n) IntExp(9) )
The optimization in which the variable nine
is replaced by its
constant value 9
is often referred to as constant propagation and
in avoids the need to modify the variable map.
This problem adds several features to Calculon to perform limited versions of the constant folding and constant propagation optimizations.
- A function in
calceval.ml
calledoptimize_expr
which transforms expressions into equivalent, optimized expressions. Some constant folding and propagation are done for this. - A top-level command
optparsetree <expr>;
which shows the expression resulting fromoptimize_expr
in the same way thatparsetree <expr>
shows the results of parsing. - Automatic optimization of expressions via
optimize_expr
when theoptimize true;
top-level option is set.
These optimizations can be done by hand but allowing a coder to use informative variable names and then automatically optimizing the result for performance is a powerful combination.
Implementing Informative Exceptions
- Start by adding the
optparsetree <expr>;
top-level command as this will make it easier to see the results ofoptimize_expr
calls. - Locate similar commands in
calculon.ml
such asparsetree
and copy the pattern there to create the top-level command. After parsing and expression, it should calloptimize_expr
fromcalceval.ml
and then print the stringified results. - Currently
calceval.ml
has a version ofoptimize_expr
which throws an exception. Start editing this function to perform optimizations. The structure of
optimize_expr
is quite similar toeval_expr
: both take avarmap
and anexpr
as arguments and operate on them. Starting with that as the code template is not a bad idea. The central idea is to evaluate any constant expressions possible and replace them with an equivalent constant. For example, in the following parsetree, both left and right branches of theAdd
are constants.calculon> parsetree 1+2; Parse tree: Add IntExp(1) IntExp(2)
The optimization of this is to replace the add with the result of adding as in
calculon> optparsetree 1+2; Optimized Parse tree: IntExp(3)
The process of optimizing an expression should be done recursively: for any branching constructions such as arithmetic, optimize the left and right branches and then check if the results are both constant expressions. If so, evaluate them and return the computed answer as the optimized expression. For example, the following nested arithmetic would be optimized to only the result.
calculon> parsetree 1*3+2*4+5; Parse tree: Add Add Mul IntExp(1) IntExp(3) Mul IntExp(2) IntExp(4) IntExp(5) calculon> optparsetree 1*3+2*4+5; Optimized Parse tree: IntExp(16)
Variable names may break the ability to fold all constants. It is not expected that all possible folding be done, only that which is easy to reach via the left-associative parsing of expressions. For example:
calculon> parsetree n+2*4+5; Parse tree: Add Add Varname(n) Mul IntExp(2) # opportunity to fold constants IntExp(4) IntExp(5) calculon> optparsetree n+2*4+5; Optimized Parse tree: Add Add Varname(n) IntExp(8) # folded to 8 IntExp(5) # due to left-associativity, not folded with 8 to 13
If variable names correspond to constant integers or booleans, they can be propagated into an expression to enable further folding.
calculon> def x = 3; # define a constant x : IntDat(3) calculon> parsetree x; Parse tree: Varname(x) # variable lookup calculon> optparsetree x; Optimized Parse tree: IntExp(3) # var currently associated with 3 calculon> parsetree 5+x; Parse tree: Add IntExp(5) Varname(x) # variable name lookup calculon> optparsetree 5+x; Optimized Parse tree: IntExp(8) # x is known so becomes 3 enabling folding
During optimization, handle variables by looking them up in the current
varmap
. If they are associated with constant integers or booleans, return a constantIntExp
orBoolExp
with the associated value. This will enable further folding.Constant
let/in
bindings can be optimized away by adding to thevarmap
the given value and then optimizing the resulting expression. This constant propagation may further allow for folding to occur such as in the following.calculon> parsetree let y=2 in y*3; Parse tree: Letin( y ) .var_expr: IntExp(2) .in_expr: Mul Varname(y) IntExp(3) calculon> optparsetree let y=2 in y*3; Optimized Parse tree: IntExp(6) # 2 propogated to y, allows 2*3 to be folded calculon> undef z; # ensure no def for z calculon> parsetree let y=2 in y*z; Parse tree: Letin( y ) .var_expr: IntExp(2) .in_expr: Mul Varname(y) Varname(z) calculon> optparsetree let y=2 in y*z; Optimized Parse tree: Mul IntExp(2) # can't get rid of z but Varname(z) # can avoid let/in binding
Comparisons can similarly be optimized to constant booleans as in the following.
calculon> parsetree 3<4; Parse tree: Less IntExp(3) IntExp(4) calculon> optparsetree 3<4; Optimized Parse tree: BoolExp(true) calculon> parsetree 5+2 > 8; Parse tree: Greater Add IntExp(5) IntExp(2) IntExp(8) calculon> optparsetree 5+2 > 8; Optimized Parse tree: BoolExp(false)
Whole branches of conditionals may be optimized away by determining that the condition is always true or false such as in the following.
calculon> parsetree if true then 1 else 2; Parse tree: Cond .if_expr: BoolExp(true) .then_expr: IntExp(1) .else_expr: IntExp(2) calculon> optparsetree if true then 1 else 2; Optimized Parse tree: IntExp(1)
Even in the event that a branch cannot be eliminated, both branches of the conditional can be optimized for propagation and folding.
calculon> parsetree if z then 1+2 else let y=1 in y*4; Parse tree: Cond .if_expr: Varname(z) .then_expr: Add IntExp(1) IntExp(2) .else_expr: Letin( y ) .var_expr: IntExp(1) .in_expr: Mul Varname(y) IntExp(4) calculon> optparsetree if z then 1+2 else let y=1 in y*4; Optimized Parse tree: Cond .if_expr: Varname(z) .then_expr: IntExp(3) .else_expr: IntExp(3)
The code within a closure should be optimized so the original examples are optimized as in:
calculon> parsetree @n 2+3; Parse tree: Lambda( n ) Add IntExp(2) IntExp(3) calculon> optparsetree @n 2+3; Optimized Parse tree: Lambda( n ) IntExp(5) calculon> parsetree @n let nine=9 in n+nine; Parse tree: Lambda( n ) Letin( nine ) .var_expr: IntExp(9) .in_expr: Add Varname(n) Varname(nine) calculon> optparsetree @n let nine=9 in n+nine; Optimized Parse tree: Lambda( n ) Add Varname(n) IntExp(9)
- When
optimize_expr
is reasonably complete, alter the code in the top-level constructs fordef
and expression evaluation:- Check if the global
optimize
option is set to true - If so, call
optimize_expr
on the results of parsing Use the optimized results when evaluating. Automatic optimization will only be visible for closure data as all other code is immediately evaluated leaving only the resulting data.
calculon> def five_std = @n 2+3; # no optimizations five_std : Closure(n, <fun>) calculon> optimize true; # turn on automatic optimization optimization now 'true' calculon> def five_opt = @n 2+3; # same definition five_opt : Closure(n, <fun>) calculon> show five_std; Closure(param_name: n, varmap={...} code= Add # un-optimized IntExp(2) IntExp(3) ) calculon> show five_opt; Closure(param_name: n, varmap={...} code= IntExp(5) # optimized )
- Check if the global
Grading Criteria for 5: Expression Optimization grading 10
Weight | Criteria |
---|---|
AUTOMATED TESTS for Save/Load Sessions | |
5 | make test-op5 correctly compiles and passes tests |
5 | MANUAL INSPECTION for Save/Load Sessions |
calculon.ml |
|
Implementation of the optparsetree command |
|
Honoring the optimize option when evaluting defs and expressions |
|
calceval.ml |
|
Definition of the optimize_expr function |
|
Recursive optimization in all relevant branches | |
Correct use of varmap to propagate constants forward |
|
Correct folding of arithmetic constants | |
Correct optimization of conditionals to eliminate brancehs if possible |
8 Zip and Submit
8.1 Zipping Files
If creating a zip file is unfamiliar to you, refer to the following guide: Making Zips (tutorial)
8.2 Submit to Canvas
Once your are confident your code is working, you are ready to submit. Ensure your folder has all of the required files. Create a zip archive of your lab folder and submit it to Canvas.
On Canvas:
- Click on the Assignments section
- Click on the appropriate link for this lab/assignment
- Scroll down to "Attach a File"
- Click "Browse My Computer"
- Select you Zip file and press OK
8.3 Late Policies
You may wish to review the policy on late project submission which will cost you late tokens to submit late or credit if you run out of tokens. No projects will be accepted more than 48 hours after the deadline.
http://www-users.cs.umn.edu/~kauffman/2041/syllabus.html#late-projects