Last Updated: 2018-12-10 Mon 20:43

CSCI 2041 Assignment 5: The Calculon Interpreter

CODE DISTRIBUTION: a5-code.zip

TESTING CODE:

CHANGELOG:

Mon Dec 10 20:38:55 CST 2018

Optional tests have been added linked to the assignment 5 specification and directly here: a5-all-tests.zip. Copy all files in the folder created by the zip including Makefile into the project directory.

In the included file OPTIONAL_PROBLEMS.txt fill in a yes for optional problems that you worked on.

The new Makefile includes targets

make test-op1     # test optional problem 1 
make test-op2     # test optional problem 2 
make test-op3     # test optional problem 3 
make test-op4     # test optional problem 4 
make test-op5     # test optional problem 5 
make test-op      # test all optional problems
Fri Dec 7 11:31:05 CST 2018
Post 898 reported bugs in the test_calculon.sh script that cause errors when running on Mac OSX. These have been fixed in an updated version of test_calculon.sh. Download it if needed.
Thu Dec 6 16:19:13 CST 2018
Tests for all required problems are now available, linked at the top of the A5 specification. Instructions to install them are here. Note that you must copy all files including Makefile from the a5-required-tests/ directory to the project directory. Tests for Optional Problems will be posted over the weekend.
Wed Dec 5 12:07:25 CST 2018
Finished an incomplete sentence associated with the explanation for function application.
Tue Dec 4 15:09:14 CST 2018
The demonstration of running the required/optional solution scripts had typos that have been corrected. The name of the scripts are calculon_required_solution and calculon_optional_solution.
Tue Dec 4 14:14:59 CST 2018
Points for automated tests for the final submission have been adjusted: the tests associated with Problem 2 are worth 10 points while those associated with Problem 3 are only worth 5 points. This was a mistake in the original specification as there is much less to test associated with Problem 3's closures.
Sat Dec 1 17:56:50 CST 2018

Post 844 identified a minor problem with the test_calculon.sh script which will cause it to not print out a side-by-side diff for failed tests in some environments. The quick fix for this is to export the COLUMNS variable as in

> export COLUMNS

which will ensure that the script has access to this variable. Alternatively, download an updated version of test_calculon.sh.

Fri Nov 30 18:25:37 CST 2018
Milestone tests have been posted and linked to the to of the spec. Download them there. They only test for completion of Problem 1 which adds if/then/else to Calculon. Instructions for running the tests have been added to the Milestone Tests Section.
Thu Nov 29 10:53:14 CST 2018
Post 836 identified a typo in the instructions for lexing comparisons: the correct token name for > is GreatThan rather than GreaterThan; the former appears in the codebase for Calculon. In retrospect, GreatThan sounds stupid but it's too late to change now.

1 Overview

Much can be learned about programming and programming languages by building them from the ground up. The purpose of this project is to do that just that: construct a small, interactive language interpreter dubbed Calculon. The language understood by the interpreter is an amalgam of constructs from other languages mainly chosen for ease of implementation. The features are demonstrated in the Demo Session later.

Calculon will compile out of the box via a make invocation. This will allow you to experiment with the interpreter and examine its basic internal structure.

> make
ocamlc -g -annot -c calclex.ml
ocamlc -g -annot -c calcparse.ml
ocamlc -g -annot -c calceval.ml
ocamlc -g -annot -o demo_parse calclex.cmo calcparse.cmo demo_parse.ml
ocamlc -g -annot -o demo_eval calclex.cmo calcparse.cmo calceval.cmo demo_eval.ml
ocamlc -g -annot -o calculon calclex.cmo calcparse.cmo calceval.cmo calculon.ml

> calculon
calculon> 1+2*3;
- : IntDat(7)
calculon> quit;

That was so terrible I think you gave me cancer!
> 

There are a number of features that must be implemented internally before they will work in the interpreter such as if/then/else and less/greater comparisons. These are the subject of the required problems.

To further assist with understanding what Calculon should do, reference implementations are provided on CSE Labs machines.

  • calculon_required_solution: script that links to an executable which implements all the required problems but none of the optional problems.
  • calculon_optional_solution: script that links to an executable which implements all the required problems and all optional problems.
  • make setup which ensures that the two scripts are executable.
> make setup                     # ensure permissions are correct
chmod u+x calculon_required_solution
chmod u+x calculon_optional_solution

> ./calculon_optional_solution   # run solution with all optional problems completed
Running kauffman's implementation
calculon> if true then 1 else 2; # doesn't work until Problem 1 is completed
- : IntDat(1)

calculon> source small.calc;     # must complete optional problem 2
Reading source file 'small.calc'
calculon> def one = 1;
one : IntDat(1)
calculon> def double = @n n*2;
double : Closure(n, <fun>)
calculon> 
End of file 'small.calc'

Use these versions to explore what can and cannot be done with Calculon.

1.1 Calculon Features and Problems

The interpreter is broken into 4 source files

File Purpose
calclex.ml Lexer / Tokenizer
calcparse.ml Parser to construct an expression tree
calceval.ml Evaluates an expression tree
calculon.ml Top-level interactive loop

Each of these must be edited to complete features of Calculon associated with required and optional problems.

Table 1: Calculon features with status of source files and correspondce to required problems (P#) and optional problems (O#).
Feature Lex Parse Eval Top Problem
Language Constructs          
Ident/Constants ok ok ok -- Complete
Add/Sub/Mul/Div ok ok ok -- Complete
let/in ok ok ok -- Complete
if/then/else P1 P1 P1 -- P1: Add conditionals
Less/Greater/Equal P2 P2 P2 -- P2: Add comparisons
Lambdas/Closures ok P3 P3 -- P3: Evaluate lambdas as closures
Function Application ok ok P4 -- P4: Evaluate function applications
Recursive let/in -- -- P5 -- P5: Modify let/in to be recursive
and/or booleans O1 O1 O1 -- O1: Optional Problem
Top-level Features          
Top-level recursive def -- -- -- ok Complete
Top-level built-ins -- -- -- ok Complete
Read source file -- -- -- O2 O2: Optional Problem
Save/Load binary file -- -- -- O3 O3: Optional Problem
Informative Exceptions -- -- -- O4 O4: Optional Problem
Expression Optimization -- -- O5 O5 O5: Optional Problem

The above table lists the features of the programming language along with the problems associated with completing them. Middle columns correspond to the above source files with symbols meaning the following:

Symbol Meaning
-- Source file is not relevant to the feature
ok No change is needed the sourc file
P# An Required Problem requires the file to be edited
O# An Optional problem requires the file to be edited

1.2 Assignment Files and Download

Download the code pack linked at the top of the page. Unzip this which will create a project folder. Create new files in this folder. Ultimately you will re-zip this folder to submit it.

NOTE: Tests Released Later: The tests for this assignment will not be released with the publication of the specification. They will be released sometime after once they have been completed. Further instructions on how to use them will be made available on release.

File State Prob# Notes
Makefile Provided All Build file to compile all programs
calclex.ml Edit All Lexer / Tokenizer
calcparse.ml Edit All Parser to construct an expression tree
calceval.ml Edit All Evaluates an expression tree
calculon.ml Edit All Top-level interactive loop
demo_parse.ml Provided - Stand-alone utility for testing parsing
demo_eval.ml Provided - Stand-alone utility for testing evaluation
small.calc Provided - Small input file with Calculon source code.
optim.calc Provided - Larger input file with expression optimization opporutnities

Automated Testing Instructions

Unzip a5-required-tests.zip to create the directory a5-required-tests/. Copy all files in the a5-required-tests directory to the project directory a5-code. The following unix command should do this if a5-required-tests and a5-code are in the same directory.

cp -r a5-required-tests/* a5-code/

The Makefile provided is a copy of Makefile.withtests. The original version is called Makefile.notests.

The new Makefile introduces the targets

  • make test-p1 : test problem 1
  • make test-p2 : test problem 2
  • make test-p3 : test problem 3
  • make test-p4 : test problem 4
  • make test-p5 : test problem 5
  • make test : test all problems

Single tests can be run using the test_calculon.sh script with a data input file appropriate to an individual problem and the test number to run. Below is an example.

> ./test_calculon.sh test_p2_data.sh 3
Setting COLUMNS based on stty
COLUMNS is 113
Loading tests from test_p2_data.sh
10 tests loaded
Running single TEST 3

RUNNING TESTS
TEST  3 tokens-comp-vars   : OK

=====================================
OVERALL:
 1 /  1 tests correct

1.3 Milestone Tests   grading 10

There is an midpoint submission deadline worth 10% of the credit for the assignment. Submit working code at this point that compiles and passes the tests associated with Problem 1 to receive 10% of the credit for the project. This is to motivate getting an early start on this rather large assignment.

To run the milestone tests download the zip of milestone tests and run the following commands.

> unzip a5-milestone-tetsts.zip        # unzip the contents 
> cp a5-milestone-tests/* a5-code/*    # copy test code into assignment directory
> cd a5-code/                          # change to project directory
> make test-p1                         # run the tests which only target problem 1
Weight Criteria
  MILESTONE TESTS
10 make test-p1 correctly compiles and passes tests
  No manual inspection will be done during milestone evaluation

1.4 General Grading Criteria   grading 5

During final manual inspection the following will be examined for credit.

Weight Criteria
5 CORRECT SUBMISSION and STYLE
  All files submitted and properly zipped
  make test compiles all code and runs tests
  Code is indented reasonably well to indicate scopes
  Comments provide insight for intricate blocks

1.5 Demo Session

Some of Calculon's features are demonstrated below.

> calculon
calculon> help;
CALCULON: a calculator/interpreter with an unholy... ACTING TALENT!

--SUPPORTED SYNTAX--
  <expr> ;                          : expression evaluation terminated by semicolon 
  def x = <expr> ; in <expr>        : top-level variable bindings
  1 + 2*9 / (4-2)                   : basic arithmetic operators
  let x=<expr> in <expr> ;          : let bindings for locals
  if <expr> then <expr> else <expr> : conditional execution
  @n n*2+4                          : lambda abstraction with parameter n
  def double = @n n*2;              : top-level function binding
  double 4;                         : function application
  def mul = @a @b a*b;              : nested lambdas for multi-arg function

--BUILTIN COMMANDS--
  help;                             : print this help message
  quit;                             : quit calculon
  whos;                             : show all top-level bindings and their value types
  undef <ident>;                    : remove a global binding
  debug {true|false};               : enable/disabe debug printing of backtraces
  tokens <expr>;                    : show the results of lexing the given <expr>
  parsetree <expr>;                 : show the results of parsing the given <expr>
  show <expr>;                      : show the structure of the evaluated data in <expr> 

--OPTIONAL COMMANDS--
  source <filename>;                : read command from <filename> as if typed
  save <filename>;                  : save all definitions as binary data in <filename>
  load <filename>;                  : load binary variable defs from <filename>
  opparsetree <expr>;               : show the results of parsing AND optimizing the given <expr>
  optimize {true|false};            : enable/disabe optimization of expressions in normal evaluation

calculon> 1+2+3*4/2;            # basic math
- : IntDat(9)

calculon> let a=2 in let b=3 in a*b; # let bindings
- : IntDat(6)

calculon> def x = 7;            # top-level definitions
x : IntDat(7)

calculon> def nope = false;     # booleans
nope : BoolDat(false)

calculon> def double = @n 2*n;  # function definitions via @ lambdas
double : Closure(n, <fun>)

calculon> double 2;             # function call
- : IntDat(4)

calculon> double 7;
- : IntDat(14)

calculon> whos;                 # listing of top-level defs
    double : Closure(n, <fun>)
      nope : BoolDat(false)
         x : IntDat(7)

calculon> def mul = @a @b a*b;  # functions of multiple params via nested lambdas
mul : Closure(a, <fun>)

calculon> mul 3 7;
- : IntDat(21)

calculon> def mul_two = mul 2;  # partial application for curried function
mul_two : Closure(b, <fun>)

calculon> mul_two 5;            # completed application
- : IntDat(10)

                                # recursive definition at top-level
calculon> def fact = @n if n=1 then 1 else n*(fact (n-1));
fact : Closure(n, <fun>)

calculon> fact 5;               # recursive function call
- : IntDat(120)

calculon> tokens mul 3 7;       # show tokenization of expression
Tokens:
[Ident(mul); IntTok(3); IntTok(7); Semicolon ]

calculon> parsetree mul 3 7;    # show parsetree of expression
Parse tree:
Apply
  .func_expr:
    Apply
      .func_expr:
        Varname(mul)
      .param_expr:
        IntExp(3)
  .param_expr:
    IntExp(7)

calculon> show mul;             # show structure of data
Closure(param_name: a, varmap={...} code=
Lambda( b )
  Mul
    Varname(a)
    Varname(b)
)
calculon> quit;                 # Noooooooooo!

That was so terrible I think you gave me cancer!
> 

2 Problem 1: Adding Conditionals

Calculon should support simple conditional expressions of the familiar form

if <bool> then
  <expr>
else
  <expr>

This will allow conditional execution like the following examples.

calculon> if true then 10 else 5;
- : IntDat(10)

calculon> if false then 10 else 5;
- : IntDat(5)

calculon> if true then
            if false
              then 1
              else 2
          else 3;
- : IntDat(2)

While the use of Boolean constants may seem a bit limiting, the next problem adds integer comparisons such as x < y which will greatly increase the power of conditionals.

Currently there are tokens and expression types present for conditionals but there is no lexer, parser, or evaluator support. Completing this problem will allow the above expressions to evaluate as indicated.

2.1 Lexing Conditionals

  1. The starting state of Calculon does not recognize conditional keywords like if so will treat them as normal identifiers. This leads to odd errors in which if is misinterpreted as a function being applied to several arguments with errors like:

       calculon> if true then 1 else 5;
       ERROR: I'm not familiar with the type of thing I'm seeing...
       Calceval.EvalError("Function application is not yet implemented", "{}", "\nApply...
    

    This can be seen using the tokens command which will list if as an identifier rather than its own token

       calculon> tokens if true then 8*2 else 4*3;
       Tokens:
       [Ident(if); BoolTok(true); Ident(then); IntTok(8); Times; IntTok(2); Ident(else); IntTok(4); Times; IntTok(3); 
        Semicolon ]
    

    Once the lexer/parser create conditionals, this error will change.

  2. In calclex.ml, edit the main lexing function lex_string so that it recognizes keyword associated with conditionals (if etc.). Examine the token type at the top of the file and return appropriate tokens for each keyword.

    Recompile Calculon and check the results of lexing by using the tokens command which should yield results like the below.

       calculon> tokens if true then 8*2 else 4*3;
       Tokens:
       [If; BoolTok(true); Then; IntTok(8); Times; IntTok(2); Else ; IntTok(4); Times; IntTok(3); 
        Semicolon ]
    

    This shows that if is no the token If rather than an identifier. which is a good start.

2.2 Parsing Conditionals

  1. In calcparse.ml, edit the function parse_cond which is meant to handle conditionals. The correct sequence for a conditional is

       if <expr> then <expr> else <expr>
    

    That means the a series of nested checks via match/with looking for the corresponding tokens. In each case, an arbitrary expression can appear in place of <expr> such as the following:

       calculon> if  let x=true in x  then 1 else 2;
       - : IntDat(1)
    
       calculon> if  if true then false else true  then 1 else 2;
       - : IntDat(2)
    

    Thus you will need to employ the top-level parsing function parse_expr to parse a whole expression between each of the keyword tokens.

    In the expr type defined at the top of calcparse.ml, the Cond variant is associated with conditional execution. The fields if_expr then_expr else_expr should be assigned the three associated expressions listed in the form above.

    While working, employ the parsetree command to show the parse tree of a given expression. For example, the following is a correct parse tree:

       calculon> parsetree if true then 8*2 else 4-3;
       Parse tree:
       Cond
         .if_expr:
           BoolExp(true)
         .then_expr:
           Mul
             IntExp(8)
             IntExp(2)
         .else_expr:
           Sub
             IntExp(4)
             IntExp(3)
    
  2. Note that conditionals require all of if/then/else in Calculon so leaving any part off should lead to errors as shown below. Note that the error messages are required for missing then and missing else cases.

       calculon> if true then 1 else 2;
       - : IntDat(1)
    
       calculon> if true then 1 else;
       ERROR: I'm not familiar with the type of thing I'm seeing...
       Calcparse.ParseError("syntax error", _)
    
       calculon> if true then 1;
       ERROR: I'm not familiar with the type of thing I'm seeing...
       Calcparse.ParseError("Expected 'else' ", _)
    
       calculon> if true then;
       ERROR: I'm not familiar with the type of thing I'm seeing...
       Calcparse.ParseError("syntax error", _)
    
       calculon> if true;
       ERROR: I'm not familiar with the type of thing I'm seeing...
       Calcparse.ParseError("Expected 'then' ", _)
    

2.3 Evaluating Conditionals

  1. In calceval.ml, locate the eval_expr function and the case associated with Cond. You will need to make use of the fields like if_expr in the record which are defined at the top of calcparse.ml. Remove the exception raise and start coding.
  2. Conditional execution is simple: evaluate the expression associated with the if part, sometimes called the test. If it is true, then evaluate only the then expression, otherwise evaluate the else expression only. The structure of your code will flow according to the above.
  3. Evaluating nested expressions should be done recursively via calls to eval_expr. This function always returns a data_t type which is defined at the top of the calceval.ml file. It represents the three kinds of data that exist in Calculon: IntDat for integers, BoolDat for booleans, and Closure for functions of one parameter.

    On evaluating the result of an expression, pattern match the results against these three types. Often one is interested in only one kind of data as the others are in error.

    For conditionals, the results of evaluating if_expr must be a BoolDat which carries a true (eval then_expr) or false (eval else_expr). Other kinds like IntDat should generate errors with messages like the following where the utility data_string has been used to stringify the erroneous data.

       calculon> if 1 then 2 else 3;
       ERROR: I'm not familiar with the type of thing I'm seeing...
       Calceval.EvalError("Expected Bool for if(), found 'IntDat(1)'", ...
       
       calculon> if 8+7 then 2 else 3;
       ERROR: I'm not familiar with the type of thing I'm seeing...
       Calceval.EvalError("Expected Bool for if(), found 'IntDat(15)'", ...
       
       calculon> def x = 21;
       x : IntDat(21)
       calculon> if x then 1 else 2;
       ERROR: I'm not familiar with the type of thing I'm seeing...
       Calceval.EvalError("Expected Bool for if(), found 'IntDat(21)'", "{x: IntDat(21)}", ...
       
       # Only works once lambdas can be evaluated
       calculon> if (@n n*2) then 2 else 3;
       ERROR: I'm not familiar with the type of thing I'm seeing...
       Calceval.EvalError("Expected Bool for if(), found 'Closure(n, <fun>)'", ...
    
  4. Once evaluation code is complete, Calculon should handle if/then/else constructs like any of its other constructs.

2.4 Grading Criteria for P1: Adding Conditionals   grading 25

The following criteria will be checked in this problem.

Weight Criteria
  AUTOMATED TESTS for conditionals
10 make test-p1 correctly compiles and passes tests
  MANUAL INSPECTION for conditionals
5 calclex.ml
  Keywords if/then/else properly added into keyword recognition.
   
5 calcparse.ml
  Proper match/with against If token to recognize conditional
  Use of parse_expr to parse a whole expression associated with if
  Further match/with to look for subsequent Then/Else tokens
  Proper creation of a Cond record
  Exceptions thrown associated with missing then/else
   
5 calceval.ml
  Recursive evaluation of if_expr to determine true/false test
  match/with used to check for BoolDat result
  Evaluation of ONLY then_expr or else_expr based on test
  Raises an evaluation error if test does not give a BoolDat

3 Problem 2: Adding Comparisons

Calculon should support integer comparison of the following three types:

x < y   # less than
x > y   # greater than
x = y   # equal to

Each of these operates on two integers and produces a boolean result. This result can be used with conditionals to get more dynamic behavior.

calculon> def x = 5;
x : IntDat(5)
calculon> def y = 7;
y : IntDat(7)
calculon> if x > y then x else y;
- : IntDat(7)
calculon> def x = 10;
x : IntDat(10)
calculon> if x > y then x else y;
- : IntDat(10)

Like conditionals, there are token and expression types established for comparisons but no lexer, parser, or evaluation support for the construct. Completing this problem will add comparisons to Calculon.

3.1 Lexing Comparisons

  1. In the provided version of Calculon, the symbols < and > are not recognized by the lexer while the = sign is but not in comparisons. This leads to the following two kinds of errors.

       calculon> 5 < 7;
       ERROR: I'm not familiar with the type of thing I'm seeing...
       Calclex.LexError("char '<' not recognized, at position 2 in the input '5 < 7;'")
       
       calculon> 5 = 7;
       - : IntDat(5)
       WARNING: Great Shatner's ghost!
       WARNING: The following tokens remain and will NOT be evaluated
       [Equal; IntTok(7); Semicolon ]
    
  2. In calclex.ml, add support for the symbols < and > at an appropriate location. Generate token types LessThan and GreatThan when these symbols are recognized. No action is needed for the = which is already tokenized.
  3. Use the tokens command to check that code changes result in proper tokenization.

       calculon> tokens 1 < 10+3;
       Tokens:
       [IntTok(1); LessThan; IntTok(10); Plus; IntTok(3); Semicolon ]
       calculon> tokens 1 > 10+3;
       Tokens:
       [IntTok(1); GreatThan; IntTok(10); Plus; IntTok(3); Semicolon ]
    

3.2 Parsing Comparisons

  1. Comparisons are among the lowest precedence operations in most programming environments and Calculon follows this convention. This allows arithmetic to happen before comparisons are evaluated such as in the following.

       calculon> 5*2 > 3+4;
       - : BoolDat(true)
       calculon> 5*2 = 3+4+3;
       - : BoolDat(true)
       calculon> 5*2 = 3+4+2;
       - : BoolDat(false)
       calculon> 1 < 10*10+2;
       - : BoolDat(true)
    
  2. Due to the low-precedence of comparisons, a new production (parsing function) will need to be introduced in the parser that precedes all other productions save for the top-level parse_expr parsing entry point.
  3. In calcparse.ml, create a parsing function parse_compare between parse_expr and parse_addsub. Though this function appears visually between them, it will only be part of the parsing recursive descent if other function call it. Modify parse_expr to call parse_compare and plan to make calls to parse_addsub in parse_compare.
  4. parse_compare should look for a sequence like one of the following:

       <expr>  <  <expr>
       <expr>  >  <expr>
       <expr>  =  <expr>
    

    <expr> here are generated by deeper recursive calls starting with low-precedence arithmetic operations like addition and subtraction.

  5. Comparison has a simpler grammar than arithmetic like + as there is no chaining of comparisons.

       1 + 2 + 3   # allowed in Calculon
       1 < 2 < 3   # not allowed in Calculon
    

    Thus the structure of the code for parsing comparisons can be simpler than what is used for arithmetic operators: no iteration is needed.

  6. Parse a left-hand expression, then use match/with to look for appropriate tokens to indicate a comparisons is being done. If so, parse a right-hand expression. The left and right expressions being compared should be saved along with the operation and result in an Intop variant of expr. Examine how these are created in the arithmetic parsing to understand what is required. Like arithmetic operations, if no appropriate symbol is found, then comparisons should simply return a left-hand expression.
  7. As before, use the parsetree command to ensure code changes result in an appropriate parse tree. Ensure that the comparisons happen last, after other arithmetic operations.

       calculon> parsetree 10 > 10+3;
       Parse tree:
       Greater
         IntExp(10)
         Add
           IntExp(10)
           IntExp(3)
       
       calculon> parsetree 5*2+1 = 10+3;
       Parse tree:
       Equal
         Add
           Mul
             IntExp(5)
             IntExp(2)
           IntExp(1)
         Add
           IntExp(10)
           IntExp(3)
    

3.3 Evaluating Comparisons

  1. In calceval.ml, identify where comparisons should be handled in eval_expr. Like arithmetic, comparisons are an Intop operations. Add cases to an appropriate match/with statement to handle them.
  2. Unlike other arithmetic operations which generate IntDat, comparison result in BoolDat with true/false values associated. Make sure in to adhere to this convention.
  3. On completing the evaluation function, try out some expressions involving comparisons to ensure that they work properly.

       calculon> 5*2+1 = 10+3;
       - : BoolDat(false)
       calculon> 5*2+1 = 10+1;
       - : BoolDat(true)
       calculon> 5*2+1 < 10+4;
       - : BoolDat(true)
       calculon> def x = 2;
       x : IntDat(2)
       calculon> if x < 3 then 5 else 10;
       - : IntDat(5)
       calculon> let y=4 in if y < x then 1 else 2;
       - : IntDat(2)
    
  4. Ensure that the warnings or errors result if repeated comparison is attempted as in the following.

       calculon> 1 < 2 < 3;
       - : BoolDat(true)
       WARNING: Great Shatner's ghost!
       WARNING: The following tokens remain and will NOT be evaluated
       [LessThan; IntTok(3); Semicolon ]
       
       calculon> if 1 < 2 < 3 then 10 else 20;
       ERROR: I'm not familiar with the type of thing I'm seeing...
       Calcparse.ParseError("Expected 'then' ", _)
    

3.4 Grading Criteria for P2: Adding Comparisons   grading 25

The following criteria will be checked in this problem.

Weight Criteria
  AUTOMATED TESTS for adding comparisons
10 make test-p2 correctly compiles and passes tests
  MANUAL INSPECTION for adding comparisons
   
5 calclex.ml
  Added lexing spport for < and > at appropriate, single-character location
   
5 calcparse.ml
  New parse_compare function added and called by parse_eval
  parse_compare makes calls to arithmetic parsing functions.
  No iteration/recursion used which prevents repeated comparisons like 1 < 2 < 3
  Clear cases handling each of less than, greater than, equal to comparisons.
  Each case constructs an Intop expression with appropriate operator.
  Clear case for no matching comparison.
   
5 calceval.ml
  Correct case of eval_expr used to handle comparisons as a kind of Intop.
  Clear cases handling each of less than, greater than, equal.
  BoolDat data clearly produced as a result of the evaluation.
  Raise and evaluation error for comparison of non-Integer kinds.

4 Problem 3: Lambda Evaluation

We have discussed the notion of lambda expressions and closures previously in the class and this problem will make that discussion concrete. In Calculon

  • A lambda expression is syntactic construct. It is the Lambda variant of the expr type which is created during parsing. The intent of the expression is to create a function but this does not happen until it is evaluated just as Add(2,3) does not become 5 until evaluated.
  • A closure is a data element. It is the Closure variant in the data_t type which represents the results of expressions. Being a datum, a closure can be assigned to variable names via def and let/in like integers and booleans allowing for named functions. Unlike integers and booleans, closures can be applied to an argument which will evaluate its associated expression. This function application is addressed in the next problem while the current problem allows for creation of closures.

Calculon supports lambda expressions via the @ symbol which is followed by a parameter name and expression to evaluate. Lambdas can be nested creating functions of multiple parameters though only the first parameter name is visible. The show command which shows the internal structure of data is useful for analyzing closures as show below.

calculon> @n n+1;
- : Closure(n, <fun>)

calculon> def inc = @n n+1;
inc : Closure(n, <fun>)

calculon> def mul = @first @second first*second;
mul : Closure(first, <fun>)

calculon> whos;
             inc : Closure(n, <fun>)
             mul : Closure(first, <fun>)

calculon> show inc;
Closure(param_name: n, varmap={...} code=
Add
  Varname(n)
  IntExp(1)
)

calculon> show mul;
Closure(param_name: first, varmap={...} code=
Lambda( second )
  Mul
    Varname(first)
    Varname(second)
)

4.1 Lexing/Parsing Lambdas

  1. The lexing for lambda expressions is already complete but it is worthwhile to examine the associated portions of code. The lexer simply needs to identify @ characters as a special At token.
  2. The parser will need to be modified to handle produce lambda expressions. In its initial state, errors will result from use of the @ symbol:

    calculon> parsetree @n n+1;
    ERROR: I'm not familiar with the type of thing I'm seeing...
    Calcparse.ParseError("syntax error", _)
    
  3. In calcparse.ml find the existing parse_lambda function which presently just falls through to the higher precedence function application.
  4. Add cases that look for the token sequence @ <ident> <expr> which is a correctly specified lambda expression. Note that the second token should be an Ident which carries a variable name.
  5. Construct a Lambda variant of expr using the appropriate data. The definition for this variant is at the top of calcparse.ml.
  6. Also match an error case which will help users identify mistakes in their use of the @ symbol. This would look for a sequence of tokens of the form @ _ where the second token is not an Ident (anything else). Raise a parse error in this case with the message below which indicates that @ should be followed by an identifier.
  7. Test your parsing with the parsetree command as demonstrated below.

    calculon> parsetree @n n+1;
    Parse tree:
    Lambda( n )
      Add
        Varname(n)
        IntExp(1)
    
    calculon> parsetree @first @second first*second;
    Parse tree:
    Lambda( first )
      Lambda( second )
        Mul
          Varname(first)
          Varname(second)
    
    calculon> parsetree @ 1+2*n;
    ERROR: I'm not familiar with the type of thing I'm seeing...
    Calcparse.ParseError("Expected identifier in lambda expression after @", _)
    

4.2 Evaluating Closures

  1. In calceval.ml, locate the case in eval_expr which deals with lambda expressions.
  2. Evaluating a lambda simply creates a closure. Thus, all that needs to happen is creation of a Closure variant of data_t. The correspondence between fields of Lambda and Closure should be obvious save for the varmap.
  3. Closures in calculon are lexically scoped which means that they retain all variable bindings present at their creation. For example, look closely at the varmap portion in the demo below.

    ##############################
    # EXAMPLE 1
    calculon> def x=2;                 # define x and y
    x : IntDat(2)
    
    calculon> def y=5;
    y : IntDat(5)
    
    calculon> def add_xy = @a a+x+y;   # define a function using x,y
    add_xy : Closure(a, <fun>)
    
    
    calculon> show add_xy;             # variable map includes x,y
    Closure(param_name: a, varmap={... x: IntDat(2), y: IntDat(5)} code=
    Add                                ^^^^^^^^^^^^^^^^^^^^^^^^^^
      Add
        Varname(a)
        Varname(x)
      Varname(y)
    )
    
    calculon> def x=199;               # redefine x and y
    x : IntDat(199)
    
    calculon> def y=255;
    y : IntDat(255)
    
    
    calculon> show add_xy;             # original x,y values are retained in map
    Closure(param_name: a, varmap={... x: IntDat(2), y: IntDat(5)} code=
    Add
      Add
        Varname(a)
        Varname(x)
      Varname(y)
    )
    
    ##############################
    # EXAMPLE 2
    calculon> def double = let two=2 in @n two*n;   # local var 'two' to closure
    double : Closure(n, <fun>)
    
    calculon> two;                                  # two is not defined at top level
    ERROR: I'm not familiar with the type of thing I'm seeing...
    Calceval.EvalError("No variable 'two' bound", ...
    
    calculon> show double;                          # double retains value for 'two'
    Closure(param_name: n, varmap={... two: IntDat(2), ...} code=
    Mul
      Varname(two)
      Varname(n)
    )
    
  4. To enable easy lexical closures, Calculon internally uses OCaml's persistent maps. At the top of calceval.ml, a module Varmap is defined along with the type varmap_t which maps string to data_t, variable names to variable values.
  5. To allow a closure to retain the environment in which it is created, construct it with the current varmap. Note that all calls to eval_expr are passed a varmap with the current bindings. Use this to construct the closure.
  6. Test your results using Calculon's show command which shows the internal structure associated with a data_t at the top level (see previous examples). Ensure that the varmap associated with each closure contains the bindings that were part of the lexical scope in which it was created.
  7. Note that the approach taken here is a bit lazy as ALL variables in the scope of a closure are retained rather than those that are strictly needed for it to work. A more hygenic implementation would analyze the closure code to determine the free variables that need to be retained from the global scope and only retain those for better memory efficiency. However, the present approach of retaining a copy of the map is favored due to ease of implementation.

4.3 Grading Criteria for P3: Adding Closures   grading 15

The following criteria will be checked in this problem.

Weight Criteria
  AUTOMATED TESTS for adding closures
5 make test-p3 correctly compiles and passes tests
  MANUAL INSPECTION for adding closures
5 calcparse.ml
  Clear case in parse_lambda identifying an @ <ident> token sequence
  Correct creation of a Lambda expression
  Inclusion of an error case for @ followed by not an Ident
   
5 calceval.ml
  Correct construction of Closure record
  Inclusion of a variable map to facilitate lexical scoping

5 Problem 4: Function Application

Function application in Calculon is identical to function application in OCaml: it is simply a sequence of expressions next to one another as in

<clos_expr> <arg1_expr> <arg2_expr> ... <argN_expr>

As the above indicates, the first expression should be a closure: other kinds of data such as integers cannot be applied, often a variable binding the function not always. Concretely, this looks likes the following:

calculon> (@n n+1) 5;                   # apply anonymous func
- : IntDat(6)

calculon> def double = @n 2*n;          # doubling function
double : Closure(n, <fun>)

calculon> double 2;                     # apply function
- : IntDat(4)

calculon> double (1+5);                 # apply function
- : IntDat(12)

calculon> def mul = @a @b a*b;          # func of 2 args
mul : Closure(a, <fun>)

calculon> mul 10 5;                     # apply function
- : IntDat(50)

calculon> mul 2 9;                      # apply function
- : IntDat(18)

calculon> def max =                     # max function
  @a @b
    if a > b
    then a
    else b;
max : Closure(a, <fun>)

calculon> max 10 5;                     # call on two args
- : IntDat(10)

calculon> max 2 9;                      # call on two args
- : IntDat(9)

In the case of a single argument as in <clos_expr> <arg_expr>, the implementation of function application an obvious 3-step process

  1. Evaluate the argument expression to get the argument data
  2. Add the argument data to the closure's varmap bound to its parameter name
  3. Evaluate the closure's code with the new variable map and return the resulting data.

What is trickier to understand is how functions of "multiple" arguments work. Recall that Lambda expressions have only a single parameter. Multi-argument functions are really nested Lambda expressions as in:

calculon> parsetree @a @b a*b;  # func of 2 args
Parse tree:
Lambda( a )                     # nested lambdas
  Lambda( b )
    Mul
      Varname(a)
      Varname(b)

Evaluating the first lambda creates a closure but the second lambda is remains un-evaluated.

calculon> def mul = @a @b a*b;               # func of 2 args
mul : Closure(a, <fun>)

calculon> show mul;
Closure(param_name: a, varmap={...} code=    # outer closure
Lambda( b )                                  # inner lambda
  Mul
    Varname(a)
    Varname(b)
)

On applying the closure, its first parameter becomes defined and its code is evaluated. In the case of a Lambda expression, this results in another closure.

calculon> def mul = @a @b a*b;  # func of 2 args
mul : Closure(a, <fun>)

calculon> parsetree mul 2;
Parse tree:
Apply                           # function application of ...
  .func_expr:
    Varname(mul)                # variable associated with closure
  .param_expr:
    IntExp(2)                   # argument

calculon> mul 2;                # apply func of 2 args to 1 arg
- : Closure(b, <fun>)           # gives a closure

This second closure may be immediately applied to more arguments are bound to a name for later use as a partially applied function.

calculon> mul 2 4;              # full application
- : IntDat(8)

calculon> def doub = mul 2;     # partial application
doub : Closure(b, <fun>)

calculon> show doub;            # closure needs a param
Closure(param_name: b, varmap={a: IntDat(2), ...} code=
Mul                            ^^^^^^ a is rememberd though
  Varname(a)
  Varname(b)
)

calculon> doub 4;               # apply partial function
- : IntDat(8)

calculon> doub 9;               # apply partial function
- : IntDat(18)

Thus, functions in Calculon are curried: they can be partially applied.

Note that errors will result when applying a closure to too many arguments as the resulting data will no longer be a closure. This is the same error that results from trying to use data such as integers as a function.

calculon> def mul = @a @b a*b;  # func of 2 args
mul : Closure(a, <fun>)

calculon> mul 2;                # partial application
- : Closure(b, <fun>)

calculon> mul 2 4;              # full application
- : IntDat(8)

calculon> mul 2 4 6;            # too many arguments
ERROR: I'm not familiar with the type of thing I'm seeing...
Calceval.EvalError("Expected Closure for application, found 'IntDat(8)'", ...

calculon> 8 6;                  # same error
ERROR: I'm not familiar with the type of thing I'm seeing...
Calceval.EvalError("Expected Closure for application, found 'IntDat(8)'", ...

5.1 Lexing/Parsing Function Application

  1. Lexing is complete for function application: there is no special syntax associated with it.
  2. Parsing is provided for function application and requires no modification. In calcparse.ml, function application is a high-precedence operation which looks for two adjacent expressions. This creates an Apply variant of expr. This is done in a left-associative fashion as arithmetic operators are parsed so that applications to multiple arguments

    clos arg1 arg2 ... argN
    
    

    are parsed as

    ((((clos arg1) arg2) ...) argN)
    
    

    This allows the results of the first application to create another closure which is applied to the second argument and so forth.

5.2 Evaluating Function Application

  1. In calceval.ml, locate where Apply expressions are handled and begin editing.
  2. Evaluate the func_expr associated with the application. In many cases this will be a simple variable name but be an arbitrarily complex expression:

    doub (2+1)        # simple func_expr
    (@m 2*m) (2+1)    # more complex func_expr
    
  3. The resulting data produce by the func_expr should be a Closure. Do pattern matching to ensure this is the case. If it is a closure, evaluate the parameter expression associated with the Apply.
  4. Refresh your understanding of the fields in the Closure variant. Access the varmap associated with the closure, and add a binding between its parameter name and the parameter data computed in the previous step. Importantly do not use the varmap available in eval_expr as this may have different bindings than the varmap of the closure.
  5. Finally, recursively evaluate the closures code in the presence of the varmap created in the previous step and return the results.
  6. If the function expression is NOT a closure, application cannot proceed so raise an evaluation error. With a message like the one below. Use the data_string function to stringify the data that was not a closure.

    calculon> 8 6;                  
    ERROR: I'm not familiar with the type of thing I'm seeing...
    Calceval.EvalError("Expected Closure for application, found 'IntDat(8)'", ...
    
    calculon> true 9 5;
    ERROR: I'm not familiar with the type of thing I'm seeing...
    Calceval.EvalError("Expected Closure for application, found 'BoolDat(true)'", ...
    
  7. Test out function application evaluation after recompiling using the examples below. Note the use of the global undef <ident> command to remove a binding from the top level. This is useful to ensure that closures properly retain their lexical scope and are evaluated in that scope.

    ##############################
    # EXAMPLE 1
    calculon> def double = @n 2*n;        # single arg function
    double : Closure(n, <fun>)
    
    calculon> double 6;                   # application
    - : IntDat(12)
    
    ##############################
    # EXAMPLE 2
    calculon> def mulall = @a @b @c a*b*c; # multiple args
    mulall : Closure(a, <fun>)
    
    calculon> mulall 1 2 3;               # full application
    - : IntDat(6)
    
    calculon> def mullast = mulall 1 2;   # partial application
    mullast : Closure(c, <fun>)
    
    calculon> mullast 3;                  # completed application
    - : IntDat(6)
    
    calculon> mullast 4;                  # completed application
    - : IntDat(8)
    
    ##############################
    # EXAMPLE 3
    calculon> def x = 10;                 # test lexical scope
    x : IntDat(10)
    
    calculon> def addx = @a a+x;          # uses x
    addx : Closure(a, <fun>)
    
    calculon> addx 5;                     # works fine
    - : IntDat(15)
    
    calculon> undef x;                    # undefine x
    calculon> x;
    ERROR: I'm not familiar with the type of thing I'm seeing...
    Calceval.EvalError("No variable 'x' bound", ...
    
    calculon> addx 5;                     # function still works
    - : IntDat(15)
    
    calculon> def x=40;                   # redefine x
    x : IntDat(40)
    
    calculon> addx 5;                     # original x=10 used
    - : IntDat(15)
    

5.3 Grading Criteria for P4: Function Application   grading 10

The following criteria will be checked in this problem.

Weight Criteria
  AUTOMATED TESTS for function application
5 make test-p4 correctly compiles and passes tests
  MANUAL INSPECTION for function application
5 calceval.ml
  Recursive evaluation of the function expression
  Use of match/with to determine if the result was a closure or not
  Evaluation of the argument expression
  Binding of argument data to parameter name in closure's varmap
  Evaluation of closure code with the new varmap
  Clear error case handling non-closure application

6 Problem 5: Recursive Let/In

Top-level functions in Calculon can be defined recursively. There is no special syntax associated with recursion for top level def statements which is less like OCaml's special let rec... and more like the conventions of C/Java/Python/Scheme/every other language.

Examples of top-level recursive functions in Calculon are given below.

calculon> def fact =
@n 
  if n=1 then
    1 
  else 
    n*(fact (n-1));
fact : Closure(n, <fun>)

calculon> fact 3;
- : IntDat(6)

calculon> fact 4;
- : IntDat(24)

calculon> fact 5;
- : IntDat(120)

calculon> def pow = 
@base @exp 
  if exp=0 then 
    1 
  else 
    let res=pow base (exp-1) in 
    base*res;
pow : Closure(base, <fun>)

calculon> pow 2 4;
- : IntDat(16)

calculon> pow 2 5;
- : IntDat(32)

calculon> pow 3 5;
- : IntDat(243)

This is quite handy to as it enables Calculon to compute most of what is computable, though not all.

The provided version of Calculon already has recursive top-level statements. No modifications are required to get them working though the previous problems must be completed for the above code to work.

There are some situations in which recursive helper functions are useful such as the following.

calculon> def sum_facts =       # sums a sequence of factorials
  let fact =                    # local function fact
    @n 
      if n=1
      then 1 
      else n*(fact (n-1)) 
  in 
  @i @stop                      # main body of sum_facts
    if i > stop
    then 0 
    else let facti = fact i in
         let rest = sum_facts (i+1) stop in
         facti + rest;

sum_facts : Closure(i, <fun>)

calculon> sum_facts 10 2;
- : IntDat(0)

calculon> sum_facts 1 3;
- : IntDat(9)

calculon> sum_facts 1 5;
- : IntDat(153)

calculon> sum_facts 5 10;
- : IntDat(4037880)

While a bit contrived as a top-level definition for fact would suffice, the let/in binding for the local fact definition allows nested recursive functions to be defined. Recursive functions defined via let/in is not present in Calculon by default and is the subject of this problem.

6.1 Adding Recursive Let

  1. Examine calculon.ml and locate the code responsible for setting up a top-level def binding. Study this code carefully as it will provide a template for how to implement recursive let bindings.
  2. In calceval.ml, locate the where eval_expr handles Letin expressions. This code is relatively simple.
  3. Recursive let bindings only matters for functions in Calculon. That means one only needs to look for closures. After the expression associated with the let is evaluated, check it to see whether it is a closure using match/with.
  4. If not a closure, nothing needs to change. However, closures should have their varmap modified to include the variable name which is being let-bound. This will allow them to refer back to their own function name in the event that recursion is required.
  5. Be careful to directly modify the closure's map and NOT the varmap that the let binding is altering as these may have different bindings in them.
  6. When finished, recompile and check your results with either the example above or the simple one below.
# ensure there are no alternative facts
calculon> undef fact;  

# recursive let binding
calculon> let fact = @n if n=1 then 1 else n*(fact (n-1))  in fact 5;   
- : IntDat(120)

6.2 Grading Criteria for P5: Recursive Let   grading 10

The following criteria will be checked in this problem.

Weight Criteria
  AUTOMATED TESTS for recursive Let
5 make test-p5 correctly compiles and passes tests
  MANUAL INSPECTION for recursive let
5 calceval.ml
  In Letin evaluation, use of match/with to look for a closure result
  Mutation of to closure's varmap present
  Binding of varname added to closures map to allow recursion

7 Optional Problems / Makeup Credit

The following problems are optional and can be completed. Any additional points earned here will contribute the overall score on Assignments in the course. This will make up for credit lost on previous assignments but will not cause totals exceed 100% on assignments. With respect to grading, keep in mind the following:

  • This assignment is worth 2X the credit of other assignments
  • There are 50 total points of optional problems
  • Completing all optional problems would make up for entirely missing an assignment

Note that the instructions for the optional problems are less thorough than for the required problems by intention as solving them will rely on your ability to work out solutions on your own.

7.1 OptProb 1: Boolean Operations

While not strictly necessary, boolean operators like and and or allow conditions to be stated somewhat more succinctly than would otherwise be possible. This problem implements these two boolean operators in Calculon. This allows code like the following to execute.

calculon> true and false;
- : BoolDat(false)

calculon> true or false;
- : BoolDat(true)

calculon> if 1<2 and 2<3 then 10 else 20;
- : IntDat(10)

calculon> def between = @lo @hi @x  (lo < x) and (x < hi);
between : Closure(lo, <fun>)

calculon> between 1 3 2;
- : BoolDat(true)

calculon> between 1 3 5;
- : BoolDat(false)

calculon> between 1 3 0;
- : BoolDat(false)

Boolean operations are lower precedence than arithmetic and lower precedence than comparison. This allows combinations of these other operations to be done prior to boolean operations a shown below.

calculon> parsetree 1+2>4 or 2+3>4;
Parse tree:
Or
  Greater
    Add
      IntExp(1)
      IntExp(2)
    IntExp(4)
  Greater
    Add
      IntExp(2)
      IntExp(3)
    IntExp(4)

calculon> 1+2>4 or 2+3>4;
- : BoolDat(true)

One important convention that Calculon follows is that most languages implement and as a higher precedence operator than or which allows the following interpretations.

calculon> def x=5;
x : IntDat(5)

calculon> 1<x and x<3 or 3<x and x<7;       # eval 'and' before 'or' 
- : BoolDat(true)

calculon> (1<x and x<3) or (3<x and x<7);   # explicit parens for the above
- : BoolDat(true)

calculon> parsetree 1<x and x<3 or 3<x and x<7;
Parse tree:
Or
  And
    Less
      IntExp(1)
      Varname(x)
    Less
      Varname(x)
      IntExp(3)
  And
    Less
      IntExp(3)
      Varname(x)
    Less
      Varname(x)
      IntExp(7)

Implementing Boolean Operations

  1. Modify the calclex.ml to recognize keywords and and or and generate appropriate tokens.
  2. Modify calcparse.ml to create parsing functions for and/or. Since these are at different precedence levels, write separate parse functions for each called parse_or and parse_and. Model these functions after arithmetic parsing functions like parse_addsub as chains of left-associative and/or's are allowed.
  3. Ensure the new parse functions are lower precedence than all other productions. Make sure to call them from the top-level parsing entry point parse_expr.
  4. Boolean operations should create Boolop variants of expr which are similar to Intop expressions for integers.
  5. In calceval.ml, implement evaluation of Boolop variants in the appropriate case of eval_expr. Model this after how Intop expressions are evaluated.

Grading Criteria for OptProb 1: Boolean Operations   grading 10

Weight Criteria
  AUTOMATED TESTS for boolean operations
5 make test-op1 correctly compiles and passes tests
5 MANUAL INSPECTION for boolean operations
  calclex.ml
  Addition of and and or to keywords, tokens produced
   
  calcparse.ml
  Addition of parse_or and parse_and parsing functions
  Clearly two functions which give and/or different precedence levels
  Creation of Boolop expressions associated with and/or
   
  calceval.ml
  Evaluation of Boolop which combine BoolDats to produce a BoolDat
  Error checking to ensure only BoolDat data are combined

7.2 OptProb 2: Evaluate Source Files

A hand feature of OCaml's REPL is the #use "source.ml" directive which will read OCaml code from a source file and evaluate it in the REPL. Calculon benefits from a similar functionality if the source command is implemented.

> cat small.calc                # unix command line
def one = 1;                    # contents of file small.calc
def double = @n n*2;

> calculon                      # start calculon

calculon> source small.calc;    # read from source file
Reading source file 'small.calc'

calculon> def one = 1;          # from file
one : IntDat(1)

calculon> def double = @n n*2;  # from file
double : Closure(n, <fun>)

calculon> 
End of file 'small.calc'        # end of file

calculon> whos;                 # show definitions
    double : Closure(n, <fun>)
       one : IntDat(1)

calculon> double one;           # call loaded function
- : IntDat(2)

calculon> double 3;
- : IntDat(6)

This allows source files to be created in text editors and loaded into Calculon.

Variable bindings in a Calculon session may be lost if a source file is loaded which redefines them:

calculon> def one = 999;        # define variable 'one'
one : IntDat(999)

calculon> source small.calc;    # source a file
Reading source file 'small.calc'

calculon> def one = 1;          # re-def of 'one'
one : IntDat(1)

...
End of file 'small.calc'
calculon> whos;
    double : Closure(n, <fun>)
       one : IntDat(1)          # one redefined
calculon> one;
- : IntDat(1)

Implementing the source command

  1. This is a top-level command so open calculon.ml where other top-level commands like whos and parsetree are found.
  2. Analyze the main interactive loop for Calculon which is near the bottom of calculon.ml. This loop repeatedly calls a sequence of functions which read input from the stdin channel (keyboard typing) and process statements associated with it. It is a good model for what needs to be done save that input should come from a file source instead of stdin.
  3. Examine other top-level commands and establish a case which corresponds to a source command. Copy relevant code from the main loop. Note that an Ident carrying the filename should come after the source command.
  4. Study the OCaml manual, previous projects, and internet sources to determine how to open an input file in OCaml. Open the named file and use it the source for functions that process statements.
  5. As in the main loop, when the end of file is reached, an exception is raised which needs to be caught. In this case it indicates there is no more input available in the file and control should be returned to the main loop. Make sure to close the open file.
  6. As the file is opened and closed, print messages associated with this to indicate to a user what is being automatically read in.

    calculon> source small.calc;    # source a file
    Reading source file 'small.calc'
    
    ...
    
    End of file 'small.calc'
    calculon> whos;                 # back to interactive
    

Grading Criteria for OptProb 2: Evaluate Source Files   grading 10

Weight Criteria
  AUTOMATED TESTS for Evaluate Source Files
5 make test-op2 correctly compiles and passes tests
5 MANUAL INSPECTION for Evaluate Source Files
  calculon.ml
  Addition of top-level source command in appropriate case.
  Use of similar to processing loop to main Calculon loop
  Use of existing statement processing functions
  Clear opening and closing of an input file and use as input
  Messages for begining/ending of the input file

7.3 OptProb 3: Save/Load Sessions

Another common feature of REPL's is the ability to save and restore sessions. Calculon can support this with the save <file> and load <file> commands as follows.

> calculon                      # start calculon

calculon> def pi=314;           # define a few variables
pi : IntDat(314)

calculon> def mul = @a @b a*b;
mul : Closure(a, <fun>)

calculon> whos;
       mul : Closure(a, <fun>)
        pi : IntDat(314)

calculon> save mydefs.dat;      # save defs to a binary file
Saved binary data in 'mydefs.dat'

calculon> quit;

That was so terrible I think you gave me cancer!

> calculon                      # restart calculon

calculon> whos;                 # no defs initially

calculon> load mydefs.dat;      # load binary data
Loading binary data from 'mydefs.dat'

calculon> whos;                 # defs are back
       mul : Closure(a, <fun>)
        pi : IntDat(314)

calculon> mul pi 2;
- : IntDat(628)

In Calculon, saving a session simply amounts to writing the global variable map defined in calculon.ml to disk while loading involves reading the data back and merging it with the current variable map.

Writing and reading a complex data structure like a varmap_t is somewhat difficult if users were required to do it by hand. Fortunately, like many sensible languages, OCaml supports marshalling of data, sometimes known as serialization. This is the process of writing the in-memory binary representation of a data structure to disk which allows it to be read back directly into memory.

The module Marshal provides this serialization of any data structure in OCaml including user-defined data. No special knowledge of the data structure is needed to marshal it. Knowledge of the functions in Marshal will make this problem relatively easy to complete.

Implementing Save/Load Sessions

  1. Both save and load are top-level commands so examine the code in calculon.ml to create an appropriate match case. Both save/load are followed by an Ident which carries the name of the file to use.
  2. Examine the functions in the standard Marshal module. Read carefully the documentation associated with the functions to_channel outchan and from_channel inchan. As a simple example of these, here is an OCaml REPL session which demonstrates their use.

    # let mylst = [1;2;3];;
    val mylst : int list = [1; 2; 3]
    
    # let mylist = [1;2;3];;
    val mylist : int list = [1; 2; 3]
    
    # let outchan = open_out "save.dat";;
    val outchan : out_channel = <abstr>
    
    # Marshal.to_channel outchan mylist [];;
    - : unit = ()
    
    # close_out outchan;;
    - : unit = ()
    
    # let inchan = open_in "save.dat";;
    val inchan : in_channel = <abstr>
    
    (* must indicate the type of data read via from_channel *)
    (*             vvvvvvvv *)
    # let loaded : int list = Marshal.from_channel inchan;;
    val loaded : int list = [1; 2; 3]
    
    # close_in inchan;;
    - : unit = ()
    
  3. The data created by to_channel is binary and will not be easily understood in standard text editors so do not expect any great revelations associated with attempting to read it.
  4. Implement save <file>; according to the example above which uses to_channel. Output the global variable map defined in calculon.ml which contains all bindings. Make sure to open and close the output channel. Print a message indicating that data has been saved as in:

    calculon> save mydefs.dat;
    Saved binary data in 'mydefs.dat'
    
  5. Implement load <file>; according to the example above. This will allow a binary data to read back in and converted to a varmap_t. Print a message indicating that data is being loaded as in:

    calculon> load mydefs.dat;
    Loading binary data from 'mydefs.dat'
    
  6. When loading, existing bindings should be preserved as much as possible though definitions in the loaded variable map should override conflicting bindings in the current global map. To be concrete, here is an example:

    > calculon;
    calculon> def a=1;              # define a,b
    a : IntDat(1)
    
    calculon> def b=2;
    b : IntDat(2)
    
    calculon> save mydefs.dat;      # save to file
    Saved binary data in 'mydefs.dat'
    calculon> quit;                 # exit
    
    That was so terrible I think you gave me cancer!
    > calculon                      # restart
    
    calculon> def b=3;              # define b,c
    b : IntDat(3)
    
    calculon> def c=4;
    c : IntDat(4)
    
    calculon> whos;
             b : IntDat(3)
             c : IntDat(4)
    
    calculon> load mydefs.dat;      # load data
    Loading binary data from 'mydefs.dat'
    
    calculon> whos;
             a : IntDat(1)          # loaded from file
             b : IntDat(2)          # redefines b from file
             c : IntDat(4)          # already present
    

    To accomplish this, load the varmap_t from a file, and merge it into the global variable map using an appropriate function. Since Varmap is derived from the Map.Make functor, it will have several functions that allow for merging. Examine functions available for merging and creating unions in the documentation for Map.Make. Experiment with these in a REPL to determine the correct way to invoke one.

Grading Criteria for OptProb 3: Save/Load Sessions   grading 10

Weight Criteria
  AUTOMATED TESTS for Save/Load Sessions
5 make test-op3 correctly compiles and passes tests
5 MANUAL INSPECTION for Save/Load Sessions
  calculon.ml
  Addition of top-level save/load commands in appropriate cases
  Clear opening and closing of input/output channels during save/load
  Messages printed indicated save/load to a file name
  Use of Marshal module functions to write/read binary maps.

7.4 OptProb 4: Informative Exceptions

While colorful, the error messages in Calculon are a bit general and hard on the eyes as indicated below.

calculon> 10!;
ERROR: I'm not familiar with the type of thing I'm seeing...
Calclex.LexError("char '!' not recognized, at position 2 in the input '10!;'")

calculon> 10+;
ERROR: I'm not familiar with the type of thing I'm seeing...
Calcparse.ParseError("syntax error", _)

calculon> def doub = @ * 2;
ERROR: I'm not familiar with the type of thing I'm seeing...
Calcparse.ParseError("Expected identifier in lambda expression after @", _)

calculon> 10+true;
ERROR: I'm not familiar with the type of thing I'm seeing...
Calceval.EvalError("Expect Int for right arithmetic expression, found 'BoolDat(true)'", "{}", "\nAdd\n  IntExp(10)\n  BoolExp(true)\n")

One minor improvement to this situation would be to treat each exceptions typ slightly differently and print associated messages as shown below.

calculon> 10!;
LEX ERROR: char '!' not recognized, at position 2 in the input '10!;'

calculon> 10+ ;
PARSE ERROR: syntax error
tokens: [Semicolon ]

PARSE ERROR: Expected identifier in lambda expression after @
tokens: [At; Times; IntTok(2); Semicolon ]

calculon> 10+true;
EVAL ERROR: Expect Int for right arithmetic expression, found 'BoolDat(true)'
varmap: {}
expression:
Add
  IntExp(10)
  BoolExp(true)

While still not tremendously helpful, these messages are at least more specific to the kind of error (lex, parse, or evaluate).

To complete this problem, develop some knowledge of exception handling. A reasonable source is in one of our course texts: Practical OCaml Ch 10 on Exception Handling. Exceptions are similar to algebraic types and exception handling of various types is similar to pattern matching. After some research and practice, exception handling will feel quite familiar.

Implementing Informative Exceptions

  1. In calculon.ml locate where errors in statement processing are handled (exceptions are "caught"). There is one case here which handles all kinds of exceptions.
  2. Add 3 cases to the exception handling for each of the exception types that can occur in the Lexing, Parsing, and Evaluation phases. Examine the associated source files if needed to determine the types associated with each of these exceptions.
  3. For each case, print a more specific error message as indicated in the example above. Examine the fields available in the exception types which carry data about the circumstances associated why a problem arose. Some stringifying functions such as tokenlist_string are useful to format messages as indicated.
  4. Use the examples provided previously as a guide to how the error messages should be printed for better clarity.

Grading Criteria for OptProb 4: Informative Exceptions   grading 10

Weight Criteria
  AUTOMATED TESTS for Save/Load Sessions
5 make test-op4 correctly compiles and passes tests
5 MANUAL INSPECTION for Save/Load Sessions
  calculon.ml
  Addition of cases to exception handling specfic to Lex/Parse/Eval errors
  Use of data fields carried by caught exceptions
  Printing of correctly formatted error messages specific to each exception type

7.5 OptProb 5: Expression Optimization

As we have seen, parsing creates a data structure representing what a code should do while evaluation actually executes the code. In between these two steps exists a whole world of possible optimizations which transform the original expressions to produce equivalent results more efficiently. This problem addresses some possible optimization opportunities.

As an opening example, consider the following function in Calculon.

calculon> def five = @n 2+3;
five : Closure(n, <fun>)

calculon> show five;
Closure(param_name: n, varmap={...} code=
Add
  IntExp(2)
  IntExp(3)
)

Whenever the function five is called, its parameter is ignored and instead 2 and 3 are added to return 5 every time. It is not hard to see that this function would be better written

calculon> def five_better = @n 5;
five_better : Closure(n, <fun>)

calculon> show five_better;
Closure(param_name: n, varmap={...} code=
IntExp(5)
)

where the adding of 2 and 3 is done ahead of time saving the need to do an addition at evaluation time.

An interesting question is whether this optimization can be done automatically, a technique called constant folding.

Similarly, the following two functions have the same effect, of adding 9 but the latter does so without needing to do any let binding.

calculon> def add_nine = @n let nine=9 in n+nine;
add_nine : Closure(n, <fun>)

calculon> show add_nine;
Closure(param_name: n, varmap={...} code=
Letin( nine )                   # evaluate a let binding
  .var_expr:
    IntExp(9)
  .in_expr:
    Add
      Varname(n)
      Varname(nine)
)

calculon> def add_nine_nolet = @n n+9;
add_nine_nolet : Closure(n, <fun>)

calculon> show add_nine_nolet;
Closure(param_name: n, varmap={...} code=
Add                             # no let binding to evaluate
  Varname(n)
  IntExp(9)
)

The optimization in which the variable nine is replaced by its constant value 9 is often referred to as constant propagation and in avoids the need to modify the variable map.

This problem adds several features to Calculon to perform limited versions of the constant folding and constant propagation optimizations.

  • A function in calceval.ml called optimize_expr which transforms expressions into equivalent, optimized expressions. Some constant folding and propagation are done for this.
  • A top-level command optparsetree <expr>; which shows the expression resulting from optimize_expr in the same way that parsetree <expr> shows the results of parsing.
  • Automatic optimization of expressions via optimize_expr when the optimize true; top-level option is set.

These optimizations can be done by hand but allowing a coder to use informative variable names and then automatically optimizing the result for performance is a powerful combination.

Implementing Informative Exceptions

  1. Start by adding the optparsetree <expr>; top-level command as this will make it easier to see the results of optimize_expr calls.
  2. Locate similar commands in calculon.ml such as parsetree and copy the pattern there to create the top-level command. After parsing and expression, it should call optimize_expr from calceval.ml and then print the stringified results.
  3. Currently calceval.ml has a version of optimize_expr which throws an exception. Start editing this function to perform optimizations.
  4. The structure of optimize_expr is quite similar to eval_expr: both take a varmap and an expr as arguments and operate on them. Starting with that as the code template is not a bad idea. The central idea is to evaluate any constant expressions possible and replace them with an equivalent constant. For example, in the following parsetree, both left and right branches of the Add are constants.

    calculon> parsetree 1+2;
    Parse tree:
    Add
      IntExp(1)
      IntExp(2)
    

    The optimization of this is to replace the add with the result of adding as in

    calculon> optparsetree 1+2;
    Optimized Parse tree:
    IntExp(3)
    
  5. The process of optimizing an expression should be done recursively: for any branching constructions such as arithmetic, optimize the left and right branches and then check if the results are both constant expressions. If so, evaluate them and return the computed answer as the optimized expression. For example, the following nested arithmetic would be optimized to only the result.

    calculon> parsetree 1*3+2*4+5;
    Parse tree:
    Add
      Add
        Mul
          IntExp(1)
          IntExp(3)
        Mul
          IntExp(2)
          IntExp(4)
      IntExp(5)
    
    calculon> optparsetree 1*3+2*4+5;
    Optimized Parse tree:
    IntExp(16)
    
  6. Variable names may break the ability to fold all constants. It is not expected that all possible folding be done, only that which is easy to reach via the left-associative parsing of expressions. For example:

    calculon> parsetree n+2*4+5;
    Parse tree:
    Add
      Add
        Varname(n)
        Mul
          IntExp(2)                 # opportunity to fold constants
          IntExp(4)
      IntExp(5)
    
    calculon> optparsetree n+2*4+5;
    Optimized Parse tree:
    Add
      Add
        Varname(n)
        IntExp(8)                   # folded to 8
      IntExp(5)                     # due to left-associativity, not folded with 8 to 13
    
  7. If variable names correspond to constant integers or booleans, they can be propagated into an expression to enable further folding.

    calculon> def x = 3;            # define a constant
    x : IntDat(3)
    
    calculon> parsetree x;
    Parse tree:
    Varname(x)                      # variable lookup
    
    calculon> optparsetree x;
    Optimized Parse tree:
    IntExp(3)                       # var currently associated with 3
    
    calculon> parsetree 5+x;
    Parse tree:
    Add
      IntExp(5)
      Varname(x)                    # variable name lookup
    
    calculon> optparsetree 5+x;
    Optimized Parse tree:
    IntExp(8)                       # x is known so becomes 3 enabling folding
    

    During optimization, handle variables by looking them up in the current varmap. If they are associated with constant integers or booleans, return a constant IntExp or BoolExp with the associated value. This will enable further folding.

  8. Constant let/in bindings can be optimized away by adding to the varmap the given value and then optimizing the resulting expression. This constant propagation may further allow for folding to occur such as in the following.

    calculon> parsetree let y=2 in y*3;
    Parse tree:
    Letin( y )
      .var_expr:
        IntExp(2)
      .in_expr:
        Mul
          Varname(y)
          IntExp(3)
    
    calculon> optparsetree let y=2 in y*3;
    Optimized Parse tree:
    IntExp(6)                       # 2 propogated to y, allows 2*3 to be folded
    
    calculon> undef z;              # ensure no def for z
    
    calculon> parsetree let y=2 in y*z;
    Parse tree:
    Letin( y )
      .var_expr:
        IntExp(2)
      .in_expr:
        Mul
          Varname(y)
          Varname(z)
    
    calculon> optparsetree let y=2 in y*z;
    Optimized Parse tree:
    Mul
      IntExp(2)                     # can't get rid of z but
      Varname(z)                    # can avoid let/in binding
    
  9. Comparisons can similarly be optimized to constant booleans as in the following.

    calculon> parsetree 3<4;
    Parse tree:
    Less
      IntExp(3)
      IntExp(4)
    
    calculon> optparsetree 3<4;
    Optimized Parse tree:
    BoolExp(true)
    
    calculon> parsetree 5+2 > 8;
    Parse tree:
    Greater
      Add
        IntExp(5)
        IntExp(2)
      IntExp(8)
    
    calculon> optparsetree 5+2 > 8;
    Optimized Parse tree:
    BoolExp(false)
    
  10. Whole branches of conditionals may be optimized away by determining that the condition is always true or false such as in the following.

     calculon> parsetree if true then 1 else 2;
     Parse tree:
     Cond
       .if_expr:
         BoolExp(true)
       .then_expr:
         IntExp(1)
       .else_expr:
         IntExp(2)
    
     calculon> optparsetree if true then 1 else 2;
     Optimized Parse tree:
     IntExp(1)
    

    Even in the event that a branch cannot be eliminated, both branches of the conditional can be optimized for propagation and folding.

     calculon> parsetree if z then 1+2 else let y=1 in y*4;
     Parse tree:
     Cond
       .if_expr:
         Varname(z)
       .then_expr:
         Add
           IntExp(1)
           IntExp(2)
       .else_expr:
         Letin( y )
           .var_expr:
             IntExp(1)
           .in_expr:
             Mul
               Varname(y)
               IntExp(4)
    
     calculon> optparsetree if z then 1+2 else let y=1 in y*4;
     Optimized Parse tree:
     Cond
       .if_expr:
         Varname(z)
       .then_expr:
         IntExp(3)
       .else_expr:
         IntExp(3)
    
    
  11. The code within a closure should be optimized so the original examples are optimized as in:

     calculon> parsetree @n 2+3;
     Parse tree:
     Lambda( n )
       Add
         IntExp(2)
         IntExp(3)
    
     calculon> optparsetree @n 2+3;
     Optimized Parse tree:
     Lambda( n )
       IntExp(5)
    
     calculon> parsetree @n let nine=9 in n+nine;
     Parse tree:
     Lambda( n )
       Letin( nine )
         .var_expr:
           IntExp(9)
         .in_expr:
           Add
             Varname(n)
             Varname(nine)
    
     calculon> optparsetree @n let nine=9 in n+nine;
     Optimized Parse tree:
     Lambda( n )
       Add
         Varname(n)
         IntExp(9)
    
    
  12. When optimize_expr is reasonably complete, alter the code in the top-level constructs for def and expression evaluation:
    • Check if the global optimize option is set to true
    • If so, call optimize_expr on the results of parsing
    • Use the optimized results when evaluating. Automatic optimization will only be visible for closure data as all other code is immediately evaluated leaving only the resulting data.

          calculon> def five_std = @n 2+3;     # no optimizations
          five_std : Closure(n, <fun>)
      
          calculon> optimize true;             # turn on automatic optimization
          optimization now 'true'
      
          calculon> def five_opt = @n 2+3;     # same definition
          five_opt : Closure(n, <fun>)
      
          calculon> show five_std;
          Closure(param_name: n, varmap={...} code=
          Add                                  # un-optimized
            IntExp(2)
            IntExp(3)
          )
      
          calculon> show five_opt;
          Closure(param_name: n, varmap={...} code=
          IntExp(5)                            # optimized
          )
      

Grading Criteria for 5: Expression Optimization   grading 10

Weight Criteria
  AUTOMATED TESTS for Save/Load Sessions
5 make test-op5 correctly compiles and passes tests
5 MANUAL INSPECTION for Save/Load Sessions
  calculon.ml
  Implementation of the optparsetree command
  Honoring the optimize option when evaluting defs and expressions
   
  calceval.ml
  Definition of the optimize_expr function
  Recursive optimization in all relevant branches
  Correct use of varmap to propagate constants forward
  Correct folding of arithmetic constants
  Correct optimization of conditionals to eliminate brancehs if possible

8 Zip and Submit

8.1 Zipping Files

If creating a zip file is unfamiliar to you, refer to the following guide: Making Zips (tutorial)

8.2 Submit to Canvas

Once your are confident your code is working, you are ready to submit. Ensure your folder has all of the required files. Create a zip archive of your lab folder and submit it to Canvas.

On Canvas:

  • Click on the Assignments section
  • Click on the appropriate link for this lab/assignment
  • Scroll down to "Attach a File"
  • Click "Browse My Computer"
  • Select you Zip file and press OK

8.3 Late Policies

You may wish to review the policy on late project submission which will cost you late tokens to submit late or credit if you run out of tokens. No projects will be accepted more than 48 hours after the deadline.

http://www-users.cs.umn.edu/~kauffman/2041/syllabus.html#late-projects


Author: Chris Kauffman (kauffman@umn.edu)
Date: 2018-12-10 Mon 20:43