Last Updated: 2018-11-21 Wed 15:20

CSCI 2041 Assignment 4: Persistent Maps and Sets

CODE DISTRIBUTION: a4-code.zip

TESTING CODE: a4-tests.zip (Instructions)

CHANGELOG:

Wed Nov 21 15:18:52 CST 2018
The points for Problem 1's Ssmap.remove function in Manual Inspection was incorrectly listed as 10 points when it should have been 5 points. This has been corrected.
Thu Nov 8 09:17:43 CST 2018

Post 630 identified a bug in the tests that prevents older OCaml versions from compiling test_treeset.ml. This is fixed by either getting a fresh copy of the tests or replacing all occurrences of the following comparison in test_treeset.ml:

Replace: Float.compare
With:    Pervasives.compare

Wed Nov 7 21:44:43 CST 2018
Test cases are now linked to the top of the assignment 4 spec. Clarification of formatting strings in the map_modules.ml has been created. Minor update to the grading criteria for Problem 2: manual inspection criteria for required file map_modules.ml has been added.
Fri Nov 2 15:02:21 CDT 2018
Minor typo corrections such as replacing incorrect TREEMAP_ELEM with correct KEYVAL_SIG in documentation for treemap.ml.

1 Overview

Typically, standard programming libraries provide two major abstract data types for programmer convenience.

  • Maps from a key to a value, sometimes referred to as dictionaries
  • Sets of unique elements

Concrete implementations of maps and sets allow any kind of data to be stored and make use one of several familiar data structures.

  • Linked lists which give linear performance for most map/set operations
  • Arrays which also yield linear performance for many map/set operations
  • Binary Search Trees which, if balanced, yield logarithmic complexity
  • Hash Tables which generally have amortized constant time complexity

OCaml provides maps and sets and per its standard immutable data disposition, these are persistent data structures: inserting a new key/value pair to a map produces a new, distinct map with the additional pair. The favored data structure for this is the Binary Search Tree.

  • Balanced BST's provide logarithmic performance, a good balance between the linear and constant of other data structure choices
  • As we have seen Immutable Linked Lists sharing structure between multiple lists, so to can immutable BST's share structure. Doing so means that a "new" BST that is distinct from the original can be produced by creating one new path from root to leaf and sharing the remainder of the tree.

OCaml's standard library provides maps and sets which use balanced binary trees but we will favor vanilla, unbalanced binary trees. This may lead to poor performance in some cases but is easier to implement and thus does not require deep knowledge of data structures.

Since OCaml is a very strongly typed language, some measures must be taken to allow a single implementation of a map or set to work with multiple types. There are few mechanisms for this but the preferred method is through functors or parameterized modules. These allow a module template to be specialized by providing the missing pieces. It is the mechanism that is used in OCaml's library via the Map.Make and Set.Make functors.

Finally, a crafty coder will recognize that maps and sets share a great deal of structure.

  • Maps have keys which form a set of unique elements
  • Sets are maps where the keys are the elements and there are no values

To that end, a crafty coder will spend time carefully writing a data structure for a Map or Set and then build the other on top of the first, sparing the need to copy and paste much code. This will be our approach here, to build a Map implementation and then provide a Set which internally uses a Map.

1.1 Problem Synopsis

Problem 1
Create a module that implements a Map from a string key to a string key. A standard binary search tree is used for the mapping and functions for searching, removal, and stringification must be completed.
Problem 2
"Functorize" the module from problem 1 so that a Map from any kind of key to any kind of value may be created. This involves only a few code changes but also the introduction of a functor and module type to be taken as an argument for the functor.
Problem 3
Create a functor that provides a Set of any kind of unique values. Internally, the functor from Problem is called to create a Map which is used to fulfill all the operations required by the set.

1.2 Assignment Files and Download

Download the code pack linked at the top of the page. Unzip this which will create a project folder. Create new files in this folder. Ultimately you will re-zip this folder to submit it.

NOTE: Tests Released Later: The tests for this assignment will not be released with the publication of the specification. They will be released sometime after once they have been completed. Further instructions on how to use them will be made available on release.

File State Prob# Notes
Makefile Provided All Build file to compile all programs
ssmap.ml Edit 1 Problem 1 Module for mapping strings to strings
ssmap_demo.ml Provided 1 Demo for use of Ssmap module
treemap.ml Create 2 Problem 2 Module for mapping arbitrary keys to values
map_modules.ml Create 2 Applications of Treemap.Make to create various map modules
treemap_demo.ml Provided 2 Demo for use of Treemap module
treeset.ml Create 3 Problem 3 Module for tracking sets of unique elemetns
treeset_demo.ml Provided 3 Demo for use of Treeset module
test_ssmap.ml Provided 1 Unit tests for problem 1
test_treemap.ml Provided 2 Unit tests for problem 2
test_map_modules.ml Provided 2 Unit tests for problem 2
test_treeset.ml Provided 3 Unit tests for problem 3
mltest.ml Provided   Testing library
process-mltest.awk Provided   Test processing script

Automated Testing Instructions

Unzip a4-tests.zip to create the directory a4-tests/. Copy all files in the a4-tests directory to the project directory a4-code. The following unix command should do this if a4-tests and a4-code are in the same directory.

cp -r a4-tests/* a4-code/

The Makefile provided is a copy of Makefile.withtests. The original version is called Makefile.notests.

The new Makefile introduces the targets

  • make test-p1 : test problem 1
  • make test-p2 : test problem 2
  • make test-p3 : test problem 3
  • make test : test all problems

Test programs test_ssmap, test_treemap, test_map_modules, and test_treeset associated with problems 1-3 can run individual tests by passing a test number as a command line argument as in

./test_treeset 3    # run the 3rd test only

1.3 General Grading Criteria   grading

Weight Criteria
5 CORRECT SUBMISSION and STYLE
  All files submitted and properly zipped
  make test compiles all code and runs tests
  Code is indented reasonably well to indicate scopes
  Comments provide insight for intricate blocks

2 Problem 1: The Ssmap

To become familiar with building persistent trees, this problem implements a map with concrete key/value types of a string key mapped to a string value. Completing the ssmap.ml module will create a concrete module with a string-to-string map which can be tested. The next problem be extend this code to work provide a map from any key-to-value types.

2.1 ssmap.ml Top-Level Definitions

The provided file ssmap.ml has a the type for the map and two functions implemented in it. Complete functions that are marked IMPLEMENT in the provided skeleton.

(* ssmap.ml: Provides types and functions for a map from string keys
   to string values. Internally uses a binary search tree sorted on
   the keys. Some functions must be completed. *)

(* Algebraic type for a persistent binary search tree key/value
   mappings. 
*)
type ssmap =
  | Empty               (* no data: bottom of tree   *)
  | Node of {           (* node of anonymous record  *)
      key   : string;   (* key for this node         *)
      value : string;   (* value associated with key *)
      left  : ssmap;    (* left branch               *)
      right : ssmap;    (* right branch              *)
    }
;;

let empty = ... ;;
(* Empty map value. Could use Ssmap.Empty but it is standard libary
   convention to provide a binding called "empty" for this. *)

let rec add map key value = ... ;;
(* val add : ssmap -> string-> string -> ssmap

   USE: let newmap = add map key value in ...

   Returns a new map with key bound to value. If key is already
   present, its binding is re-assigned to the given value.  
*)

let tree_string map = ... ;;
(* val tree_string : ssmap -> string

   USE: let s = tree_string map in ...

   Use a Buffer (extensible string) and a right-to-left in-order
   traversal of the internal tree to construct a string version of the
   map. Nodes are indented according to their depth: root is left-most
   with children farther to the right. Each data element is preceded
   by its integer depth. Right-most elements appear at the top while
   left-most elements appear at the bottom. This means elements read
   in REVERSE order from top to bottom and that mentally rotating the
   tree clockwise gives appropriate left/right branch positions.  
*)

let rec getopt map key = ... ;;
(* IMPLEMENT

   val getopt : ssmap -> string -> string option

   USE: let opt = getopt map key in ...

   Search the map for given key. If found, return the Some of the
   associated value. Otherwise return None.
*)

let contains_key map str = ... ;;
(* IMPLEMENT

   val contains_key : ssmap -> string -> bool

   USE: let present = contains_key map key in ...

   Returns true if key is present in the map and false
   otherwise. Uses function getopt.
*)

let rec iter func map = ... ;;
(* IMPLEMENT

   val iter : (string -> string -> unit) -> ssmap -> unit

   USE: let func key value = ... in
        iter func map;

   Apply func to all elements key/value pairs in the map for
   side-effects. Keys are visited in sorted order from minimum to
   maximum. Works as other iter-like functions such as List.iter
   or Array.iter.  
*)

let rec fold func cur map = ... ;;
(* IMPLEMENT

   val fold : ('a -> string -> string -> 'a) -> 'a -> ssmap -> 'a

   USE: let func cur key value = ... in
        let reduction = fold func init map in ...

   Apply func to all elements key/value pairs in the map to produce a
   single value at the end. Keys are visited in sorted order from
   minimum to maximum. Works as other fold-like functions such as
   List.fold_left or Array.fold_left.  
*)

let to_string map = ... ;;
(* IMPLEMENT

   to_string : ssmap -> string

   USE: let displaystr = to_string map in ...

   Produces a string representation of the map. The format is

   "[{k1 -> v1}, {k2 -> v2}, {k3 -> v4}]"

   Key/vals appear from minimum key to maximum key in the output
   string. Functionality from the Buffer module is used with an
   in-order traversal to produce the string efficiently. May make use
   of a higher-order map function such as iter or fold.
*)

let rec findmin_keyval map = ... ;;
(* IMPLEMENT

   val findmin_keyval : ssmap -> (string * string)

   USE: let (minkey,minval) = findmin ssmap in ...

   Find the "minimum" key in the given map and return it and its value
   as a pair. If used with an empty map, fails with exception message
   "No minimum in an empty tree".  Since the map is based on a BST,
   the minimum key is at the left-most node.  
*)

let rec remove_key map key = ... ;;
(* IMPLEMENT

   val remove_key : ssmap -> string -> ssmap

   USE: let newmap = remove_key map key in ...

   Returns a new map without key in it.  The internal tree is
   re-arranged according to standard BST conventions: (1) If key is a
   leaf, it is replaced with Empty, (2) If key is an internal node
   with one child, it's left or right branch replaces it, (3) If key
   is an internal node with both left/right children, it successor
   replaces it. Makes use of findmin_keyval to locate a succesor as
   the minimum of on the right child branch.
*)

2.2 Implementing ssmap.ml Functions

Examine Implementation of add and tree_string

  • Examine closely how these two functions are constructed
  • This will give insight into the general structure of any code that must traverse a tree in OCaml
  • In add, notice that new Node records are created which create a new path from root to insertion point
  • In tree_string, notice the use of the Buffer module to append onto a growing character buffer. This is far more efficient that typical string concatenation.

getopt map key

  • The search code should be close to that used by add to determine whether there is any tree left or whether a left/right branch should be taken
  • Return a string option by returning None if the key is not found and Some of a node's data if the given search key is located.

contains_key map key

  • Through use of getopt, this function can be completed as a one-liner.

iter func map and fold func init map

  • These two higher-order functions accept a function to run key/value pairs
  • The code to implement both iter and fold is essentially an in-order traversal which should visit left branches, the current node's data, and then right branches
  • For iter, applications of the parameter func will return unit so no results need to be captured
  • For fold, each application of func produces a result that should be captured or returned. Visiting the left branch should produce a return value which is used in the application of the function to the current node's data. This should produce a result that is passed into a fold on the right branch which produces the ultimate answer.

to_string map

  • Make use of the Buffer module to in this function. Examples uses are in the provided tree_string function.
  • to_string can make use of either its own in-order traversal of the tree via a recursive helper function OR it can employ an appropriate higher-order function such as iter or fold to perform the traversal.
  • Regardless of the traversal mechanism, each Node with data should be added to the Buffer with the format

    "{%s -> %s}, "
    
    
  • Printf.sprintf is useful for producing formatted strings.
  • The last element tree should not contribute a comma-space (", ") in the resulting string. That is, the following are correct and incorrect results from to_string

    CORRECT:   [{11 -> Eleven}, {14 -> Fourteen}, {25 -> Twenty-five}]
    INCORRECT: [{11 -> Eleven}, {14 -> Fourteen}, {25 -> Twenty-five}, ]
                                                                     ^^
    
    

    The easiest way to accomplish this is to produce the buffer

    "[{11 -> Eleven}, {14 -> Fourteen}, {25 -> Twenty-five}, "
    
    

    and then delete the final two characters before adding the closing ]. Buffer.truncate and Buffer.length functions allow characters at the end a character buffer to clipped off. BE CAREFUL with completely Empty trees which require special care

findmin_keyval map

  • The minimum elements in a BST are always leftmost so this recursive function should traverse left until it cannot go any further. Return the key/value at that node with as a pair.
  • If an Empty map is passed as an argument, use failwith to raise an exception with the message indicated.

remove_key map key

  • This is the most complex function as it has a variety of cases to consider.
  • The first layer of cases deal with searching for the key to remove. Thus it will mirror add very closely with left/right/current cases. As there, create new nodes for the results.
  • The case of the current node.key equaling the search key, there are additionally several cases to consider. These are explained in some detail in Wikipedia's Article on BST Deletion and summarized in the documentation string for the function.
  • If the key is found, one will need to break removal into cases for
    1. No child branches
    2. Only left or right branch present
    3. Both left and right branch present
  • The most complex case of removal is if the key is at a node that has both left and right branches present. In that case, replace the node with its successor as indicated in the following picture. This can be located via an appropriate call to findmin_keyval then removed via a recursive call of remove.

    bst-remove-2child.png

    Figure 1: Removing a node from a BST with two children. D is to be removed, locate the successor to D which is the left-most node in D's right branch, in this case E. Remove E and install it at D's position. Being a left-most node, E is guaranteed to have at most 1 child making it "easy" to remove.

2.3 ssmap_demo.ml Program

The provided ssmap_demo program can help to verify that the Ssmap module is working correctly.

> make ssmap_demo
ocamlc -g -annot -c ssmap.ml
ocamlc -g -annot -o ssmap_demo ssmap.cmo ssmap_demo.ml
> ./ssmap_demo
map3:
   1: {99 -> Ninety-nine}
 0: {17 -> Seventeen}

map4:
     2: {99 -> Ninety-nine}
   1: {86 -> Eighty-six}
       3: {75 -> Seventy five}
     2: {61 -> Sixty-one}
 0: {50 -> Fifty}
     2: {42 -> Forty-two}
   1: {25 -> Twenty-five}
       3: {14 -> Fourteen}
     2: {11 -> Eleven}

map4 to_string: [{11 -> Eleven}, {14 -> Fourteen}, {25 -> Twenty-five}, {42 -> Forty-two}, {50 -> Fifty}, {61 -> Sixty-one}, {75 -> Seventy five}, {86 -> Eighty-six}, {99 -> Ninety-nine}]
Sum of keys is 463
All vals:  Eleven Fourteen Twenty-five Forty-two Fifty Sixty-one Seventy five Eighty-six Ninety-nine
Key: 11, Value: Eleven
Key: 14, Value: Fourteen
Key: 25, Value: Twenty-five
Key: 42, Value: Forty-two
Key: 50, Value: Fifty
Key: 61, Value: Sixty-one
Key: 75, Value: Seventy five
Key: 86, Value: Eighty-six
Key: 99, Value: Ninety-nine
> 

2.4 Grading Criteria for ssmap.ml   grading

The following criteria will be checked in this problem.

Weight Criteria
  AUTOMATED TESTS for ssmap.ml
30 make test-p1 correctly compiles and passes tests
  MANUAL INSPECTION for ssmap.ml
5 getopt
  Use of match/with to decompose algebraic data type
  Distinguishes cases of search key equal,less than, greater than current node's data
  Clearly DOES NOT search entire tree by exploiting BST left/right conventions
  Use of option type in return values
   
  contains_key
  Uses getopt to make function very short
   
5 iter / fold
  Use of match/with to decompose algebraic data type
  Visits all nodes of the tree through an in-order traversal
  iter applies parameter func to all key/values for side-effects
  fold captures return values for applications of func and passes those on to further applications
   
5 to_string
  Use of Buffer functions to efficiently build a string incrementally
  Clear use of either a recursive helper or a higher-order function to traverse the tree
  Special consideration given to completely empty trees to produce correct strings
   
5 findmin_keyval
  Use of failwith to raise an exception for Empty trees
  Use of match/with to decompose algebraic data type
  Recursion to traverse as far left as possible
  Clear "look ahead" to determine if a node has a left child before recursing left
   
5 remove
  Use of match/with to decompose algebraic data type
  Distinguishes cases of search key equal,less than, greater than current node's data
  Allocation of new Node records along to create a new path in the tree
  For key equal to node's data, clearly divide cases into 0, 1, 2 children
  Use of findmin_keyval in 2-child case to find successor key/value
  Use of recursive remove to remove successor in 2-child case

3 Problem 2: Functorize for Treemap

The major limitation of Problem 1's Ssmap is that it only works for string keys with string values. If one wanted to map string keys to integer values or integer values to some kind of record, Ssmap is not useful for this.

OCaml's standard library Map does not suffer from this problem as it provides a functor, Map.Make which takes the key type as an argument and creates a specialized module for that. This problem establishes a functor that generalizes the functionality in Ssmap to a Treemap.

3.1 treemap.ml Top-Level Definitions

Create a file treemap.ml. In it, you will define two major entities

  • A module signature KEYVAL_SIG
  • A functor called Make

The file below gives the overall signatures and ordering of elements in the file but specific instructions are in the next section as to how to do this effectively.

(* treemap.ml: Provides a functor to create a map from an arbitrary
   key type to an arbitrary value type. Functor Make requires a module
   input matching the KEYVAL_SIG signature.  The resulting module
   has the same functions as Ssmap but that works for maps with the
   specified key/value types.  As the name implies, a BST is the
   underlying data structure for the map.
*)

(* Type of key/vals that go into Treemaps. Includes key and value
   type, key comparison function, and a string function for easy
   display. *)
module type KEYVAL_SIG =
sig
  type key_t
  type value_t
  val compare_keys : key_t -> key_t -> int
  val keyval_string : key_t -> value_t -> string
end

(* Functor which creates a module for maps with key/value types
   specified by the parameter module KVMod. *)
module Make (KVMod : KEYVAL_SIG) =
sig
  type treemap =
      Empty
    | Node of { ... }

  val empty : treemap

  val add : treemap -> KVMod.key_t -> KVMod.value_t -> treemap

  val tree_string : treemap -> string

  val getopt : treemap -> KVMod.key_t -> KVMod.value_t option

  val contains_key : treemap -> KVMod.key_t -> bool

  val iter : (KVMod.key_t -> KVMod.value_t -> 'a) -> treemap -> unit

  val fold : ('a -> KVMod.key_t -> KVMod.value_t -> 'a) -> 'a -> treemap -> 'a

  val to_string : treemap -> string

  val findmin_keyval : treemap -> KVMod.key_t * KVMod.value_t

  val remove_key : treemap -> KVMod.key_t -> treemap
end

3.2 Implementation of the Treemap module

Follow these steps to complete your version of treemap.ml.

  1. From the listing above copy in the module type TREEMAP_ELEM from the top of the listing. This signature is needed to specify the pieces of information required to complete map modules in the succeeding functor.
  2. Copy the functor "setup" line for the Make module. Replace the colon at the end of the line with an equals sign and affix a struct keyword. This is because we won't specify a signature for the results of Make and instead proceed directly to its implementation as a structure.
  3. Paste your code from ssmap.ml into the file beneath the struct definition. This will serve as the bulk of the implementation for Treemaps and by design has all the required functions and types in it though changes will be required to get the code to compile.
  4. Change the name of the ssmap algebraic type to treemap.
  5. The pasted code for Ssmap is tied specifically to key/values that are both strings. To allow the key and value types to vary, in the treemap algebraic type Node record, replace the key and value types. Rather than a string key type, this will instead by drawn from the parameter module and can be referred to with a its qualified type name KeyMod.key_t. Make a similar change for the value record field.
  6. At a variety of locations in the Ssmap, calls to String.compare are used to compare two string keys. To generalize this in Treemap, use the comparison function provided in KVMod.compare_keys which takes two key_t values and produces and int. Using text editor functions that do "Find/Replace" is handy for this kind of source code change.
  7. Several spots in the Ssmap code create strings using the fact that keys and values are strings. To generalize this, function KVMod.keyval_string should be used instead passing both key and value to create a string representation suitable for use in functions like tree_string and to_string.
  8. Like all blocks that start with struct, make sure that a matching end is inserted after all the functions.
  9. Very few if any other code changes should be required. Due to OCaml's type inference, most of the rest of the code should compile naturally after making the mild adjustments above.

3.3 Complete map_modules.ml

Create a file called map_modules.ml which will have two map modules in it. Both of these are created via applications of the Treemap.Make functor to modules which adhere to the Treemap.KEVAL_SIG. The basic structure will look like this.

(* map_modules.ml: provides two modules for maps.
   1. StringStringMap which maps strings to strings
   2. IntpairBoolMap which maps pairs of ints to bools.
   Both modules are created by creating a short module adhering to the
   Treemap.KEYVAL_SIG signature and then invoking the Treemap.Make
   functor. *)

open Printf;;

(* Interface module for maps of string to string *)
module StringStringKV = ...;;

(* A map module from string keys to string values. *)
module StringStringMap = ...;;

(* Interface module for maps of int pairs to bool *)
module IntpairBoolKV = ...;;

(* A map module from int pair keys to bool values. *)
module IntpairBoolMap = ...;;

You can check that treemap.ml and map_modules.ml compile and run with the provided treemap_demo program.

> make treemap_demo
ocamlc -g -annot -c treemap.ml
ocamlc -g -annot -c map_modules.ml
ocamlc -g -annot -o treemap_demo treemap.cmo map_modules.cmo treemap_demo.ml
> ./treemap_demo
map3:
[{10 -> 20}, {15 -> 30}, {25 -> 50}]
map5:
     2: {25 -> 50}
   1: {15 -> 30}
 0: {10 -> 20}
     2: {7 -> 14}
   1: {5 -> 10}

valsum: 124
map3:
   1: {99 -> Ninety-nine}
 0: {17 -> Seventeen}

map:
     2: {12 > 7 : false}
   1: {10 > 5 : false}
     2: {7 > 12 : true}
 0: {5 > 10 : true}
   1: {3 > 5 : false}
     2: {3 > 2 : false}

> 


Notes on map_modules.ml Key-Val Format Strings

  • The keyval_string function in StringStringKV should format keys/values in a similar way to what Ssmap did: {k -> v}
  • The keyval_string function in IntpairBoolKV should format keys/values in somewhat differently, with the format {3 > 2 : false} or {7 > 12 : true}.

3.4 Grading Criteria for treemap.ml/map_modules.ml   grading

The following criteria will be checked in this problem.

Weight Criteria
  AUTOMATED TESTS for treemap.ml and map_modules.ml
  make test-p2 correctly compiles and passes tests
5 Tests for treemap.cmo
5 Tests for map_modules.cmo
  MANUAL INSPECTION for treemap.ml
2 Presence of the KEYVAL_SIG module type
   
  Make Functor
3 Correct definition of the treemap abstract data type
  Changes to the key and value field types to those from the parameter module KVMode
   
2 Use of the parameter module KVMod functions compare_keys and keyval_string in definition of functor
  Presence of all required functions (identical the functions present in Ssmap
  MANUAL INSPECTION for map_modules.ml
3 Correct definition of "interface" modules StringStringKV and IntPairBoolKV
  Correct invocation of Treemap.Make functor to create specialized map modules

4 Problem 3: Treeset from Treemap

Significant effort is typically put into data structures for maps and sets. To that end, it is worthwhile to leverage the relationship between them so that the other can built for a low effort cost. Since we have defined a Treemap explicitly, the next step is to define a Treeset which simply uses the map implementation to provide set operations.

4.1 treeset.ml Top-Level Definitions

Create a file treeset.ml. In it, you will define two major entities

  • A module signature TREEMAP_ELEM
  • A functor called Make

The code below gives the overall signatures and ordering of elements in the file but specific instructions are in the next section as to how to do this effectively.

(* treeset.ml: provides a Make functor to create a set of unique
   elments given a parameter module adhering to the ELEM_SIG
   signature.  Internally, uses a Treemap to facilitate set
   operations. *)

(* Type for elements that can go into Treesets. Comprised of an
   element type, a comparison function, and *)
module type ELEM_SIG = sig
  type element;;
  val compare : element -> element -> int;;
  val elem_string : element -> string;;
end;;

(* Functor to create a set module *)
module Make (ElMod : ELEM_SIG) = struct

  (* Internal module used to inteface with Treeset.Make *)
  module ElemKeyVal = ...;;

  (* Internal module providing map functions *)
  module ElTreemap = ...;;

  (* Empty set value *)
  let empty = ElTreemap.empty;;

  (* Return a set with the given element added. *)
  let add set elem =
    ...
  ;;

  (* Produce a string version of the set showing its tree structure. *)
  let tree_string set =
    ...
  ;;

  (* Return (Some elem) if elem is in the set and None otherwise. *)
  let getopt set elem =
    ...
  ;;

  (* Return true if elem is in the set and false otherwise. *)
  let contains set elem =
    ...
  ;;

  (* Higher order iterate function on a set for side-effects. func
     accepts one argument, an element from the set. *)
  let iter func set =
    ...
  ;;

  (* Higher order folding function on a set. func accepts two
     argument, the current result and an element from the set. *)
  let fold func init set =
    ...
  ;;

  (* Create a string version of the set. *)
  let to_string set =
    ...
  ;;

  (* Return a set with the given element removed. *)
  let remove set elem =
    ...
  ;;
end;;

4.2 Implementation of the Treeset module

Importantly, everything in Treeset is short. There is no redefinition of tree structures or recursive methods in it. The entire functionality is provided by leveraging Treemap.

Copy the ELEM_SIG module type into treeset.ml verbatim.

The ElemKeyVal Structure

Within the Treeset.Make functor, the first step is to create an interface module for a set element to the keys in a Treemap. This is because the keys of a map form a unique set: keys cannot be duplicated. Thus a map can act as a set with clever coding.

Create an ElemKeyVal module which adheres to the KEYVAL_SIG. It's key type should be the ElMod.element type a good choice for the value type is unit. Use the compare function in ElMod to specify the compare_keys function of ElemKeyVal.

The ElMod.elem_string is used in ElemKeyVal.keyval_string. Keep in mind that the latter takes a key and a value but in this setting, only the key matters for string output.

Invoke TreeMap.Make

Establish ElTreemap using the TreeMap.Make functor. This will give a specialized map for with the set's element type as the keys.

Define Functions using calls to ElTreeMap

Each of the remaining functions can be defined using 1-4 lines of code. This is possible by making calls to ElTreemap to add, stringify, remove, and get elements.

getopt in Treeset.Make

Keep in mind that ElTreemap.getopt will return values which are likely to be Some unit if an element is present. This require just a little adaptation so that Some elem is returned instead.

Higher Order Functions in Treeset.Make

The higher-order functions, that you will need to do a little adaptation.

  • In XSet.iter elfunc set, the elfunc argument is meant to take 1 argument, an element of the set.
  • Internally a call to ElTreemap.iter kvfunc map will be used where kvfunc expects 2 arguments, the key and value. However, only the key is of importance in this setting as keys are elements of the set.
  • This mismatch can be bridged by creating a small helper that accepts two arguments which is passed to ElTreemap.iter. The helper just calls elfunc on the key and ignores the value.
  • A similar adaptation of 1 to 2 arguments is required in fold.

4.3 treeset_demo.ml

Once treeset.ml is complete, the treeset_demo program will compile and run to produce the following output.

> make treeset_demo
ocamlc -g -annot -c treemap.ml
ocamlc -g -annot -c treeset.ml
ocamlc -g -annot -o treeset_demo treemap.cmo treeset.cmo treeset_demo.ml
> ./treeset_demo
inttree3:
   1: 99
 0: 17

inttree4:
     2: 99
   1: 86
       3: 75
     2: 61
 0: 50
     2: 42
   1: 25
       3: 14
     2: 11

sum: 463
strtree1:
     2: Z
         4: W
       3: T
   1: R
         4: Q
       3: N
     2: M
 0: K
     2: E
   1: D
     2: A

strtree3:
     2: Z
         4: W
       3: T
   1: R
     2: Q
 0: N
     2: E
   1: D
     2: A

citytree1:
     2: Tokyo, Japan: 13.83 million
       3: Tianjin, China: 12.78 million
   1: Shanghai, China: 24.25 million
           5: Sao Paulo, Brazil: 12.03 million
         4: Mumbai, India: 12.47 million
             6: Moscow, Russia: 12.19 million
           5: Lagos, Nigeria: 16.06 million
             6: Karachi, Pakistan: 14.91 million
                 8: Istanbul, Turkey: 14.02 million
               7: Guangzhou, China: 14.04 million
       3: Dhaka, Bangladesh: 14.39 million
     2: Delhi, India: 11.03 million
 0: Chongqing, China: 30.75 million
     2: Chengdu, China: 16.04 million
   1: Beijing, China: 21.51 million

citytree2:
     2: Tokyo, Japan: 13.83 million
       3: Tianjin, China: 12.78 million
   1: Shanghai, China: 24.25 million
         4: Sao Paulo, Brazil: 12.03 million
       3: Mumbai, India: 12.47 million
           5: Moscow, Russia: 12.19 million
         4: Lagos, Nigeria: 16.06 million
           5: Karachi, Pakistan: 14.91 million
               7: Istanbul, Turkey: 14.02 million
             6: Guangzhou, China: 14.04 million
     2: Dhaka, Bangladesh: 14.39 million
 0: Delhi, India: 11.03 million
     2: Chengdu, China: 16.04 million
   1: Beijing, China: 21.51 million

> 

4.4 Grading Criteria for treemap.ml   grading

The following criteria will be checked in this problem.

Weight Criteria
  AUTOMATED TESTS for treeset.ml
10 make test-p3 correctly compiles and passes tests
  MANUAL INSPECTION for treeset.ml
2 Presence of the ELEM_SIG module type
   
  Make Functor
3 Correct definition of the interface module ElemKeyVal
  Creation of the ElTreemap module via Treemap.Make
   
5 Use of ElTreemap throughout the definitions of all the output module functions
  Short definitions for all functions

5 Zip and Submit

5.1 Zipping Files

If creating a zip file is unfamiliar to you, refer to the following guide: Making Zips (tutorial)

5.2 Submit to Canvas

Once your are confident your code is working, you are ready to submit. Ensure your folder has all of the required files. Create a zip archive of your lab folder and submit it to Canvas.

On Canvas:

  • Click on the Assignments section
  • Click on the appropriate link for this lab/assignment
  • Scroll down to "Attach a File"
  • Click "Browse My Computer"
  • Select you Zip file and press OK

5.3 Late Policies

You may wish to review the policy on late project submission which will cost you late tokens to submit late or credit if you run out of tokens. No projects will be accepted more than 48 hours after the deadline.

http://www-users.cs.umn.edu/~kauffman/2041/syllabus.html#late-projects


Author: Chris Kauffman (kauffman@umn.edu)
Date: 2018-11-21 Wed 15:20