Assignment 3. CS330 Programming Languages, Spring 2004 

PLEASE READ CAREFULLY
MISSING
DETAILS MAY COST YOU

Details

A random 25% of students will be orally interviewed by me in person
regarding their submission. Make sure you have written everything you
submit. For any function you are trying to write, you can request
what is the desired output for a particular input by email. Solve the parts
of the assignment in the order they are given as they get progressively harder.
If you are stuck somewhere move to the next part and come back later.

Late assignment policy:  If the assignment is submitted
within three days from when it was due, you will get half the grade you
would get if you had submitted on time. You will not be able
to submit after three days. (exceptions to this rule only
if you have a VERY important reason).

You will be working in groups for this assignment. I understand
that in some cases one student will do more work than their partner.
If that's the case make sure both partners understand well all the code
written. Either of you might be orally interviewed and failure to understand the code will
result in lower grade for both of you. Learning to work with other programmers
of various skill leves is essential and it can be unfair but that's life. Try
to make this a learning experience for both of you.
Here are your group assignments:
as3groups.txt

IMPORTANT: The careful design and documentation (i.e comments) of
your code will be important factors in your grade. If there is a bug
it's better to report it than hide it. Be honest, precise and clear and
you shall be rewarded. NEVER (at least in this class) sacrifice
clarity for efficiency of execution.

(10% of final grade - 10 points)

Part 1. Assignment 2 in Scheme (2) 

Write parts 1 and 3 of assignment II in Scheme. Ignore the infix operators
for polynomial multiplication and polynomial addition. Submit two files
one file for each member of the group.

If you want to improve your grade for assignment I and II you can
implement parts 2 (1 point) and 4 (2 points) for assignment II in Scheme.

Part 2. Victoria Math Language (8) 

(Note: this assignment looks harder than it really is)

Victoria Math Language (VML) is a simple expression language for the manipulation
of symbolic expressions representing functions of one variable. The task
for this part is to write an interpreter for this language. This assignment
will serve as an introduction to lexical analysis, parsing, recursive functions
over abstract syntax trees and semantic interpretation of a mini programming language. In addition
it will expand your knowledge of SML and familiarize with the use of the module system
and the Compilation manager (something similar to make for SML). You will be
provided some skeleton code which you will need to understand, modify and extend.

The easiest way to explain VML is through some examples. A more formal
definition will be provided later. Here is a some code in VML:

# simple bindings of expression to name 
a is x + x;
print a;
# binding of name to another name
b is a + x;
print b;
# test the arithmetic operations c is a contant function but needs to be
# called with an argument to be evaluated
c is (5.0 + 2.0) * 2.0 - 10.0;
print c(0.0);
# test derivative operator
d is D(x*x*x + x + 2.0);
print d(~2.0);
# test function application
e is a(x+1.0);
print e;
# a more complex test - reuse of a as a name
a is d(2.0) + a(x+1.0) * 25.0;
print a;

Here is the output of the above VML program after being interpreted:
(x+x)
((x+x) + (x+x))
4.0
13.0
((x+1.0) + (x+1.0))
(13.0+(((x+1.0)+(x+1.0))*25.0))
Lines starting with the # characters are comments and are ignored.
There are only two types of statements in the language.
Binding statements have the form  :          identifier is expression;
and print statements have the form :          print expression;

Binding statements associate a name with a particular symbolic expression
representing a function of one variable. Once a particular name is bound
to a symbolic expression it can then subsequently be used to refer to that
expression. Print statements just output the expression. Boldface is used
to denote keywords of the language.

The expressions of VML, are interpreted using the usual
interpretation of mathematics. All numbers are reals with decimal
notation. Negative numbers are preceded by ~ as in ML. Powers of x
are written as x*x*x. The following functions of one variable
are also supported: cos, sin, ln, exp (with the appropriate symbolic derivatives).

The derivative operator has the form:        D(exp)
It computes the symbolic derivative of an expression.

Function application has the form:          identifier(exp)
and results in substitution of every instance of x in the expression
assosiated with the identifier with exp. For example
if a is x + x; then a(x+1.0) is (x+1.0) + (x+1.0).

When an identifier which has been bound to a name
is found as part of an expression it is replaced by it's
value in that expression.

More formal description of VML Syntax

Lexical units

Names:   names are composed by a letter for by zero or more letters and digits (a..z) (a..z 0..9)
Examples of valid names:
a 
hello 
h1 

Numbers:  numbers are composed by one more digits followed by a period and one more digits          
Examples of valid numbers:

5.0
~2.2
10.15

Special characters:
+ - * / ( ) ;

Keywords: is, print, x, cos, sin, exp, ln

These lexical units will be modeled using the following ML datatype:

datatype token = PLUSSIGN     | MINUSSIGN   | TIMESSIGN | DIVSIGN
               | RPAR        | LPAR     
               | ISKEYWORD    | DKEYWORD    | PRINTKEYWORD | XVAR
               | SINFUN       | COSFUN      | LNFUN     | EXPFUN 
               | NAME of string | NUMBER of real | SEMICOLON
The scanner part of your submission converts strings to list of tokens with the above datatype.

Grammars for Parsing

The parser part of your submission takes as input a list of tokens (produced by the scanner)
and creates an abstract syntax tree that represent the incoming program.

The following grammars are used for parsing:

(expression grammar)
exp       ->  term expopt
expopt  ->  + term expopt | - term expopt | nothing
term      ->  atom termpot
termpot -> *  atom export   | / atom expopt | nothing
atom     ->  number | name(exp) | name | x | sin atom | cos atom | ln atom | exp atom | d exp | (exp)

(statement grammar)
stm -> stm; stm                      
stm -> name is exp
stm -> print exp

   
The abstract syntax tree that is created by the parser is represented using the following SML datatypes:

datatype fexpr = CONST of real | X 
| PLUS of fexpr * fexpr | MINUS of fexpr * fexpr
| TIMES of fexpr * fexpr | DIV of fexpr * fexpr
| SIN of fexpr | COS of fexpr
| LN of fexpr | EXP of fexpr | NAME of string
| APPLY of string * fexpr | D of fexpr


datatype stm = COMPOUNDSTM of stm * stm
| BINDSTM of string * fexpr
| PRINTSTM of fexpr

VML semantics and interpretation 

The interpreter traverses the abstract syntax tree (AST) and generates the output of the program.
The main tasks it has to accomplish are: symbolic differentation, substitution of names,
function application, expression evaluation and printing. All these tasks will be accomplished
by appropriate use of recursive functions that use pattern matching to traverse the AST.

Symbolic differentiation of expressions is governed by the following rules  where D denotes
symbolic differentation:

D(constant) = 0.0
D(x)  = 1.0
D(e1 + e2) =  D(e1) + D(e2)
D(e1 -  e2) =  D(e1)  - D(e2)
D(e1 * e2) =  D(e1) * e2 + e1 * D(e2)
D(e1 /  e2) =  (D(e1) * e2 - e1 * D(e2)) / (e2 * e2)
D(sin(e))   =  cos(e) * D(e)
D(cos(e))    = -sin(e) * D(e)
D(ln(e))      = D(e) / e
D(exp(e))    = exp(e) * D(e)

You will write a function dr that takes as input an abstract syntax tree
and returns an abstract syntax tree which is the symbolic derivative of the input.

For printing the easiest thing to do is to surround everything with parentheses.
Even though the output is uglier that way it is guaranteed has to be correct
in terms of precedence etc. Write a printing function that outputs in way
that "looks good" to the human eye is a non-trivial and challenging task
which you are not required to undertake and if you do be very careful.

For the substitution of names and the application of function you will
need to create a symbol table that associates names with their
corresponding expressions.  The dictionary datastructure
supports the operations of insert, update and lookup.
A dictionary implementation based on binary search trees is provided but
you are required to provide an alternative implementation based on an
association list. Basically each entry is stored as a tupple where the
first element is a string, the symbol table is a list and for lookup you just linearly
traverse the list until you find the correct element.

Your functions for name substitution and function application
must have as argument the symbol table and return the modified
symbol table.



Supporting code

You are going to be provided with skeleton code for a subset of VML
in the following files:

Scanner     (.sig, .sml)    
Parser        (.sig, .sml)
Interpreter (.sig, .sml)
Dictionary (.sig)
DictionaryBST (.sml)
sources.cm     (the equivalent of a make file for sml)

plus some additional support files. All the code is documented and
comments about where you are suppose to make your modifications
are included.

The code implements simple scanning, parsing and interpretation
of arithmetic expressions. I call this simple language VML-. Only
single line input is supported and I have included two test cases
of vml- programs with their corresponding outputs. A readme
file describes the contents of the directory. It also uses the compilation
manager which is a version of make on steriods for ML. All you
need to know is that you put all your source files in sources.cm
and then you type CM.make("sources") and everything compiles.

Try to read and understand carefully how the example code
works before you attempt to start modifying it. Here it is:

vml-.tar.gz

Deliverables 


Modified Scanner.sml that scans properly full VML files                                                                                      (1 points)
Modified Parser.sml    that builds the full syntax tree                                                                                             (2 points)
Modified Interpreter.sml that correctly interpreters full VML problems                                                                 (2 points)
Set of test cases that you have designed to test your code     (minimum 20 that test drive most cases)                   (1 point)
(note: I will be using tests cases of other groups to test your code so make sure you have
thoroughly tested your code)
Dictionary based on association lists                                                                                                                      (1 point)
Documentation  + comments                                                                                                                                  (1 point)


A bit of history 

The problem of symbolic differentiation of mathematical expressions in addition to being interesting and
 important has some interesting history behind it. John McCarthy (the inventor of LISP) in 1958 spent a
summer working at the IBM information research department trying to write a program to perform symbolic
differentation of algebraic expressions. In order to solve this problem in a reasonable way it is necessary for a programming
language to have recursion and conditional expressions neither of which were available at the languages of that time (mainly
Fortran). This led McCarthy to the invention of LISP and the ideas of list processing, recursion and garbage collection
entered Computer Science.

This assignment can be seen as a very mini-version of symbolic mathematical packages such as Mapple and Mathematica which
are widely used in science and engineering.

Extra credit 

If you plan to work on these extra credit questions make sure you send
me an email with what you are planning to do so that I can confirm that
it counts as extra credit and can provide you with some guidelines
(basically I want to avoid the postscript confusion of assignment II)

0)    Make the interpreter of VML interactive (like ML or Scheme) (1 extra point)
1)    Add another statement to VML called plot that takes as arguments a filename,
a starting x-value, an ending x-value, number of points between them and an expression
and outputs to the filename postscript commands that create a drawing of the corresponding
function from starting x to ending x with number of points.
2)    Implement part 2 in another functional language (for example Scheme or Haskell) (2 extra points)
3)    Implement part 2 in another non-functional language of your choise and write
a short 2 page paper contrasting the two approaches (3 extra points)