Equational Reasoning in Standard ML
Table of Contents
1. Overview
The excitement with statically-typed functional programming is that we can reason about it. The technical term is "equational reasoning" and resembles Dijkstra's structured derivations. For purely functional languages, we can use equational reasoning for the entire program.
The critical property enabling us to reason about functional programs is referential transparency. The only two features of Standard ML which hinders referential transparency: IO, and references. (IO is a thorny issue in functional programming.)
Since Standard ML has some impure aspects, and has eager evaluation, there are some subtleties about how to reason about ML code.
Moral: Well-written functional code corresponds to its own correctness proof.
2. Definitions
We have an equivalence, in ascii written ==
(so e1 == e2 : T
is a
predicate on two closed expressions e1
and e2
, both have the same
type T
). In math, on paper, we write \(e_{1}\cong e_{2} : \tau\). We
usually suppress the type.
e value
means thate
is a value; not every expression has a value (e.g.,5 div 0
).- For now, values may be described inductively as:
- The basic types like numbers, strings, characters,
- applications of datatype constructors to values,
- More generally, "values" would be syntactical values or "non-expansive expressions" as found in chapter 4 of the definition of Standard ML (1997 revision).
- For now, values may be described inductively as:
e valuable
mean "There exists av
such thatv value
ande == v
"- If an expression raises an exception, then it is not valuable.
f : T1 -> T2, f total
means "For all valuesv1 : T1
, ifv1 value
, then(f v2) valuable
."- Examples:
(fn s => s)
,op +
,Int.toString
- Non-Examples:
div
,exp
- Examples:
- It's common to abuse language and write: "for all values
e : T
…" or even "for allv : T
…" to mean "for alle : T
, ife value
, then…". Specifically, this means a quantifier over arbitrary expressions must be explicitly indicates as, "For all expressionse : T
…". - We write
e'[e/x]
for substitution ofe
forx
ine'
. - Reduction of an expression is written
e ==> e'
(or on paper as \(e \Longrightarrow e'\)) and is defined using the operational semantics of Standard ML. It does not necessarily correspond to the steps the computer takes when evaluating a Standard ML expression (since optimizations may occur, etc.).- We write \(e\stackrel{k}{\Longrightarrow} e'\) for "\(e\) reduces to \(e'\) in \(k\) steps of reduction"
- If we have \(e\Longrightarrow e'\), then we mean "\(e\) reduces to \(e'\) in zero or more steps of reduction"
- Evaluation of an expression is denoted \(e\hookrightarrow v\)
- Evaluation and reduction are related in the sense that if \(e\hookrightarrow v\) then either \(e\) is \(v\) already or \(e\stackrel{1}{\Longrightarrow}e_{1}\Longrightarrow v\)
- We write
e |-> e'
for one step of calculation or reduction,e |=> e''
for multiple parallel steps of calculation or reduction, ande ==> e'''
for an arbitrary-but-finite number of steps of reduction.- Observe if
e |-> e'
, thene |=> e'
- If
e |=> e'
, thene ==> e'
- If
e1 ==> e2
ande2 ==> e3
, thene1 ==> e3
. - Example: evaluating
(1 + 2)*(3 + 4)
- We have
(1 + 2)*(3 + 4) |=> 3*7
since(1 + 2) |=> 3
and separately3 + 4 |=> 7
; and then we have3*7 |=> 21
. So|=>
does one step of evaluation on disjoint subexpressions. - But we also have
(1 + 2)*(3 + 4) ==> 21
done in one step. - The longest route is to do one step of evaluation at a time, which
could look something like:
(1 + 2)*(3 + 4) |-> 3*(3 + 4) |-> 3*7 |-> 21
, but there are other possibilities.
- We have
- Syntax on paper, we would write:
e ==> e'
as \(e \Longrightarrow e'\)e |-> e'
as \(e \mapsto e'\)e |=> e'
- \(e\hookrightarrow v\) for \(e\Longrightarrow v\) and \(v\) is a value, i.e., \(e\) is valuable and \(e\Longrightarrow v\)
- Observe if
3. Provisional Definition of Extensional Equivalence
There are several cases to the definition of extensional equivalence. Really, there is an abstract definition, and these several cases are theorems…but the abstract definition is too abstract.
3.1. For "base" types
Let \(e\), \(e'\) be two expressions of the same base type (e.g., int
,
char
, string
, bool
, word
, unit
, etc.). We say they are
Extensionally Equivalent, written \(e\cong e'\), if one of the
following is true:
- Evaluation of
e
produces the same value as the evaluation ofe'
, - Evaluation of
e
raises the same exception as the evaluation ofe'
, or - Both loop forever.
We can see that two values of the same base type are extensionally
equivalent iff they are identical (modulo silly encoding issues, like
0x10
and 16
are the same value but expressed in superficially
different literals).
Also notice, if we have the product of base types, extensional equivalence carries over component-wise.
3.2. For Function Values
We say two function values f : T -> T'
and g : T -> T'
of the same
type are extensionally equivalent whenever f(v) == g(w)
for all
extensionally equivalent argument values v : T
and w : T
.
A "function value" is an anonymous lambda expression fn x => e
together with a (possibly empty) environment of bindings for any
"nonlocal variables" appearing in the body of the function. This pairing
(of a lambda expression plus an environment) form a Closure.
3.3. For Functions
We will say two expressions of type T -> T'
are extensionally
equivalent if one of the following hold:
- they both evaluate to extensionally equivalent function values,
- they both raise extensionally equivalent exceptions when evaluates, or
- both loop forever when evaluated.
3.4. General Definition
Two expressions e
and e'
are extensionally equivalent provided (a)
they are of the same type, and (b) one of the following holds:
- they both evaluate to extensionally equivalent values, or
- they both raise extensionally equivalent exceptions when evaluates, or
- both loop forever when evaluated.
(Two exceptions are extensionally equivalent if they are the same kind of exception and their payloads are extensionally equivalent.)
When e
and e'
are extensionally equivalent, we write e == e'
in
code and \(e\cong e'\) on paper.
3.5. Using this Definition
For us to actually use this definition, we need to introduce the various base types and germane information, like:
- Values of the type
- Primitive Operations provided by the Standard Basis
- Typing rules associated with the primitive operations
- Evaluation rules
Since most of the primitive operations are functions and Standard ML
is call-by-value, the only evaluation rules we really need are for the
special functions like andalso
, orelse
, etc.
What exactly are "values"? They correspond to non-expansive expressions as found in §4.7 of the 1997 Definition of Standard ML. These are literal values ("special constants"), tuples of non-expansive expressions, function values, constructors applied to non-expansive expressions, etc.
4. Initial Specification of a Few Types
We need to specify a few of the "base types" for Standard ML before we can reason about things. This will include giving the operational semantics for a few of the functions found in the Standard basis library (i.e., "primitive functions").
We should also provide similar specifications for char
, string
, unit
.
The other interesting types are tuples, records, functions; we do not
specify these, but they should be understood.
4.1. Integers
- Type
int
- Values
- All the integers (this may or may not coincide with your implementation, we assume this is the case).
- Operations
- Let
e1
,e2
be integer expressions. Then we have the following additional operations:e1 + e2
,e1 - e2
,e1 * e2
,e1 div e2
,e1 mod e2
, among many others found in the Standard Basis library'sINTEGER
signature. - Typing Rules
- If
<op>
is a binary operator of integers, ife1 : int
ande2 : int
, thene1 <op> e2 : int
. - Evaluation
- These operations are evaluated "from left to right" by first evaluating the arguments, then producing the final value.
For an explicit example of the evaluation rules, we have for addition the following three rules:
\begin{equation} \frac{e_{1}\stackrel{1}{\Longrightarrow}e_{1}'}{e_{1}~\mbox{\texttt{+}}~e_{2} \stackrel{1}{\Longrightarrow} e_{1}'~\mbox{\texttt{+}}~e_{2}} \end{equation} \begin{equation} \frac{n_{1}~\mbox{value}\qquad e_{2}\stackrel{1}{\Longrightarrow}e_{2}'}{n_{1}~\mbox{\texttt{+}}~e_{2} \stackrel{1}{\Longrightarrow} n_{1}~\mbox{\texttt{+}}~e_{2}'} \end{equation} \begin{equation} \frac{n_{1}~\mbox{value}\qquad n_{2}~\mbox{value}}{n_{1}~\mbox{\texttt{+}}~n_{2} \stackrel{1}{\Longrightarrow} n} \end{equation}where \(n\) is the integer value representing the sum of the integer values \(n_{1}\) and \(n_{2}\).
The other evaluation rules are similarly defined.
4.2. Booleans
- Type
bool
- Values
true
andfalse
- Operations
- Let
e1
ande2
be Boolean expressions. Thene1 andalso e2
,e1 orelse e2
,not e1
are Boolean expressions. Ift1
,t2
are any two expressions of typeT
, thenif e1 then t1 else t2
is an expression of typeT
. - Typing Rules
- As we would expect.
- Evaluation
- The definition specifies
e1 andalso e2
is syntactic sugar forif e1 then e2 else false
, ande1 orelse e2
is syntactic sugar forif e1 then true else e2
. By "syntactic sugar", we mean we can replaceandalso
,orelse
with these expressions.
5. Basic Properties
For every syntactically valid Standard ML expression e
(i.e., e
is
well-typed and has type T
) that we can
evaluate, exactly one of the following holds:
e ==> v
for some valuev : T
- the evaluation of
e
raises some exception - the evaluation of
e
loops forever.
The dynamic semantics for expressions in Standard ML is given by rules (90) through (149) of the 1997 Definition. By inspection, we see that these are the only possibilities.
Two expressions e
and e'
are said to be Extensionally Equivalent
(written \(e\cong e'\) or e == e'
) if there is a type T
such that e : T
and e' : T
, and one of the following holds:
- there are values
v : T
andv' : T
such thate ==> v
ande' ==> v'
andv == v'
, or - the evaluation of
e
ande'
both raise extensionally equivalent exceptions, or - the evaluation of both
e
ande'
loop forever.
If e1 ==> e2
, then e1 == e2
.
If e1 ==> e
and e2 ==> e
, then e1 == e2
.
Equivalence is an equivalence relation:
- Reflexivity:
e == e
- Symmetry: if
e1 == e2
, thene2 == e1
- Transitivity: if
e1 == e2
ande2 == e3
, thene1 == e3
.
Proof of reflexivity: by theorem 5, there are three possibilities when
trying to evaluate e
:
- Consider
v : T
such thate ==> v
. Thene ==> v
ande ==> v
, andv == v
, which meanse == e
by definition of extensional equivalence. - The evaluation of
e
raises an exception, which is extensionally equivalent to itself. Hencee == e
by the second clause of the definition of extensional equivalence. - The evaluation of
e
loops forever. Thene == e
by the third clause of the definition of extensional equivalence.
Proof of symmetry is similar, working by cases given by the clauses of the definition of extensional equivalence.
Transitivity is similarly done by cases, using symmetry of extensional equivalence to prove the first case, and possibly some lemmas for the remaining two cases.
Furthermore, extensional equivalence is a congruence (in the sense of universal algebra), meaning we can replace one expression with an equivalent one "inside" any bigger expression. Let us formalize this notion using the idea of "referential transparency".
- We call a context \(E[-]\) (i.e., an "expression with a hole") Referentially Transparent if for any \(e\hookrightarrow v\) of appropriate type we have \(E[e]\cong E[v]\).
- We can call an expression \(e\) Referentially Transparent if for every well-typed sub-expression \(e_{1}\) there is a corresponding context \(E[-]\) such that (a) we have \(E[e_{1}]=e\) identically, and (b) \(E[-]\) is a referentially transparent context.
- We want to work with referentially transparent expressions.
The intuition is that referentially transparent expressions are "pure functions" or build out of "pure functions". There are no side-effects when evaluating a referentially transparent expression.
Let \(e\) be a referentially transparent expression. Let \(e_{1}\) be any subexpression of \(e\) and \(e = E[e_{1}]\). Then for any well-typed expression \(e_{2}\cong e_{1}\) we have \(E[e_{1}]\cong E[e_{2}]\).
For applications, we may generalize from the situation with a single argument:
- If
e2 == e2'
, thene1 e2 == e1 e2'
- If
e1 == e1'
, thene1 e2 == e1' e2
On tuples, the general pattern may be gleaned from the ordered pair:
- If
e1 == e1'
, then(e1, e2) == (e1', e2)
- If
e2 == e2'
, then(e1, e2) == (e1, e2')
If e1 == e1'
,
then (let val x = e1 in e2 end) == (let val x = e1' in e2 end)
.
If e1 == e1'
, then
case e of [] => e1 | x :: xs => e2 == case e of [] => e1' | x :: xs => e2
For subexpressions which "go under binders", we quantify over all values of the appropriate type. Consider the following examples:
(Function bodies)
If for all values v : T
, e[v/x] == e'[v/x]
, then
(fn x : T => e) == (fn x : T => e')
.
(Let bodies)
If for all values v : T
, we have e2[v/x] == e2'[v/x]
, then
(let val x : T = e1 in e2 end) == (let val x : T = e1 in e2' end)
.
(Destructuring lists)
If for all values v
and vs
we have e2[v/x][vs/xs] == e2'[v/x][vs/xs]
,
then
case e of [] => e1 | x::xs => e2 == case e of [] => e1 | x::xs => e2'
6. Reductions
Since Standard ML is call-by-value, we evaluate the arugment to a function before substituting into the body. We can express this by the following:
If e2 |-> e2'
,
then (fn x => e) e2 |-> (fn x => e) e2'
.
We write |->
for one step of calculation, and we write ==>
to mean
there have been an arbitrary number of steps.
For short-circuiting boolean operators, andalso
and orelse
, we have
different rules of evaluation:
- If
e1 ==> true
, thene1 andalso e2 ==> e2
ande1 orelse e2 ==> true
; - If
e1 ==> false
, thene1 andalso e2 ==> false
ande1 orelse e2 ==> e2
.
(This is because andalso
and orelse
are not functions)
If e' value
, then
(fn x => e) e' = e[e'/x]
.
The critical aspect of this second rule is the premise e' value
. We
have several other reduction rules, which "should be obvious".
(case [] of [] => e1 | x::xs => e2) == e1
and
If v::vs value
, then
(case v::vs of [] => e1 | x::xs => e2) == e2[v/x][vs/xs]
If (v1,v2) value
, then
(let val (x,y) = (v1,v2) in e end) == e[v1/x][v2/y]
.
If e'
is valuable, then ((fn x => e) e') == (let x = e' in e end)
.
7. Type-Directed Rules
Two functions are equivalent if they agree on all arguments; i.e.,
if for all values v : T1
we have f v == g v : T2
,
then f == g : T1 -> T2
.
Further, for all values v
and v'
,
if v == v' : T1
, then f v == g v' : T2
Constructors are injective and disjoint.
Denote by !=
the negation of our equivalence relation (so a != b
means not a==b
), then we have
x::xs != []
- If
y::ys == x::xs
, thenx==y
andxs==ys
.
8. Valuability
[] valuable
If x
and xs
are valuable (with values v
and vs
, respectively),
then x::xs valuable
.
If x
and y
are valuable, then (x,y) valuable
.
(fn x => e) valuable
If f total
and e valuable
,
then (f e) valuable
.
To prove f:T->T'
is total, we want to show f v
is valuable for all
values v:T
.
Let e
, e1
, e2
be expressions.
If:
e
is valuable,e==[]
impliese1 valuable
, and- for all values
v
andvs
,e==v::vs
impliese2[v/x][vs/xs]
is valuable;
…then we have
(case e of [] => e1 | x::xs => e2) valuable
In general, valuability is closed under equivalence:
If e==e'
and e'
is valuable, then e
is valuable.
(Valuable-Stepping Principle)
If e2
is a valuable expression, then
(fn x => e1) e2 == e1[e2/x]
.
This can be made more complicated if e1
or e2
is impure.
9. Inversions
For closed terms, e == v
implies e
evaluates to a value e ==> v'
where v' == v
. This
justifies the following inversion(s):
If e1 e2 == v
,
then there exists an expression e
and a value v
such that
e1 == (fn x => e)
e2 == v2
e[v2/x] == v
If case e of [...]
is valuable,
then e
is valuable.
10. Concluding Remarks
For the axiomatic behaviour and contracts for the standard basis library, I would defer to the Basis library of Hamlet:
HaMLet emphasizes adherence to the Standard ML definition, so this may be useful.
11. References
- Bob Harper, "Equational Reasoning Guide",
CMU CS15-150.
- 2012 Handout
- Notes on structural induction
- Notes on Evaluation
- Notes using
=>*
instead of==>
- David Walker, Princeton COS326 notes, 2012