Standard ML

1. Overview
2. Core Language
3. Modules
- 3.1. Signatures
  - 3.1.1. Usefulness of Opaque Ascription
4. Value Restriction
5. Lists: Map, Fold, Filter is all you need
6. References

This page has gotten a little unruly. I might refactor out the discussion of the Core and Module language into a separate tutorial page. For additional topics, I have separated out other discussions.

1. Overview

The primordial "statically typed functional programming language", Standard ML was invented to program theorem provers. The language consists of two parts:

The Core Language = the typed-lambda calculus equipped with algebraic data types
The module language = the signatures, structures, functors

This is incredibly minimalistic compared to, say, Haskell.

The "pure" fragment of the core language is called PCF (Programming Computable Functions) for historic reasons, which is itself an interesting little language.

2. Core Language

The top-level amounts to declaration of values, functions, and type definitions. Let's examine these one-by-one.

2.1. Values

These are the easiest to visualize and explain. Values are named constants.

val pi = 3.14159; (* A declaration without a type annotation *)
val tau : real = 6.28318; (* An annotated declaration *)

2.2. Functions

There are two styles to declaring functions, the curried approach and the uncurried approach. The uncurried approach declares the function to take a tuple and produce a value.

fun cylinder_volume(r,h) = pi*r*r*h;

We can optionally annotate each parameter with its type:

fun cyl_vol2(r : real, h : real) = pi*r*r*h;

We can optionally annotate the result

fun cyl_vol3(r,h) : real = pi*r*r*h;

We can use curried versions:

fun cyl_vol4 r h = pi*r*r*h;

To have a type annotated curried function definition, we must parenthetize each argument to declare its type:

fun cyl_vol5 (r : real) (h : real) = pi*r*r*h;

This makes the line really long when we have more than a couple arguments to a function; I suspect this is one reason people use signatures and modules. We stick the declaration in the signature and the definition in the module, but we'll tackle that later.

2.2.1. Semantics

Function calls are call-by-value (i.e., evaluate the arguments, then copy the value to be bound by the formal parameters).

Also, when a function is defined, it binds with the environment available at the time. For example:

val pi = 3.14159;

fun area(r) = pi*r*r;

val pi = 1.0; (* redefinitions are allowed *)

area(1.0);

What do you think will be produced? Well, when area was defined, pi was bound to 3.14159, so area(1.0) evaluates to 3.14159*1.0*1.0.

2.2.2. Documentation

Carnegie-Mellon's CS15-150 course recommends the following style for writing functions:

Write a comment which starts with the type annotation of the function, e.g., split : 'a list -> ('a list)*('a list)
The next line in the comment should be REQUIRES: <assumption>
- Each assumption lives in its own REQUIRES line
- If there are no assumptions, then write REQUIRES: true
The next line in the comment should be ENSURES: <property about result>; this ends the necessary parts of the comment.
- For case-based results, the ENSURES: statement will resemble something like the example below.
Then the code for the function implementation
Then add some unit tests for the function, to make sure it works as intended.

(* select : int -> int
   REQUIRES: true
   ENSURES: select(x) returns
                      0 if x = 1,
                      x if x < 1,
                      and x*x if x > 1.
*)
fun select (x:int):int =
    (case (square x, x > 0) of
         (1, true) => 0
       | (_, false) => x
       | (sqr, _) => sqr)

2.3. Expressions

Besides the usual arithmetic expressions, we have a few more worth mentioning:

let-in-end expressions for local definitions
if-then-else expressions for conditional flow
case expressions for pattern matching

2.4. Boolean Operators

Care must be taken, since in Standard ML the boolean operators provided are:

andalso for short-circuiting conjunction
orelse for short-circuiting disjunction
not for negation

The and keyword has a technical meaning for simultaneous definitions.

We test for equality using = and inequality using <>, but this only works on "equality types" (unit, int, word, char, string, array, vector, ref).

2.5. Types

The built-in types include: bool, int, real (for floating point), char, string, and unit (analogous to void in C-like languages). We can build tuples using the Cartesian product, written T1*T2 for the product of type T1 with type T2. Function types are built using arrows T1 -> T2, like in Haskell or OCaml.

There are two ways to introduce types: one is to make an alias of an existing type, the other is to construct an algebraic data type.

type Identifier = string;

datatype Expr = Var of Identifier
              | Abs of (Expr*Expr)
              | App of (Expr*Expr);

It's important to note the constructor is separated from the rest of the row by an "of" keyword.

CAUTION: Unlike Haskell, Standard ML writes parametrized types "backwards"; e.g., in Haskell we have a List int, but in Standard ML we have int list.

Unlike Haskell, if we want to have a parametrized type, we need to include the apostrophe, e.g., datatype 'a Foo = Bar of ('a * int).

2.5.1. Records

Records are key-value data structures, like tuples, but we access a field by name rather than by its position.

type person : {name:string,
               age:int,
               born:int,
               occupation:string};
val harry : person = {name = "Harry",
                      age = 17,
                      born = 1956,
                      occupation = "Master of the Universe"};

print (#occupation harry); (* prints "Master of the Universe" to screen *)

We can destructure records in functions, provided we give a type annotation (the ": person" in parentheses). Without it, we get an error alerting us about an unresolved flexible record type.

fun is_nice ({occupation,...} : person) =
    occupation="Barista" orelse occupation="Cook";

(* alternatively *)
fun is_really_nice(p : person) =
let
    val {occupation, ...} = p (* example of destructuring value declarations *)
in
    occupation="Barista" orelse occupation="Cook"
end;

Without the dots in either destructuring instance produces errors (mismatch between pattern and expression).

2.5.2. Pattern Matching in Functions

We can continue with our expression data structure, and compute the number of symbols involved "by cases":

type Identifier = string;

datatype Expr = Var of Identifier
              | Abs of (Expr*Expr)
              | App of (Expr*Expr);

fun expr_length(Var id) = 1
  | expr_length(Abs(e1, e2)) = expr_length e1 + expr_length e2 + 1
  | expr_length(App(e1, e2)) = expr_length e1 + expr_length e2;

We can have a "catch all" pattern which handles all other cases. For example, if we want to count the number of distinct Abs instances in an expression, we could write:

fun abs_count(Abs(e1, e2)) = 1 + abs_count e1 + abs_count e2
  | abs_count(App(e1, e2)) = abs_count e1 + abs_count e2
  | _ = 0;

2.6. References to Memory Locations

Standard ML provides the ability to have mutable variables, by making them references to memory locations. Basically, we dereference the ref using an exclamation point, and assign it a value using :=. For example:

val iref : int ref;

iref := 3; (* initialize the ref *)

iref := !iref - 1; (* update the ref *)

val foo : int = (!iref)*7; (* use the ref *)

We should think of ref as a parametrized type, 'a ref.

2.7. Primitive Functions and Types

We are given some initial functions and type declarations in the "initial basis". The types given:

eqtype unit which has a single value, ()
eqtype int
eqtype word
type real for floating point numbers
eqtype char, characters are indicated by #"a" for 'a', or #"\n" for a newline character, or #"\uxxxx" where xxxx are 4 hexadecimal constants
eqtype string for strings (double quoted, "foo")
type substring
type exn for exceptions
eqtype 'a array
eqtype 'a vector
eqtype 'a ref
datatype bool = false | true
datatype 'a option = NONE | SOME of 'a
datatype order = LESS | EQUAL | GREATER
datatype 'a list = nil | :: of ('a * 'a list)

There are about a dozen primitive exceptions defined:

exception Bind
exception Chr
exception Div
exception Domain
exception Empty
exception Fail of string
exception Match
exception Option
exception Overflow
exception Size
exception Span
exception Subscript

The non-overloaded functions provided at the top-level:

List functions
- val app : ('a -> unit) -> 'a list -> unit, so app f lst applies f to lst ("from left to right")
- val @ : ('a list * 'a list) -> 'a list for concatenating lists
- val foldl : ('a*'b->'b)-> 'b -> 'a list -> 'b, interpreted as foldl f init x1::xs = foldl f f(x1,init) xs, and foldl f init [] = init.
- val foldr : ('a*'b->'b)-> 'b -> 'a list -> 'b, interpreted as foldr f init x::xs = f(x, foldr f init xs), and foldr f init [] = init.
- val hd : 'a list -> 'a produces the first element in a list hd x::xs = x, raises error if operating on empty list hd [] = raise Empty
- val length : 'a list -> int for the length of a list
- val map : ('a -> 'b) -> 'a list -> 'b list produces a new list by applying a function to each element of the supplied list
- val null : 'a list -> bool, tests if the list is empty
- val rev : 'a list -> 'a list reverses the list
- val tl : 'a list -> 'a list gives the rest of the list tl x::xs = xs, raises error if operating on empty list tl [] = raise Empty
Characters
- val chr : int -> char produces a character from a "code point" integer
- val ord : char -> int produces the "code point" for a given integer
Options
- val getOpt : ('a option * 'a) -> 'a
- val isSome : 'a option -> bool
- val valOf : 'a option -> 'a gives the value of the option (i.e., valOf (SOME x) = x)
Reals
- val ceil : real -> int
- val floor : real -> int
- val real : int -> real produces a real for a given int
- val round : real -> int rounds a real to its nearest integer
- val trunc : real -> int truncates a real (i.e., "throws away the fractional part")
References
- val ref : 'a -> 'a ref primitive
- val ! : 'a ref -> 'a
- val := : 'a ref * 'a -> unit
Strings
- val ^ : string * string -> string, an inline string concatenation s ^ t
- val concat : string list -> string concatenates together a list of strings (concat s::strs = foldr ^ s strs | [] = "")
- val explode : string -> char list produces the list of characters in a string
- val implode : char list -> string creates a string from a list of characters, should be the same as concat o (map str).
- val size : string -> int gives the size of the string
- val str : char -> string turns a character into a string, e.g., str #"a" = a
- val substring : string * int * int -> string will produce the substring starting at the first number taking a specific number of characters, e.g., substring (s,i,j) returns s[i..i+(j-1)]; it raises an exception if i < 0 or j < 0 or |s| < i+j
Miscellaneous
- val not : bool -> bool, negation (not true = false, not false=true)
- val before : 'a * unit -> 'a, an infixed operation, a before b returns a. It provides a notational shorthand for evaluating a, then b, before returning the value of a.
- val exnMessage : exn -> string, get the message for an exception
- val exnName : exn -> string, get the name for an exception
- val ignore : 'a -> unit, ignore a returns (). The purpose of ignore is to discard the result of a computation, returning () instead. This is useful, for example, when a higher-order function, such as List.app, requires a function returning unit, but the function to be used returns values of some other type.
- val o : ('a->'b) * ('c->'a) -> 'c->'b composes functions, like Haskell's dot operator, (g o f)(x) = g(f(x)).
- val use : string -> unit will load a file (compiler/system dependent)
- val print : string -> unit, prints a given string to the screen
- val vector : 'a list -> 'a vector produces a vector from a list

There are also about a dozen "overloaded identifiers" which programmers may not overload:

val + : num * num -> num addition (defaults to num=int)
val - : num * num -> num, subtraction
val * : num * num -> num, multiplication
val div : wordint * wordint -> wordint, takes the quotient of integers (or word-sized integers)
val mod : wordint * wordint -> wordint, i mod j returns i-j*floor((i/j))
val / : real * real -> real divides two real numbers
val ~ : num -> num negation, equivalent to multiplying by -1
val abs : realint -> realint, the absolute value of either a real or an int
val < : numtext * numtext -> bool, compares two number (or two strings lexicographically)
val > : numtext * numtext -> bool, compares two number (or two strings lexicographically)
val <= : numtext * numtext -> bool, compares two number (or two strings lexicographically)
val >= : numtext * numtext -> bool, compares two number (or two strings lexicographically)

The top-level environment has the following infix identifiers and associated precedence levels:

infix 7 *, /, div, mod
infix 6 +, -, ^
infixr 5 ::, @
infix 4 =, <>, >, >=, <, <=
infix 3 :=, o
infix 0 before

So we would parse a*b + i/j as (a*b) + (i/j) since the precedence for multiplication is higher than addition, and division is higher than addition.

There's a lot more functionality provided by the Standard ML built-in library, but the rest seem to be contained in modules.

3. Modules

We can structure programs using modules. Standard ML's module system is extraordinarily sophisticated. It consists of structure, signature, and functor (analogous to expressions, types, and functions).

We can think of a structure as acting like a namespacing mechanism at minimum. For example,

structure Foo = struct
    val foo = [];
    fun spam q::qs = 1 + spam qs
      | spam    [] = 0;
end;

Foo.span Foo.foo; (* => 0 *)

3.1. Signatures

Signatures specify a minimum necessary for a structure. Each entry in a signature is interpreted as a specification.

signature QUEUE = sig
    type 'a t; (* type of queues *)
    exception E; (* for errors *)
    val empty : 'a t; (* the empty queue *)
    val enq : 'a t * 'a -> 'a t; (* add to end *)
    val null : 'a t -> bool; (* test for empty queue *)
    val hd : 'a t -> 'a; (* return the front of the queue *)
    val deq : 'a t -> 'a t; (* remove the front of the queue *)
end;

This seems to be the naming convention most Standard ML programmers use: signatures are ALL_UPPER_CASE, structures are PascalCased.

Also, for something like this QUEUE signature, where the signature specifies a single type, it seems to be convention to call it "t" (presumably short for "type").

We can "implement" a queue by making the types declared in the signature "transparent" (accessible outside the structure) or "opaque" (inaccessible outside the structure). For example:

structure Q1 : QUEUE = struct
    type 'a t = 'a list;
    exception E;

    val empty = [];

    fun enq (q, x) = q@[x];

    fun null(x::q) = false
      | null _     = true;

    fun head(x::q) = x
      | head _     = raise E;

    fun deq(x::q) = q
      | deq _     = raise E;
end;

structure Q2 :> QUEUE = struct
    type 'a t = 'a list;
    exception E;

    val empty = [];

    fun enq (q, x) = q@[x];

    fun null(x::q) = false
      | null _     = true;

    fun head(x::q) = x
      | head _     = raise E;

    fun deq(x::q) = q
      | deq _     = raise E;
end;

List.null Q1.empty; (* evaluates to "true" *)

List.null Q2.empty; (* ERROR!!! *)
(* In SML/NJ we get:

stdIn:7.1-7.19 Error: operator and operand do not agree [tycon mismatch]
  operator domain: 'Z list
  operand:         'Y Q3.t
  in expression:
    List.null Q3.empty

In HaMLet:

(input 2):1.0-1.18: type mismatch on application
*)

Since QUEUE had not defined its type 'a t, when we have structure Q2 opaquely ascribe to QUEUE, it hides the definition of type 'a t: its constructors are hidden from the user, and distinct from existing constructors.

3.1.1. Usefulness of Opaque Ascription

If we don't want the user to cheat and use a certain data structure, we use opaque ascription :>. Why on Earth would this be useful?

In the LCF style of theorem provers, we have a KERNEL signature which encodes the state of the theorem prover. We encode proven theorems using a thm data type. But we do not want the user to promote any arbitrary formula to be a theorem. How can we enforce using only certain specific axioms?

With opaque signature ascription! We have

signature KERNEL = sig
    type thm;
    type context;

    val axiom_trueR : context -> thm;
    (* other inference rules omitted *)
end;

structure Kernel :> KERNEL = struct
    datatype thm = datatype Syntax.formula;
    type context = Syntax.formula list;

    (* inference rules omitted *)
end;

If a user tries to do something like thm my_result = RiemannHypothesisFormula, then they'll get an error.

4. Value Restriction

Polymorphic ref types can cause problems when they store a function. The way around this is value restriction (or what the 1997 revised definition calls "non-expansive expressions").

5. Lists: Map, Fold, Filter is all you need

The folklore result that you only really need map, foldl, and filter for a list may be found in:

Erik Meijer, Maarten Fokkinga, and Ross Paterson,
"Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire".
In Conference on functional programming languages and computer architecture (pp. 124–144). Berlin, Heidelberg: Springer Berlin Heidelberg. PDF.

Meijer and friends work in the category CPO which results in a lazy functional language. Here a type is an \(\omega\) CPO, and morphisms are continuous functions between them.

Earlier work:

Tatsuya Hagino,
"Codatatypes in ML." Journal of symbolic computation 8, no. 6 (1989) 629–650.

(See also arXiv:2010.05167for Hagino's PhD thesis, and his paper "A Typed Lambda Calculus with Categorical Type Constructors" eprint for more details/fun.)

6. References

John Mitchell and Robert Harper,
"The Essence of ML".
Robert Harper and Christopher Stone,
"An Interpretation of Standard ML in Type Theory".
CMU-CS-97-147 Tech. Report, dated June 27, 1997, PDF.
Norman Ramsey's Learning ML is a good reading list, with good recommendations on what parts of books to read (and what to skip).

6.1. Standard Basis

Implementations
- SML/NJ (see also cvs)
- HaMLet
- PolyML

6.2. Tutorials

Lawrence Paulson,
ML for the Working Programmer. CUP, second ed., 1996; Online version
M.Tofte, "Tips for Computer Scientists on Standard ML (Revised)". Eprint.

6.3. Definition of Standard ML

There is a precise definition of Standard ML (first published in 1990, later revised in 1997).

The "initial basis" (top-level environment provided by Standard ML) is described here, and the Standard Basis documentation describes the libraries provided "out of the box".
- Standard ML Basis documentation
Robin Milner, Mads Tofte,
Commentary on Standard ML.
MIT Press, 1991; CiteSeerX
- Anyone trying to read the formal definition should start with this commentary and the 1990 definition; although the 1997 revision is now the standard, it seems to have only modified the module portion of the definition.
Daniel K. Lee, Karl Crary, Robert Harper,
"Mechanizing the Metatheory of Standard ML".
Tech. Report, PDF
- Harper and friends have produced a mechanization in Twelf of the definition of Standard ML (amazing!)

6.4. Classic ML

A "dialect", I suppose we could call the "primordial Ur ML" (or "pre-Standard ML").

Christoph Kreitz and Vincent Rahli, "Introduction to Classic ML". Eprint, dated 2011.

6.5. PCF

Martin Hotzel Escardo, Introduction to Real PCF.
John C Mitchell,
Foundations for Programming Languages.
MIT Press, 1996; Chapter 2 discusses PCF, using an ML-like syntax.
Gilles Dowek,
Introduction to the Theory of Programming Languages. Springer, 2011; chapter 2 introduces PCF.
Ingmar Dasseville, Marc Denecker,
"Transpiling Programmable Computable Functions to Answer Set Programs". arXiv:1808.07770, 15 pages.
Catuscia Palamidessi,
"The Language PCF".
Lecture Notes, 1998, CSE520

6.5.1. Full Abstraction of PCF

There are concerns about whether the operational semantics of PCF coincides with its denotational semantics; if so, then it's a property called "full abstraction" (Stanford encyclopedia of philosophy entry).

Hyland and Ong's "On Full Abstraction for PCF: I, II, and III" note on page 293:

Plotkin showed in [61] that the standard model is adequate but not fully abstract for PCF. He also pointed out the reason for the failure of full abstraction. The mismatch may be explained, in a nutshell, by the fact that while PCF-programs correspond to sequential algorithms, the standard Scott-continuous function space model contains parallel functions or, more precisely, functions which can only be implemented by parallel algorithms (e.g., parallel or). This point was made explicit by Plotkin in [61] (see also [65] and [67] where the relation between extensions of PCF by various parallel constructs is studied) as follows.

…

61. Plotkin, G. D. (1977),
"LCF as a programming language",
Theoret.Comput.Sci. 5, 223–255.

65. Sazonov, V. Yu. (1975),
"Sequentiality and parallely computable functionals",
in ``Proc.Symp.Lambda Calculus and Computer Science Theory,'' Lecture Notes in Computer Science, Vol.37, Springer-Verlag, Berlin/New York.

67. Sazonov, V. Yu. (1976),
"Expressibility of functions in Scott's LCF language",
Algebra i Logika 15, 308–330.

Gordon Plotkin, "LCF considered as a programming language". Theoretical Computer Science 5, 3 (1977) 223–255. PDF
Robin Milner, "Fully abstract models of typed λ-calculi". Theoretical Computer Science 4 (1977) 1–22.
Samson Abramsky, Radha Jagadeesan, Pasquale Malacaria, "Full Abstraction for PCF". arXiv:1311.6125, 50 pages.
JME Hyland, CHL Ong, "On Full Abstraction for PCF: I, II, and III". Eprint: pdf, 124 pages.
Vladimir Sazonov, "Inductive Definition and Domain Theoretic Properties of Fully Abstract Models for PCF and PCF+". arXiv:0707.3170, 50 pages.

6.6. Category Theory

One major unresolved issue, for me, is whether we can articulate Standard ML in terms of category theory. It's equally unclear to me if Haskell can be articulated in terms of genuine category theory (Hask is not a category).

G. Bellè, C. B. Jay, E. Moggi, "Functorial ML". PLILP 1996: Programming Languages: Implementations, Logics, and Programs pp 32-46 eprint
Edward Kmett, "On Hask". Youtube, 18 July 2014 (1 hr 2 min)
Moggi, "An Abstract View of Programming Languages" PDF

6.7. Pure Functional Programming

Usually people don't do pure functional programming in Standard ML (or any eager functional language). The reason stems from a series of papers:

Nicholas Pippenger,
"Pure versus impure Lisp".
Argues that purely functional Lisp is not as efficient as impure Lisp.
Richard Bird, Geraint Jones, Oege de Moor,
"More haste, less speed: lazy versus eager evaluation".
PDF
- A direct response to Pippenger, showing that lazy purely functional Lisp is competitive with the eager impure Lisp.
- The "punch line" seems to be: purely functional languages ought to be lazy, not eager.
Robin Milner,
"Fully abstract models of typed lambda-calculi". Theoretical Computer Science 4, no.1 (1977) 1–22, Eprint
- Arguably, any purely functional programming language should satisfy Milner's context lemmas