Idioms - SML
Table of Contents
These are idioms gleaned from reading some random Standard ML programs.
1. Using type t
in Signatures
When encoding a data type as a signature, for example a signature SET
,
use type t
to specify the data type.
Example:
signature SET = sig type 'a t; val empty : 'a t; val insert : 'a -> 'a t -> 'a t; val null : 'a t -> bool; val member : 'a -> 'a t -> bool; val map : ('a -> 'b) -> 'a t -> 'b t; end; structure TreeSet : SET = struct datatype 'a t = E | T of int*'a*('a t)*('a t); val empty = E; fun insert x E = (* ... *) | insert x (T (n, v, l, r)) = (* ... *); fun null E = true | null _ = false; fun member _ E = false | x (T (_, v, l, r)) = (* ... *); fun map _ E = E | map f (T (_, x, l, r)) = (* ... *); end; (* mySet : TreeSet.t *) mySet = TreeSet.empty;
2. Datatypes in Signatures
One caveat worth discussing: we can end up with duplicate-but-distinct types if we place datatype declarations in signatures. For example, consider the signature:
signature EXP = sig datatype t = Var of id | Abs of id * t | App of t * t end;
Suppose we have two different structures which implement this signature, e.g., one for a parse tree and another for an abstract syntax tree:
structure ParseTree : EXP = struct datatype t = Var of id | Abs of id * t | App of t * t; end; structure AST : EXP = struct datatype t = Var of id | Abs of id * t | App of t * t; end;
The surprise: AST.t
and ParseTree.t
are two distinct types.
This is because Standard ML has generative datatype declarations,
meaning: each time a datatype
declaration is encountered, a new type
will be created. We could avoid this by doing something like:
struct Exp = struct datatype t = Var of id | Abs of id * t | App of t * t; end; signature EXP = sig datatype t = datatype Exp.t; end;
This is specified by rule 18 of the 1997 definition of SML to not generate duplicate datatypes. We could also do the following:
datatype Exp = Var of id | Abs of id * Exp | App of Exp * Exp; signature EXP = sig datatype t = datatype Exp; end;
But it is unclear if we could do the following:
local datatype Exp = Var of id | Abs of id * Exp | App of Exp * Exp; in signature EXP = sig datatype t = datatype Exp; end; end;
This also has the problem/feature that the constructors Var
, Abs
,
App
are private.
(This is also probably why signatures should have just type t
instead
of a datatype declaration.)
See:
- Generative Datatypes in MLton's wiki
- Tips for writing concise SML in MLton's wiki
- Appendix G §§2–3 of the Revised Definition of Standard ML.
- §4.3.3 of "The History of Standard ML"
- Karl Crary, Robert Harper, Perry Cheng, Leaf Petersen, Chris Stone, Transparent and opaque interpretations of datatypes (1998).
3. Stringify Types
In Haskell, we tend to have a single typeclass Show
which has a "to string"
method. Standard ML does not have anything similar to type classes, so
we end up encoding them as signatures. Strictly speaking, we would have
signature SHOW = sig type t; val toString : t -> string; end;
3.1. Type Conversions
More broadly, if we have defined two types A
and B
, and it is
"obvious" we can convert any value of type A
to a value of type B
,
then we define a function of the form:
structure A = struct type t; val toB : t -> B; (* val fromB : B -> t; if we can convert the other way around *) (* etc. *) end;
Then we have A.toB foo
be an "obvious" conversion.
- Gansner and Reppy's Standard ML Basis Library, section 1.1.4, encourages this sort of naming convention for conversion functions.
4. Comparable Types
When defining a new type T
, we would need to implement a function
compare : T * T ->
order
.
This should be placed in a signature, so it's "public facing".
We would want various nice properties of the ordering, like:
- Transitivity: if
compare(x,y) = compare(y,z)
, then we wantcompare(x,z) = compare(x,y)
too. - Reflexivity: for any
x
, we wantcompare(x, x) = EQUAL
- Antisymmetry: if
not (compare(x, y) = GREATER)
andnot (compare(y, x) = GREATER)
, then we expectcompare(x, y) = EQUAL
.
A "nice to have" property which, I am not certain is absolutely necessary:
- Completeness: for any
x
andy
, we wantcompare(x,y) orelse compare(y,x)
to be true
Examples:
References:
- Gansner and Reppy's Standard ML Basis Library, sections 1.1.3 and 4.1, encourages this sort of practice with linear orders.
4.1. Caveat: Equality Testing
The only caveat which springs to mind is that it may be faster to
implement your own function testing for equality. Again, Haskell has
Data.Eq
typeclass, but Standard ML has equality type kludge.
For example, if you are implementing a proof assistant, and you have a
datatype encoding logical propositions, it would be somewhat faster to
implement your own eq
function than to use compare (x,y) = EQUALS
.
5. Principle: Similar Functions should have Similar Names
If we are building our own collection data structure (say, a set or hashmap or whatever), then we should name the functions similar to analogous functions on lists. For example,
signature SET = sig type 'a t; val app : ('a -> unit) -> 'a t -> unit; val map : ('a -> 'b) -> 'a t -> 'b t; (* And so on... *) end;
The intuition being that an instance of SET.t
is like a list and
should have analogous map
and app
functions (among others). In
retrospect, a better design choice would have been to have a hierarchy
of signatures for collections, and the structure would implement the
relevant one.
- Gansner and Reppy's Standard ML Basis Library, section 1.1.2, encourages this sort of practice.