SECD Machine
Table of Contents
1. Introduction
Landin introduced the SECD machine back in 1964, Plotkin used it in the mid 1970s, and Henderson employed it in his book on Lispkit in 1980.
We will discuss the description of the theoretical architecture of the SECD machine (namely, its registers, its instruction set, and how each operation transforms the SECD machine state). Once we have finished this, we will consider a compiler from Lisp to SECD instructions.
Why Lisp? Well, we don't need to consider a parser or grammar for a simple input language. It's nothing more than wishing to avoid difficulties.
2. Basic Data Structures
The five key data structures SECD machines will build, manipulate, and keep in memory:
- Arbitrary S-expressions for computed data
- Lists representing programs to be executed
- Stacks used by the program's instructions
- "Value lists" containing the arguments for uncompleted function applications
- Closures to represent unprocessed function applications
3. Machine Description
Like any abstract machine, we have some instruction set which manipulates stacks or registers.
3.1. The four "registers"
There are four "registers" (stacks):
s
stack- holds intermediate results, it's a list of expressions
e
environment- holds values bound to variables
c
control- contains the code being executed
d
dump- stores the contents of the other registers temporarily when a function-call occurs (and restores them upon return)
Everything in the environment e
is given a unique ID, which is
essentially a tuple (x, y)
where x
is the location of the list
on the stack, y
is the position in that list.
3.2. Operations
Following Henderson, we have the following instruction set for our SECD machine:
LD
- load
LDC
- load a constant onto the stack
LDF
- load a function onto the stack
AP
- apply a function
RTN
- return
DUM
- create a dummy environment
RAP
- recursively apply
SEL
- select subcontrol
JOIN
- rejoin main control
CAR
- take the
car
of the top of thes
stack CDR
- take the
cdr
of the top of thes
stack ATOM
- apply the
atom
predicate to the top stack item CONS
- form
cons
of top two stack items EQ
- apply
eq
predicate to top two stack items STOP
- stop the machine, we're done
And then about a half-dozen arithmetic operators (ADD
, SUB
,
MUL
, DIV
, REM
, LEQ
).
3.3. Operation Transition Rules
The semantics for our SECD machine is described using a transition
relation which maps s e c d -> s' e' c' d'
.
3.3.1. S-Expression related Rules
The CONS
, CDR
, CAR
instructions behave as we would hope and
expect:
CONS
(a b . s) e (CONS . c) d
becomes((a . b) . s) e c d
CAR
((a . b) . s) e (CAR . c) d
becomes(a . s) e c d
CDR
((a . b) . s) e (CDR . c) d
becomes(b . s) e c d
The ATOM
predicate tests if the top item in s
is an atomic
datum or not.
ATOM
(a . s) e (ATOM . c) d
becomes(t . s) e c d
wheret=T
ifa
is in fact an atom,t=F
ifa
is a list.
3.3.2. Arithmetic Rules
The arithmetic operations applied to the top 2 items of the stack, but do not modify the other registers:
ADD
takes(a b . s)
and produces(b + a . s)
SUB
takes(a b . s)
and produces(b - a . s)
MUL
takes(a b . s)
and produces(b * a . s)
DIV
takes(a b . s)
and produces(b / a . s)
REM
takes(a b . s)
and produces(b % a . s)
LEQ
takes(a b . s)
and produces(b <= a . s)
EQ
takes(a b . s)
and produces(b = a . s)
3.3.3. Loading Rules
LD
s e (LD i . c) d
becomes(x . s) e c d
wherex = locate(i, e)
is the current value for the indexi
, andlocate()
is a function in the meta-languageLDC
s e (LDC n . c) d
becomes(n . s) e c d
LDF
s e (LDF c' . c) d
becomes((c' . e) . s) e c d
where the closure is(c' . e)
.
The pseudocode for the locate()
function requires first
introducing a function which returns the nth
member of a list
(zero-indexed):
index :: Int -> List -> a index 0 s = car(s) index n s = if n < 0 then error "Must have non-negative index number" else index (n - 1) cdr(s)
We use this to define a function which determines, for an index
pair i = (b . n)
the corresponding value in the environment:
locate :: (Int, Int) -> Env -> a locate (b, n) e = index n (index b e)
3.3.4. Conditional Form Rules
There are two instructions for conditional forms, namely SEL
selects a sublist of the control based on the value at the top of
the stack, and JOIN
rejoins the main control. We could then
expect the control to look like
(... SEL c_TRUE c_FALSE c)
where both c_TRUE
and c_FALSE
look like a list whose last
element is JOIN
, i.e., c_T = (... JOIN)
.
SEL
(x . s) e (SEL c_T c_F . c) d
becomess e c_X (c . d)
wherec_X
isc_T
ifx = T
, andc_X
isc_F
ifx = F
JOIN
s e (JOIN) (c . d)
evaluates tos e c d
So SEL
is a jump instruction, JOIN
jumps back to the function
caller.
3.3.5. Function Application
We need to bear in mind how we encode closures, because they govern
how we implement AP
, RTN
, etc. A closure is just (c . e)
a
pair of instructions plus an environment for it.
AP
((c' . e') v . s) e (AP .c) d
evaluates toNIL (v . e') c' (s e c . d)
a state with:- a fresh
NIL
stack, - a new environment
(v . e')
, - the instructions to execute
c'
given by the function body, and - the saved state at the time of function call
(s e c . d)
- a fresh
RTN
(x) e' (RTN) (s e c . d)
becomes(x . s) e c d
, it restores the previous state of thee
,c
,d
registers, and pushesx
to the top of the previouss
register-state.DUM
- creating a new environment with
PENDING
as the first sublist, transformss e (DUM . c) d
intos (PENDING . e) c d
. RAP
- recursively apply,
((c' . e') v . s) (PENDING . e) (RAP . c) d
becomesNIL rplaca(e', v) c' (s e c . d)
whererplaca
is a function in the metalanguage (compare toAP
which instead had(v . e')
directly)
The recursive aspect of RAP
is due to rplaca
which amounts to
be a (setf (car e') v)
(thus replacing the PENDING
dummy
metasymbol introduced by DUM
). The mnemonic really is rplaca(x,y)
"replace car of x
by y
and return pointer to x
".
Remember when we noted e
is basically a stack of lists, and data
stored in the e
environment is given a unique identifier (x, y)
?
In a typical function call with AP
, a new list is pushed onto
e
. Any local variables now look like (len(e), y)
. This len(e)
is problematic for recursive functions, because the stack grows
with each recursive function call.
That's why RAP
, instead of pushing a new list onto the
environment stack, simply replaces whatever is at the head of e
with a new environment list.
For further consideration about recursive functions in SECD, consider also reading John Ramsdell's "The Tail-Recursive SECD Machine" (1999).
4. References
- Peter Henderson,
Functional Programming: Application and Implementation.
Prentice-Hall International Series in Computer Science,
- See especially chapter 6.
- Gordon Plotkin, "Call-byname, call-by-value and the lambda-calculus". Theoretical Computer Science 1 (1975) pp. 125-159.
4.1. SECD Machine
- StackOverflow thread
- Jia-Huai You, SECD Virtual Machine (archive, lecture notes/slides)
- John D. Ramsdell, "The Tail-Recursive SECD Machine". Manuscript dated 1999. Eprint, CiteSeerX.
- Revisiting SECD and the Power of Lisp
- P.M. Kogge, The Architecture of Symbolic Computers. McGraw–Hill, 1991, ch. 7.
- W.H. Burge, Recursive Programming Techniques. Addison-Wesley, 1975.
4.2. Lisp to SECD Compiler
- Peter Henderson's book is about this very topic
- Milos Radovanovic, Mirjana Ivanovic, "An Implementation of Lispkit Lisp in Java". In Proc. XV Conf. on Applied Math., 2002, pp.169-178, PDF, CiteSeerX.
- A Rational Deconstruction of Landin’s SECD Machine
- In the SECD Machine how does “rap” work?
- SECD: Design Issues