SECD Machine
Table of Contents
1. Introduction
Landin introduced the SECD machine back in 1964, Plotkin used it in the mid 1970s, and Henderson employed it in his book on Lispkit in 1980.
We will discuss the description of the theoretical architecture of the SECD machine (namely, its registers, its instruction set, and how each operation transforms the SECD machine state). Once we have finished this, we will consider a compiler from Lisp to SECD instructions.
Why Lisp? Well, we don't need to consider a parser or grammar for a simple input language. It's nothing more than wishing to avoid difficulties.
2. Basic Data Structures
The five key data structures SECD machines will build, manipulate, and keep in memory:
- Arbitrary S-expressions for computed data
- Lists representing programs to be executed
- Stacks used by the program's instructions
- "Value lists" containing the arguments for uncompleted function applications
- Closures to represent unprocessed function applications
3. Machine Description
Like any abstract machine, we have some instruction set which manipulates stacks or registers.
3.1. The four "registers"
There are four "registers" (stacks):
sstack- holds intermediate results, it's a list of expressions
eenvironment- holds values bound to variables
ccontrol- contains the code being executed
ddump- stores the contents of the other registers temporarily when a function-call occurs (and restores them upon return)
Everything in the environment e is given a unique ID, which is
essentially a tuple (x, y) where x is the location of the list
on the stack, y is the position in that list.
3.2. Operations
Following Henderson, we have the following instruction set for our SECD machine:
LD- load
LDC- load a constant onto the stack
LDF- load a function onto the stack
AP- apply a function
RTN- return
DUM- create a dummy environment
RAP- recursively apply
SEL- select subcontrol
JOIN- rejoin main control
CAR- take the
carof the top of thesstack CDR- take the
cdrof the top of thesstack ATOM- apply the
atompredicate to the top stack item CONS- form
consof top two stack items EQ- apply
eqpredicate to top two stack items STOP- stop the machine, we're done
And then about a half-dozen arithmetic operators (ADD, SUB,
MUL, DIV, REM, LEQ).
3.3. Operation Transition Rules
The semantics for our SECD machine is described using a transition
relation which maps s e c d -> s' e' c' d'.
3.3.1. S-Expression related Rules
The CONS, CDR, CAR instructions behave as we would hope and
expect:
CONS(a b . s) e (CONS . c) dbecomes((a . b) . s) e c dCAR((a . b) . s) e (CAR . c) dbecomes(a . s) e c dCDR((a . b) . s) e (CDR . c) dbecomes(b . s) e c d
The ATOM predicate tests if the top item in s is an atomic
datum or not.
ATOM(a . s) e (ATOM . c) dbecomes(t . s) e c dwheret=Tifais in fact an atom,t=Fifais a list.
3.3.2. Arithmetic Rules
The arithmetic operations applied to the top 2 items of the stack, but do not modify the other registers:
ADDtakes(a b . s)and produces(b + a . s)SUBtakes(a b . s)and produces(b - a . s)MULtakes(a b . s)and produces(b * a . s)DIVtakes(a b . s)and produces(b / a . s)REMtakes(a b . s)and produces(b % a . s)LEQtakes(a b . s)and produces(b <= a . s)EQtakes(a b . s)and produces(b = a . s)
3.3.3. Loading Rules
LDs e (LD i . c) dbecomes(x . s) e c dwherex = locate(i, e)is the current value for the indexi, andlocate()is a function in the meta-languageLDCs e (LDC n . c) dbecomes(n . s) e c dLDFs e (LDF c' . c) dbecomes((c' . e) . s) e c dwhere the closure is(c' . e).
The pseudocode for the locate() function requires first
introducing a function which returns the nth member of a list
(zero-indexed):
index :: Int -> List -> a
index 0 s = car(s)
index n s = if n < 0 then error "Must have non-negative index number"
else index (n - 1) cdr(s)
We use this to define a function which determines, for an index
pair i = (b . n) the corresponding value in the environment:
locate :: (Int, Int) -> Env -> a locate (b, n) e = index n (index b e)
3.3.4. Conditional Form Rules
There are two instructions for conditional forms, namely SEL
selects a sublist of the control based on the value at the top of
the stack, and JOIN rejoins the main control. We could then
expect the control to look like
(... SEL c_TRUE c_FALSE c)
where both c_TRUE and c_FALSE look like a list whose last
element is JOIN, i.e., c_T = (... JOIN).
SEL(x . s) e (SEL c_T c_F . c) dbecomess e c_X (c . d)wherec_Xisc_Tifx = T, andc_Xisc_Fifx = FJOINs e (JOIN) (c . d)evaluates tos e c d
So SEL is a jump instruction, JOIN jumps back to the function
caller.
3.3.5. Function Application
We need to bear in mind how we encode closures, because they govern
how we implement AP, RTN, etc. A closure is just (c . e) a
pair of instructions plus an environment for it.
AP((c' . e') v . s) e (AP .c) devaluates toNIL (v . e') c' (s e c . d)a state with:- a fresh
NILstack, - a new environment
(v . e'), - the instructions to execute
c'given by the function body, and - the saved state at the time of function call
(s e c . d)
- a fresh
RTN(x) e' (RTN) (s e c . d)becomes(x . s) e c d, it restores the previous state of thee,c,dregisters, and pushesxto the top of the previoussregister-state.DUM- creating a new environment with
PENDINGas the first sublist, transformss e (DUM . c) dintos (PENDING . e) c d. RAP- recursively apply,
((c' . e') v . s) (PENDING . e) (RAP . c) dbecomesNIL rplaca(e', v) c' (s e c . d)whererplacais a function in the metalanguage (compare toAPwhich instead had(v . e')directly)
The recursive aspect of RAP is due to rplaca which amounts to
be a (setf (car e') v) (thus replacing the PENDING dummy
metasymbol introduced by DUM). The mnemonic really is rplaca(x,y)
"replace car of x by y and return pointer to x".
Remember when we noted e is basically a stack of lists, and data
stored in the e environment is given a unique identifier (x, y) ?
In a typical function call with AP, a new list is pushed onto
e. Any local variables now look like (len(e), y). This len(e)
is problematic for recursive functions, because the stack grows
with each recursive function call.
That's why RAP, instead of pushing a new list onto the
environment stack, simply replaces whatever is at the head of e
with a new environment list.
For further consideration about recursive functions in SECD, consider also reading John Ramsdell's "The Tail-Recursive SECD Machine" (1999).
4. References
- Peter Henderson,
Functional Programming: Application and Implementation.
Prentice-Hall International Series in Computer Science,
- See especially chapter 6.
- Gordon Plotkin, "Call-byname, call-by-value and the lambda-calculus". Theoretical Computer Science 1 (1975) pp. 125-159.
4.1. SECD Machine
- StackOverflow thread
- Jia-Huai You, SECD Virtual Machine (archive, lecture notes/slides)
- John D. Ramsdell, "The Tail-Recursive SECD Machine". Manuscript dated 1999. Eprint, CiteSeerX.
- Revisiting SECD and the Power of Lisp
- P.M. Kogge, The Architecture of Symbolic Computers. McGraw–Hill, 1991, ch. 7.
- W.H. Burge, Recursive Programming Techniques. Addison-Wesley, 1975.
4.2. Lisp to SECD Compiler
- Peter Henderson's book is about this very topic
- Milos Radovanovic, Mirjana Ivanovic, "An Implementation of Lispkit Lisp in Java". In Proc. XV Conf. on Applied Math., 2002, pp.169-178, PDF, CiteSeerX.
- A Rational Deconstruction of Landin’s SECD Machine
- In the SECD Machine how does “rap” work?
- SECD: Design Issues