\( \newcommand\D{\mathrm{d}} \newcommand\E{\mathrm{e}} \newcommand\I{\mathrm{i}} \newcommand\bigOh{\mathcal{O}} \newcommand{\cat}[1]{\mathbf{#1}} \newcommand\curl{\vec{\nabla}\times} \newcommand{\CC}{\mathbb{C}} \newcommand{\NN}{\mathbb{N}} \newcommand{\QQ}{\mathbb{Q}} \newcommand{\RR}{\mathbb{R}} \newcommand{\ZZ}{\mathbb{Z}} \)
UP | HOME

Architecture - I-Machine

Table of Contents

(These are annotations on the paper (local) describing the architecture of the Symbolics I-Machine.)

1. Lisp-Machine Data Types

1.1. Introduction to Lisp-Machine Objects

1.1.1. Memory Words

1.1.1.1. Length and Format

A word was 40-bits and contained three fields:

Position Length Field Name
<39:38> 2 bits Cdr Code
<37:32> 6 bits Data Type
<31:0> 32 bits Address or Immediate Data

This is a minimum requirement. Possibly more bits could be alloted to a word, allowing for ECC, etc. ECC or parity bits are not required, but are a possible extension.

In C, this would be a packed struct:

#pragma pack(1)
struct Word {
    uint32_t cdr  :  2;
    uint32_t type :  6;
    uint32_t data : 32;
};
1.1.1.2. Fields

The type field indicated what kind of information is stored in the word. (I think the types are dtp- prefixed identifiers in the documentation.)

The address or immediate data field is interpreted according to the data type of the word.

The cdr code is used for various different purposes depending on the type.

1.1.2. Classes of Stored Object Representations

An object can be stored:

  1. as a list, which is built out of one or more conses (e.g., conses, compact lists, closures, big ratios, double-precision floating point numbers, complex numbers, generic functions),
  2. as an immediate object, which does not require any additional memory for its representation (e.g., numbers, physical addresses, primitive types, instructions), or
  3. as a structure, represented as a block of memory words, the first contains a header with a special data type code (e.g., compiled functions, instances, arrays, symbols, bignums).

Each word within the structure object may contain:

  • an object reference,
  • a header
  • a forwarding pointer, or
  • a special marker.

1.1.3. Components of Stored Representations

The components of stored representations found in Lisp Machine memory are:

  • object references,
  • headers,
  • forwarding (invisible) pointers, or
  • special markers.
1.1.3.1. Object References

Object references are the mechanism by which one refers to an object. There are three types of object references:

  1. object references by address (implemented by a memory word whose address field contains the virtual address of the stored representation of the object; e.g., symbols, lists, arrays)
  2. immediate object references (implemented by memory words that directly contain the entire representation of the object; e.g., small integers, single-precision floating point numbers)
  3. pointers (contain the virtual addresses of locations that do not contain objects: they point to locations within objects; e.g., pointer to the value cell of a symbol).
1.1.3.2. Headers

Headers separate objects in memory. This contains information about the type of object. There are two types of headers:

  • dtp-header-i when the header is immediate data
  • dtp-header-p when the header is the address containing some descriptive data

Object references usually use the address of an object's header as the address of the object. "Usually". The only exceptions are

  1. the object reference to a compiled function, and
  2. the object reference to an array "with a leader" (in which case the reference points to a specified location in the structure)

For dtp-header-p, the following table describes the cdr-code field's interpretation:

Header Type Symbolic Name Object Type
0 %header-type-symbol Symbol
1 %header-type-instance Instance
2 %header-type-leader Array Leader
3   Reserved

For dtp-header-i, the following table describes the cdr-code field's meaning:

Header Type Symbolic Name Object Type
0 %header-type-compiled-function Compiled Function
1 %header-type-array Array
2 %header-type-number Number
3   Reserved
  • If we want to change the memory location of an object represented by a structure, then
    • the object's header is moved to a new location and the object's old location is filled with a word of type dtp-header-forward, an invisible pointer that contains the address of the new location of the reference.
    • The object references in the locations of the old structure are all replaced with pointers of the type dtp-element-forward, which contains the addresses of the new locations of the objects.
1.1.3.3. Forwarding (Invisible) Pointers

The metaphor for a "forwarding pointer" is postal forwarding, specifying a reference to the location containing it should be redirected to another memory location. They're used for internal bookkeeping by the storage management system.

The data types of the forwarding pointers are:

  • dtp-external-value-cell-pointer, used to link a symbol's value cell to a closure or an instance value cell (it is not invisible to binding and unbinding, see the section Binding Stack)
  • dtp-one-q-forward forwards only the cell that contains it (i.e., it indicates the cell is contained at the address specified in the address field of the dtp-one-q-forward word and the cdr-code of the required data is the cdr-code of the dtp-one-q-forward word). This pointer is used to link a symbol value or function cell to a wired cell or a compiled-function's function cell (as well as for many other applications).
  • dtp-header-forward
  • dtp-element-forward

A header-forward pointer is also used with list representation (which is explained fulling in another section, see Representations of Lists). When a one-word cons must be expanded to a two-word cons by replacd, a new two-word cons is allocated and the old one-word cons is replaced by a header-forward pointer containing the address of the new cons.

1.1.3.4. Special Markers

The data types of the special markers are:

  • dtp-null (placed in the value cell or function cell of a symbol or in the instance-variable value cell in an instance, in those cases when no value has been assigned). The address field of the null marker contains the address of the name of the variable. This makes it possible for an error handler to report the name of the offending variable when an attempt to use the value of an unbound variable is detected.

    A null special marker is also used to initialize a freshly-created virtual memory page in case it is accidentally accessed before an object is created in it.

    The encoding of the null-special-marker data type is zero. Memory that is initialized to all bits zero thus contains all null words, which will cause a trap if dereferenced.

  • dtp-monitor-forward, intended for use with a debugging feature (see the section/chapter Exception Handling)
  • dtp-gc-forward, used by the garbage collector and may only appear in oldspace.

1.1.4. Operand-Reference Classification

Observe there are 64 type codes altogether: 55 from data, then the following 9 additional codes

  • Null: dtp-null
  • Immediate header: dtp-header-i
  • Pointer Header: dtp-header-p
  • HFWD: dtp-header-forward
  • EFWD: dtp-element-forward
  • 1FWD: dtp-one-q-forward
  • EVCP: dtp-external-value-cell-pointer
  • GC: dtp-gc-forward
  • Monitor: dtp-monitor-forward

1.2. Data-Type Descriptions

1.2.1. Representations of Symbols

The object reference to a symbol is a word of data type dtp-symbol or dtp-nil. The address field of this word contains the address of a header of type dtp-header-p. This is followed by 4 words. The header's header-type field equals %header-type-symbol and the address of the header contains the symbol's name, a string. The five words that constitute a symbol object are:

0 SYMBOL-NAME-CELL address of the symbol's name
1 SYMBOL-VALUE-CELL the value, or an unbound marker
2 SYMBOL-FUNCTION-CELL the definition, or an unbound marker
3 SYMBOL-PROPERTY-CELL the property list
4 SYMBOL-PACKAGE-CELL the home package, or nil

A straightforward implementation would be:

struct Symbol {
    Word name;
    Word value;
    Word function;
    Word property; /* property list */
    Word package;
};

struct Word* make_symbol(name, value, ...) {
    struct Symbol *symbol = malloc(sizeof(*symbol));
    struct Word *symbol_ptr = malloc(sizeof(symbol_ptr));
    symbol_ptr->type = DTP_SYMBOL;
    symbol_ptr->data = (uint32_t)(symbol);
    symbol->name = name;
    symbol->name.type = DTP_HEADER_P;
    symbol->name.cdr = HEADER_TYPE_SYMBOL; /* == %header-type-symbol */
    symbol->value = value;
    /* etc. */
    return symbol_ptr;
}

The I-Machine reserved locations for nil and t in fixed memory locations (vma=pma 1011000) and (vma=pma 1011005), respectively (see the section Wired Addresses).

1.2.2. Representation of Instances and Related Data Types

Before we had CLOS, we had Flavors.

1.2.2.1. Flavor Instances

The object reference to an instance is a word of data type dtp-instance whose address field points to the instance structure.

The stored representation of an instance consists of a header with type dtp-header-p whose header-type field equals %header-type-instance. The words following the header of the instance are the value cells of the instance variables. They constain either object references or an unbounded marker. The cdr codes are not used. The address field of the header contains the address of the hash-mask field of a flavor-description structure. This description is called a "flavor".

The fields of a flavor are:

  • the array header, part of the packaging of the structure
  • the named-structure symbol, part of the packaging of the structure
  • the size of an instance, used by the garbage collector and by the instance referencing instructions (%instance-ref and the like)
  • the hash mask, used by the hardware for method lookup
  • the handler hash table address, used by the hardware for method lookup
  • the name of the flavor, used by the type-of function
  • additional flavors known only to the flavor system
struct FlavorInstance {
    Word header;
    Word flavor_address;
    Word instance_fields[];
};
/* instance->flavor_address.data == &flavor->hash_mask */

struct Flavor {
    Word header;
    Word named_struct_symbol;
    Word instance_size;
    Word hash_mask;    /* hash_mask->type == DTP_FIXNUM */
    Word hash_address; /* hash_address->type == DTP_LOCATIVE */
    Word typename;     /* typename->type == DTP_SYMBOL */
};
/* flavor->hash_address.data = &handler_table */

struct Handler {
    Word key;       /* generic function, symbol, or nil */
    Word method;    /* even-pc or odd-pc */
    Word parameter; /* mapping table */
};
/* flavor->hash_address == &array(Handler) */

A handler table is a hash table which maps a generic function or message to the function to be invoked and a parameter to that function. Typically the function is a method, and the parameter is a mapping table used by that method to access instance variables. The mapping table is a simple, short-prefix ART-Q array ("ART" = "ARray Type").

For speed, the format of handler tables is architecturally defined and known by hardware. Handler hash tables are packaged inside arrays (but this is software dependent, not hardware or architecture dependent).

A handler table consists of a sequence of three-word elements. The address of first word of the first element is in the flavor. Each element consists of:

  • The key: This is a generic function (dtp-generic-function), a message name (dtp-symbol), or nil, which is a default that matches everything (dtp-nil)
  • The method: This is a program-counter value (dtp-even-pc or dtp-odd-pc) addressing the instruction at which the compiled function corresponding to the method is to be entered.
  • The parameter: This is a parameter that gets passed from the function or message to the method as an extra argument. If the parameter in the handler table is nil, the generic function or message is used as the parameter.

I honestly do not understand the idea of a mapping table so far.

There's a "fence object" at the end of the handler hash table, which is filled with nil entries for its key, method, and parameter. Unused methods are likewise filled with nil entries.

We should note that it seems, thus far reading the fine manual, that dtp-even-pc and dtp-odd-pc are pointers to functions — compiled, or lexical closures (or dynamical closures?).

1.2.2.2. List Instance

An object reference to a list instance is a word of data type dtp-list-instance whose address field points to an instance structure.

1.2.2.3. Array Instance

An object reference to an array instance is a word of data type dtp-array-instance whose address field points to an instance field.

1.2.2.4. String Instance

An object reference to a string instance is a word of data type dtp-string-instance whose address field points to an instance structure.\

1.2.3. Representation of Characters

An object reference to a character is an immediate object of type dtp-character. Characters seem to use the 32-bit data as follows:

Position Symbolic Name Description
<31:28> (4 bits) %%CHAR-BITS Control, Meta, Super, Hyper bit
<27:16> (12 bits) %%CHAR-STYLE Italic, large, bold, …
<15:8> (8 bits) %%CHAR-CHAR-SET Character set
<7:0> (8 bits) %%CHAR-SUBINDEX Index within this character set

As I can glean from this: the %%CHAR-SUBINDEX corresponds to what we would call the code point for the character, or perhaps the 16-bits from the character set & index correspond to the code point. (Arguably, the control, meta, super, hyper bits should be added, as well, giving a 20-bit character set.)

1.2.4. Representation of Numbers

1.2.4.1. Fixnum Representation

A 32-bit two's-complement integer. Its data type is dtp-fixnum.

1.2.4.2. Bignum Representation
1.2.4.3. Small-Ratio Representation

The 32-bit field is treated as an array of two 16-bit integers, the first is a signed two's-complement number treated as the numerator, the second is an unsigned number treated as the denominator.

If the numerator is zero, or the denominator is 1 or 0, then the number is treated as an illegal value.

1.2.4.4. Big-Ratio Representation

An object reference to a big-ratio is a word of data type dtp-big-ratio whose address field points to a cons pair. The car of the cons contains the numerator, the dcr contains the denominator. As with the small-ratio, if the numerator is 0 or a denominator of 0 or 1, or a negative number, is treated as an illegal value.

1.2.4.5. Single-Precision Floating-Point Representation

A single-precision floating-point number is an immediate object of data type dtp-single-float whose data field contains a 32-bit IEEE basic floating-point number. The following fields are defined:

Position Symbolic Name Description
<31> %%SINGLE-SIGN 0 for positive numbers, 1 for negative numbers
<30:23> %%SINGLE-EXPONENT excess-127 exponent
<22:0> %%SINGLE-FRACTION positive fraction, with hidden 1 on the left
1.2.4.6. Double-Precision Floating-Point Representation

The object reference to a double-precision floating point number is a word of data-type dtp-double-float. The address field of the double-float word contains the address of a cons pair. The data fields in the words of the cons hold two fixnums, containing the sign, exponent, and fraction as packed fields. The first contains:

Position Symbolic Name Description
<31> %%DOUBLE-SIGN 0 for a positive number, 1 for negative number
<30:20> %%DOUBLE-EXPONENT excess-1023 exponent
<19:0> %%DOUBLE-FRACTION-HIGH top 20 bits of fraction (excluding hidden bit)

The second fixnum contains one field:

Position Symbolic Name Description
<31:0> %%FRACTION-LOW bottom 32 bits of fraction

This conforms to the IEEE standard 64-bit representation of double-precision numbers.

1.2.4.7. Complex-Number Representation

The object reference to a complex number is a word of data type dtp-complex whose address points to a cons pair. The car of the cons contains the real part of the number, and the cdr contains the imaginary part.

Obviously, this is the Cartesian representation of the number. There are other representations (polar, etc.) which are also useful, but we realize this now, long after the fact. We didn't know any better back in the '80s.

1.2.4.8. The Spare-Number Type

This is reserved for future use.

1.2.5. Representations of Lists

An object reference to a list has its type be dtp-list. The cdr-code tag of a memory word which constitutes an element of a list specifies how to get the cdr of its associated cons according to whether the list is stored in normal linked-list form or in compact ("cdr-coding") form. The cdr-code tag works as follows:

Code Symbolic Name Description
0 cdr-next Increment the address to get a reference to the cdr, itself a cons
1 cdr-nil The cdr is nil. This is used for both kinds of list.
2 cdr-normal Fetch the next memory word; it contains a reference to the cdr.
3   (illegal)

Basically cdr-coding works by reserving cells sequentially.

/* representation of '(a b) using normal lists */
struct Symbol *a, *b;
/* initialize a, b */
struct Word list;
list.type = DTP_LIST;

struct Word cons[2];
cons[0].cdr_code = CDR_NORMAL;
cons[0].type = DTP_SYMBOL;
cons[0].value = (int32_t)a; /* address of 'a */
cons[1].cdr = CDR_NIL;
cons[1].type = DTP_LIST;

struct Word next_cons[2];
next_word[0].cdr_code = CDR_NORMAL;
next_word[0].type = DTP_SYMBOL;
next_word[0].data = (int32_t)b; /* address of 'b */
next_word[1].cdr_code = CDR_NIL;
next_word[1].type = DTP_NIL;
next_word[1].data = NIL;

cons[1].data = (int32_t)&next_word[0];
list.data = &cons[0]; /* list's data is the address of first cons pair */

/* the same list compactified */
struct Word compact_list;
compact_list.type = DTP_LIST;
Word cells[2];
cells[0].cdr_code = CDR_NEXT;
cells[0].type = DTP_SYMBOL;
cells[0].data = (int32_t)a; /* address of 'a */
cells[1].cdr_code = CDR_NIL;
cells[1].type = DTP_SYMBOL;
cells[1].data = (int32_t)b; /* address of 'b */

compact_list.data = &cells[0];

We see see that cdr-coding requires half as many cells as a naive linked-list structure.

1.2.6. Representations of Arrays and Strings

Object references to an array or string is a word with data type dtp-array or dtp-string.

Whether an array is refered to by dtp-array or dtp-string has no effect on its stored representation: the data type of the object reference simply serves to make the stringp predicate faster.

An array is a structure consisting of a prefix followed by optional data. A prefix is defined to be a word whose data type is dtp-header-i and whose header type is %header-type-array followed by zero or more additional words. The prefix defines the type and shape of the array.

The byte fields in a prefix header's 32-bit immediate field are:

Position   Symbolic Name Description
<31:26> 6 ARRAY-TYPE-FIELD Combination of fields below
<31:30> 2 ARRAY-ELEMENT-TYPE Element type, one of: fixnum, character, boolean, object-reference
<29:27> 3 ARRAY-BYTE-PACKING Byte packing. Base 2 logarithm (0 to 5) of the number of elements per word, 6 or 7 in this field is undefined.
<26> 1 ARRAY-LIST-BIT 1 in ART-Q-LIST arrays, 0 otherwise
<25> 1 ARRAY-NAMED-STRUCTURE-BIT 1 in named-structures, 0 otherwise
<24> 1 ARRAY-SPARE-1 (spare for software uses)
<23> 1 ARRAY-LONG-PREFIX-BIT 1 if prefix is multiple words
<22:15> 8 ARRAY-LEADER-LENGTH-FIELD Number of elements in the leader
<14:0> 15 ARRAY- Use of these bits depends on the prefix type, as described below
  • Bits <31:27> corresponds to the same bits of the control word of an array register. (Array registers are discussed in the next section.)
  • Bits <26:24> are not used by the hardware.
  • Bits <31:27.23> enable various spceial pieces of hardware or microcode dispatches.
  • Bits <22:0> are used by hardware under microcode control.
  • Bits <31:26> are sometimes grouped together as ARRAY-TYPE-FIELD

Some arrays include packed data in their stored representations. For example, character strings store each character in a single 8-bit byte. This is more efficient than general arrays, which require an entire word for each element. Accessing the n-th character of a string fetches the n/4 word of the string, extracts the mod(n,4) byte of that word, and constructs an object reference to the character whose code is equal to the contents of the byte.

(Machine instructions in compiled functions are stored in a similarly packed form.)

For uniformity, the stored representation of an object containing the packed data remains a sequence of object references. Each word in an array of element-type fixnum, boolean, or character is an immediate reference, data type dtp-fixnum, whose thirty-two bits are broken down into packed fields as required, such as four 8-bit bytes in the case of some character strings.

Array leader. An array can optionally be preceded by a leader, a sequence of object references that implements the array-leader feature. If there is a leader, the leader is preceded by a header of its own, tagged dtp-header-p and %header-type-leader; the address field of this header contains the address of the array's main header — i.e., the address of the header of the array prefix.

Note if an array has a leader, the address field of an object reference designating that array contains the address of the main header, the one after the leader, not the address of the header at the beginning of the array's store (before the leader). Schematically:

Word prefix; /* array prefix */
prefix.cdr = CDR_ARRAY;
prefix.type = DTP_HEADER_I;
prefix.data = ...;

Word *data = &prefix;
data++; /* next word after prefix */
data->type = DTP_FIXNUM;
data->data = 42; /* some random data */

data++; /* second array entry */
data->type = DTP_FIXNUM;
data->data = 31; /* some random data */


Word leader[]; /* array leader */
leader[0] = {cdr = CDR_LEADER, type = DTP_HEADER_P, data = &prefix};
/* leader[1] is a named-structure symbol */
leader[1] = {type = DTP_SYMBOL, data = &name};
leader[2] = {type = DTP_FIXNUM, data = ... };


Word array;
array.type = DTP_ARRAY;
array.data = (&prefix) + 1; /* start address of array data */

The address of leader element i of an array whose address is a, regardless of whether the prefix is long or short, is given by (- a i 1).

The two array formats — %array-prefix-short and %array-prefix-long — are provided to optimize speed and space for simple, small arays (which are the most common).

%array-prefix-short has the 15 bits in position <14:0>, with the symbolic name ARRAY-SHORT-LENGTH-FIELD, give the length of the array.

The prefix is one word, the array is one-dimensional and not displaced, but may have a leader. Most common arrays include defstructs, editor lines, and most arrays with fill-pointers use this type.

The address of data element i of a short-prefix array whose address is a and whose ARRAY-BYTE-PACKING field is b is given by (+ a (ash i (- b)) 1). When b is greater than zero, packed array elements are stored right-to-left within words, thus the right shift to right-justify data element i is (ash (logand i (1- (ash 1 b))) (- 5 b)).

%array-prefix-long has the following 15 bits in position <14:0>:

Position Bits Symbolic Name Description
<14> 1 ARRAY-DISPLACED-BIT 0 for normal array, 1 for displaced array
<13> 1 ARRAY-DISCONTIGUOUS-BIT 0 for normal aray, 1 for conformal array
<12:3> 12 ARRAY-LONG-SPARE Spare
<2:0> 3 ARRAY-LONG-DIMENSIONS-FIELD Number of dimensions.

The long prefix format is used for displaced arrays (including indirect arrays), arrays that are too large to fit in the short-prefix format, and multidimensional (including zero-dimensional) arrays. The first word in the prefix contains the number of dimensions in place of the length of the data. The total length of the prefix is (+ 4 (* d 2)) where d is the number of dimensions.

The second word of the prefix is the length of the array. For conformally displaced arrays, this is the maximum legal linear subscript, not the number of elements (which may be smaller).

The third word of the prefix is the index offset. This is always present, even for non-indirect arrays. Zero should be stored here in non-displaced arrays, since this word is always added to the subscript.

The fourth word of the prefix is the address of the data. This is a locative to the first word after the prefix for normal arrays, except for arrays with no elements (in which case it is a locative to the array itself to avoid pointing to garbage). Fir displaced arrays, this is a locative or a fixnum. For indirect arrays, this is an array.

The remaining wods of the prefix consist of two words for each dimension. The first word is the length of that dimension and the second word is the value to multiply that subscript by.

Word prefix[];
Word array_data[];

union { int32_t data,
        struct {
            int32_t element_type : 2;
            int32_t byte_packing : 3;
            int32_t list_bit     : 1;
            int32_t named_structure_bit : 1;
            int32_t spare_1      : 1;
            int32_t long_prefix_bit : 1;
            int32_t leader_length : 8;
            int32_t displaced_bit : 1;
            int32_t discontiguous_bit : 1;
            int32_t long_spare : 12;
            int32_t long_dimensions : 3;
        } fields } prefix_data;
prefix_data.fields = {element_type        = ELEMENT_FIXNUM,
                      byte_packing        = 3,
                      list_bit            = 0,
                      named_structure_bit = 0,
                      spare_1             = 0,
                      long_prefix_bit     = 1,
                      leader_length       = 0,
                      displaced_bit       = 0,
                      discontiguous_bit   = 0,
                      long_spare          = 0
                      long_dimensions     = 2};
prefix[0] = {cdr = CDR_ARRAY, type = DTP_HEADER_I,
             data = prefix_data.data};
prefix[1] = {type = DTP_FIXNUM, data = 28}; /* array length */
prefix[2] = {type = DTP_FIXNUM, data = 0};  /* index offset */
prefix[3] = {type = DTP_LOCATIVE,
             data = &array_data[0]};
/* first dimension */
prefix[4] = {type = DTP_FIXNUM, data = 7}; /* length */
prefix[5] = {type = DTP_FIXNUM, data = 4}; /* multiplier */
/* second dimension */
prefix[6] = {type = DTP_FIXNUM, data = 4}; /* length */
prefix[7] = {type = DTP_FIXNUM, data = 1}; /* multiplier */

array_data[0] = {type = DTP_FIXNUM,
                 data = ...};
array_data[1] = {type = DTP_FIXNUM,
                 data = ...};
array_data[2] = {type = DTP_FIXNUM,
                 data = ...};
array_data[3] = {type = DTP_FIXNUM,
                 data = ...};

Word array_pointer = {type = DTP_ARRAY,
                      data = &prefix[0]};

1.2.7. I-Machine Array Registers

1.2.8. Representations of Functions and Closures

1.2.8.1. Representation of Compiled Functions

The object reference to a compiled function is a word of data type dtp-compiled-function whose address field points to a word inside a compiled-function structure.

  • The prefix is two words long and has a fixed format.
  • The body is a sequence of one or more instructions.
  • The suffix is at least one word long and contains debugging information and constant data.

The cdr code for the instruction words have special meaning, see the section Instruction Sequencing for details.

The memory layout for this looks roughly as follows:

lisp-machine-0.svg

Figure 1: The memory layout for a compiled function

Now, for a (defun <function-name> (params...) body..) we have a symbol for <function-name> which points to a compiled function. Roughly, this corresponds to:

Symbol symbol;
symbol.name = pointer_to(/* <function-name> string */);
symbol.function = Word{type = DTP_ONE_Q_FORWARD,
                       data = &fn_instr[0]};

/* or really */
make_symbol(pointer_to(/* <function-name> string */), /* name */
            nil,                        /* value */
            pointer_to(fn_instr[0]),    /* function */
            nil,                        /* property */
            nil)                        /* package */
1.2.8.2. Generic Functions

An object reference to a generic function has data type dtp-generic-function. The address field points to a list-like structure whose content is not architecturally defined; it is used internally by the flavor system. See the section Generic Functions and Message Passing.

1.2.8.3. Representation of Lexical Closures

An object reference to a lexical closure is a word of data type dtp-lexical-closure which points to a cons pair. The car of the cons is the lexical environment, and the cdr is the function.

The lexical environment is a cdr-coded list of value cells associated with the closure. In such an implementation, this list must be compact (i.e., cdr-coded using cdr-next) since instructions that access the lexical variables compute addresses of the variables simply as an offset past the address of the environment.

When a lexical closure is called as a function, the environment will be made an argument to the function. For more information, see the chapter on function calling, specifically the section Starting a Function Call.

1.2.8.4. Representation of Dynamic Closures

The object reference to a dynamic closure is a word of data type dtp-dynamic-closure, which points to a list structure. The format of a dynamic closure is not architecturally defined, but is determined by software. (The hardware traps to Lisp to funcall dynamic closures.)

The list representation allows closures to be stored in the stack (a la with-stack-list); certain special forms such as error-restart exploit this.

This list is always cdr-coded, but nothing actually depends on this. The first element of the list is the function succeeding elements are taken in pairs. The first element of each pair is a locative pointer to the value cell to be bound when the closure is called. The second element of each pair is a locative pointer to the closure value cell to which that cell is to be linked.

1.2.9. Instruction Representation

The instructions ins a compiled function are a sequence of words whose data-type field selects among three types of words:

  1. Packed instructions. Data types with codes 60–77 are used for words that contain two 18-bit instructions. These are the usual stack-machine type instructions.
  2. Full-word instructions. Data types coded 50–57 are used for words that contain a single intsruction with an address field. These are used for starting function calls.

    In addition, data type dtp-external-value-cell-pointer (type code 4) is used to fetch the contents of the value cell of a special variable or the function cell of a function an push it on the stack. This is actually an optimization to save space and time; the value cell address could be pushed as a constant locative and then a car instruction could be executed.

    Besides these, there is one other full word instruction type, the entry instructions, which do not contain addresses, but instead look like pairs of half-word instructions. These are decoded by their opcode field, not by the data-type field.

  3. Constants. All other data types encountered among the instructions in a compiled function are constants. The word from the instruction stream is pushed on the stack with the cdr code set to cdr-next. The hardware will signal an error if the word is a header or an invisible pointer.

The fields within various types of instructions are described in the chapter on the Macroinstruction Set.

1.2.10. Program-Counter Representation

The program counter (pc) is a register in the I-machine that contains the virtual address of the currently executing instruction. Since instructions are packed 2 to a word, we need to specify which half-word instruction is executing.

The information is included in the data-type code of the pc contents; thus there are two pc data types dtp-even-pc and dtp-odd-pc. Word of these data types are not usually found in the stored representations of Lisp objects, but occur within stack frames or inside compiled functions for long branches. See Function Calling, Message Passing, Stack-Group Switching.

1.2.11. Representation of Locatives

A locative is a pointer to virtual memory implemented as an object with data type dtp-locative and an address field that is the address of the virtual memory word to which it points.

It is classified as a pointer object reference.

Locatives may point to locations within objects, such as the value cell of a symbol. Other uses include the pointer to the start of data in long format arrays and the base address of array registers.

1.2.12. Representation of Physical Addresses

The data type dtp-physical-address allows unmapped access to the full (up to 32 bits wide) physical address space. Since it is a separate data type it has restricted usage.

1.3. Data-Type Code Assignments

1.3.1. Headers, Special Markers, and Forwarding Pointers

Type Code Symbolic Name Description
0 DTP-NULL Unbound variable/function, uninitialized storage
1 DTP-MONITOR-FORWARD This cell is being monitored
2 DTP-HEADER-P Structure header, with pointer field
3 DTP-HEADER-I Structure header, with immediate bits
4 DTP-EXTERNAL-VALUE-CELL Invisible except for binding
5 DTP-ONE-Q-FORWARD Invisible pointer (forwards 1 cell)
6 DTP-HEADER-FORWARD Invisible pointer (forwards whole structure)
7 DTP-ELEMENT-FORWARD Invisible pointer in element of structure

1.3.2. Number Data Types

Type Code Symbolic Name Description
10 DTP-FIXNUM Small integer
11 DTP-SMALL-RATIO Ratio with small numerator and denominator
12 DTP-SINGLE-FLOAT Single-precision floating point
13 DTP-DOUBLE-FLOAT Double-precision floating point
14 DTP-BIGNUM Big integer
15 DTP-BIG-RATIO Ratio with big numerator or denominator
16 DTP-COMPLEX Complex number
17 DTP-SPARE-NUMBER A number to the hardware trap mechanism

1.3.3. Instance Data Types

Type Code Symbolic Name Description
20 DTP-INSTANCE Ordinary instance
21 DTP-LIST-INTSANCE Instance that masquerades as a cons
22 DTP-ARRAY-INSTANCE Instance that masquerades as an array
23 DTP-STRING-INSTANCE Instance that masquerades as a string

1.3.4. Primitive Data Types

There are 11 primitive types:

Type Code Symbolic Name Description
24 DTP-NIL The symbol NIL
25 DTP-LIST A cons
26 DTP-ARRAY An array that is not a string
27 DTP-STRING A string
30 DTP-SYMBOL A symbol other than NIL
31 DTP-LOCATIVE Locative pointer
32 DTP-LEXICAL-CLOSURE Lexical closure of a function
33 DTP-DYNAMIC-CLOSURE Dynamic closure of a function
34 DTP-COMPILED-FUNCTION Compiled code
35 DTP-GENERIC-FUNCTION Generic function (see later section)
36 DTP-SPARE-POINTER-1 Spare Pointer
37 DTP-SPARE-POINTER-2 Spare Pointer
40 DTP-PHYSICAL-ADDRESS Physical address
41 DTP-SPARE-IMMEDIATE-1 Spare immediate
42 DTP-SPARE-POINTER-3 Spare pointer
43 DTP-CHARACTER Common Lisp character object
44 DTP-SPARE-POINTER-4 Spare pointer

Note codes 36, 37, 42, 44 are spare pointer data types and code 41 is a spare immediate data type.

1.3.5. Special Marker for Garbage Collector

Type Code Symbolic Name Description
45 DTP-GC-FORWARD Object-moved flag for garbage collector

1.3.6. Data Types for Program Counter Values

Type Code Symbolic Name Description
46 DTP-EVEN-PC PC at first packed word instruction in word, or of full-word instruction
47 DTP-ODD-PC PC at second instruction in word

1.3.7. Full Word Instruction Data Types

Type Code Symbolic Name Description
50 DTP-CALL-COMPILED-EVEN Start call, address is compiled-function
51 DTP-CALL-COMPILED-ODD Start call, address is compiled-function
52 DTP-CALL-INDIRECT Start call, address is function cell
53 DTP-CALL-GENERIC Start cell, address is generic-function
54 DTP-CALL-COMPILED-EVEN-PREFETCH Same as DTP-CALL-COMPILED-EVEN but prefetch is desirable
55 DTP-CALL-COMPILED-ODD-PREFETCH Same as DTP-CALL-COMPILED-ODD but prefetch is desirable
56 DTP-CALL-INDIRECT-PREFETCH Same as DTP-CALL-INDIRECT but prefetch is desirable
57 DTP-CALL-GENERIC-PREFETCH Same as DTP-CALL-GENERIC but prefetch is desirable

1.3.8. Half-Word Instruction Data Types

Type Code Symbolic Name Description
60–77 DTP-PACKED-INSTRUCTION Used for instructions in compiled code.

Each word of this type contains two 18-bit instructions, which is why 16 types are used up. Bits <37:36> contain 3 to select the instruction data type. Bits <39:38>, the cdr code, contain sequencing information described in the chapter on theinstruction set. The instruction in bits <17:0> is executed before the instruction in bits <35:18>.

1.4. Appendix: Comparison of 3600-Family and I-Machine Data Representations

1.4.1. Array Differences

1.4.2. Compiled Function Differences

2. Memory Layout and Addressing

2.1. Address Space

The address space is divided into thirty-two zones, each containing 128 megabytes. The thirty-two zones are variously assigned to several sections as shown in the table below. Note that ephemeral space is a subset of the virtual address space:

  • #0o00000000000..0o00777777777 Ephemeral address space (zone 0, the low 27 = 128 megawords)
  • #0o00000000000..0o36777777777 Virtual address space (zones 0–30, the low 3968 megawords)
  • #0o37000000000..0o37777777777 Unmapped address space (zone 31, the high 128 megawords)
  • #0o00000000000..0o37777777777 Total address space (4 gigawords)

Note that 0o37777777777 is 232-1.

2.1.1. Virtual Addresses

A virtual address is divided into two fields for mapping purposes: the virtual page number and the offset within page fields.

Virtual space occupies thirty-one zones. An internal processor register allows each zone to be specified as either old or new space.

Position Meaning
<31:27> Zone number (zones 0 through 30)
<31:8> Virtual Page Number (VPN – 512K virtual pages per zone)
<7:0> Offset within Page (256 words per page)

The virtual address space is partitioned by software into regions, areas, and quanta. These have no direct hardware impact. Note, however, that the hardware hash function for the Page Hash Table is optimized for a quantum of size 65536 words (see Page Hash Table).

2.1.2. Ephemeral Addresses

The lowest zone of the virtual address is reserved for the storage of ephemeral objects. This space is provided to support a garbage collection strategy that takes advantage of recently created objects usually having a short lifetime.

Ephemeral space is divided into thirty-two levels. Data within an ephemeral level is the same age. The relative ages of different levels is up to decide and would normally change dynamically. Each level is further divided into two halves (old and new space). An internal processor register specifies which half is old and which is new.

The thirty-two ephemeral levels are grouped into four groups of eight levels each. The ephemeral level groups referenced by a page are maintained in the PHT.

Position Meaning
<31:27> 00000 => ephemeral, otherwise non-ephemeral
<26> which half of the ephemeral level
<25:21> ephemeral level number
<25:24> ephemeral level group number
<20:0> word address within an ephemeral level

2.1.3. Unmapped Addresses

2.1.4. Wired Addresses

2.1.5. Pages

2.1.5.1. Page Hash Table
2.1.5.2. PHT Lookup Algorithm
2.1.5.3. Translation Algorithm

2.2. GC Support

2.3. Address Translation

2.3.1. Page Hash Table

2.3.2. PHT Lookup Algorithm

2.3.3. Translation Algorithm

2.4. Appendix: Comparison of 3600-Family and I-Machine Memory Layout and Addressing

3. Macroinstruction Set

3.1. Introduction

3.1.1. Instruction Sequencing

The cdr-code of instruction words specify the sequencing of instructions for execution.

Cdr Code PC Increment Comment
0 +1 Normal instruction sequencing
1 illegal Fence; marks end of compiled function
2 -1 On some constants
3 +2 PC even, +3 PC odd Before some constants, on some constants

3.1.2. Internal Registers

The I-machine internal registers may be summarized in the following table. An asterisk by an address entry means the register may be defined by an implementation, and "reserved" means the register may be architecturally defined in the future. The information in this table is specific to Revision 0 of the Ivory chip.

Address Read/Write Data Type Register Name
0*     For use by microcode only
1 RW loc Frame Pointer (FP)
2 RW loc Local Pointer (LP)
3 RW loc Stack Pointer (SP)
4*     For use by microcode only
5 RW loc Stack Cache Lower Bound
6 RW loc/pa BAR0 Contents
206 RW loc/pa BAR1 Contents
406 RW loc/pa BAR2 Contents
606 RW loc/pa BAR3 Contents
7 R fix BAR0 Hashed
207 R fix BAR1 Hashed
407 R fix BAR2 Hashed
607 R fix BAR3 Hashed
10*     For use by microcode only
11*      
12 RW pc Continuation
13 RW fix DP Op
14 RW fix Control Register
15*      
16 RW fix Ephemeral Oldspace Register
17 RW fix Zone Oldspace Register
20 R fix Implementation Revision
21* RW fix FP coprocessor present
22*      
23 RW fix Preempt Register
24* RW fix Icache Control
25* RW fix Prefetcher Control
26* RW fix Map Cache Control
27* RW fix Memory Control
30* R fix ECC Log
31* R fix ECC Log Address
32* W - Invalidate Matching Map Entry for VMA in BAR0
232* W - Invalidate Matching Map Entry for VMA in BAR1
432* W - Invalidate Matching Map Entry for VMA in BAR2
632* W - Invalidate Matching Map Entry for VMA in BAR3
33*      
34 RW fix Stack Cache overflow limit
35*      
36*      
37 reserved    
40-47*      
50*      
51*      
52* W - Load Matching Map Word 1 forVMA in BAR0
252* W - Load Matching Map Word 1 forVMA in BAR1
452* W - Load Matching Map Word 1 forVMA in BAR2
652* W - Load Matching Map Word 1 forVMA in BAR3
53–777 reserved    
1000 RW -- Top of Stack (TOS)
1001 RW -- Array Event Count
1002 RW -- Binding Stack Pointer
1003 RW -- Catch Block Pointer
1004 RW -- Control Stack Limit
1005 RW -- Control Stack Extra Limit
1006 RW -- Binding Stack Limit
1007 RW -- PHT Base
1010 RW -- PHT Mask
1011 RW -- Count Map Reloads
1012 RW -- List Cache Area
1013 RW -- List Cache Address
1014 RW -- List Cache Length
1015 RW -- Structure Cache Area
1016 RW -- Structure Cache Address
1017 RW -- Structure Cache Length
1030 RW -- Maximum Frame Size
1031 RW -- Stack Cache Dump Quantum

3.1.3. Memory Side Effects

Reading memory may not cause side effects. The architecture permits an implementation to start a memory read that it will not use, perhaps because of instruction prefetching, perhaps while starting an array reference before an out of bounds check is peformed, perhaps because of instruction pipelining, or perhaps for something else.

Writing memory using a dtp-physical-address is allowed to cause side effects; dtp-physical-address is guaranteed not to be cached, and the write is guaranteed to happen exactly once. Also both %coprocessor-read and %coprocessor-write instructions may cause side effects; they are guaranteed to be performed exactly once.

3.1.4. Explanation of Instruction Definitions

3.1.4.1. Instruction Formats
3.1.4.2. Arguments: the Data Types Accepted
3.1.4.3. Types of Instruction Exceptions
3.1.4.4. Types of Memory References
3.1.4.5. Top-of-Stack Register Effects
3.1.4.6. Cdr Codes of Values Returned

3.2. The Instructions

3.2.1. List-Function Operations

3.2.2. Predicate Intsructions

3.2.3. Numeric Operations

3.2.4. Data-Movement Instructions

3.2.5. Field-Extraction Instructions

3.2.6. Array Operations

3.2.7. Block Instructions

3.2.8. Function-Calling Instructions

3.2.9. Binding Instructions

3.2.10. Catch Instructions

3.2.11. Lexical Variable Accessors

3.2.12. Instance Variable Accessors

3.2.13. Subprimitive Instructions

4. Function Calling, Message Passing, Stack-Group Switching

4.1. Stacks

4.1.1. Control Stack

4.1.2. Binding Stack

4.1.3. Data Stack

4.2. Registers Important to Function Calling and Returning

4.3. Function Calling

4.3.1. Starting a Function Call

4.3.2. Pushing the Arguments

4.3.3. Finishing the Call

4.4. Function Entry

4.4.1. Push-apply-args

4.4.2. Pull-apply-args

4.4.3. Trapping Out of Entry and Restarting

4.5. Function Returning

4.5.1. Function Return Instructions

4.5.2. Frame Cleanup

4.5.3. Value Matchup

4.6. Catch, Throw and Unwind-Protect

4.7. Generic Functions and Message Passing

4.7.1. Flavor

4.7.2. Handler Table

4.7.3. Calling a Generic Function

4.7.4. Sending a Message

4.7.5. Accessing Instance Variables

4.8. Stack-Group Switching

4.9. Appendix: Comparison of 3600-Family and I-Machine Function Calling

5. Exception Handling

Last Updated 2021-06-01 Tue 10:00.