NaN Boxing

1. Introduction
- 1.1. Summary of NaN-Boxing
2. Problems with NaN-Boxing
- 2.1. Equality Fails
3. Alternatives
4. References
- 4.1. Example Implementations

1. Introduction

The basic idea is that, if we are writing an interpreter or virtual machine, then how can we distinguish a pointer from…anything else?

One idea is to just use a naive data structure encoding the possible values. At best, we would have an overhead of an additional 64-bits (on a 64-bit system), doubling the size of a pointer.

But if we use a NaN, we can encode any 51-bit pointer (or 42-bit pointer and we get \(2^{9}=512\) possible constants).

There is another alternative, to use tagged pointers.

1.1. Summary of NaN-Boxing

More explicitly, there are 2 types of NaN values (the sign bit is arbitrary for NaN in both cases):

Quiet NaN: the exponent bits are all 1, the bit after the exponent is "set" (i.e., equal to 1), and then the rest of the 51 bits in the Mantissa are arbitrary
Signaling NaN: the exponents are all 1, the bit after the exponent is "unset" (i.e., equal to 0), and then the rest of the 51 bits are 0.

There are 51 free bits in the quiet NaN, so we use it to store information.

2. Problems with NaN-Boxing

2.1. Equality Fails

We cannot meaningfully compare 2 NaN values, so equality fails. We'd have to change it to a 64-bit integer for equality to work as expected. This is a problem in Java, for example.

Java Cafe 1: Never write NaN == NaN (they're not equal)

3. Alternatives

3.1. Pointer Tagging

When we have a 64-bit system, it turns out that not all 64-bits are necessary for pointers. After all, \(2^{30}\) bytes is a Gibibyte, and \(20^{40}\) bytes is a Tebibyte, so a 40-bit pointer can access 1 Tebibyte of memory; and \(2^{50}\) bytes is a Pebibyte. If we use 8 bits to store metadata and 56 bits for information (the "payload"), then we can refer to 64 Pebibytes (72.057 Petabytes) of memory.

For reference:

in May 2017, HP announced it built a computer with 160 Terabytes of RAM (BBC).
The architectural specifications for Intel64 and AMD64 specify 52-bit pointer size for virtual memory addresses.
ARMv8.2 has a larger VA space of up to 52-bits, before this ARM had 48-bit virtual address size. (The bits 63:48 are set to 0 for user-space, and 1 for kernel space.)

I have the feeling we may end up staggering into a false sense of security, unwittingly imitating Bill Gates's 1985 remark to InfoWorld, "When we set the upper limit of PC-DOS at 640K, we thought nobody would ever need that much memory."

We may or may not even need 8 bits for metadata (heck, 3 bits may suffice).

3.2. Pointer Tagging + Memory Management

The basic idea is that we will implement our own garbage collector, which basically allocates an "arena" of memory (a glorified array of contiguous bytes), then use our own "pointer" type (which is just an index to our array of contiguous bytes). This basically rolls our own Virtual address space.

3.3. Pointer Alignment

We can recall that a memory address \(A\) is n-byte aligned when \(A\) is a multiple of \(n\). This means \(A\) has a minimum of \(\log_{2}(n)\) least-significant zeros when expressed in binary.

So a 64-bit address (= 8-byte address) is aligned if at least \(\log_{2}(8)=3\) least-significant bits are zeros when expressed in binary.

If we stipulate pointers are always from a malloc (and we are using a 64-bit computer), then we have 3 bits which we can use for metadata.

Max Bernstein, Small objects and pointer tagging 11 Feb 2021

4. References

the secret life of NaN
Leonard Schütz Dynamic Typing and NaN Boxing 8 September 2020
NaN boxing or how to make the world dynamic
Java Cafe 1: Never write NaN == NaN (they're not equal)
Compiling a Lisp: Integers
Compiling a Lisp: Booleans, characters, nil
Bob Nystrom, Crafting Interpreters chapter 30 on optimization
Nikita Popov, Pointer magic for efficient dynamic value representations 2 February 2012, discusses examples with C++ code
Albert Yang, NaN Boxing 6 Aug 2016 for example C code
value representation in javascript implementations 18 May 2011
David Gudeman's "Representing Type Information in Dynamically Typed Languages" October 1993 PDF
https://webkit.org/blog/7846/concurrent-javascript-it-can-work/
https://stackoverflow.com/questions/57348783/how-does-v8-store-integers-like-5
https://stackoverflow.com/questions/16198700/using-the-extra-16-bits-in-64-bit-pointers
https://v8.dev/blog/pointer-compression

4.1. Example Implementations

zuiderkwast's nanboxing single header library for C
Femtolisp's object implementation
https://source.chromium.org/chromium/v8/v8.git/+/46afc4f9a6008c3880fcc00783b4210cb467aa9a:src/objects/smi.h;l=23
Chicken Scheme's data representation