\( \DeclareMathOperator{\tr}{tr} \newcommand\D{\mathrm{d}} \newcommand\E{\mathrm{e}} \newcommand\I{\mathrm{i}} \newcommand\bigOh{\mathcal{O}} \newcommand{\cat}[1]{\mathbf{#1}} \newcommand\curl{\vec{\nabla}\times} \newcommand{\CC}{\mathbb{C}} \newcommand{\NN}{\mathbb{N}} \newcommand{\QQ}{\mathbb{Q}} \newcommand{\RR}{\mathbb{R}} \newcommand{\ZZ}{\mathbb{Z}} % For +---- metric \newcommand{\BDpos}{} \newcommand{\BDneg}{-} \newcommand{\BDposs}{\phantom{-}} \newcommand{\BDnegg}{-} \newcommand{\BDplus}{+} \newcommand{\BDminus}{-} % For -+++ metric \newcommand{\BDpos}{-} \newcommand{\BDposs}{-} \newcommand{\BDneg}{} \newcommand{\BDnegg}{\phantom{-}} \newcommand{\BDplus}{-} \newcommand{\BDminus}{+} \)
UP | HOME

NaN Boxing

Table of Contents

1. Introduction

The basic idea is that, if we are writing an interpreter or virtual machine, then how can we distinguish a pointer from…anything else?

One idea is to just use a naive data structure encoding the possible values. At best, we would have an overhead of an additional 64-bits (on a 64-bit system), doubling the size of a pointer.

But if we use a NaN, we can encode any 51-bit pointer (or 42-bit pointer and we get \(2^{9}=512\) possible constants).

There is another alternative, to use tagged pointers.

1.1. Summary of NaN-Boxing

More explicitly, there are 2 types of NaN values (the sign bit is arbitrary for NaN in both cases):

  1. Quiet NaN: the exponent bits are all 1, the bit after the exponent is "set" (i.e., equal to 1), and then the rest of the 51 bits in the Mantissa are arbitrary
  2. Signaling NaN: the exponents are all 1, the bit after the exponent is "unset" (i.e., equal to 0), and then the rest of the 51 bits are 0.

There are 51 free bits in the quiet NaN, so we use it to store information.

2. Problems with NaN-Boxing

2.1. Equality Fails

We cannot meaningfully compare 2 NaN values, so equality fails. We'd have to change it to a 64-bit integer for equality to work as expected. This is a problem in Java, for example.

3. Alternatives

3.1. Pointer Tagging

When we have a 64-bit system, it turns out that not all 64-bits are necessary for pointers. After all, \(2^{30}\) bytes is a Gibibyte, and \(2^{40}\) bytes is a Tebibyte, so a 40-bit pointer can access 1 Tebibyte of memory; and \(2^{50}\) bytes is a Pebibyte. If we use 8 bits to store metadata and 56 bits for information (the "payload"), then we can refer to 64 Pebibytes (72.057 Petabytes) of memory.

For reference:

  • in May 2017, HP announced it built a computer with 160 Terabytes of RAM (BBC).
  • The architectural specifications for Intel64 and AMD64 specify 52-bit pointer size for virtual memory addresses. At the risk of link-rot, let me link the relevant documents and quote the passages (if possible):
    • AMD64 architectural specification (well, the "programmer's guide" version 24593—Rev. 3.44—release date 2026-03-06) states "The AMD64 architecture enhances this support to allow translation of 64-bit virtual addresses into 52-bit physical addresses, although processor implementations can support smaller virtual-address and physical-address spaces." (Section 5.1 physical page 128, pdf page 189)
    • Intel's specification allows for up to 52-bit physical addresses (see Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3A, table 5-1 on page 5-2). I last checked the volume version number 325462-091US dated March 2026
  • ARMv8.2 has a larger VA space of up to 52-bits, before this ARM had 48-bit virtual address size. (The bits 63:48 are set to 0 for user-space, and 1 for kernel space.) (See Table D8-5 of "Arm Architecture Reference Manual for A-profile architecture" Version M.b; I last checked this 21 June 2026)
  • As of today (Sunday 21 June 2026), Crucial sells 32 GB DDR4 RAM sticks which can fit in one slot, and high end desktops have 8 slots for RAM, which means that desktops right now can support 256 GB RAM (or \(2^{8}\) GB). But the next generation DDR5 RAM octuples the density, meaning we could expect 2 TB of RAM in 8 slots in the next few years. (DDR6 is expected to be released around 2028.) tl;dr 2 TB of RAM is the cutting edge for desktops over the next couple years.

I have the feeling we may end up staggering into a false sense of security, unwittingly imitating Bill Gates's 1985 remark to InfoWorld, "When we set the upper limit of PC-DOS at 640K, we thought nobody would ever need that much memory."

We may or may not even need 8 bits for metadata (heck, 3 bits may suffice).

3.2. Pointer Tagging + Memory Management

The basic idea is that we will implement our own garbage collector, which basically allocates an "arena" of memory (a glorified array of contiguous bytes), then use our own "pointer" type (which is just an index to our array of contiguous bytes). This basically rolls our own Virtual address space.

3.3. Pointer Alignment

We can recall that a memory address \(A\) is n-byte aligned when \(A\) is a multiple of \(n\). This means \(A\) has a minimum of \(\log_{2}(n)\) least-significant zeros when expressed in binary.

So a 64-bit address (= 8-byte address) is aligned if at least \(\log_{2}(8)=3\) least-significant bits are zeros when expressed in binary.

If we stipulate pointers are always from a malloc (and we are using a 64-bit computer), then we have 3 bits which we can use for metadata.

4. References

Last Updated: Sun, 21 Jun 2026 07:23:50 -0700