C Programming Language
Table of Contents
1. Overview
C is arguably the lingua franca of programming.
The "C Abstract Machine" is the hypothetical computer which is described by the C standard (C17 5.1.2.3 "Program execution"), which states:
The semantic descriptions in this International Standard describe the behavior of an abstract machine in which issues of optimization are irrelevant.
This abstract machine may, or may not, have any relation to actual hardware.
The basic model is that there are two memory locations:
- The stack stores local variables and is cleaned up when the function call terminates
- The heap stores
malloc()
allocated results, must be freed, and is accessed by pointers. Loosely, this intuitively corresponds to RAM.
2. Dereferencing NULL Pointer
This leads to "undefined behaviour" according to the C Standard. This means it depends on the CPU and other sordid details.
2.1. On x86 Family Architecture
But that doesn't stop us from writing some code like:
#include <stdlib.h> int main() { int *x = NULL; *x = 5; return 0; }
We then check this compiles (using GCC 8.3.0) to:
; gcc -fverbose-asm -S null.c -o null.s -O1 main: subq $40, %rsp ;, .seh_stackalloc 40 .seh_endprologue ; null.c:3: int main() { call __main ; ; null.c:5: *x = 5; movl $5, 0 ;, MEM[(int *)0B] ; null.c:7: } movl $0, %eax ;, addq $40, %rsp ;, ret
The key line of code is the movl $5, 0
which tries to store in the
address at 0
the literal value 5
.
- The pointers refer to a "virtual address", which is translated into "physical addresses" via the operating system (in most modern operating systems); this is done through paging
- When the translation fails, the CPU raises a page fault exception
- This triggers a transition from "user mode" to a specific location in the OS kernel's code, as defined by the interrupt descriptor table
- The Operating System kernel regains control and must determine what to
do based on the information from the exception and the process's page
table.
- Windows will (or used to) raise a structured exception.
- Linux 2.6 had this be a possible exploit, see Much ado about NULL: Exploiting a kernel NULL dereference.