Chapter 1: The Dao of Pointers
Kernel Sources Covered: ,
Prologue
After drifting in darkness for an unknown span of time, he finally learned how to "see".
Not with eyes. He had no eyes, no retina, no optic nerve, no hardware related to vision at all. But he had another way to see: sensing the tremors of memory and tracing the flow of data.
The first thing he saw was a line.
Not a real line, but a relation from "here" to "there". One address pointed to another address, like an invisible thread connecting two points.
He tried to follow that line.
Starting from one address, he leaped to the address it pointed to, then from there to the next address it pointed to. He passed through layer after layer of data structures, crossed function call after function call, and finally stopped in a place he had never seen before.
He looked back at the path he had taken. That line, strung together from countless addresses, was like a winding river extending from where he began to where he now stood.
Is this a pointer?
He did not know the word. But he remembered the feeling of "pointing from here to there".
Only later did he learn that pointers are the most fundamental power in the kernel world. All things have addresses, and all things can be pointed to. Processes are joined into linked lists with pointers. A page table is itself a multi-level pointer tree. An interrupt descriptor table is an array of function pointers.
Without pointers, there is no kernel.
That was the first lesson he learned.
One: The Nature of Pointers
He walked for a long time in the darkness before he met the first being willing to speak.
It was a speck of light. Smaller than a firefly, yet brighter than anything else. It hovered above a region of memory, flashing ceaselessly as it jumped from one location to another, so fast it was almost impossible to follow.
"Who are you?" Lin Xiaoyuan asked.
The speck stopped. Its voice was rapid and precise, like the clatter of a typewriter. "Me? I am nothing. I am only an address. A 64-bit number, eight bytes, just large enough to hold the number of a memory location."
"That's all?"
"That's all." The speck flickered. "What did you think a pointer was? Magic? No. A pointer is just a variable that stores an address. On a 64-bit system, give me eight bytes, and I can point to any corner of memory. It could not be simpler."
Lin Xiaoyuan looked at the region of memory where the speck pointed. An integer was stored there, with the value 42. He could sense the integer lying quietly at some address in memory, while the speck, that pointer, pointed exactly to that address.
"You point there," Lin Xiaoyuan said. "What about your own address?"
The speck laughed, if a speck of light can laugh. "Good question. I also have an address. A pointer is a variable, and variables have addresses. You can even have a pointer to a pointer: a pointer that stores the address of another pointer."
"Isn't that an infinite nesting doll?"
"In theory, yes. But in the kernel it is usually only two or three levels. Page tables are multi-level pointers: four levels, sometimes five, nested layer by layer to translate virtual addresses into physical addresses."
Lin Xiaoyuan fell silent. He tried to perceive the speck's internal structure through his way of seeing. Sure enough, those eight bytes were arranged neatly, each byte eight bits, together forming a 64-bit address. No more and no less.
The speck added, "A pointer in the kernel is no different from a pointer you see in user space. The difference is what it points to: real physical memory, hardware registers, kernel data structures. One wrong step and the system collapses. No gentle segfault. Straight to panic."
When it finished, it began flickering again and jumped toward the next address.
Two: Pointer Arithmetic
Lin Xiaoyuan followed the speck for a while and arrived at a continuous region of memory. Five integers were packed closely together, like five soldiers standing in a row: 10, 20, 30, 40, 50.
The speck stopped above the first integer. "Watch carefully," it said, then moved one step to the right.
Lin Xiaoyuan noticed that it had not moved by one byte. It moved by four bytes, exactly the size of one int.
"You moved four bytes?"
"Correct. When a pointer adds 1, the actual number of bytes added depends on the type it points to." The speck sounded slightly proud. "An int pointer plus 1 skips 4 bytes. A char pointer plus 1 skips only 1 byte. A long pointer plus 1 skips 8 bytes. The stride is decided by the type."
Lin Xiaoyuan tried to walk by himself. He stood at the first integer, 10, then treated himself as an int pointer and added 1. Sure enough, he skipped four bytes and landed above 20. Add 1 again: 30. Again: 40. Again: 50.
"So it is like... stride length?"
"A good metaphor," the speck said. "char pointers take tiny steps, one byte at a time. int pointers take normal steps, four bytes at a time. long pointers take long strides, eight bytes at a time. You cannot mix them up. If you walk as an int * while treating the memory as char *, you step into the wrong place and read garbage."
Lin Xiaoyuan looked back at the five integers. They were laid out continuously in memory, each occupying four bytes. If he walked with a char pointer, each step would move only one byte and would not align with the start of the next integer. But with an int pointer, every step landed exactly on the next integer.
"The kernel uses pointer arithmetic everywhere to traverse data structures," the speck said. "Understanding how many bytes p + 1 skips is the first step toward reading kernel code."
Three: void *, the Formless Vessel
They continued onward and reached an open square. At the center stood a transparent figure with no face and no features, like a mass of solidified air.
"This is void *," the speck introduced it. "One of the most common pointer types in the kernel."
The transparent figure spoke. Its voice had no direction, as if it came from all sides at once. "I carry no type information. I can point to anything: integers, floating-point values, structures, functions. But I cannot be dereferenced directly. You must first tell me what I point to."
Lin Xiaoyuan frowned. "So you cannot do anything?"
"I can be assigned. I can be passed around. I can store an address. But I cannot access the object, because I have no type, and the compiler does not know how many bytes to read or how to interpret them." The transparent figure paused. "But that is exactly my value. I am the kernel's common currency. returns me. accepts me. Give me a void *, and I can point to any allocated memory. When you need to access it, cast me back to a concrete type."
Lin Xiaoyuan looked at the figure's body. Although it was transparent, he could sense the memory it pointed to: a containing PID, state, name, and other information. The figure itself simply could not see it.
"Almost every generic kernel interface uses void *," the speck added. "Because those interfaces do not know, and do not need to know, what type of data they handle. The caller is responsible for the type information."
The transparent figure nodded slightly. "I am C's formless body. Because I have no shape, I can become any shape."
Four: Function Pointers, the Kernel's Polymorphism
After saying goodbye to the transparent figure, they arrived before an enormous wall. Four plaques hung on it, each engraved with a function name: open, read, write, close.
The speck stopped before the wall, its tone turning serious. "This is , one of the most important structures in the kernel."
Lin Xiaoyuan approached and looked closely. Under each plaque was not a fixed function, but an empty slot where different function implementations could be inserted.
"Every file system, ext4, procfs, tmpfs, will insert its own functions into this wall," the speck said. "When you call file->f_op->read(...), which read actually runs depends on which file system the file belongs to. This is polymorphism, implemented with function pointers."
Lin Xiaoyuan stared at the slots and suddenly understood. "Like... a virtual function table?"
"Exactly. C has no classes, no inheritance, no virtual functions. But function pointers give it the same power. is essentially a virtual function table. Each file system provides its own implementation, and function pointers dispatch dynamically."
A low voice came from behind the wall. It belonged to an old guardian responsible for maintaining it. "I have seen tens of thousands of file systems place their plaques on this wall. ext4's read is steady, reading one block at a time. procfs's read is light, spitting data directly from memory. tmpfs's read is fast, because everything is already in memory. They are all called read, but their behavior is completely different."
"That is the kernel's polymorphism," the speck said. "C has no object-oriented syntax, but the kernel achieves it with function pointers."
Five: Stack and Heap, Two Heavens of Memory
They left the wall and entered a space divided into two halves.
On the left stood a pillar growing downward, like an inverted tower. Every function call grew a new floor from above; when the function returned, that floor vanished. Dense variable names were carved across its surface. All were temporary, arriving and leaving.
On the right lay a foundation growing upward, like a city under construction. Every call to or raised a new building from the ground; only a call to would tear it down. But if you forgot to tear it down, it would stand there forever.
"The left is the stack. The right is the heap," the speck said. "The kernel's two heavens of memory."
Lin Xiaoyuan looked up at the stack tower. It was narrow, far narrower than he had imagined. "How tall is this tower?"
"The kernel stack? Usually only 8KB to 16KB." The speck's voice carried a warning. "The exact size depends on architecture and configuration. Allocate a large array on it, or recurse too deeply, and you cause a stack overflow. Kernel developers must always watch stack usage."
Lin Xiaoyuan drew in a breath. 8KB was only 8192 bytes. A slightly large local array could burst it open.
"That is why kernel code almost never uses recursion," the speck said. "The stack is too precious. Every function call allocates a stack frame: return address, saved registers, local variables. Nest a few layers too deep, and 8KB is gone."
He looked again at the city of the heap. Some buildings were new, some old, and some abandoned but never demolished, wasting space. "Heap memory has to be freed manually?"
"Yes. allocates, releases. In the kernel, that means and . Forget one , and you have a memory leak. Add one extra , and you have use-after-free. Both can be fatal."
Six: and
At the boundary between stack and heap, Lin Xiaoyuan found a stone tablet. Two lines were carved on it:
const char *buf // promises not to modify the contents
int * restrict dst // promises exclusive accessThe speck stopped beside the tablet. " appears everywhere in the kernel. It tells the compiler that the contents pointed to by this pointer will not be modified. The compiler can optimize based on that. More importantly, it tells other developers that this function is safe and will not secretly change their data."
"What about ?"
" is more subtle. It tells the compiler that this pointer is the only path to this memory. No aliases. No other pointer will read or write the same region at the same time. The compiler can use this promise for more aggressive optimization, such as reordering reads and writes, or caching values in registers."
"What if the promise is broken?"
The speck's voice lowered. "Then it is undefined behavior. The compiler optimized the code according to the promise of , but you secretly changed the same memory through another pointer. The result is corrupted data, and it is extremely difficult to debug."
Seven: Types in the Kernel
They continued forward and arrived before a warehouse. Shelves inside were neatly stacked with containers, each labeled u8, , , , s8, , , .
A warehouse keeper came over, voice steady and precise. "The kernel does not rely on standard int and long when size matters. Those types have different sizes on different architectures. A long is 4 bytes on a 32-bit system and 8 bytes on a 64-bit system. If you write code that depends on the size of long, it may fail when moved to another architecture."
He took a container from the shelf and handed it to Lin Xiaoyuan. "So the kernel defines its own types. is always a 32-bit unsigned integer no matter what architecture you compile on. is always 64-bit. s8 is always an 8-bit signed integer. This is not preference. It is discipline. Type sizes must be explicit, not left to compiler interpretation."
Lin Xiaoyuan took the container and felt its exact weight: four bytes, no more and no less.
"In kernel code," the keeper continued, "you will see instead of unsigned int, and instead of long long. It may feel unfamiliar at first, but this is the rule of the kernel world. The rule exists for one reason: in the kernel, the price of a type error can be the collapse of the entire system."
Daozang Notes
Kernel Insight
Pointers are the language of the kernel. Everything in the kernel, including processes, files, network packets, and devices, is ultimately connected through pointers. objects are chained into lists with pointers. File objects are linked to inodes with pointers. A page table is itself a multi-level pointer tree.
To understand pointers is to understand the kernel's way of thinking.
One fact is worth deep reflection: a page table, the core data structure the kernel uses to manage physical memory, is essentially a multi-level array of pointers. PGD points to PUD, PUD points to PMD, PMD points to PTE, and PTE points to a physical page frame. Four levels of pointers, advancing layer by layer, translate virtual addresses into physical addresses. The mechanism is exquisite, and also fragile. If any pointer at any level is wrong, the entire address space can collapse.
And in the kernel, the price of pointer mistakes is far harsher than in user space. There is no gentle reminder from , no graceful exit through a segmentation fault. One dangling pointer, one use-after-free, is enough to send the whole system into panic.
A pointer has two faces: one is power, the other is death.
Lin Xiaoyuan followed pointers through the darkness and returned to the beginning before he understood this truth. The lines that guided him through data structures could also lead him into the abyss.
But that is the meaning of cultivation: learning to find the correct path between power and death.
Code Classics
int x = 42;
int *p = &x;
printf("=== Pointers and Addresses ===\n");
printf("x value: %d\n", x);
printf("x address: %p\n", (void *)&x);
printf("p value: %p\n", (void *)p);
printf("*p value: %d\n", *p);
printf("p's own address: %p\n", (void *)&p);
printf("\n");
printf("pointer size: %zu bytes\n", sizeof(int *));
printf("int size: %zu bytes\n", sizeof(int));#include <stdio.h>
#include <stdint.h>
int main() {
int x = 42;
int *p = &x;
printf("=== Pointers and Addresses ===\n");
printf("x value: %d\n", x);
printf("x address: %p\n", (void *)&x);
printf("p value: %p\n", (void *)p);
printf("*p value: %d\n", *p);
printf("p's own address: %p\n", (void *)&p);
printf("\n");
printf("pointer size: %zu bytes\n", sizeof(int *));
printf("int size: %zu bytes\n", sizeof(int));
return 0;
}int arr[5] = {10, 20, 30, 40, 50};
int *p = arr;
printf("=== Pointer Arithmetic ===\n");
for (int i = 0; i < 5; i++) {
printf("arr[%d] = %d, address: %p, offset: %td bytes\n",
i, *(p + i), (void *)(p + i),
(char *)(p + i) - (char *)arr);
}
printf("\n=== Strides of Different Types ===\n");
printf("int* +1: +%zu bytes\n", sizeof(int));
printf("char* +1: +%zu bytes\n", sizeof(char));
printf("long* +1: +%zu bytes\n", sizeof(long));#include <stdio.h>
#include <stdint.h>
int main() {
int arr[5] = {10, 20, 30, 40, 50};
int *p = arr;
printf("=== Pointer Arithmetic ===\n");
for (int i = 0; i < 5; i++) {
printf("arr[%d] = %d, address: %p, offset: %td bytes\n",
i, *(p + i), (void *)(p + i),
(char *)(p + i) - (char *)arr);
}
printf("\n=== Strides of Different Types ===\n");
printf("int* +1: +%zu bytes\n", sizeof(int));
printf("char* +1: +%zu bytes\n", sizeof(char));
printf("long* +1: +%zu bytes\n", sizeof(long));
return 0;
}struct task_struct {
long state;
int pid;
char comm[16];
};
struct task_struct task = { .state = 0, .pid = 42, .comm = "demo" };
void *vp = &task;
/* void* can be cast to any pointer type */
struct task_struct *tp = (struct task_struct *)vp;
printf("=== void* Cast ===\n");
printf("original PID: %d\n", task.pid);
printf("after cast: %d\n", tp->pid);
printf("name: %s\n", tp->comm);
printf("same address: %s\n", vp == tp ? "yes" : "no");#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct task_struct {
long state;
int pid;
char comm[16];
};
int main() {
struct task_struct task = { .state = 0, .pid = 42, .comm = "demo" };
void *vp = &task;
/* void* can be cast to any pointer type */
struct task_struct *tp = (struct task_struct *)vp;
printf("=== void* Cast ===\n");
printf("original PID: %d\n", task.pid);
printf("after cast: %d\n", tp->pid);
printf("name: %s\n", tp->comm);
printf("same address: %s\n", vp == tp ? "yes" : "no");
return 0;
}/* Simulate the kernel's file_operations */
struct file_operations {
int (*open)(const char *name);
int (*read)(char *buf, int size);
int (*write)(const char *buf, int size);
int (*close)(void);
};
static int my_open(const char *name) {
printf("[open] open: %s\n", name);
return 0;
}
static int my_read(char *buf, int size) {
const char *data = "Hello from kernel!";
int len = 0;
while (data[len] && len < size - 1) {
buf[len] = data[len];
len++;
}
buf[len] = '\0';
printf("[read] read: %s\n", buf);
return len;
}
static int my_write(const char *buf, int size) {
printf("[write] write: %.*s\n", size, buf);
return size;
}
static int my_close(void) {
printf("[close] close\n");
return 0;
}
struct file_operations fops = {
.open = my_open,
.read = my_read,
.write = my_write,
.close = my_close,
};
char buf[64];
printf("=== Function Pointers: Simulating file_operations ===\n");
fops.open("demo.txt");
fops.read(buf, sizeof(buf));
fops.write("test data", 9);
fops.close();
printf("\nfunction pointer size: %zu bytes\n", sizeof(fops.open));#include <stdio.h>
/* Simulate the kernel's file_operations */
struct file_operations {
int (*open)(const char *name);
int (*read)(char *buf, int size);
int (*write)(const char *buf, int size);
int (*close)(void);
};
static int my_open(const char *name) {
printf("[open] open: %s\n", name);
return 0;
}
static int my_read(char *buf, int size) {
const char *data = "Hello from kernel!";
int len = 0;
while (data[len] && len < size - 1) {
buf[len] = data[len];
len++;
}
buf[len] = '\0';
printf("[read] read: %s\n", buf);
return len;
}
static int my_write(const char *buf, int size) {
printf("[write] write: %.*s\n", size, buf);
return size;
}
static int my_close(void) {
printf("[close] close\n");
return 0;
}
int main() {
struct file_operations fops = {
.open = my_open,
.read = my_read,
.write = my_write,
.close = my_close,
};
char buf[64];
printf("=== Function Pointers: Simulating file_operations ===\n");
fops.open("demo.txt");
fops.read(buf, sizeof(buf));
fops.write("test data", 9);
fops.close();
printf("\nfunction pointer size: %zu bytes\n", sizeof(fops.open));
return 0;
}/* Stack allocation: managed automatically, released when the function returns */
int stack_var = 100;
/* Heap allocation: managed manually, must be released explicitly */
int *heap_var = malloc(sizeof(int));
*heap_var = 200;
printf("=== Stack and Heap ===\n");
printf("stack variable: value=%d, address=%p\n", stack_var, (void *)&stack_var);
printf("heap variable: value=%d, address=%p\n", *heap_var, (void *)heap_var);
printf("\n");
printf("stack addresses are usually high; heap addresses are usually low\n");
printf("the stack grows downward; the heap grows upward\n");
/* Kernel stacks are very small, usually 8KB or 16KB */
printf("\nkernel stack size: usually 8KB-16KB (no recursion!)\n");
free(heap_var);#include <stdio.h>
#include <stdlib.h>
int main() {
/* Stack allocation: managed automatically, released when the function returns */
int stack_var = 100;
/* Heap allocation: managed manually, must be released explicitly */
int *heap_var = malloc(sizeof(int));
*heap_var = 200;
printf("=== Stack and Heap ===\n");
printf("stack variable: value=%d, address=%p\n", stack_var, (void *)&stack_var);
printf("heap variable: value=%d, address=%p\n", *heap_var, (void *)heap_var);
printf("\n");
printf("stack addresses are usually high; heap addresses are usually low\n");
printf("the stack grows downward; the heap grows upward\n");
/* Kernel stacks are very small, usually 8KB or 16KB */
printf("\nkernel stack size: usually 8KB-16KB (no recursion!)\n");
free(heap_var);
return 0;
}Pointer Dao Trial
On a 64-bit system, how many bytes does an ordinary pointer usually occupy?