Read OSS

Code Generation and Linking: From AIR to Binary

Advanced

Prerequisites

  • Articles 1-3 of this series
  • Basic understanding of machine code and instruction encoding
  • Familiarity with executable formats (ELF, Mach-O, or PE/COFF)

Code Generation and Linking: From AIR to Binary

Once Sema has produced typed AIR for a function, the compiler's backend takes over. This half of the pipeline — code generation and linking — converts abstract operations into concrete machine instructions and assembles them into an executable binary. The Zig compiler supports multiple codegen backends and multiple linker implementations, all orchestrated through a generic dispatch layer.

In this article, we'll trace how AIR flows through the codegen dispatcher, gets lowered to backend-specific MIR, emitted as machine code, and linked into the final output.

The Codegen Dispatcher and Backend Selection

The codegen entry point is src/codegen.zig, which routes to the correct backend via importBackend():

fn importBackend(comptime backend: std.builtin.CompilerBackend) type {
    return switch (backend) {
        .stage2_aarch64 => aarch64,
        .stage2_c => @import("codegen/c.zig"),
        .stage2_llvm => @import("codegen/llvm.zig"),
        .stage2_riscv64 => @import("codegen/riscv64/CodeGen.zig"),
        .stage2_sparc64 => @import("codegen/sparc64/CodeGen.zig"),
        .stage2_spirv => @import("codegen/spirv/CodeGen.zig"),
        .stage2_wasm => @import("codegen/wasm/CodeGen.zig"),
        .stage2_x86, .stage2_x86_64 => @import("codegen/x86_64/CodeGen.zig"),
        // ...
    };
}

This is a comptime dispatch — the backend parameter is known at compile time, so importBackend resolves to a concrete module. Combined with dev.zig feature gating (via devFeatureForBackend at line 32), backends that aren't needed are completely dead-code-eliminated.

flowchart TD
    AIR["AIR"] --> DISP["codegen.zig dispatcher"]
    DISP -->|"x86_64"| X86["x86_64/CodeGen.zig\n~190K lines"]
    DISP -->|"aarch64"| ARM["aarch64/\nCodeGen.zig"]
    DISP -->|"C"| CBE["c.zig\n~8K lines"]
    DISP -->|"LLVM"| LLVM["llvm.zig\n~13K lines"]
    DISP -->|"wasm"| WASM["wasm/CodeGen.zig"]
    DISP -->|"riscv64"| RV["riscv64/CodeGen.zig"]
    DISP -->|"spirv"| SPIRV["spirv/CodeGen.zig"]

Two-Phase Code Generation: MIR and Emission

Codegen is split into two phases, each with its own function. This split separates the what (which instructions to generate) from the where (putting bytes in the right place in the output file).

Phase 1: generateFunction at line 141 lowers AIR to backend-specific MIR (Machine IR). It takes an Air and produces an AnyMir:

pub fn generateFunction(
    lf: *link.File, pt: Zcu.PerThread,
    src_loc: Zcu.LazySrcLoc, func_index: InternPool.Index,
    air: *const Air, liveness: *const ?Air.Liveness,
) CodeGenError!AnyMir { ... }

Phase 2: emitFunction at line 176 converts MIR to raw machine code bytes. It's called from the linker and may query linker state (e.g., for relocation targets):

pub fn emitFunction(
    lf: *link.File, pt: Zcu.PerThread,
    src_loc: Zcu.LazySrcLoc, func_index: InternPool.Index,
    atom_index: u32, any_mir: *const AnyMir,
    w: *std.Io.Writer, debug_output: link.File.DebugInfoOutput,
) ...
flowchart LR
    AIR["AIR\n(typed)"] -->|"generateFunction"| MIR["MIR\n(backend-specific)"]
    MIR -->|"emitFunction"| MC["Machine Code\n(bytes)"]
    MC -->|"linker writes"| BIN["Output Binary"]

    style AIR fill:#e3f2fd
    style MIR fill:#fff3e0
    style MC fill:#fce4ec

The AnyMir union at line 100 is the bridge between the two phases:

pub const AnyMir = union {
    aarch64: @import("codegen/aarch64/Mir.zig"),
    riscv64: @import("codegen/riscv64/Mir.zig"),
    x86_64: @import("codegen/x86_64/Mir.zig"),
    wasm: @import("codegen/wasm/Mir.zig"),
    c: @import("codegen/c.zig").Mir,
    // ...
};

This union allows the generic codegen/linker interface to pass backend-specific data without templates or type erasure. The active field is known from the backend in use, so accessing the wrong field is a logic error caught by the type system.

Tip: The C backend is special — its "MIR" is essentially a C AST, and emitFunction is never called for it. Instead, link.C directly understands the C backend's MIR and renders it to a .c file.

Backend Spotlight: x86_64, C, and LLVM

The three most important backends serve different purposes:

Backend Location Size Role
x86_64 src/codegen/x86_64/CodeGen.zig ~190K lines Most mature self-hosted backend
C src/codegen/c.zig ~8K lines Essential for bootstrap (outputs .c files)
LLVM src/codegen/llvm.zig ~13K lines Glue layer to external LLVM for production

The x86_64 backend is the flagship self-hosted backend. It performs register allocation, instruction selection, and direct x86_64 encoding. At 190K lines, it's the largest single component in the compiler — bigger than Sema.

The C backend outputs C source code instead of machine code. This is what makes the bootstrap chain possible: zig1 uses the C backend to produce zig2.c, which the system C compiler then compiles. Despite its critical role, it's remarkably small at ~8K lines.

The LLVM backend doesn't generate machine code directly. Instead, it translates AIR into LLVM IR using LLVM's C API, then hands off to LLVM for optimization and native code generation. This is the production path for users who want maximum optimization.

There's also a generateSymbol function at line 299 that handles non-function declarations (global variables, constants). Unlike generateFunction, it doesn't go through MIR — it directly writes the value's byte representation.

The Linker File Abstraction

The linker layer starts with the File struct at src/link.zig#L380:

pub const File = struct {
    tag: Tag,
    comp: *Compilation,
    emit: Path,
    file: ?fs.File,
    // ...
};

The Tag enum identifies the linker implementation:

pub const Tag = enum {
    coff2, elf, elf2, macho, c, wasm, spirv, plan9, lld,

    pub fn Type(comptime tag: Tag) type {
        return switch (tag) {
            .coff2 => Coff2, .elf => Elf, .elf2 => Elf2,
            .macho => MachO, .c => C, .wasm => Wasm,
            .spirv => SpirV, .lld => Lld, .plan9 => unreachable,
        };
    }
};
classDiagram
    class File {
        +tag: Tag
        +comp: *Compilation
        +emit: Path
        +prelink()
    }
    class Elf {
        +base: File
        +zig_object: ?*ZigObject
        +sections: MultiArrayList
        +files: MultiArrayList
    }
    class MachO {
        +base: File
    }
    class C {
        +base: File
    }
    class Wasm {
        +base: File
    }
    File <|-- Elf
    File <|-- MachO
    File <|-- C
    File <|-- Wasm

Notice there are two ELF implementations: elf (the original) and elf2 (a newer rewrite). The fromObjectFormat function selects between them based on a use_new_linker flag. This pattern of maintaining both old and new implementations during transitions is common in the Zig project.

The linker receives work from codegen through link_nav and link_type jobs in the compilation work queue. When a Nav's value is fully resolved, a link_nav job gets queued, which calls into the appropriate linker implementation to emit the declaration into the output binary.

Self-Hosted ELF Linker Deep Dive

The ELF linker at src/link/Elf.zig is the most complex self-hosted linker. Its struct embeds base: link.File and adds ELF-specific state:

base: link.File,
zig_object: ?*ZigObject,
rpath_table: std.StringArrayHashMapUnmanaged(void),
image_base: u64,
// ... many z_* flags for linker options
files: std.MultiArrayList(File.Entry) = .{},
sections: std.MultiArrayList(Section) = .{},

The linker manages Atoms (defined in Elf/Atom.zig) — the smallest relocatable units of code or data. Each function and global variable becomes an Atom. The ZigObject represents the Zig-generated object within the ELF file, while SharedObject entries represent dynamic libraries.

For incremental linking, the ELF linker can patch individual atoms in-place without re-linking the entire binary. When a function changes, only its atom gets re-emitted. This is possible because the initial link reserves padding space that later updates can fill.

flowchart TD
    subgraph "ELF Linker"
        ZO["ZigObject\n(Zig code atoms)"]
        OBJ["Object Files\n(C objects, etc.)"]
        SO["Shared Objects\n(.so files)"]
        SECT["Sections\n(.text, .data, .rodata)"]
        SHDR["Section Headers"]
        OUT["ELF Binary"]
    end
    ZO --> SECT
    OBJ --> SECT
    SO --> SECT
    SECT --> SHDR
    SHDR --> OUT

Codegen and Linking in the Job System

As we saw in Article 1, the compilation uses a staged job queue. Codegen and linking are orchestrated through this system in src/Compilation.zig.

The processOneJob function handles codegen_func jobs by:

  1. Checking that all types in the AIR are fully resolved
  2. Creating a SharedMir for the function
  3. Either spawning a worker thread for codegen (if the backend supports it) or running codegen on the current thread
  4. Dispatching a link_func task to write the machine code into the output binary

The job priority system puts codegen_func and resolve_type_fully at stage 0 (highest priority), while other jobs like analyze_mod and link_nav are at stage 1. This ensures codegen threads stay busy while Sema continues analyzing other functions:

sequenceDiagram
    participant WQ as Work Queue
    participant T1 as Thread 1 (Sema)
    participant T2 as Thread 2 (Codegen)
    participant LNK as Linker

    T1->>WQ: Queue codegen_func (stage 0)
    T1->>T1: Continue analyzing next function
    T2->>WQ: Dequeue codegen_func
    T2->>T2: generateFunction (AIR → MIR)
    T2->>LNK: Dispatch link_func task
    LNK->>LNK: emitFunction (MIR → bytes)

Tip: The separateCodegenThreadOk() method determines whether codegen can run on a different thread from Sema. Some backends have threading bugs (noted in Zcu.Feature.separate_thread), so the compiler falls back to single-threaded execution when needed.

What's Next

We've now traced the complete path from source bytes to binary output. In the final article, we'll zoom out and examine the orchestration layer: how Compilation.update() drives the entire cycle, how the three cache modes work, how the InternPool's dependency tracking enables fine-grained incremental recompilation, and how the multi-stage bootstrap ties everything together.