Compilation Orchestration, Caching, and Incremental Compilation

Over the previous four articles, we've traced the individual phases of the Zig compiler: parsing, AstGen, Sema, codegen, and linking. But who drives the whole thing? How does the compiler know which functions to re-analyze when a source file changes? How does caching work? And how does the bootstrap process leverage all of this to build a full compiler from a WebAssembly blob?

This final article covers the orchestration layer — the Compilation.update() loop, the job queue system, three cache modes, the InternPool's dependency tracking, and the PerThread threading model.

The Compilation.update() Loop

Everything starts with Compilation.update(). This is the function that buildOutputType() calls after constructing a Compilation. It's also what a long-running compilation server calls on each edit cycle.

The function's flow depends on the cache mode, but the general structure is:

flowchart TD
    START["update()"] --> CLEAR["Clear misc failures"]
    CLEAR --> CACHE{"Cache mode?"}
    CACHE -->|"whole"| CHECK["Check cache manifest"]
    CHECK -->|"hit"| DONE["Return (cache hit)"]
    CHECK -->|"miss"| WORK
    CACHE -->|"none"| WORK["Create temp directory"]
    CACHE -->|"incremental"| WORK2["Detect changed files"]
    WORK --> QUEUE["Queue AstGen, C compilation,\nSema, codegen, linking jobs"]
    WORK2 --> QUEUE
    QUEUE --> PROCESS["Process work queues\n(thread pool)"]
    PROCESS --> LINK["Wait for link tasks"]
    LINK --> FINALIZE["Finalize output"]
    FINALIZE --> END["Write cache manifest"]

At line 2884, the function branches on comp.cache_use:

whole: Checks a full cache manifest upfront. If all inputs match, the entire compilation is skipped. This is the default mode for one-shot builds.
none: No caching at all. Used for special cases.
incremental: Per-declaration tracking. Changed files trigger targeted re-analysis.

For the whole mode, the cache check at lines 2898–2960 computes a manifest from all inputs (source files, flags, target triple, etc.) and checks if a cached output exists. This can short-circuit the entire compilation in a single filesystem check.

Staged Work Queues and Job Priorities

The Compilation maintains an array of work queues, sized at compile time based on the number of job stages:

work_queues: [len: { ... }]std.Deque(Job),

The length calculation at lines 119-127 iterates over all Job.Tag variants and finds the maximum stage number. The Job union defines the work types:

const Job = union(enum) {
    codegen_func: struct { func: InternPool.Index, air: Air },
    link_nav: InternPool.Nav.Index,
    link_type: InternPool.Index,
    analyze_comptime_unit: InternPool.AnalUnit,
    analyze_func: InternPool.Index,
    analyze_mod: *Package.Module,
    resolve_type_fully: InternPool.Index,
    windows_import_lib: usize,
};

The stage() function assigns priorities:

fn stage(tag: Tag) usize {
    return switch (tag) {
        .resolve_type_fully, .analyze_func, .codegen_func => 0,
        else => 1,
    };
}

Stage 0 jobs (type resolution, function analysis, codegen) are processed before stage 1 jobs (module analysis, link_nav, etc.). This design ensures codegen threads have work to do as soon as Sema finishes analyzing a function, maximizing parallelism.

flowchart LR
    subgraph "Stage 0 (High Priority)"
        RF["resolve_type_fully"]
        AF["analyze_func"]
        CF["codegen_func"]
    end
    subgraph "Stage 1 (Normal Priority)"
        AM["analyze_mod"]
        LN["link_nav"]
        LT["link_type"]
        ACU["analyze_comptime_unit"]
    end
    AF -->|"produces"| CF
    CF -->|"produces"| LN

The processOneJob function dispatches each job type. As we saw in Article 4, codegen_func jobs are handled by creating a SharedMir and either spawning a thread or running codegen inline.

Tip: The compile-time assertion at line 1012 (assert(stage(.resolve_type_fully) <= stage(.codegen_func))) enforces that type resolution always completes before codegen attempts to use the types. This is a structural guarantee, not just a runtime hope.

Cache Modes: None, Whole, and Incremental

The three cache modes represent fundamentally different strategies:

Mode	When Used	Granularity	Invalidation
`none`	Special cases	—	Everything rebuilds
`whole`	Default one-shot	Entire compilation	Any input change invalidates
`incremental`	Watch mode / server	Per-declaration	Dependency-tracked invalidation

Whole caching is the simplest and most common. It hashes all inputs (source files, compiler flags, target) into a cache manifest. If the manifest matches, the cached binary is used directly. This gives the "nothing changed, nothing to do" fast path.

Incremental caching is far more sophisticated. It tracks dependencies at the AnalUnit granularity — individual functions, comptime blocks, and type resolutions. When a source file changes, only the declarations that actually changed (detected via source hash comparison) are re-analyzed, and only their dependents are invalidated.

The incremental mode requires the InternPool's dependency tracking infrastructure, which we'll explore next.

Dependency Tracking in the InternPool

The InternPool houses eight dependency hashmaps at lines 34-65:

src_hash_deps: AutoArrayHashMap(TrackedInst.Index, DepEntry.Index),
nav_val_deps: AutoArrayHashMap(Nav.Index, DepEntry.Index),
nav_ty_deps: AutoArrayHashMap(Nav.Index, DepEntry.Index),
interned_deps: AutoArrayHashMap(Index, DepEntry.Index),
zon_file_deps: AutoArrayHashMap(FileIndex, DepEntry.Index),
embed_file_deps: AutoArrayHashMap(Zcu.EmbedFile.Index, DepEntry.Index),
namespace_deps: AutoArrayHashMap(TrackedInst.Index, DepEntry.Index),
namespace_name_deps: AutoArrayHashMap(NamespaceNameKey, DepEntry.Index),

graph TD
    subgraph "Dependency Sources (Dependees)"
        SH["src_hash_deps\n(ZIR instruction hashes)"]
        NV["nav_val_deps\n(Nav values)"]
        NT["nav_ty_deps\n(Nav types)"]
        ID["interned_deps\n(runtime funcs, containers)"]
        NS["namespace_deps\n(all names in scope)"]
        NN["namespace_name_deps\n(specific name existence)"]
    end
    subgraph "Dependency Consumers (Dependers)"
        AU["AnalUnit\n(func, comptime, nav_val, etc.)"]
    end
    SH -->|"invalidates"| AU
    NV -->|"invalidates"| AU
    NT -->|"invalidates"| AU
    ID -->|"invalidates"| AU
    NS -->|"invalidates"| AU
    NN -->|"invalidates"| AU

Each hashmap maps from a dependee (something that can change) to a linked list of DepEntry values. Each DepEntry records which AnalUnit depends on that thing, plus a pointer to the next entry in two linked lists (one for "all deps of this dependee" and one for "all deps from this depender").

The TrackedInst provides stable cross-update references to ZIR instructions. On an incremental update, ZIR instruction indices may shift, so TrackedInst maps old indices to new ones. When mapping fails (the instruction was deleted), the MaybeLost wrapper uses a sentinel .lost value.

The invalidation flow works like this:

Source file changes → AstGen produces new ZIR
TrackedInst source hashes are compared: old vs. new
Changed hashes trigger src_hash_deps lookups
Each dependent AnalUnit is marked for re-analysis
Re-analysis may change Nav values/types, triggering further cascading through nav_val_deps and nav_ty_deps

The first_dependency map at line 76 provides the reverse direction: given an AnalUnit, find all its dependencies so they can be removed when the unit is re-analyzed.

The PerThread Threading Model

Thread-safe access to Zcu and InternPool is managed through PerThread:

zcu: *Zcu,
tid: Id,  // dense, per-thread unique index

pub fn activate(zcu: *Zcu, tid: Id) Zcu.PerThread {
    zcu.intern_pool.activate();
    return .{ .zcu = zcu, .tid = tid };
}

pub fn deactivate(pt: Zcu.PerThread) void {
    pt.zcu.intern_pool.deactivate();
}

The activate/deactivate pattern is a lightweight scope guard. activate() increments the InternPool's active thread count, and deactivate() decrements it. This doesn't lock anything — the InternPool's sharded design means multiple threads can intern values simultaneously without contention, as long as they use different shards (selected by tid).

sequenceDiagram
    participant T1 as Thread 1
    participant T2 as Thread 2
    participant IP as InternPool
    participant S1 as Shard[tid=0]
    participant S2 as Shard[tid=1]

    T1->>IP: activate(tid=0)
    T2->>IP: activate(tid=1)
    T1->>S1: intern type (no contention)
    T2->>S2: intern type (no contention)
    T1->>IP: deactivate()
    T2->>IP: deactivate()

The Compilation itself has a mutex for protecting shared mutable state like error lists, failed object tables, and the work queues. In single-threaded builds, this mutex becomes a no-op struct.

The threading model is designed for the common case where Sema runs on one thread while codegen runs on another. The InternPool's per-thread Local storage means Sema can intern new types without blocking codegen's reads. The dependency tracking data is only modified during the single-threaded invalidation phase between updates.

Tip: The IdBacking type is u7, supporting up to 128 threads. The tid is embedded in the high bits of InternPool indices via the tid_shift_* fields, so every index implicitly encodes which thread created it.

The Multi-Stage Bootstrap

Let's bring everything full circle. As we introduced in Article 1, the bootstrap uses three stages. Now that we understand the compilation pipeline, we can appreciate the elegance of this design.

The bootstrap.c program orchestrates the chain:

Stage 1 — zig1 (bootstrap environment):

// 1. Build wasm2c from C
cc -o zig-wasm2c stage1/wasm2c.c
// 2. Convert zig1.wasm to C
./zig-wasm2c stage1/zig1.wasm zig1.c
// 3. Compile zig1 from C
cc -o zig1 zig1.c stage1/wasi.c

Then bootstrap.c writes a config.zig with pub const dev = .core; at line 145, setting the stage for zig2.

Stage 2 — zig2 (core environment):

// 4. zig1 compiles the compiler to C
./zig1 lib build-exe -ofmt=c -target host ...
// 5. System C compiler builds zig2
cc -o zig2 zig2.c compiler_rt.c

Stage 3 — zig3 (full environment): zig2 runs as a normal Zig compiler (with dev = .core supporting all backends and linkers) and builds the final stage 3 compiler with dev = .full.

The feature-gating system at src/dev.zig makes each stage viable:

sequenceDiagram
    participant B as bootstrap (zig1)
    participant C as core (zig2)
    participant F as full (zig3)

    Note over B: 6 features enabled
    Note over B: C backend only
    Note over B: No networking, no fmt
    B->>C: Compile with -ofmt=c

    Note over C: ~35 features enabled
    Note over C: All backends + linkers
    Note over C: No fmt, fetch, init
    C->>F: Compile with all features

    Note over F: All features enabled
    Note over F: Complete compiler

The bootstrap environment at lines 58-67 supports only: build_exe_command, build_obj_command, ast_gen, sema, c_backend, and c_linker. Everything else — x86_64 codegen, ELF linking, LLVM, incremental compilation — is dead-code-eliminated at compile time. The resulting zig1.c is small enough to compile quickly with any C compiler.

The core environment at lines 68-130 enables most compilation features but deliberately excludes networking (network_listen), the fmt command, and other non-essential features. The comment on line 126 explains why: "Avoid dragging networking into zig2.c because it adds dependencies on some linker symbols that are annoying to satisfy while bootstrapping."

This three-stage design solves the fundamental chicken-and-egg problem of self-hosting: you need a Zig compiler to compile the Zig compiler. By using WebAssembly as the portable stage0, converting through C as the universal bootstrap language, and progressively enabling features at each stage, Zig achieves a fully self-hosted compiler that can be built from source on any platform with a C compiler.

Closing Thoughts

Over these five articles, we've mapped the entire Zig compiler — from the repository layout to the final binary output. The architecture reflects several distinctive design principles:

Flat data structures everywhere. The AST, ZIR, AIR, and InternPool all use integer indices rather than pointers. This gives excellent cache locality and trivial serialization.
Comptime-driven configuration. The dev.zig feature gating, the importBackend() comptime dispatch, and the AnyMir union all leverage Zig's comptime capabilities to dead-code-eliminate at compile time.
Incremental from the ground up. The AnalUnit / Nav / TrackedInst / dependency-tracking infrastructure isn't bolted on — it's woven into the core data structures.
Frontend shared as a library. Putting the tokenizer, parser, and AstGen in lib/std/zig/ ensures one source of truth for the language's syntax, shared between the compiler, formatter, and language server.
Self-hosting through progressive capability. The three-stage bootstrap with dev.zig feature gating is an elegant solution to the self-hosting bootstrapping problem.

The Zig compiler is a living project targeting 0.16.0-dev, and these structures will continue to evolve. But the fundamental architecture — the IR chain, the InternPool, the staged job system, and the bootstrap strategy — forms a foundation that's both principled and pragmatic. Understanding it puts you in a strong position to contribute, build tooling around Zig, or simply appreciate one of the most ambitious compiler projects in modern systems programming.

Compilation Orchestration, Caching, and Incremental Compilation

Prerequisites

Compilation Orchestration, Caching, and Incremental Compilation

The Compilation.update() Loop

Staged Work Queues and Job Priorities

Cache Modes: None, Whole, and Incremental

Dependency Tracking in the InternPool

The PerThread Threading Model

The Multi-Stage Bootstrap

Closing Thoughts