Read OSS

The Build Subsystem: gRPC, BuildKit, and Image Creation

Advanced

Prerequisites

  • Article 1: Architecture and Navigation Guide
  • Article 3: Container Lifecycle from run to Exit
  • Basic familiarity with gRPC and Protocol Buffers

The Build Subsystem: gRPC, BuildKit, and Image Creation

Every article in this series so far has centered on XPC — the macOS-native IPC mechanism that connects the CLI to the API server to the helper daemons. The build subsystem breaks that pattern entirely. When you run container build, the CLI communicates with a BuildKit process running inside a Linux container VM via gRPC over a vsock socket. No XPC, no Mach services, no launchd — just a raw socket connection from macOS into the guest VM.

This isn't arbitrary. It's a direct consequence of the one-VM-per-container architecture: BuildKit is a Linux process, and Linux processes can't participate in macOS XPC. gRPC over vsock is the natural bridge between the two worlds. Understanding this system completes the picture of how apple/container's communication patterns adapt to their constraints.

Why gRPC Instead of XPC for Builds

The build subsystem's architecture falls out of a simple constraint: BuildKit runs inside a Linux VM, and XPC only works between macOS processes.

When you run container build, a dedicated "builder" container is started (or reused if already running) with the container ID buildkit. This container runs a BuildKit daemon that listens for connections. The macOS host connects to it over a vsock socket — a virtual socket that provides direct host-to-guest communication without going through the network stack.

flowchart LR
    subgraph macOS Host
        CLI["container build<br/>(CLI process)"]
        API["container-apiserver"]
    end
    subgraph Linux VM
        BK["BuildKit daemon<br/>(Linux process)"]
    end

    CLI -->|"XPC: dial(buildkit, port)"| API
    API -->|"vsock fd"| CLI
    CLI -->|"gRPC over vsock"| BK

    style CLI fill:#4A90D9,color:#fff
    style API fill:#D94A4A,color:#fff
    style BK fill:#2ECC71,color:#fff

The connection is established by first asking the API server to dial a vsock port on the builder container (reusing the same containerDial XPC route used for other vsock operations), receiving back a file descriptor for the connected socket, and then using that file descriptor to establish a gRPC channel.

The Builder Struct: vsock gRPC Channel

Builder is the central type in the build subsystem. Its initializer takes a FileHandle (the vsock socket) and an EventLoopGroup, then configures a gRPC ClientConnection over the connected socket:

The connection configuration at lines 39-56 is tuned for the build workload:

config.connectionIdleTimeout = TimeAmount(.seconds(600))
config.connectionKeepalive = .init(
    interval: TimeAmount(.seconds(600)),
    timeout: TimeAmount(.seconds(500)),
    permitWithoutCalls: true
)
config.callStartBehavior = .fastFailure
config.httpMaxFrameSize = 8 << 10
config.maximumReceiveMessageLength = 512 << 20
config.httpTargetWindowSize = 16 << 10

Notable choices: the maximumReceiveMessageLength is set to 512 MiB to accommodate large build outputs, connectionKeepalive uses 10-minute intervals with permitWithoutCalls: true to keep the connection alive during long builds, and callStartBehavior is .fastFailure so the client fails immediately if the server isn't ready rather than queueing.

The socket buffers are explicitly set before creating the connection — 4 MiB send and 2 MiB receive — via setSockOpt calls at lines 366-405.

sequenceDiagram
    participant CLI as BuildCommand
    participant CC as ContainerClient
    participant B as Builder

    CLI->>CC: dial(id: "buildkit", port: 8088)
    CC-->>CLI: FileHandle (vsock socket)
    CLI->>B: Builder(socket: fh, group: eventLoopGroup)
    B->>B: Configure gRPC ClientConnection
    CLI->>B: info()
    B-->>CLI: InfoResponse (BuildKit ready)
    CLI->>B: build(config)

The BuildCommand in the CLI orchestrates this: it dials the vsock port, creates a Builder, verifies BuildKit is running by calling info(), and then starts the build. If the builder container isn't running, the command automatically starts it using BuilderStart.start() and waits for it to be ready.

Build Configuration as HPACK Metadata Headers

Here's the most unconventional pattern in the codebase. Instead of encoding build parameters in the gRPC request message body, they're passed as HTTP/2 HPACK metadata headers:

extension CallOptions {
    public init(_ config: Builder.BuildConfig) throws {
        var headers: [(String, String)] = [
            ("build-id", config.buildID),
            ("context", URL(filePath: config.contextDir).path(percentEncoded: false)),
            ("dockerfile", config.dockerfile.base64EncodedString()),
            ("progress", config.terminal != nil ? "tty" : "plain"),
            ("target", config.target),
        ]
        for tag in config.tags {
            headers.append(("tag", tag))
        }
        for platform in config.platforms {
            headers.append(("platforms", platform.description))
        }
        // ... build args, labels, secrets, cache options, outputs
        self.init(customMetadata: HPACKHeaders(headers))
    }
}

The Dockerfile content is base64-encoded into a header. Tags, platforms, build args, labels, and secrets all go into headers, with repeated values using the same header key multiple times (which HPACK supports). Build secrets are particularly interesting: they're base64-encoded as id=base64data pairs in the secrets header.

Why headers instead of a message body? This design maps to how the BuildKit shim on the Linux side works. The shim process inside the container receives these headers as part of the gRPC call setup, before any streaming begins. It uses them to configure the BuildKit session, then the bidirectional stream handles the actual build I/O. This separation of configuration (headers) from data (stream) is clean — the configuration is available synchronously before streaming starts.

Bidirectional Streaming: Progress and Terminal Resize

The build operation uses gRPC bidirectional streaming via performBuild. The client sends ClientStream messages and receives ServerStream messages concurrently.

BuildPipeline manages the server-side stream. It maintains a chain of BuildPipelineHandler implementations, each responsible for a different type of server message:

sequenceDiagram
    participant Client as macOS (Builder)
    participant Stream as gRPC Stream
    participant Server as BuildKit (Linux VM)

    Note over Client,Server: Bidirectional streaming

    Server->>Stream: ServerStream (file sync request)
    Stream->>Client: BuildFSSync handles
    Client->>Stream: ClientStream (file data)

    Server->>Stream: ServerStream (content proxy request)
    Stream->>Client: BuildRemoteContentProxy handles
    Client->>Stream: ClientStream (content data)

    Server->>Stream: ServerStream (build progress)
    Stream->>Client: BuildStdio renders to terminal

    Client->>Stream: ClientStream (terminal resize)
    Stream->>Server: Resize command

The handlers in the pipeline at lines 28-35:

Handler Responsibility
BuildFSSync Syncs build context files from host to BuildKit
BuildRemoteContentProxy Proxies OCI content from host's content store
BuildImageResolver Resolves base images during build
BuildStdio Renders build progress output to terminal

Each handler implements accept(_ packet:) -> Bool to check if it can handle a packet, and handle(_ sender:, _ packet:) to process it. The pipeline iterates handlers in order, stopping at the first one that accepts the packet.

On the client-to-server direction, the build method at lines 80-131 sends terminal resize events. A SIGWINCH signal handler watches for terminal size changes and yields ClientStream messages containing serialized TerminalCommand structs with the new dimensions.

Tip: The BuildPipeline uses a custom untilFirstError concurrency primitive instead of withThrowingTaskGroup. The standard task group can't exit when a single task fails while the main loop is still iterating the stream — the untilFirstError implementation at lines 98-167 solves this by running the stream consumption and error monitoring as concurrent tasks.

Build Export: From BuildKit Output to Local Content Store

When the build completes, BuildKit writes the output (OCI image layers) to a location accessible to the host. The BuildExport type defines the supported output formats:

  • oci — Export as an OCI image archive, then load it into the local content store via ClientImage.load
  • tar — Export as a tar archive to a specified destination
  • local — Export build output to a local directory

The CLI's export handling in BuildCommand.swift#L390-L442 shows the post-build flow for OCI exports: the output tar is loaded into the images service, unpacked for the target platform, and tagged with all requested image names. A ProgressTaskCoordinator tracks unpacking progress across multiple exports.

flowchart TD
    A[Build Complete] --> B{Export type?}
    B -->|oci| C[Load archive via ClientImage]
    C --> D[Unpack for target platform]
    D --> E[Tag with requested names]
    E --> F["Print: Successfully built <names>"]
    B -->|tar| G[Move out.tar to destination]
    G --> H["Print: Successfully exported to <dest>"]
    B -->|local| I[Copy local dir to destination]
    I --> H

Architectural Reflections

Looking at the build subsystem alongside the rest of apple/container reveals a clear architectural principle: use the right communication mechanism for each boundary.

Between macOS processes that share the same user context and need privilege separation: XPC with Mach services and audit token validation. Between a macOS host and a Linux VM guest: gRPC over vsock, because the guest can't participate in macOS IPC primitives. The project doesn't force everything through a single abstraction — it adapts to the constraints of each boundary.

The build subsystem also demonstrates how the one-VM-per-container model extends naturally to development workflows. BuildKit gets its own VM, isolated from your running containers. If a build goes wrong, it can't affect your running services. If you want different BuildKit configurations for different projects, you could potentially run multiple builder containers.

This concludes our deep dive into apple/container. Across six articles, we've traced the architecture from the top-level process model down to individual XPC messages, from CLI argument parsing through VM boot to DNS resolution. The codebase is remarkably consistent in its patterns — the Service/Harness split, the client/server target pairs, the typed route enums — which makes it navigable once you understand the conventions. The one area where it deliberately breaks convention — the gRPC-based build subsystem — does so for exactly the right reasons.