Read OSS

The XPC Communication Layer: How Processes Talk to Each Other

Advanced

Prerequisites

  • Article 1: Architecture and Navigation Guide
  • Familiarity with Swift actors and async/await
  • Basic understanding of macOS XPC and Mach services

The XPC Communication Layer: How Processes Talk to Each Other

As we saw in Part 1, apple/container is not a single process — it's five cooperating executables. Every message that flows between them passes through a custom XPC abstraction layer that lives in Sources/ContainerXPC/. This layer does three things: it provides type-safe access to XPC dictionaries, it turns XPC's callback-based API into Swift async/await, and it enforces EUID-based security on every incoming message.

This article dissects each component of that layer, then examines how the service targets build typed API contracts on top of it, and finally explores a particularly elegant security pattern used by the container runtime helper.

XPCMessage: A Typed Dictionary over xpc_object_t

Apple's XPC framework works with opaque xpc_object_t values. Reading a string out of an XPC dictionary requires calling xpc_dictionary_get_string with a C string key, getting back an UnsafePointer<CChar>?, and converting it. This is tedious and error-prone.

XPCMessage wraps an xpc_object_t with type-safe accessors for every type the project needs:

classDiagram
    class XPCMessage {
        +routeKey: String$
        +errorKey: String$
        -object: xpc_object_t
        -lock: NSLock
        +string(key) String?
        +set(key, String)
        +data(key) Data?
        +set(key, Data)
        +bool(key) Bool
        +uint64(key) UInt64
        +int64(key) Int64
        +date(key) Date
        +fileHandle(key) FileHandle?
        +set(key, FileHandle)
        +endpoint(key) xpc_endpoint_t?
        +reply() XPCMessage
        +error() throws
        +set(error: ContainerizationError)
    }

Two details are worth calling out. First, all access to the underlying xpc_object_t is serialized through an NSLock. The object itself is marked nonisolated(unsafe) — a Swift 6 concurrency escape hatch — but actual access is always lock-protected. Second, the error handling is convention-based: errors are JSON-encoded as a ContainerXPCError struct and stored under a well-known key (XPCMessage.swift#L78-L98). The client side calls message.error() on every response to check for server-side failures.

The fileHandle accessors deserve special attention. XPC can pass file descriptors between processes — the kernel duplicates the descriptor into the receiving process's file table. This is how stdin/stdout/stderr pipes travel from the CLI through the API server to the container runtime. The implementation at XPCMessage.swift#L218-L235 uses xpc_fd_create and xpc_fd_dup to handle the descriptor lifecycle.

Tip: dataNoCopy(key:) is a performance optimization that returns a Data backed by the XPC object's memory without copying. Use it when you'll consume the data immediately (like JSON decoding), but be aware the data becomes invalid once the XPC message is deallocated.

XPCServer: Route-Based Dispatch and Security

XPCServer is initialized with a Mach service identifier and a dictionary mapping route strings to handler closures. When listen() is called, it creates a Mach service listener, accepts incoming connections, and dispatches each message to the appropriate handler based on a route key embedded in the message.

flowchart TD
    A[Incoming XPC Connection] --> B{xpc_get_type?}
    B -->|CONNECTION| C[handleClientConnection]
    B -->|ERROR| D[Finish Stream]
    C --> E[Receive Message]
    E --> F{Is Dictionary?}
    F -->|No| G[Reply Error]
    F -->|Yes| H{EUID Match?}
    H -->|No| I[Reply Unauthorized]
    H -->|Yes| J{Route Exists?}
    J -->|No| K[Reply Invalid]
    J -->|Yes| L[Call Handler]
    L --> M[Send Response]

The security check is concise but critical. At XPCServer.swift#L166-L184, every incoming message has its audit token extracted via xpc_dictionary_get_audit_token. The server compares the client's effective UID against its own with geteuid(). If they don't match, the request is rejected immediately. This prevents other users on the same machine from sending commands to your container daemon.

The connection handling uses AsyncStream to bridge XPC's callback model into structured concurrency. The outer listen() method wraps the connection event handler in an AsyncStream<xpc_connection_t>, and each connection's messages are themselves wrapped in an AsyncStream<xpc_object_t>. Both streams use withThrowingDiscardingTaskGroup to process items concurrently.

XPCClient: Bridging Callbacks into Async/Await

On the client side, XPCClient wraps xpc_connection_create_mach_service and provides an async send method. The interesting part is how it handles timeouts.

The send(_:responseTimeout:) method uses withThrowingTaskGroup to race two tasks: the actual XPC send (wrapped in withCheckedThrowingContinuation) and a sleep task that throws after the timeout expires. Whichever finishes first wins, and the other is cancelled:

sequenceDiagram
    participant Caller
    participant TaskGroup
    participant XPC as xpc_connection_send
    participant Timer as Task.sleep

    Caller->>TaskGroup: addTask(XPC send)
    Caller->>TaskGroup: addTask(sleep timeout)

    alt XPC responds first
        XPC-->>TaskGroup: XPCMessage
        TaskGroup->>Timer: cancel
        TaskGroup-->>Caller: response
    else Timeout fires first
        Timer-->>TaskGroup: throw timeout error
        TaskGroup->>XPC: cancel
        TaskGroup-->>Caller: throw error
    end

The default timeout is 60 seconds (XPCClient.xpcRegistrationTimeout), which is intentionally generous. When a container runtime helper is first registered with launchd, macOS may take several seconds to actually launch the process. Once the service is running, XPC requests complete in milliseconds.

Route and Key Enums: The Typed API Contract

The raw XPCMessage works with string keys. The service layers build type-safe contracts on top using enums. XPC+.swift in ContainerAPIClient defines two enums:

XPCKeys — field names for all data that flows through the API server: container configuration, process IDs, file descriptors for stdio, network state, volume data, progress updates, and more.

XPCRoute — all routes the API server handles: containerList, containerCreate, containerBootstrap, networkCreate, pluginLoad, ping, and dozens more.

The file also provides typed extension methods on XPCMessage that accept XPCKeys and XPCRoute values instead of raw strings, turning message.string(key: "id") into message.string(key: .id).

The sandbox service has its own parallel set of enums in SandboxRoutes.swift. Each route is namespaced with com.apple.container.sandbox/ prefix — for example, com.apple.container.sandbox/bootstrap or com.apple.container.sandbox/createProcess.

classDiagram
    class XPCRoute {
        <<enumeration>>
        containerList
        containerCreate
        containerBootstrap
        containerStop
        networkCreate
        pluginLoad
        ping
        ...
    }
    class SandboxRoutes {
        <<enumeration>>
        createEndpoint
        bootstrap
        createProcess
        start
        stop
        wait
        dial
        shutdown
        ...
    }
    class XPCKeys {
        <<enumeration>>
        id
        containerConfig
        stdin
        stdout
        stderr
        exitCode
        ...
    }
    XPCRoute ..> XPCMessage : used with
    SandboxRoutes ..> XPCMessage : used with
    XPCKeys ..> XPCMessage : used with

Tip: If you're adding a new operation to apple/container, the first step is adding a route to the appropriate enum and keys for any new data fields. This establishes the contract before you write any business logic.

The Service/Harness Pattern

Every server-side service in the codebase follows a consistent two-struct pattern:

  1. A Service (usually an actor) holds business logic and mutable state.
  2. A Harness (usually a struct) handles XPC message deserialization, calls the service, and serializes the response.

The harness methods are registered as route handlers during API server startup. Look at APIServer+Start.swift#L264-L292: ContainersService is the actor, ContainersHarness is the struct, and each route like XPCRoute.containerCreate maps to harness.create.

This separation is clean: the service layer never touches XPCMessage directly, making it testable without XPC. The harness is thin glue code — decode the request, call the service, encode the response. Every service in the API server follows this pattern: PluginsService/PluginsHarness, NetworksService/NetworksHarness, VolumesService/VolumesHarness, HealthCheckService/HealthCheckHarness.

The Two-Server Security Pattern in container-runtime-linux

The runtime helper uses the most interesting XPC pattern in the entire codebase. Instead of exposing all its operations on a single Mach service, it runs two XPC servers.

Looking at RuntimeLinuxHelper+Start.swift#L74-L122:

sequenceDiagram
    participant Client as API Server / CLI
    participant EP as Endpoint Server<br/>(public Mach service)
    participant Main as Main Server<br/>(anonymous connection)

    Note over EP: Only exposes createEndpoint route
    Client->>EP: createEndpoint
    EP->>EP: xpc_endpoint_create(anonymousConnection)
    EP-->>Client: XPC endpoint token
    Client->>Client: xpc_connection_create_from_endpoint
    Client->>Main: bootstrap, createProcess, wait...
    Main-->>Client: responses

The endpoint server is registered with launchd under the public Mach service name (e.g., com.apple.container.runtime.container-runtime-linux.{uuid}). It exposes exactly one route: createEndpoint. This route creates an XPC endpoint from an anonymous connection and returns it.

The main server listens on that anonymous connection. It exposes all the real operations: bootstrap, createProcess, start, stop, kill, resize, wait, dial, shutdown, and statistics.

Why this split? The public Mach service name is discoverable by any process on the system. By limiting the public surface to a single createEndpoint operation, the runtime minimizes what an attacker could do even if they could connect to the service. The actual sandbox operations are only accessible through the anonymous endpoint — which requires having already successfully called createEndpoint and passed the EUID check.

The client side of this handshake lives in SandboxClient.swift#L50-L75. The static create method connects to the public Mach service, calls createEndpoint, extracts the endpoint from the response, creates a new connection from it, and returns a SandboxClient backed by that direct connection.

What's Next

With the communication layer understood, we can now trace a complete operation end-to-end. In the next article, we'll follow container run from the moment you press Enter — through CLI parsing, XPC calls to the API server, plugin registration with launchd, the endpoint handshake we just described, VM creation, Linux boot, stdio pipe passing, and finally process exit.