Container Lifecycle: From `container run` to Exit
Prerequisites
- ›Article 1: Architecture and Navigation Guide
- ›Article 2: The XPC Communication Layer
Container Lifecycle: From container run to Exit
In the previous articles we mapped the architecture and dissected the XPC communication layer. Now it's time to watch everything work together. This article traces the complete path of a container run command — from the moment you type it to the moment the container process exits and you get your shell prompt back.
This is where the four-layer architecture, the Service/Harness pattern, the two-server endpoint handshake, and the file-descriptor passing all converge into a single, coordinated flow.
CLI Parsing and ContainerConfiguration
Everything begins in ContainerRun.swift. The command uses Swift Argument Parser's @OptionGroup pattern to organize flags into logical groups: process options (tty, interactive), resource options (CPUs, memory, storage), management options (name, detach, auto-remove), and registry options.
flowchart TD
A["container run --name web -p 8080:80 nginx"] --> B[Parse Flags]
B --> C[Generate Container ID]
C --> D[Check for Existing Container]
D --> E["Utility.containerConfigFromFlags()"]
E --> F[ContainerConfiguration]
F --> G[ContainerClient.create]
G --> H[ContainerClient.bootstrap]
The flag groups are combined by Utility.containerConfigFromFlags() into a ContainerConfiguration — the central data type that describes everything about a container. This struct is Codable and travels across process boundaries as JSON embedded in XPC messages.
The configuration captures the complete container spec:
| Field | Purpose |
|---|---|
id |
Unique container identifier |
image |
OCI image reference |
mounts |
Host-to-container filesystem mounts |
publishedPorts |
Port mappings (host:container) |
networks |
Network attachment configurations |
resources |
CPU count, memory (default 1 GiB), storage quota |
rosetta |
Enable x86-64 translation |
ssh |
Forward SSH agent socket |
readOnly |
Mount rootfs read-only |
runtimeHandler |
Which runtime plugin to use (default: container-runtime-linux) |
initProcess |
The process to run inside the container |
Tip: The
runtimeHandlerfield defaults to"container-runtime-linux"but is configurable — this is how the plugin system allows alternative runtimes.
ContainerClient: Creating and Bootstrapping via XPC
With the configuration built, the CLI uses ContainerClient to make two XPC calls: create() and bootstrap().
The create() call at lines 48-76 JSON-encodes the ContainerConfiguration, the kernel information, and creation options, stuffs them into an XPCMessage with route .containerCreate, and sends it to the API server.
The bootstrap() call at lines 116-146 is more interesting. It packs the stdio file handles (stdin, stdout, stderr pipes) directly into the XPC message. These file descriptors will travel from the CLI process, through the API server, and into the container runtime — crossing two process boundaries via XPC's kernel-level fd passing.
sequenceDiagram
participant CLI as container CLI
participant API as container-apiserver
participant LD as launchd
participant RT as container-runtime-linux
CLI->>API: containerCreate(config, kernel)
API->>API: Persist ContainerSnapshot
API->>API: Find runtime plugin
API->>LD: bootstrap plist for runtime
LD->>RT: Launch process
API-->>CLI: OK
CLI->>API: containerBootstrap(id, stdio fds)
API->>RT: createEndpoint (public Mach service)
RT-->>API: XPC endpoint
API->>RT: bootstrap(stdio fds, attachments)
RT->>RT: Boot Linux VM
RT-->>API: OK
API-->>CLI: OK + ClientProcess handle
ContainersService: Plugin Registration and Sandbox Setup
On the server side, ContainersService is the actor that manages all container state. When it receives a create request, it:
- Deserializes the
ContainerConfigurationfrom the XPC message - Creates a
ContainerSnapshot— the persistent state record - Persists it to disk via
FilesystemEntityStore - Finds the appropriate runtime plugin using
pluginLoader - Registers the runtime plugin with launchd via
pluginLoader.registerWithLaunchd()
The persistence layer uses FilesystemEntityStore — an actor that writes JSON files to disk and maintains an in-memory index. Each container gets its own directory under <appRoot>/containers/<id>/, containing an entity.json file with the serialized snapshot.
The ContainersService maintains an in-memory dictionary of ContainerState structs, each holding the snapshot, the SandboxClient (once connected), and allocated network attachments.
When bootstrap is called, the ContainersService performs the endpoint handshake described in Article 2 — connecting to the runtime's public Mach service, obtaining an anonymous endpoint, and establishing a direct connection.
The SandboxClient Endpoint Handshake
The SandboxClient.create() static method implements the two-server handshake in practice:
- Construct the Mach service label:
com.apple.container.runtime.container-runtime-linux.{uuid} - Create an
XPCClientconnected to that service - Send a
createEndpointrequest - Extract the
xpc_endpoint_tfrom the response - Call
xpc_connection_create_from_endpointto get a direct connection - Return a
SandboxClientbacked by the direct connection
From this point on, all communication with the runtime bypasses the public Mach service entirely. The bootstrap, createProcess, start, wait, and other operations all flow through the anonymous connection.
SandboxService: VM Creation and Linux Boot
Inside the runtime helper, SandboxService is the actor that manages the VM lifecycle. Its bootstrap method at lines 126-179 is where the Linux VM actually starts.
The bootstrap sequence:
- Create a container bundle on disk if it doesn't exist
- Load the container configuration and kernel from the bundle
- Configure kernel arguments (including security modules:
lsm=lockdown,capability,landlock,yama,apparmor) - Create a
VZVirtualMachineManagerfrom thecontainerizationlibrary - Extract allocated network attachments from the XPC message
- Dynamically configure DNS nameservers if not explicitly set
- Select the network interface strategy based on macOS version
- Boot the VM
flowchart TD
A[bootstrap message] --> B[Load config from bundle]
B --> C[Configure kernel args]
C --> D[Create VZVirtualMachineManager]
D --> E[Extract network attachments]
E --> F{macOS version?}
F -->|"macOS 26+"| G[NonisolatedInterfaceStrategy]
F -->|"macOS 15"| H[IsolatedInterfaceStrategy]
G --> I[Attach interfaces to VM]
H --> I
I --> J[Boot Linux VM]
J --> K[Start guest agent]
K --> L[Create init process]
The network interface strategy selection at RuntimeLinuxHelper+Start.swift#L67-L72 uses #available(macOS 26, *) guards to choose between NonisolatedInterfaceStrategy (macOS 26+, which supports full container-to-container networking) and IsolatedInterfaceStrategy (macOS 15, where containers are network-isolated from each other).
ProcessIO: Stdio, Signals, and Terminal Resize
Back on the CLI side, ProcessIO manages the connection between the user's terminal and the container's stdio streams. Its create method at lines 46-84 sets up the I/O pipeline differently based on the mode:
Interactive TTY mode (--tty --interactive): The terminal is put into raw mode via Terminal.setraw(). Stdin is read using non-blocking I/O with a readabilityHandler callback. Stdout from the container is piped directly to the user's stdout. Stderr is merged into stdout (as is standard for TTY mode).
Non-TTY mode: Stdout and stderr get separate pipes with independent readabilityHandler callbacks. An IoTracker coordinates stream completion — it uses an AsyncStream<Void> to signal when each output stream has finished (received empty data indicating EOF).
Detached mode (--detach): No output pipes are created. The container ID is printed and the CLI exits immediately.
flowchart LR
subgraph CLI Process
STDIN["Host stdin"]
STDOUT["Host stdout"]
STDERR["Host stderr"]
end
subgraph "Pipes (via XPC)"
P1["stdin pipe"]
P2["stdout pipe"]
P3["stderr pipe"]
end
subgraph Runtime Process
VM["Container VM"]
end
STDIN -->|readabilityHandler| P1
P1 -->|fd passed via XPC| VM
VM -->|fd passed via XPC| P2
P2 -->|readabilityHandler| STDOUT
VM -->|fd passed via XPC| P3
P3 -->|readabilityHandler| STDERR
Signal handling is centralized. ProcessIO registers handlers for SIGTERM, SIGINT, SIGUSR1, SIGUSR2, and SIGWINCH. For TTY sessions, SIGWINCH (terminal resize) triggers a resize command sent to the container via XPC. For non-TTY sessions, a SignalThreshold counter allows the user to force-exit after three consecutive SIGINT/SIGTERM signals.
Tip: The non-blocking stdin trick using
OSFile.makeNonBlocking()is essential. Without it, a blocking read on stdin would prevent the process from responding to signals or detecting when the container has exited.
Container States and Exit
Containers move through a simple state machine:
stateDiagram-v2
[*] --> created: create()
created --> running: bootstrap()
running --> stopped: exit / stop / kill
stopped --> [*]: delete()
created --> [*]: delete()
The ContainersService tracks these states in the ContainerSnapshot persisted to disk. When the CLI calls bootstrap, the service updates the snapshot to running. When the container's init process exits, the runtime notifies via the exit monitor, and the state moves to stopped.
The exit flow is instructive. After bootstrap returns, the CLI's ContainerRun.run() method at line 165 calls io.handleProcess(process:log:), which starts the container's init process and waits for it to exit. The wait is implemented via the containerWait XPC route, which blocks until the runtime reports an exit code.
The exit code from the container process is propagated all the way back to the CLI, which throws it as an ArgumentParser.ExitCode at line 173. If the container exits with code 0, the CLI exits with 0. If it exits with 1, the CLI exits with 1. Clean and transparent.
If the --remove flag was set, the container is automatically deleted after exit. If an error occurs during the run, the CLI attempts to clean up by calling client.delete(id:) in the catch block at line 167.
What's Next
We've now traced the full lifecycle of a container, but we glossed over a critical subsystem: networking. How does the container get an IP address? How can containers resolve each other by hostname? Why does the project include its own DNS server? The next article dives into the networking stack — from virtual network creation to IP allocation to a surprisingly specific musl libc compatibility workaround in the DNS handler.