Read OSS

Networking and DNS: Virtual Networks, IP Allocation, and Name Resolution

Advanced

Prerequisites

  • Article 1: Architecture and Navigation Guide
  • Article 2: The XPC Communication Layer
  • Article 3: Container Lifecycle

Networking and DNS: Virtual Networks, IP Allocation, and Name Resolution

When a container boots in apple/container, it needs an IP address, a gateway, and a way to resolve hostnames — both for the outside world and for other containers on the same machine. Unlike traditional container runtimes where networking happens through Linux kernel namespaces inside a shared VM, here every container is its own VM. That means virtual networking must be configured at the macOS host level, using Apple's vmnet.framework.

This article covers the complete networking stack: how networks are created and managed, how IP addresses are allocated, and how two custom DNS servers built on SwiftNIO handle hostname resolution with a particularly interesting musl libc compatibility workaround.

Network Lifecycle: Creation, Attachment, and Teardown

Networks in apple/container are managed resources, just like containers and volumes. The API server's NetworksService coordinates network lifecycle, delegating to per-network NetworkService instances that run inside the container-network-vmnet helper.

When the API server starts, it checks for a default network and creates one if it doesn't exist. This happens in APIServer+Start.swift#L294-L331:

sequenceDiagram
    participant API as container-apiserver
    participant Net as container-network-vmnet
    participant vmnet as vmnet.framework

    Note over API: Startup
    API->>API: Check for default network
    API->>Net: Create network (NAT mode)
    Net->>vmnet: Create vmnet network
    vmnet-->>Net: Subnet info (gateway, CIDR)
    Net-->>API: NetworkState (running)

    Note over API: Container attaches
    API->>Net: allocate(hostname)
    Net->>Net: AttachmentAllocator.allocate()
    Net-->>API: Attachment (IP, MAC, gateway)

The flow for attaching a container to a network involves three steps: the API server asks the network service to allocate an IP address, receives back an Attachment with the assigned IP/MAC/gateway, and passes this information to the runtime helper during bootstrap. As we saw in Article 3, the SandboxService.bootstrap() method receives these allocated attachments via XPC and uses them to configure the VM's network interfaces.

The Network Protocol and macOS Version Branching

The Network protocol is minimal — just three requirements:

public protocol Network: Sendable {
    var state: NetworkState { get async }
    nonisolated func withAdditionalData(_ handler: (XPCMessage?) throws -> Void) throws
    func start() async throws
}

This protocol has two implementations, selected at runtime based on the macOS version:

classDiagram
    class Network {
        <<protocol>>
        +state: NetworkState
        +withAdditionalData(handler)
        +start()
    }
    class ReservedVmnetNetwork {
        +@available(macOS 26, *)
        -stateMutex: Mutex~State~
        -network: vmnet_network_ref?
        +start()
    }
    class AllocationOnlyVmnetNetwork {
        +actor
        -_state: NetworkState
        +start()
    }
    Network <|.. ReservedVmnetNetwork
    Network <|.. AllocationOnlyVmnetNetwork

ReservedVmnetNetwork is available on macOS 26+ and uses vmnet's reservation APIs. It creates a vmnet_network_ref that provides full network interface isolation — containers on the same network can communicate with each other, and each container gets a reserved interface. The class uses Mutex<State> for thread-safe state management (it's a final class, not an actor, because the vmnet callbacks come on arbitrary dispatch queues).

AllocationOnlyVmnetNetwork is the macOS 15 fallback. It's an actor that handles IP allocation but doesn't create a vmnet network interface itself — the vmnet framework on macOS 15 only supports isolated networks where containers can't communicate with each other. It also can't assign subnets, only NAT mode is supported.

Tip: The @available(macOS 26, *) guard on ReservedVmnetNetwork is the key conditional. If you're debugging networking issues on macOS 15, you're dealing with AllocationOnlyVmnetNetwork and its limitations (no container-to-container communication, no custom networks, potential subnet mismatches documented in the technical overview).

IP and MAC Address Allocation with AttachmentAllocator

The AttachmentAllocator is an actor that manages IP address assignment within a network's subnet. It's initialized with the lower bound of the allocatable range and the number of available addresses (derived from the subnet's CIDR).

The allocator maps hostnames to addresses using a simple dictionary (hostnames: [String: UInt32]). The underlying address allocator uses a rotating strategy — it cycles through available addresses rather than always reusing the lowest available one, which avoids ARP cache issues that can occur when addresses are rapidly reused.

flowchart TD
    A["allocate(hostname: 'web')"] --> B{Hostname exists?}
    B -->|Yes| C[Return existing IP]
    B -->|No| D["UInt32.rotatingAllocator.allocate()"]
    D --> E["Map: 'web' → index"]
    E --> F["Return IPv4Address(index)"]

    G["deallocate(hostname: 'web')"] --> H["Remove from map"]
    H --> I["allocator.release(index)"]

The NetworkService.allocate method ties it all together. When a container attaches, it allocates an IP index, generates or accepts a MAC address, constructs the full Attachment record (with IPv4 CIDR, gateway, optional IPv6, and MAC), and returns it via XPC. MAC addresses are either provided by the caller or randomly generated with the locally-administered bit set.

An important idempotency detail: if a hostname is already allocated, the allocator returns the existing IP rather than allocating a new one. This prevents address leaks if the same container is bootstrapped multiple times.

The Custom DNS Server: SwiftNIO UDP

apple/container runs two DNS servers, both built on the same DNSServer infrastructure. The DNS server is a surprisingly compact SwiftNIO application — it uses DatagramBootstrap to bind a UDP socket, wraps the channel in a NIOAsyncChannel, and processes packets in a for try await loop.

The two server instances, started concurrently in APIServer+Start.swift#L106-L150:

Server Port Purpose
Container DNS 2053 Resolves container hostnames to IP addresses
Localhost DNS 1053 Resolves .localhost domain aliases

Both servers use the same architecture: a handler chain built with the CompositeResolver pattern. The composite resolver iterates through a list of DNSHandler implementations, returning the first non-nil answer:

flowchart LR
    Q[DNS Query] --> V[StandardQueryValidator]
    V --> CR[CompositeResolver]
    CR --> H1[ContainerDNSHandler]
    H1 -->|nil| H2[NxDomainResolver]
    H1 -->|answer| R[Response]
    H2 --> R2[NXDOMAIN]

The StandardQueryValidator filters out non-standard queries. The CompositeResolver tries each handler in order. If the ContainerDNSHandler can resolve the hostname, it returns the answer. Otherwise, NxDomainResolver returns NXDOMAIN as the fallback.

Tip: The DNS servers bind to 127.0.0.1 (localhost only), so they're not accessible from the network. Containers are configured to use these servers via their /etc/resolv.conf, which is set up during VM bootstrap based on the DNSConfiguration in the container config.

Container Hostname Resolution and the musl libc Workaround

The ContainerDNSHandler is where container-to-container name resolution happens. When container "web" wants to reach container "db" by hostname, the DNS query flows to this handler, which calls networkService.lookup(hostname:) to find the IP allocation.

The handler supports both A records (IPv4) and AAAA records (IPv6). The IPv4 path is straightforward — look up the hostname, return the IPv4 address. The IPv6 path is where it gets interesting.

Look at lines 39-53:

case ResourceRecordType.host6:
    let result = try await answerHost6(question: question)
    if result.record == nil && result.hostnameExists {
        // Return NODATA (noError with empty answers) when hostname exists but has no IPv6.
        // This is required because musl libc has issues when A record exists but AAAA returns NXDOMAIN.
        // musl treats NXDOMAIN on AAAA as "domain doesn't exist" and fails DNS resolution entirely.
        // NODATA correctly indicates "no IPv6 address available, but domain exists".
        return Message(
            id: query.id,
            type: .response,
            returnCode: .noError,
            questions: query.questions,
            answers: []
        )
    }

Here's the problem: when a container has only an IPv4 address (no IPv6), a standard DNS server would return NXDOMAIN for AAAA queries. Most DNS clients handle this fine — they got an A record, so they use that. But musl libc (used in Alpine Linux and many minimal container images) treats an NXDOMAIN on the AAAA query as "this domain doesn't exist at all" and fails the entire resolution, even if the A record query succeeded.

The fix is to return NODATA instead — a response with returnCode: .noError but an empty answers array. This tells the client "the domain exists, but there's no IPv6 address available," which musl handles correctly.

flowchart TD
    Q["AAAA query for 'db'"] --> L["networkService.lookup('db')"]
    L --> F{Found?}
    F -->|No| N1[Return nil → NXDOMAIN via fallback]
    F -->|Yes| V{Has IPv6?}
    V -->|Yes| R[Return AAAA record]
    V -->|No| ND["Return NODATA<br/>(noError + empty answers)<br/>musl libc workaround"]

This is a great example of the kind of real-world compatibility issue that only surfaces when you run diverse container images in production. It's a single if statement, but it prevents DNS resolution failures across an entire class of Linux distributions.

What's Next

We've now covered how containers get their network identities and how they find each other. The next article shifts focus to the plugin system — the extensibility mechanism that makes the runtime helpers, network helpers, and even CLI extensions all work through a common config.json-based discovery pattern with launchd integration.