Read OSS

Architecture Overview and Boot Process: From CLI to Tunnel Daemon

Intermediate

Prerequisites

  • Basic Go (goroutines, channels, context)
  • Understanding of reverse proxies and HTTP
  • Familiarity with Cloudflare Tunnels concepts

Architecture Overview and Boot Process: From CLI to Tunnel Daemon

Cloudflare Tunnel turns the traditional server exposure model inside out. Instead of opening ports to the internet and hoping your firewall holds, cloudflared dials out to Cloudflare's edge and says "send me traffic." This single inversion of control — an outbound-only connection that acts as a reverse proxy — is the foundation of everything in this codebase. In this first article, we'll build a mental model of how cloudflared works, map the package structure, and trace the complete boot sequence from main() to a running tunnel daemon.

The Mental Model: Tunnel as Reverse Proxy

At its core, cloudflared maintains encrypted, persistent connections from your origin server to Cloudflare's global edge network. When a user requests app.example.com, Cloudflare routes that request through one of these connections to cloudflared, which proxies it to your local service (e.g., localhost:8080).

flowchart LR
    User["👤 User"] -->|HTTPS| Edge["Cloudflare Edge"]
    Edge -->|QUIC/HTTP2| CFD["cloudflared"]
    CFD -->|HTTP/TCP/WS| Origin["Origin Service\nlocalhost:8080"]
    
    style Edge fill:#f68b1f,color:#fff
    style CFD fill:#1a73e8,color:#fff

The daemon supports two major operational modes:

  1. Tunnel server mode — the primary mode, where cloudflared acts as a tunnel connector between origin services and the Cloudflare edge.
  2. Access client mode — a secondary mode for authenticating users against Cloudflare Access policies (via cloudflared access).

This series focuses on tunnel server mode, where the interesting architectural decisions live.

Package Structure and Layering

The repository contains roughly 70 packages, but they organize into clear logical layers. Understanding this layering is essential — it tells you where to look when debugging or extending any feature.

flowchart TD
    CLI["CLI Layer\ncmd/cloudflared/"]
    CONFIG["Configuration\nconfig/ · credentials/"]
    ORCH["Orchestration\norchestration/ · features/"]
    SUPER["Supervisor\nsupervisor/"]
    CONN["Connection\nconnection/ · quic/"]
    PROXY["Traffic Routing\nproxy/ · ingress/"]
    EDGE["Edge Discovery\nedgediscovery/"]
    OPS["Operations\nmetrics/ · management/ · diagnostic/"]
    
    CLI --> CONFIG
    CLI --> SUPER
    CONFIG --> ORCH
    SUPER --> CONN
    SUPER --> EDGE
    CONN --> PROXY
    ORCH --> PROXY
    OPS -.-> SUPER
    OPS -.-> CONN
Layer Key Packages Responsibility
CLI cmd/cloudflared/, cmd/cloudflared/tunnel/ Command parsing, flag handling, boot orchestration
Configuration config/, credentials/, cmd/cloudflared/tunnel/configuration.go YAML discovery, credential loading, flag merging
Supervisor supervisor/ HA connection management, reconnection, protocol fallback
Connection connection/, quic/ QUIC and HTTP/2 transport, control stream RPC
Traffic Routing proxy/, ingress/ Ingress rule matching, origin service proxying
Orchestration orchestration/, features/ Runtime config updates, feature flags
Operations metrics/, management/, diagnostic/ Prometheus, live log streaming, health checks

Tip: When navigating the codebase, start from cmd/cloudflared/tunnel/cmd.go — it's the nexus where CLI parsing, configuration building, and daemon startup all converge.

The Bootstrap Sequence: main() to runApp()

The entry point at cmd/cloudflared/main.go#L51-L97 reveals three pragmatic decisions before any real work begins:

sequenceDiagram
    participant main
    participant Subsystems
    participant Platform
    
    main->>main: Disable QUIC ECN (TUN-8148 workaround)
    main->>main: Set GOMAXPROCS via automaxprocs
    main->>main: Create graceShutdownC channel
    main->>main: Build urfave/cli App
    main->>Subsystems: Init(tunnel, access, updater, tracing, token, tail, management)
    main->>Platform: runApp(app, graceShutdownC)

First, os.Setenv("QUIC_GO_DISABLE_ECN", "1") globally disables ECN (Explicit Congestion Notification) in the QUIC library. This is a pragmatic workaround for bugs in ECN support detection — the kind of fix that's easy to miss but critical for reliability in diverse network environments.

Second, maxprocs.Set() from Uber's automaxprocs library automatically configures GOMAXPROCS based on container CPU quotas. Without this, Go processes in cgroups-limited containers would over-schedule goroutines.

Third, the graceShutdownC channel is created and threaded through the entire application. This channel is the lingua franca of shutdown — when closed, every component from the supervisor to the auto-updater knows to begin winding down. On Windows, the Service Control Manager closes it; on Unix, signal handlers do the same.

Each subsystem's Init() call receives build info and the shutdown channel, registering its CLI commands and flags with the global app before runApp() dispatches to the platform-specific runner.

Command Tree and Tunnel Mode Dispatch

The command hierarchy is built in cmd/cloudflared/main.go#L99-L158:

flowchart TD
    Root["cloudflared"] --> Tunnel["tunnel"]
    Root --> Access["access"]
    Root --> Update["update"]
    Root --> Version["version"]
    Root --> Tail["tail"]
    Root --> Mgmt["management"]
    
    Tunnel --> Run["run"]
    Tunnel --> Create["create"]
    Tunnel --> Route["route"]
    Tunnel --> List["list"]
    Tunnel --> Delete["delete"]
    Tunnel --> Info["info"]
    Tunnel --> Login["login"]
    
    Access --> AccLogin["login"]
    Access --> AccSSH["ssh"]
    Access --> AccCurl["curl"]

The most important dispatch happens in TunnelCommand at cmd/cloudflared/tunnel/cmd.go#L227-L269. It handles three tunnel modes:

  1. Adhoc named tunnel — when --name is provided, cloudflared creates a tunnel, optionally routes DNS, then runs it. All in one command.
  2. Quick tunnel — when --url or --hello-world is set with the quick-service endpoint, it requests an ephemeral tunnel from trycloudflare.com.
  3. Named tunnel run — the production path, where cloudflared tunnel run <TUNNEL_ID> starts a pre-configured tunnel.

There's also a subtle fourth path: when cloudflared is invoked with no arguments at all, the action() function at cmd/cloudflared/main.go#L170-L184 detects the empty invocation and enters service mode, reading from a config file and running via the overwatch service manager.

StartServer(): Where Everything Comes Together

The StartServer function at cmd/cloudflared/tunnel/cmd.go#L313-L510 is the 200-line function that turns a parsed CLI context into a running tunnel daemon. Here's its logical flow:

sequenceDiagram
    participant SS as StartServer
    participant Sentry
    participant Config
    participant Orch as Orchestrator
    participant Metrics
    participant Supervisor
    
    SS->>Sentry: Init(DSN, version)
    SS->>SS: Create connectedSignal
    SS->>SS: Start auto-updater goroutine
    SS->>Config: prepareTunnelConfig()
    Config-->>SS: TunnelConfig + orchestration.Config
    SS->>SS: Create ManagementService
    SS->>Orch: NewOrchestrator(config, internalRules)
    SS->>Metrics: CreateMetricsListener + ServeMetrics
    SS->>Supervisor: StartTunnelDaemon(tunnelConfig, orchestrator)
    SS->>SS: waitToShutdown()

The critical call is prepareTunnelConfig() at cmd/cloudflared/tunnel/configuration.go#L114-L150, which:

  1. Creates a FeatureSelector that fetches feature flags from DNS TXT records
  2. Builds a client.Config with a generated connector ID
  3. Selects the transport protocol (QUIC by default, with fallback to HTTP/2)
  4. Parses ingress rules from config + CLI
  5. Assembles the complete TunnelConfig and orchestration.Config

After config assembly, StartServer creates the management service — a chi-based HTTP handler for live log streaming from the dashboard — and injects it as an internal ingress rule. The Orchestrator is then initialized with these internal rules plus the user-defined ingress configuration.

Finally, supervisor.StartTunnelDaemon() kicks off the actual tunnel connections, and the function blocks on waitToShutdown(), which listens for errors or the grace shutdown signal.

Configuration and Credentials

Config file discovery follows a deliberate search path defined in config/configuration.go#L23-L39:

flowchart TD
    Search["Search Directories"] --> User["~/.cloudflared/\n~/.cloudflare-warp/\n~/cloudflare-warp/"]
    Search --> Nix["/etc/cloudflared/\n/usr/local/etc/cloudflared/"]
    User --> Files["config.yml\nconfig.yaml"]
    Nix --> Files
    Files --> Found{Found?}
    Found -->|Yes| Parse["YAML decode + strict warning pass"]
    Found -->|No| CLI["Fall back to CLI flags only"]

The YAML config is parsed twice — once normally, and once in strict mode to generate warnings about unknown fields without failing. This is a thoughtful UX choice: users with slightly mistyped keys get warned but not blocked.

Credentials come through three paths, checked in order:

  1. --credentials-file pointing to a JSON file with account tag, tunnel secret, and tunnel ID
  2. --token providing a base64-encoded token (the dashboard flow)
  3. --credentials-contents with inline JSON

The Credentials struct at connection/connection.go#L64-L69 holds all tunnel identity information: account tag, tunnel secret, tunnel UUID, and an optional endpoint field used for FedRAMP compliance.

Key Design Decisions in the Boot Path

Several non-obvious patterns in the boot path deserve attention:

The graceShutdownC channel pattern. Rather than using context.Context for shutdown (which would require threading a context through service managers), cloudflared uses a raw chan struct{} that's created once in main() and shared everywhere. This works because the Windows SCM integration needs to close the channel from outside the normal Go context hierarchy. It's a pragmatic bridge between Go's context model and OS service management APIs.

Quick tunnels are limited. The quick tunnel path through trycloudflare.com automatically constrains to a single HA connection and disables ICMP routing. These tunnels are ephemeral, unauthenticated, and carry no uptime guarantee — the code enforces these limitations structurally rather than just documenting them.

Service mode vs command mode. When you install cloudflared as a systemd service with no arguments, it enters a completely different code path — the handleServiceMode function creates a file watcher on the config, uses the overwatch package to manage service lifecycles, and can react to config file changes on disk. This is a separate operational mode from the tunnel run command.

What's Next

With the boot sequence understood, the daemon is now running and has called supervisor.StartTunnelDaemon(). But what happens inside the supervisor? How does it manage four simultaneous connections to the edge? What happens when QUIC fails? In Part 2, we'll dive into the Supervisor's event loop, its HA connection management, and the sophisticated protocol fallback system that keeps tunnels alive even when networks misbehave.