Read OSS

Architecture Overview: Navigating the workerd Codebase

Intermediate

Prerequisites

  • General understanding of JavaScript runtimes (Node.js, Deno, or browsers)
  • Basic familiarity with C++ project layouts
  • Conceptual understanding of V8 as a JavaScript engine

Architecture Overview: Navigating the workerd Codebase

Cloudflare Workers handles billions of requests daily. The runtime powering that—code-named workerd—was open-sourced in 2022, giving the world a production-grade JavaScript/Wasm server runtime built on V8, Cap'n Proto, and KJ. If you've ever wondered how a runtime can execute untrusted JavaScript with near-zero cold starts and strict isolation guarantees, this series will walk you through the source code that makes it happen.

This first article orients you in the codebase. We'll cover the design philosophy, the layered architecture, the capability-based configuration model, and the CLI entry point—everything you need to know before diving deeper in subsequent articles.

Project Overview and Design Philosophy

workerd is not another Node.js. It's a single-tenant, capability-based JavaScript runtime purpose-built for the edge. Where Node.js hands your code full access to the filesystem, network, and environment variables by default, workerd starts from zero ambient authority: a Worker has access to nothing unless you explicitly grant it through configuration bindings.

This principle is declared up front in the configuration schema at src/workerd/server/workerd.capnp#L23-L35:

A common theme in this configuration is capability-based design. We generally like to avoid giving a Worker the ability to access external resources by name, since this makes it hard to see and restrict what each Worker can access. Instead, the default is that a Worker has access to no privileged resources at all, and you must explicitly declare "bindings" to give it access to specific resources.

This shapes everything about how workerd is built. Unlike Deno's permission prompts or Node's trust-everything approach, workerd's security model is structural: you literally cannot make a subrequest unless your configuration wires up a service binding to enable it.

Feature Node.js Deno workerd
Default network access Full Prompt-gated None (explicit binding required)
Filesystem access Full Prompt-gated None (explicit disk service required)
Security model Trust by default Permission prompts Capability-based config
V8 usage Embedded Embedded Embedded
Primary language C++ Rust C++
Configuration package.json / CLI flags deno.json / CLI flags Cap'n Proto schema

Directory Structure and Layered Architecture

The codebase follows a clean layered design. Understanding these layers is the key to navigating the roughly 500,000 lines of C++ without getting lost.

graph TD
    subgraph "src/workerd/"
        A["server/\nServer orchestration, CLI, config parsing"] --> B["io/\nI/O context, worker lifecycle, channels"]
        B --> C["jsg/\nC++ ↔ V8 binding layer"]
        C --> D["api/\nWeb API implementations\n(fetch, crypto, streams, etc.)"]
        B --> D
    end
    A --> E["V8 Engine"]
    A --> F["KJ Async I/O"]
    A --> G["Cap'n Proto"]
    C --> E
Directory Purpose Key Classes
server/ CLI entry point, config parsing, service orchestration Server, CliMain, WorkerService
io/ Request lifecycle, async I/O bridging, worker management IoContext, Worker, Worker::Actor
jsg/ C++↔JavaScript type mapping and binding macros Wrappable, JSG_RESOURCE_TYPE
api/ Web-standard API implementations ServiceWorkerGlobalScope, Response, Request
util/ Low-level utilities: SQLite wrappers, string helpers SqliteDatabase, SqliteKv

The Server class at src/workerd/server/server.h#L36 sits at the top of the stack. It consumes Cap'n Proto configuration, creates services, binds sockets, and manages the lifetime of everything:

class Server final: private kj::TaskSet::ErrorHandler,
                    private ChannelTokenHandler::Resolver {
 public:
  kj::Promise<void> run(jsg::V8System& v8System,
      config::Config::Reader conf,
      kj::Promise<void> drainWhen = kj::NEVER_DONE);

The io/ layer bridges JavaScript execution with the KJ event loop. This is where IoContext—the per-request execution context—lives (src/workerd/io/io-context.h#L190). It's where garbage-collected JS objects meet event-loop-bound I/O objects, and it's probably the most architecturally important class in the codebase.

The jsg/ layer (JavaScript Glue) provides the macro-driven type system that maps C++ classes to JavaScript objects. If you've ever used pybind11 or N-API, you'll recognize the pattern—but JSG is deeply integrated with V8's garbage collector in ways those tools are not.

The api/ layer is where the actual Web APIs live—fetch(), crypto.subtle, ReadableStream, and so on. These classes use JSG macros to expose themselves to JavaScript.

Tip: When navigating the source, the dependency flow always goes server → io → jsg → api. If you find yourself in api/ code trying to understand how something gets called, trace upward through io/ to find the orchestration logic.

The Capability-Based Configuration Model

workerd's configuration is defined via a Cap'n Proto schema. This isn't just a preference—it enables the "compiled binary" pattern where your configuration and source code are appended directly to the executable, producing a single self-contained binary.

The top-level Config struct contains two primary lists: services and sockets. Services define what can run; sockets define how the world reaches them.

flowchart LR
    Config --> Services
    Config --> Sockets
    Services --> S1["Service: worker"]
    Services --> S2["Service: network"]
    Services --> S3["Service: external"]
    Services --> S4["Service: disk"]
    Sockets --> Socket1["Socket: HTTP/HTTPS"]
    Socket1 -->|service designator| S1
    S1 -->|bindings| S2
    S1 -->|bindings| S3

The schema at src/workerd/server/workerd.capnp#L163-L197 defines a Service as a tagged union with four variants:

  • worker: A JavaScript/Wasm worker with modules, bindings, and a compatibility date
  • network: Access to a network (typically the public internet)
  • external: A proxy to a specific remote server
  • disk: An HTTP service backed by a local directory

The ServiceDesignator type (src/workerd/server/workerd.capnp#L199-L235) is how one part of the config references another. A socket references a service by name; a worker's binding references another service by name. This is the plumbing that enforces the capability model: a worker can only talk to services it has been explicitly wired to.

The Worker struct itself (src/workerd/server/workerd.capnp#L237-L270) supports three source types: ES modules, a service worker script, or inheriting from another worker. It also defines the module types: esModule, commonJsModule, text, data, wasm, json, and pythonModule.

CLI Entry Point and Subcommands

The CliMain class at src/workerd/server/workerd.c++#L680 is where execution begins. It exposes several subcommands:

flowchart TD
    CLI["workerd"] --> serve["serve\nRun the server"]
    CLI --> compile["compile\nCreate self-contained binary"]
    CLI --> test["test\nRun unit tests"]
    CLI --> pylock["pyodide-lock\nOutput Pyodide package lock"]
    serve --> Server["Server::run()"]
    compile --> Binary["Append config to executable"]
    test --> Tests["Server::test()"]

The constructor of CliMain does something clever: it checks whether the executable itself contains an appended configuration (src/workerd/server/workerd.c++#L706-L728). If the last bytes of the binary match a magic suffix, CliMain reads the embedded Cap'n Proto config, enabling the "compiled binary" pattern where workerd compile bakes your configuration and source into a standalone executable.

if (kj::arrayPtr(magic) == kj::asBytes(COMPILED_MAGIC_SUFFIX)) {
  // Oh! It appears we are running a compiled binary,
  // it has a config appended to the end.

The subcommands at lines 744–756 are registered with KJ's MainBuilder:

.addSubCommand("serve", KJ_BIND_METHOD(*this, getServe), "run the server")
.addSubCommand("compile", KJ_BIND_METHOD(*this, getCompile),
    "create a self-contained binary")
.addSubCommand("test", KJ_BIND_METHOD(*this, getTest), "run unit tests")

Tip: The compile subcommand is a unique feature among JS runtimes. If you're distributing a Workers application and want a zero-dependency deployment artifact, workerd compile produces a single binary that doesn't need a separate config file at runtime.

Running the Hello World Example

The simplest way to see the architecture in action is the ESM hello-world sample. The configuration at samples/helloworld_esm/config.capnp demonstrates the minimum viable setup:

const helloWorldExample :Workerd.Config = (
  services = [ (name = "main", worker = .helloWorld) ],
  sockets = [ ( name = "http", address = "*:8080", http = (), service = "main" ) ]
);

const helloWorld :Workerd.Worker = (
  modules = [
    (name = "worker", esModule = embed "worker.js")
  ],
  compatibilityDate = "2023-02-28",
);

And the worker at samples/helloworld_esm/worker.js:

export default {
  async fetch(req, env) {
    return new Response("Hello World\n");
  }
};

Notice how the config declares one service ("main") bound to the worker, and one socket listening on port 8080 that dispatches to "main." There's no implicit fetch() access, no ambient network—the only thing this worker can do is respond to incoming requests, because that's all the config permits.

Running workerd serve samples/helloworld_esm/config.capnp triggers the entire boot sequence we'll trace in the next article: config parsing → service creation → socket binding → ready for requests.

What's Next

This article gave you the mental map. You know the four layers (server, io, jsg, api), you understand the capability-based configuration model, and you've seen the CLI entry point. In Article 2, we'll follow a request from the moment it hits a listening socket all the way through to JavaScript execution and back—tracing the boot sequence, the makeService() factory, and the WorkerEntrypoint that bridges HTTP and JavaScript.