Navigating the workerd Codebase: Architecture and Directory Map
Prerequisites
- ›Basic understanding of Cloudflare Workers as a product (fetch handlers, KV, Durable Objects)
- ›Familiarity with C++ project structure
- ›General knowledge of JavaScript runtimes and V8
Navigating the workerd Codebase: Architecture and Directory Map
If you've ever deployed a Cloudflare Worker, you've used workerd — you just didn't know it. The same C++ code that runs your fetch() handler on Cloudflare's edge now ships as an open-source binary you can run on your own machine. This article is your map to the 200,000+ line C++ codebase that makes it work. By the end, you'll know where every major subsystem lives, how the layers compose, and where to start reading code for any given task.
What is workerd and Why Does It Exist?
workerd (pronounced "worker-dee") is a JavaScript and WebAssembly server runtime built on V8, the same engine that powers Chrome and Node.js. But unlike Node.js, workerd was designed from the ground up for multi-tenant, serverless execution. It is the same code that runs Cloudflare Workers in production — extracted, open-sourced, and packaged as a standalone binary.
You might use it in three ways: as an application server to self-host Workers-compatible applications, as a development tool for local testing (this is what Wrangler's wrangler dev does under the hood), or as a programmable HTTP proxy where JavaScript intercepts and transforms every request.
The project lives at github.com/cloudflare/workerd and builds with Bazel. It embeds V8 and depends heavily on two Cloudflare-maintained C++ libraries: KJ (an async I/O framework) and Cap'n Proto (a serialization and RPC system). If you've never encountered these, think of KJ as a more opinionated alternative to Boost.Asio, and Cap'n Proto as a zero-copy alternative to Protocol Buffers.
Design Principles: Nanoservices, Capabilities, and Backwards Compatibility
Three design decisions shape the entire workerd codebase. Understanding them is essential before reading any code.
Nanoservices. In workerd, everything is a Service. A JavaScript Worker is a Service. A network connection is a Service. A directory on disk is a Service. When one Worker calls another via a binding, the callee runs in the same thread and process — the cost is a function call, not a network round-trip. This is what Cloudflare calls the "nanoservice" architecture.
Capability-based bindings. A Worker has zero ambient authority. It cannot access the network, the filesystem, or other Workers unless the configuration explicitly grants it a binding. This is a security principle baked into the config schema itself — look at the comments at the top of the config file:
src/workerd/server/workerd.capnp#L23-L35
Even internet access is modeled as a binding to a special "internet" service that is auto-created if not explicitly defined.
Backwards compatibility via compatibility dates. workerd's version number is a date. Every behavior change — every bug fix that could break existing code — is tracked as a boolean flag with a date annotation in a 1500+ line Cap'n Proto schema. A Worker specifies a compatibilityDate, and the runtime enables only the flags that were active on that date. This means updating workerd never breaks your code.
flowchart LR
subgraph Principles
A[Nanoservices<br/>Everything is a Service] --> D[Composable Architecture]
B[Capability Bindings<br/>No ambient authority] --> D
C[Compat Dates<br/>Never break old code] --> D
end
D --> E[workerd Runtime]
Layered Architecture Overview
workerd is organized as a strict layer cake. Each layer depends only on the layers below it, and each has a single, well-defined responsibility.
flowchart TB
CLI["CLI Layer<br/><code>workerd.c++</code><br/>Parse args, load config, manage lifecycle"]
Server["Server Layer<br/><code>server.c++ / server.h</code><br/>Build service graph, bind sockets, accept loop"]
Services["Service Layer<br/>WorkerService, NetworkService,<br/>DiskDirectoryService, ExternalHttpService"]
IO["I/O Layer<br/><code>io/</code><br/>IoContext, Worker, Actor, gates"]
API["API Layer<br/><code>api/</code><br/>fetch, streams, crypto, KV, DO, R2..."]
JSG["JSG Binding Layer<br/><code>jsg/</code><br/>C++↔V8 type marshaling"]
V8["V8 Engine<br/>JavaScript execution"]
CLI --> Server --> Services --> IO --> API --> JSG --> V8
The CLI layer (src/workerd/server/workerd.c++) uses KJ's MainBuilder to define subcommands: serve, compile, test, and pyodide-lock. It parses the Cap'n Proto config file and hands it to the Server.
The Server class (src/workerd/server/server.h#L36-L46) is the core orchestrator. It takes a parsed Config struct and a V8System reference, constructs all the Service instances, links them together (resolving inter-service references), binds sockets, and enters the accept loop.
The Service layer is where the nanoservice principle becomes concrete. Server::Service is an abstract base class with a critical startRequest() method:
src/workerd/server/server.c++#L217-L251
Concrete implementations include WorkerService (runs JavaScript), NetworkService (wraps the network as a service), DiskDirectoryService, and ExternalHttpService.
Directory Structure Deep Dive
| Directory | Purpose |
|---|---|
src/workerd/server/ |
CLI entry point, Server class, config schema, API aggregation |
src/workerd/io/ |
Request lifecycle: IoContext, Worker, Actor, gates, caching |
src/workerd/jsg/ |
JSG binding layer: C++↔V8 macros, type marshaling, setup |
src/workerd/api/ |
All Workers API implementations: fetch, streams, crypto, KV, DO, R2, queues |
src/workerd/api/node/ |
Node.js compatibility layer (buffer, crypto, etc.) |
src/workerd/util/ |
Shared utilities: SQLite wrapper, HTTP helpers, autogate system |
src/node/ |
Node.js compat modules implemented in JS/TS |
src/cloudflare/ |
Internal Cloudflare-specific APIs (AI, vectorize, pipelines) |
src/rust/ |
Rust components: transpiler, encoding |
build/ |
Bazel build rules |
patches/ |
Patches applied to third-party dependencies |
graph TD
subgraph "src/workerd/"
server["server/<br/>CLI + Config + Server"]
io["io/<br/>IoContext, Worker, Actor"]
jsg["jsg/<br/>V8 Binding Layer"]
api["api/<br/>All API Implementations"]
util["util/<br/>Shared Utilities"]
end
server --> io
io --> jsg
api --> jsg
api --> io
server --> api
io --> util
The api/ directory is by far the largest — it contains the implementation of every API that a Worker can call. Each API area gets its own header/source pair: fetch.h/fetch.c++, streams.h/streams.c++, crypto/impl.h, and so on. The global-scope.h file defines the global object (ServiceWorkerGlobalScope) and is a useful index of everything available to a Worker.
Tip: When exploring the codebase for the first time, start with
api/global-scope.h. It imports every major API header and defines the global scope — making it an excellent table of contents for the entire API surface.
Configuration as Cap'n Proto Schema
workerd uses Cap'n Proto for its configuration format. The schema is defined in src/workerd/server/workerd.capnp and establishes the four core structs:
erDiagram
Config ||--o{ Service : "has many"
Config ||--o{ Socket : "has many"
Socket ||--|| ServiceDesignator : "routes to"
Service ||--o| Worker : "may be"
Service ||--o| Network : "may be"
Service ||--o| ExternalServer : "may be"
Service ||--o| DiskDirectory : "may be"
Worker ||--o{ Binding : "has"
Binding ||--|| ServiceDesignator : "references"
Worker ||--o{ Module : "contains"
Config is the top-level struct: it contains lists of Services and Sockets, plus V8 flags and extensions. Socket defines a listener (address, HTTP/HTTPS/TCP protocol, TLS options) pointing to a Service. Service is a tagged union — it can be a Worker, a Network, an ExternalServer, or a DiskDirectory. Worker contains modules, bindings, compatibility dates, and Durable Object namespace definitions.
Why Cap'n Proto instead of JSON or YAML? Three reasons: (1) schema evolution guarantees — fields can be added without breaking old configs; (2) the same schema is used for both human-readable text format and efficient binary format; (3) Cap'n Proto's RPC system is used throughout the codebase, so the schema language is already ubiquitous.
Wrangler (Cloudflare's CLI tool) generates binary Cap'n Proto configs when running wrangler dev, feeding them to workerd with the --binary flag.
Where to Start Reading Code
If you're a new contributor, here's the reading order I'd recommend:
flowchart LR
A["1. workerd.capnp<br/>Understand the config shape"] --> B["2. server.h<br/>The Server API"]
B --> C["3. workerd.c++<br/>CLI and boot sequence"]
C --> D["4. server.c++<br/>Service construction"]
D --> E["5. worker-entrypoint.c++<br/>Request handling"]
E --> F["6. global-scope.h<br/>API surface index"]
F --> G["7. jsg/jsg.h<br/>Binding macros"]
For specific tasks, use this map:
| Task | Start here |
|---|---|
| Adding a new API | api/global-scope.h, then the EW_*_ISOLATE_TYPES macro in workerd-api.c++ |
| Fixing a compat bug | io/compatibility-date.capnp for the flag, then grep for flags.getXxx() |
| Understanding request flow | io/worker-entrypoint.c++ → io/io-context.h |
| Debugging Durable Objects | io/worker.h (Worker::Actor), io/actor-cache.h, io/io-gate.h |
| Modifying the config schema | server/workerd.capnp, then server/server.c++ for parsing |
The API registration point — where all Workers API types are aggregated into a single V8 isolate type — is one of the most important files to understand:
src/workerd/server/workerd-api.c++#L84-L153
This JSG_DECLARE_ISOLATE_TYPE invocation lists every EW_*_ISOLATE_TYPES macro — each defined in a different API header — and composites them into the complete type system that V8 understands. If your new API type isn't listed here, it doesn't exist in JavaScript.
Tip: The codebase uses
.c++(not.cpp) and.hfor C++ files. This is a KJ/Cap'n Proto convention. Don't let it trip you up when searching.
With this mental model in place — the layer cake from CLI to V8, the nanoservice principle, capability bindings, and compat dates — you're ready to dive deeper. In the next article, we'll trace the exact boot sequence: what happens between typing workerd serve config.capnp and the server being ready to accept its first request.