Read OSS

Base Node Architecture: How 30 Files Orchestrate an Ethereum L2

Intermediate

Prerequisites

  • Docker and Docker Compose fundamentals
  • Basic understanding of Ethereum L1/L2 architecture (execution vs. consensus layer)
  • What the OP Stack is (Optimism's rollup framework)

Base Node Architecture: How 30 Files Orchestrate an Ethereum L2

Most codebases you clone are applications. They have src/ directories, package managers, and build tools that produce a single binary. The base/node repository is none of those things. It's a deployment orchestrator — roughly 30 files that fetch, build, and wire together upstream binaries from four different repositories, across three programming languages, to produce a running Base L2 Ethereum node. Understanding this mental model is the key to navigating everything else in the repository.

This is the first article in a five-part deep dive. We'll start by mapping the entire codebase, explain the two-service architecture that underpins everything, and show how a single JSON file drives the entire build pipeline.

What This Repository Actually Is

If you git clone this repository expecting to find blockchain consensus logic or EVM execution code, you'll be confused. There's no main() in the critical path. Instead, base/node is an opinionated composition layer that:

  1. Pins exact versions of four upstream dependencies (Reth, Geth, Nethermind, and op-node) in a single JSON file
  2. Builds them from source inside Docker multi-stage builds, verifying commit hashes to prevent supply chain attacks
  3. Wires them together with entrypoint scripts that handle service discovery, JWT authentication, and feature flag toggling
  4. Orchestrates startup via Docker Compose, ensuring the execution client is ready before the consensus client connects
flowchart LR
    subgraph "base/node repository"
        DC[docker-compose.yml]
        VJ[versions.json]
        EP[Entrypoint Scripts]
        DF[Dockerfiles]
    end
    subgraph "Upstream Repos"
        R[base/base<br/>Reth + base-consensus]
        G[ethereum-optimism/op-geth]
        N[NethermindEth/nethermind]
        O[ethereum-optimism/optimism<br/>op-node]
    end
    VJ -->|pins versions| DF
    DF -->|clones & builds| R
    DF -->|clones & builds| G
    DF -->|clones & builds| N
    DF -->|clones & builds| O
    DC -->|orchestrates| EP

The result is a single docker compose up command that produces a syncing Base node — regardless of which execution client you choose.

Directory Structure and File Roles

With only ~30 files, there's no deep directory tree to get lost in. But those files serve distinct roles that map to different stages of the build-and-run pipeline.

Path Role Description
docker-compose.yml Orchestration Defines the two services and their wiring
.env User config Default client selection and data directory
.env.mainnet / .env.sepolia Network config Network-specific endpoints, bootnodes, and cache settings
versions.json Version pinning Source of truth for all upstream dependency versions
versions.env Generated config Shell-sourceable exports consumed by Dockerfiles
geth/Dockerfile Build pipeline Multi-stage Docker build for op-geth + op-node
reth/Dockerfile Build pipeline Multi-stage Docker build for base-reth-node + base-consensus + op-node
nethermind/Dockerfile Build pipeline Multi-stage Docker build for Nethermind + op-node
geth/geth-entrypoint Runtime Geth startup script with cache tuning
reth/reth-entrypoint Runtime Reth startup with Flashblocks and historical proofs
nethermind/nethermind-entrypoint Runtime Nethermind startup script
consensus-entrypoint Runtime Dispatcher routing to op-node or base-consensus
op-node-entrypoint Runtime Legacy consensus client startup
base-consensus-entrypoint Runtime New consensus client startup with follow mode
supervisord.conf Process management Fallback for standalone Docker usage
dependency_updater/ Tooling Go CLI for automated version tracking
.github/workflows/ CI/CD Build, test, and release automation

Tip: The flat structure is intentional. Each execution client gets its own directory with exactly two files — a Dockerfile and an entrypoint script. Everything else lives at the root. This makes it trivial to add a fourth client: create a directory, add two files, update versions.json.

The Two-Service Architecture

The heart of base/node is docker-compose.yml — a remarkably compact file that defines the entire runtime topology.

flowchart TB
    subgraph Docker["Docker Compose Network"]
        subgraph EX["execution service"]
            EC[Execution Client<br/>Reth / Geth / Nethermind]
        end
        subgraph ND["node service"]
            CC[Consensus Client<br/>op-node / base-consensus]
        end
    end
    CC -->|"Engine API<br/>port 8551<br/>JWT auth"| EC
    EC -->|"depends_on"| ND
    Internet -->|"P2P :30303"| EC
    Internet -->|"P2P :9222"| CC
    User -->|"RPC :8545"| EC

Both services use the same Dockerfile, selected at build time by the CLIENT environment variable:

# docker-compose.yml line 5
dockerfile: ${CLIENT:-geth}/Dockerfile

This means docker compose build produces two identical images. The differentiation happens at runtime through the command override — the execution service runs bash ./execution-entrypoint while the node service runs bash ./consensus-entrypoint.

The depends_on directive on line 26 ensures Docker starts the execution service first. But Docker's depends_on only waits for the container to start, not for the application inside it to be ready. The actual readiness coordination happens in the consensus entrypoint scripts, which poll the Engine API — a pattern we'll explore in detail in Part 2.

Why supervisord.conf Exists But Doesn't Run

You'll notice each Dockerfile ends with CMD ["/usr/bin/supervisord"], and there's a supervisord.conf that runs both processes in a single container. This is the fallback for running the image without Docker Compose — a single container that manages both processes via supervisord.

When you use docker compose up, the command directives in docker-compose.yml override the CMD in each Dockerfile. The execution container runs only the execution entrypoint; the node container runs only the consensus entrypoint. Supervisord is never invoked.

flowchart TD
    A{"How are you<br/>running it?"} -->|"docker compose up"| B["command overrides CMD<br/>Each service runs one process"]
    A -->|"docker run"| C["CMD runs supervisord<br/>Both processes in one container"]
    B --> D["execution-entrypoint"]
    B --> E["consensus-entrypoint"]
    C --> F["supervisord manages both<br/>execution + consensus"]

This dual-mode design is a pragmatic choice. Docker Compose gives you better logging, independent restarts, and resource isolation. But some deployment environments don't have Compose — and a single container with supervisord works just fine.

The Three-Layer Configuration System

Configuration flows through three layers, each overriding the last. Understanding this layering is essential for debugging "why is my node connecting to the wrong endpoint?"

flowchart TD
    A[".env<br/>CLIENT=geth<br/>USE_BASE_CONSENSUS=true<br/>HOST_DATA_DIR=./geth-data"] -->|"lowest priority"| D["Effective Config"]
    B[".env.mainnet / .env.sepolia<br/>Network endpoints, bootnodes,<br/>cache settings, L1 config"] -->|"medium priority<br/>(loaded via env_file)"| D
    C["Shell environment<br/>export CLIENT=reth"] -->|"highest priority"| D

Layer 1: .env — The default configuration is just three lines:

CLIENT=${CLIENT:-geth}
HOST_DATA_DIR=./${CLIENT}-data
USE_BASE_CONSENSUS=true

This sets the execution client to Geth by default and enables base-consensus as the default consensus client.

Layer 2: .env.mainnet / .env.sepolia — These are loaded via the env_file directive in docker-compose.yml line 19. The NETWORK_ENV variable defaults to .env.mainnet. To switch to Sepolia testnet, set NETWORK_ENV=.env.sepolia.

Layer 3: Shell environment — Any variable set in your shell before running docker compose up takes precedence over both files.

The Dual Namespace Pattern

One of the more confusing aspects of the network config files is the duplicate variable namespaces. Look at .env.mainnet:

OP_NODE_L2_ENGINE_RPC=http://execution:8551       # for op-node
BASE_NODE_L2_ENGINE_RPC=ws://execution:8551        # for base-consensus

The same Engine API endpoint is defined twice under different prefixes. This exists because op-node reads OP_NODE_* variables while base-consensus reads BASE_NODE_* variables. Both clients need the same information, but they use different environment variable conventions. The OP_NODE_* namespace is the legacy convention from Optimism; BASE_NODE_* is the newer Base-specific convention.

Tip: The hardcoded JWT secret (688f5d737bad920bdfb2fc2f488d6b6209eebda1dae949a8de91398d932c517a) in the network config files might look like a security concern, but it's deliberate. The Engine API is only exposed within the Docker network — it's not reachable from the host. The JWT exists to satisfy the Engine API spec, not to protect against external attackers.

Version Pinning: From versions.json to Docker Builds

The versions.json file is the single source of truth for every upstream dependency:

{
  "base_reth_node": {
    "tag": "v0.7.0",
    "commit": "2b0d89d4267ae7b2893e1719d2ba026074e4a8b8",
    "owner": "base",
    "repo": "base",
    "tracking": "release"
  },
  "op_geth": {
    "tag": "v1.101702.0",
    "commit": "d0734fd5f44234cde3b0a7c4beb1256fc6feedef",
    "owner": "ethereum-optimism",
    "repo": "op-geth",
    "tracking": "release"
  }
}

Each entry captures a tag (the Git tag to clone), a commit (the expected SHA at that tag), and a tracking mode that controls how the automated updater finds new versions. We'll explore the updater in depth in Part 4.

The critical insight is the two-step verification that happens during Docker builds. Look at the Geth Dockerfile's clone step on lines 9-11:

RUN . /tmp/versions.env && git clone $OP_GETH_REPO --branch $OP_GETH_TAG --single-branch . && \
    git switch -c branch-$OP_GETH_TAG && \
    bash -c '[ "$(git rev-parse HEAD)" = "$OP_GETH_COMMIT" ]'

First, it clones at the specified tag. Then it verifies that HEAD matches the expected commit hash. If an upstream maintainer were to move a tag to point at a different commit (a "tag-moving attack"), the build would fail. This is supply chain security implemented at the Docker layer.

flowchart LR
    VJ[versions.json] -->|"dependency_updater"| VE[versions.env]
    VE -->|"COPY into build"| DF[Dockerfile]
    DF -->|"1. git clone at tag"| REPO[Upstream Repo]
    DF -->|"2. verify commit hash"| CHECK{SHA matches?}
    CHECK -->|Yes| BUILD[Build binary]
    CHECK -->|No| FAIL[Build fails]

The versions.env file is a generated artifact — shell-sourceable exports that Dockerfiles can source during build. Each dependency gets three variables following a consistent naming convention:

export OP_GETH_TAG=v1.101702.0
export OP_GETH_COMMIT=d0734fd5f44234cde3b0a7c4beb1256fc6feedef
export OP_GETH_REPO=https://github.com/ethereum-optimism/op-geth.git

The uppercased dependency key (op_gethOP_GETH) concatenated with _TAG, _COMMIT, and _REPO suffixes. This file is checked into the repository and regenerated automatically whenever versions.json changes.

How the Pieces Connect

Now that we've mapped the individual components, here's how they fit together end-to-end:

flowchart TD
    USER["Operator runs:<br/>CLIENT=reth docker compose up"] --> DC[docker-compose.yml]
    DC -->|"builds using"| RDF[reth/Dockerfile]
    RDF -->|"reads versions from"| VE[versions.env]
    VE -->|"generated from"| VJ[versions.json]
    RDF -->|"clones & builds"| RETH[base-reth-node binary]
    RDF -->|"clones & builds"| BC[base-consensus binary]
    RDF -->|"clones & builds"| OPN[op-node binary]
    DC -->|"starts execution service"| EX["reth-entrypoint<br/>→ base-reth-node"]
    DC -->|"starts node service"| ND["consensus-entrypoint<br/>→ base-consensus"]
    ND -->|"Engine API + JWT"| EX

The flow is: user selects a client → Docker Compose triggers the appropriate Dockerfile → the Dockerfile sources versions.env to know which upstream repos and commits to clone → binaries are built from source → entrypoint scripts handle runtime configuration and inter-service coordination.

What's Next

We've established the architectural skeleton — two services, three configuration layers, and a version-pinned build pipeline. But the skeleton doesn't explain what happens when those containers actually start. The startup sequence involves polling loops, public IP detection, JWT secret writing, and in Reth's case, a multi-stage initialization that can take up to six hours.

In Part 2, we'll trace every step from docker compose up through a fully syncing node, including the consensus dispatcher pattern and Reth's historical proofs initialization sub-sequence.