Read OSS

esbuild Architecture: How a Go Bundler Achieves Extreme Speed

Intermediate

Prerequisites

  • Basic understanding of what JavaScript bundlers do (entry points → output bundles)
  • Familiarity with Go basics (packages, goroutines, channels)

esbuild Architecture: How a Go Bundler Achieves Extreme Speed

When Evan Wallace released esbuild in 2020, it rewrote expectations of what a JavaScript bundler could achieve. Builds that took minutes in webpack completed in milliseconds. The secret wasn't a single clever algorithm — it was a holistic set of architectural decisions, from the choice of language to the data structure layout to turning off the garbage collector for single-file builds.

This article is the first in a six-part series. We'll start with a bird's-eye view of the codebase, trace the three entry points into the core pipeline, and examine the key performance decisions that make esbuild fast. By the end, you'll have a mental model of how every part fits together — the foundation for the deep dives that follow.

Project Structure and Package Organization

esbuild's repository is organized into a clear separation of concerns. The Go code — roughly 87,000 lines in internal/ — does all the real work. Everything else is glue.

Directory Purpose Language
cmd/esbuild/ CLI and IPC service entry points Go
pkg/api/ Public Go API (Build, Transform, Context) Go
pkg/cli/ CLI argument parsing Go
internal/ Core implementation: lexer, parser, linker, printer Go
lib/ JavaScript/TypeScript API wrappers TypeScript
npm/ Platform-specific npm distribution packages JS/JSON
compat-table/ Browser compatibility data generation TypeScript

The internal/ directory is where you'll spend most of your time reading code. Its sub-packages map directly to bundler concepts:

Package Responsibility
js_lexer Streaming tokenization
js_parser Two-pass parsing, scope analysis, syntax lowering
css_parser CSS parsing (analogous structure)
bundler Module graph construction (scan phase)
linker Tree shaking, code splitting, chunk generation
resolver Module resolution (node_modules, tsconfig paths, etc.)
js_printer JavaScript code generation
renamer Identifier renaming and minification
runtime Embedded helper functions (__commonJS, __esm, etc.)
graph TD
    subgraph "Entry Points"
        CLI["cmd/esbuild/main.go"]
        GoAPI["pkg/api/api.go"]
        JSAPI["lib/npm/node.ts"]
    end

    subgraph "Core Pipeline (internal/)"
        Lexer["js_lexer"]
        Parser["js_parser"]
        Bundler["bundler"]
        Resolver["resolver"]
        Linker["linker"]
        Printer["js_printer"]
        Renamer["renamer"]
        Runtime["runtime"]
    end

    CLI --> GoAPI
    JSAPI -->|IPC| CLI
    GoAPI --> Bundler
    Bundler --> Lexer
    Bundler --> Parser
    Bundler --> Resolver
    Bundler --> Linker
    Linker --> Printer
    Linker --> Renamer
    Linker --> Runtime

Tip: The internal/ prefix is significant in Go — it prevents external packages from importing these packages. This gives the esbuild team full freedom to change internal APIs without breaking consumers.

Entry Points: CLI, Go API, and JavaScript API

esbuild exposes three entry points, but they all converge on the same internal pipeline. Understanding this convergence is key to navigating the codebase.

The CLI Entry Point

The CLI's main() function in cmd/esbuild/main.go#L159-L386 does a pre-scan over arguments before any real work begins. It strips special flags like --service, --heap, --trace, and --cpuprofile from the argument list, then delegates to cli.Run():

// From cmd/esbuild/main.go
for _, arg := range osArgs {
    switch {
    case strings.HasPrefix(arg, "--service="):
        hostVersion := arg[len("--service="):]
        isRunningService = true
        // Version validation...
    case arg == "--watch" || arg == "--watch=true":
        isWatch = true
    // ...
    }
}

The --service flag is particularly interesting: it transforms esbuild from a one-shot CLI tool into a long-lived IPC server, which is how the JavaScript API communicates with the Go binary.

The Go API

The public Go API lives in pkg/api/api.go#L388-L407. The Build() function is the primary entry point:

func Build(options BuildOptions) BuildResult {
    start := time.Now()
    ctx, errors := contextImpl(options)
    if ctx == nil {
        return BuildResult{Errors: errors}
    }
    result := ctx.Rebuild()
    ctx.Dispose()
    return result
}

Notice the contextImplRebuildDispose pattern. The Context API (used for watch mode and incremental builds) exposes this same lifecycle but lets users control when Rebuild() and Dispose() are called.

The JavaScript API

The JavaScript API doesn't execute any bundling logic directly. Instead, lib/npm/node.ts spawns the Go binary as a child process with the --service=<version> flag and communicates via a binary IPC protocol over stdin/stdout. We'll cover this protocol in detail in Part 5.

flowchart LR
    A["JS: build()"] -->|spawn child process| B["Go binary --service=0.28.0"]
    B -->|binary IPC over stdin/stdout| A
    B --> C["contextImpl()"]
    C --> D["ScanBundle()"]
    D --> E["Compile()"]

The Two-Phase Pipeline: Scan and Compile

Every build in esbuild follows a strict two-phase architecture, documented right at the top of the bundler package in internal/bundler/bundler.go#L1-L7:

// The bundler is the core of the "build" and "transform" API calls. Each
// operation has two phases. The first phase scans the module graph, and is
// represented by the "ScanBundle" function. The second phase generates the
// output files from the module graph, and is implemented by the "Compile"
// function.

Phase 1: ScanBundle

ScanBundle() constructs the module graph by:

  1. Parsing the runtime file first (always source index 0)
  2. Processing injected files
  3. Adding entry points
  4. Running scanAllDependencies() — a fan-out/fan-in loop that parses files in parallel goroutines while resolving imports sequentially

The output is an immutable Bundle struct containing all parsed files and their dependency relationships.

Phase 2: Compile

Compile() takes the immutable module graph and produces output files. It delegates to the linker, which performs tree shaking, code splitting, symbol renaming, and code generation.

flowchart TD
    EP["Entry Points"] --> SB["ScanBundle()"]
    SB --> RT["Parse runtime (index 0)"]
    RT --> INJ["Process injected files"]
    INJ --> ADD["Add entry points"]
    ADD --> SCAN["scanAllDependencies()"]
    SCAN --> BUNDLE["Immutable Bundle"]
    BUNDLE --> COMP["Compile()"]
    COMP --> LINK["Link()"]
    LINK --> OUT["Output Files"]

A critical design choice: the scan phase output is immutable and shared between linker invocations. This is what enables watch mode — when a file changes, only the affected files are re-scanned while unchanged results are cached.

Performance by Design: Key Decisions

esbuild's speed comes from several deliberate architectural decisions, each traceable to specific code.

Decision 1: Disable the Garbage Collector

For single-entry-point, non-watch builds, esbuild disables Go's garbage collector entirely. From cmd/esbuild/main.go#L309-L323:

if !isWatch && !isServe {
    nonFlagCount := 0
    for _, arg := range osArgs {
        if !strings.HasPrefix(arg, "-") {
            nonFlagCount++
        }
    }
    if nonFlagCount <= 1 {
        debug.SetGCPercent(-1)
    }
}

The logic: if we're going to allocate a bunch of memory and then exit, why waste time collecting garbage? This is a measurable speedup for one-shot builds. The guard for nonFlagCount <= 1 prevents excessive memory usage when processing many entry points.

Decision 2: Streaming Lexer

Unlike traditional compilers that lex the entire file into a token array before parsing, esbuild's lexer is called on-demand by the parser. From internal/js_lexer/js_lexer.go#L1-L14:

// The lexer converts a source file to a stream of tokens. Unlike many
// compilers, esbuild does not run the lexer to completion before the parser is
// started. Instead, the lexer is called repeatedly by the parser as the parser
// parses the file. This is because many tokens are context-sensitive and need
// high-level information from the parser.

This isn't just about memory savings — JavaScript has context-sensitive tokens (regex literals vs. division operators, JSX elements) that require parser state to disambiguate.

Decision 3: Parallel Parsing with Serial Graph Construction

The scanner parses files in parallel goroutines but constructs the dependency graph on a single goroutine. From scanAllDependencies():

func (s *scanner) scanAllDependencies() {
    for s.remaining > 0 {
        result := <-s.resultChannel
        s.remaining--
        // ... resolve imports, spawn new parse goroutines
    }
}

This fan-out/fan-in pattern maximizes CPU utilization during parsing (the most expensive phase) while avoiding the complexity of concurrent graph mutations.

sequenceDiagram
    participant C as Coordinator
    participant G1 as Goroutine 1
    participant G2 as Goroutine 2
    participant G3 as Goroutine 3

    C->>G1: Parse entry.js
    G1-->>C: AST + imports [./a, ./b]
    C->>G2: Parse a.js
    C->>G3: Parse b.js
    G2-->>C: AST + imports [./c]
    G3-->>C: AST + imports []
    C->>G2: Parse c.js
    G2-->>C: AST + imports []
    Note over C: Graph complete

Decision 4: Two-Pass Parser

The parser does everything in exactly two passes — parsing and visiting — to minimize full-tree traversals. From internal/js_parser/js_parser.go#L22-L35:

// This parser does two passes:
//
// 1. Parse the source into an AST, create the scope tree, and declare symbols.
//
// 2. Visit each node in the AST, bind identifiers to declared symbols, do
//    constant folding, substitute compile-time variable definitions, and
//    lower certain syntactic constructs as appropriate given the language target.
//
// So many things have been put in so few passes because we want to minimize
// the number of full-tree passes to improve performance.

Most bundlers make many passes over the AST — one for scope analysis, one for constant folding, one for syntax lowering, etc. esbuild packs all of this into two passes, which dramatically reduces memory traffic.

How the Pieces Connect: contextImpl to Output

To tie everything together, let's trace a Build() call through the entire system:

flowchart TD
    Build["Build(options)"] --> CI["contextImpl(options)"]
    CI --> VBO["validateBuildOptions()"]
    CI --> LP["loadPlugins()"]
    CI --> IC["internalContext{}"]
    IC --> RB["Rebuild()"]
    RB --> SB["ScanBundle()"]
    SB --> ParseRT["Parse runtime"]
    SB --> ParseEP["Parse entry points"]
    SB --> ScanDeps["scanAllDependencies()"]
    RB --> Compile["Compile()"]
    Compile --> Link["Link()"]
    Link --> Clone["CloneLinkerGraph()"]
    Link --> SIE["scanImportsAndExports()"]
    Link --> TS["treeShakingAndCodeSplitting()"]
    Link --> CC["computeChunks()"]
    Link --> Gen["generateChunksInParallel()"]
    Gen --> OUT["[]OutputFile"]

The contextImpl() function at pkg/api/api_impl.go#L881-L955 creates the build context: it validates options, loads plugins, and constructs the internalContext that drives everything. The Rebuild() method on this context calls ScanBundle() followed by Compile(), producing the final output files.

What's Coming Next

In Part 2, we'll dive into the scan phase — the concurrent module graph construction that is the heart of esbuild's performance. We'll examine the streaming lexer's state machine, the two-pass parser's scope tree, the scanner's fan-out/fan-in concurrency model, and the resolver's PathPair design for handling the ambiguity between module and main fields in package.json.