Read OSS

From Module Graph to Output Files: Rollup's Chunk Generation and Rendering

Advanced

Prerequisites

  • Article 1: Rollup's Architecture
  • Article 2: Module Graph and Loading
  • Article 3: Tree-Shaking and the AST System
  • Understanding of code splitting and dynamic imports

From Module Graph to Output Files: Rollup's Chunk Generation and Rendering

The build phase gave us a fully analysed module graph with included/excluded markings on every AST node. Now the generate phase must answer a new set of questions: Which modules belong in which output files? What happens when two merged modules declare variables with the same name? How do you compute a content hash for a chunk when its content includes the filename of another chunk whose hash depends on the first chunk?

This article traces the generate phase end-to-end, from Bundle.generate() through the final file emission.

Three Chunk Modes: inlineDynamicImports, preserveModules, and Default Code Splitting

The generate phase starts in Bundle.generateChunks(). The very first decision is which chunking strategy to use, determined at line 183-194:

const executableModule = inlineDynamicImports
    ? [{ alias: null, modules: includedModules }]               // one chunk
    : preserveModules
        ? includedModules.map(module => ({ alias: null, modules: [module] }))  // one per module
        : getChunkAssignments(/* ... */);                        // smart splitting

inlineDynamicImports collapses everything into a single bundle. Dynamic import() calls are left as-is if the format supports them (ES), or converted to synchronous requires (CJS). This is the simplest mode.

preserveModules creates one output file per input module. The directory structure is preserved relative to a computed inputBase. This is useful for library builds where consumers want to import individual files.

Default mode calls getChunkAssignments(), which implements the sophisticated BigInt bitmask algorithm.

The Chunk Assignment Algorithm: BigInt Bitmasks as Module Sets

The getChunkAssignments() function is preceded by a 110-line design comment at line 41-151 that explains the algorithm in detail. Here's the core idea:

flowchart TD
    A["Assign each entry a bit index"] --> B["For each module, compute<br/>dependent entries as BigInt bitmask"]
    B --> C["Group modules with<br/>identical bitmasks into chunks"]
    C --> D["Optimize: remove dynamic entry<br/>from bitmask if already loaded"]
    D --> E["Re-group after optimization"]
    E --> F["Merge small chunks<br/>if safe to do so"]

Each entry point (static or dynamic) gets a numerical index. A module's "dependent entries" is the set of entries that can reach it via static imports. This set is represented as a BigInt where each bit corresponds to an entry index. Two modules that have the exact same dependent entry set will always be loaded together, so they belong in the same chunk.

The BigInt representation makes set operations O(1):

  • Union: a | b
  • Intersection: a & b
  • Subset check: (a & b) === a

This is critical for the dynamic import optimization that follows.

Dynamic Import Optimization: Avoiding Redundant Chunks

The 110-line comment tells the full story, but the key optimization is this: when a dynamic import fires, some modules are already guaranteed to be in memory (because the importing chunk was already loaded). If a module's dependent entries are a subset of the already-loaded entries, it doesn't need a separate chunk.

Consider: entry A imports B statically, and A dynamically imports C, which also imports B. Naively, B would be in its own chunk (dependent entries: {A, C}). But since C can only be loaded after A, and A already includes B, there's no point creating a separate chunk for B. The optimization removes C from B's dependent entry set, merging B into A's chunk.

The algorithm handles the general case where multiple dynamic importers exist. It computes the "dynamically dependent entries" by taking the union of all dynamic importers' dependent entry sets, then intersects the already-loaded modules across all importers to find what's guaranteed to be in memory.

For very large projects, a naive implementation of this has O(n³) complexity relative to dynamic entries. Rollup avoids this by using the BigInt representation and processing changes incrementally — only re-examining dynamic entries whose inputs changed in the previous iteration.

Small Chunk Merging with Side-Effect Safety

After the initial assignment, some chunks may be very small — perhaps containing a single utility function. The experimentalMinChunkSize option triggers merging of these small chunks. The merging happens with a critical constraint: a chunk can only be merged into a dependent if doing so doesn't change the observable execution order.

Consider module effects.js that logs to the console, imported by both chunks A and C. If we merge effects.js into chunk A, then when chunk C loads, effects.js has already executed (it was in A). This changes the observable timing of the console log relative to C's other code. Rollup tracks whether modules have side effects and refuses merges that would alter execution semantics.

The ChunkDescription interface tracks both containedAtoms (which initial chunks are inside this chunk) and correlatedAtoms (which chunks are guaranteed loaded alongside), plus a pure flag for side-effect-free chunks.

Variable Deconfliction: No Name Collisions in Merged Output

When multiple modules are merged into a single chunk, they may declare variables with the same name. The deconflictChunk() function handles this:

The algorithm varies by output format — ES and SystemJS handle imports differently than CJS/IIFE/UMD. For ES format, imported names don't need deconfliction because they're in their own syntactic space. For CJS, imported names become local variables and must be deconflicted like any other.

The deconfliction strategy builds a usedNames set, seeded with reserved identifiers like Object, Promise, module, exports, and require. Then for each module in the chunk, it walks every scope's variables and calls getSafeName() — which appends $1, $2, etc. until a unique name is found. The result is stored as variable.renderName, which is later used by Identifier.render().

Tip: The deconfliction uses Variable.forbiddenNames to prevent a variable from being renamed to a specific name. This is used to avoid conflicts with names that appear in the module's own code — if your module has a local variable called helper and imports another helper, the imported one can't be renamed back to helper.

The Rendering Pipeline: MagicString → Finaliser → renderChunk Hook

The rendering pipeline is orchestrated by renderChunks():

sequenceDiagram
    participant RC as renderChunks()
    participant C as Chunk
    participant AST as AST Nodes
    participant F as Finaliser
    participant PD as PluginDriver

    RC->>C: chunk.render()
    C->>C: resolve dynamic imports
    C->>C: deconflictChunk()
    loop For each module
        C->>AST: node.render(magicString)
        Note over AST: Only included nodes write output
    end
    C->>F: finaliser(magicString, options)
    Note over F: Wrap with format-specific imports/exports
    C-->>RC: ChunkRenderResult
    RC->>PD: hookReduceArg0('renderChunk', [code])
    Note over PD: Plugins can transform output
    RC->>RC: Compute content hashes
    RC->>RC: Replace hash placeholders

Each Chunk.render() call:

  1. Determines how dynamic imports and import.meta references should resolve
  2. Runs deconflictChunk() to assign unique render names
  3. Creates a MagicString.Bundle — a collection of module-level MagicString instances that preserves source positions for sourcemaps
  4. Calls .render() on each module's AST, which recursively calls .render() on included nodes
  5. Passes the bundled string to the format finaliser

After all chunks render, the renderChunk plugin hook fires, allowing plugins to post-process the output (minification, for example). Finally, content hashes are computed and placeholders are replaced.

Hash Placeholders: Solving Circular Content-Based Hashing

Content-based file hashing has a chicken-and-egg problem. Chunk A imports Chunk B, so A's content includes B's filename. B's filename depends on B's content hash. But if B also imports A (directly or transitively), B's content includes A's filename, which depends on A's hash, which depends on A's content, which includes B's filename...

Rollup solves this with a placeholder system in src/utils/hashPlaceholders.ts:

flowchart TD
    A["Generate unique placeholder<br/>per chunk: !~{001}~"] --> B["Render all chunks<br/>with placeholders in filenames"]
    B --> C["Replace placeholders<br/>with default value for hashing"]
    C --> D["Compute content hash<br/>of each chunk"]
    D --> E["Track which chunks contain<br/>which placeholders"]
    E --> F["Resolve hashes in<br/>dependency order"]
    F --> G["Replace all placeholders<br/>with final hash values"]

Each chunk gets a unique placeholder string like !~{001}~ (using characters from a private-use range to minimize collision risk). During rendering, cross-chunk references use these placeholders instead of actual filenames.

After all chunks are rendered, Rollup:

  1. Replaces each placeholder with a default value (all zeros) to compute initial hashes
  2. Records which placeholders appear in which chunks
  3. Resolves final hashes in dependency order — chunks with no placeholder dependencies get their final hash first
  4. Replaces all placeholders with the resolved values

The replacePlaceholders() function uses a single regex pass to substitute all placeholders at once.

Format Finalisers: es, cjs, iife, amd, umd, system

The src/finalisers/index.ts file exports six finaliser functions that share a common interface:

export default { amd, cjs, es, iife, system, umd } as Record<string, Finaliser>;

Each receives a MagicStringBundle containing the rendered module bodies and wraps it with format-specific boilerplate. The es finaliser is the simplest — it prepends import declarations and appends export statements:

export default function es(magicString, { dependencies, exports, snippets }, options): void {
    const importBlock = getImportBlock(dependencies, importAttributesKey, snippets);
    if (importBlock.length > 0) intro += importBlock.join(n) + n + n;
    // ...
    const exportBlock = getExportBlock(exports, snippets);
    if (exportBlock.length > 0) magicString.append(n + n + exportBlock.join(n).trim());
}

The CJS finaliser wraps module bodies in a scope and replaces imports with require() calls. The IIFE finaliser creates a self-executing function. The UMD finaliser produces a hybrid wrapper that detects AMD, CJS, and global contexts. The SystemJS finaliser uses System.register().

All finalisers operate on the same rendered module body — the per-module MagicString instances are format-agnostic. This is the payoff of the two-phase design: the expensive work (parsing, analysis, tree-shaking, rendering) happens once, and format wrapping is a relatively cheap string operation.

Tip: The snippets object passed to finalisers contains format-aware code generation helpers — cnst produces const or var depending on generatedCode settings, n is the newline character (empty in compact mode), and _ is the space character (also empty in compact mode). If you're building a custom output format plugin, study how the existing finalisers use snippets.

What's Next

We've followed code from module graph to output files. The next article zooms into the system that makes Rollup infinitely extensible: the plugin architecture. We'll examine the four hook execution strategies, the layered input/output plugin driver, the plugin context API, and the FileEmitter that manages asset and chunk emission across build phases.