Read OSS

From Source to Static: The Build Pipeline and Bundler Abstraction

Advanced

Prerequisites

  • Understanding of the plugin lifecycle and route generation (Article 2)
  • Webpack fundamentals (loaders, plugins, bundles)
  • Familiarity with Node.js worker_threads

From Source to Static: The Build Pipeline and Bundler Abstraction

With routes generated and code files written to .docusaurus/, the build pipeline takes over. Its job: compile a client bundle for the browser, compile a server bundle for rendering, execute Static Site Generation to produce HTML files, and run post-build validation. This article traces the full journey from docusaurus build to a production-ready build/ directory.

Along the way, we'll encounter a clean bundler abstraction that lets Docusaurus swap between webpack and Rspack, a pragmatic approach to multi-locale builds, and a configurable SSG executor that can parallelize page rendering across worker threads.

Multi-Locale Build Orchestration

The build command at build.ts#L23-L56 orchestrates building all configured locales. The approach is sequential: each locale is built one at a time using mapAsyncSequential().

flowchart TD
    BUILD["docusaurus build"] --> LOCALES["getLocalesToBuild()"]
    LOCALES --> L1["buildLocale('en')"]
    L1 --> L2["buildLocale('fr')"]
    L2 --> L3["buildLocale('ja')"]
    L3 --> DONE["Build complete"]
    
    style L1 fill:#e1f5fe
    style L2 fill:#e1f5fe
    style L3 fill:#e1f5fe

Why sequential? The comments at build.ts#L120-L131 tell the story: the team tried running locale builds in worker_threads but hit SIGSEGV and SIGBUS crashes. Running in child_process worked but introduced complexity around memory limits and logging. The sequential approach is the pragmatic choice — it's reliable, and each locale build can use the full CPU for its webpack compilation and SSG passes.

The default locale is always built first (lines 60-75). This prevents a subtle bug: since the default locale typically outputs to the root build/ directory while other locales write to build/<locale>/, building the default last would erase the localized sub-directories.

Per-Locale Build Pipeline

Each locale build in buildLocale.ts#L43-L150 follows a five-step pipeline:

flowchart TD
    BL["buildLocale()"] --> LOAD["1. loadSite()"]
    LOAD --> CONFIGS["2. Create client + server configs (parallel)"]
    CONFIGS --> COMPILE["3. compile() both bundles"]
    COMPILE --> SSG["4. executeSSG()"]
    SSG --> POST["5. postBuild() + handleBrokenLinks()"]
    
    CONFIGS --> CLEAR["Clear output dir (parallel with config creation)"]

Step 1: loadSite() — As we covered in Articles 1-2, this runs the full server pipeline: config loading, plugin lifecycle, code generation. The DOCUSAURUS_CURRENT_LOCALE environment variable is set as a workaround for site config translation.

Step 2: Create bundler configs — The client and server webpack configurations are created in parallel at lines 80-100. Notably, the output directory is also cleared during this step — it runs as a third parallel promise, since it doesn't depend on the configs and takes time.

Step 3: compile() — Both bundles are compiled together. For hash router mode, only the client bundle is needed (no SSG), so the server config is skipped.

Step 4: executeSSG() — The server bundle is loaded and used to render every route to HTML. We'll cover this in detail below.

Step 5: Post-build — The postBuild() plugin lifecycle runs, giving plugins access to the rendered output. Then handleBrokenLinks() checks all collected links against all collected anchors across the entire site.

Tip: Several environment variables control the pipeline for debugging: DOCUSAURUS_SKIP_BUNDLING, DOCUSAURUS_RETURN_AFTER_LOADING, DOCUSAURUS_EXIT_AFTER_LOADING, and DOCUSAURUS_EXIT_AFTER_BUNDLING let you halt the build at any stage. These are defined at buildLocale.ts#L38-L41.

The Bundler Abstraction Layer

Docusaurus supports both Webpack and Rspack behind a common interface. The abstraction lives in the docusaurus-bundler package, with the core function getCurrentBundler() at currentBundler.ts#L27-L42:

export async function getCurrentBundler({siteConfig}): Promise<CurrentBundler> {
  if (isRspack(siteConfig)) {
    return {
      name: 'rspack',
      instance: (await importRspack()) as unknown as typeof webpack,
    };
  }
  return {
    name: 'webpack',
    instance: webpack,
  };
}

The CurrentBundler type is simply {name: 'webpack' | 'rspack', instance: typeof webpack}. Since Rspack aims for API compatibility with Webpack, it's cast to the same type. This lets all downstream code use currentBundler.instance without caring which bundler is active.

graph TD
    CB[getCurrentBundler] -->|rspackBundler: true| RS[Rspack instance]
    CB -->|rspackBundler: false| WP[Webpack instance]
    RS --> COMMON["CurrentBundler {name, instance}"]
    WP --> COMMON
    COMMON --> CSS[getCSSExtractPlugin]
    COMMON --> COPY[getCopyPlugin]
    COMMON --> PROG[getProgressBarPlugin]

However, some plugins differ between bundlers. The helper functions getCSSExtractPlugin, getCopyPlugin, and getProgressBarPlugin at currentBundler.ts#L57-L103 return the appropriate implementation. For Rspack, CSS extraction uses CssExtractRspackPlugin (a built-in), and the progress bar creates a custom class wrapping rspack.ProgressPlugin to mimic WebpackBar's name/color API.

Plugin webpack configuration merging happens in configure.ts#L55-L78. Each plugin's configureWebpack() return value is deep-merged with the existing config using webpack-merge, with an optional mergeStrategy for fine-grained control over array/object merging behavior.

SSG: Static Site Generation

The SSG executor at ssgExecutor.ts decides between two modes based on future.faster.ssgWorkerThreads:

Simple mode (createSimpleSSGExecutor, lines 39-58) renders all pages sequentially in the current Node.js process. This is the default and works well for small-to-medium sites.

Pooled mode (createPooledSSGExecutor, lines 101-176) uses Tinypool to distribute page rendering across worker threads. The number of threads is calculated dynamically:

flowchart TD
    EX["executeSSG()"] --> CHECK{ssgWorkerThreads?}
    CHECK -->|false| SIMPLE["Simple: single thread"]
    CHECK -->|true| POOL["Pooled: Tinypool"]
    POOL --> CALC["inferNumberOfThreads()"]
    CALC --> |"pageCount / 100 vs cpuCount"| THREADS["min(workload, cpus)"]
    THREADS -->|"== 1"| SIMPLE
    THREADS -->|"> 1"| SPAWN["Spawn thread pool"]
    SPAWN --> CHUNK["Chunk pages by SSGWorkerThreadTaskSize"]
    CHUNK --> RENDER["Parallel rendering"]

The thread count inference at ssgExecutor.ts#L65-L78 uses a minPagesPerCpu threshold of 100 — if you have 50 pages, you'll get 1 thread regardless of CPU count. This avoids the overhead of thread creation for small sites. If the inferred count is 1, it falls back to simple mode entirely.

The pool also includes memory management: maxMemoryLimitBeforeRecycle at line 136 enables thread recycling when memory usage gets too high, working around SSG memory leaks documented in issue #11161.

The actual rendering happens in serverEntry.tsx. For each page, it preloads route data, wraps <App /> in StaticRouter and HelmetProvider, renders to HTML, and collects broken link data (all <a> hrefs and id anchors on the page). The collected data is returned alongside the HTML for post-build validation.

The future.faster Performance Flags

The future.faster config option controls a set of performance optimizations. Defaults are defined at configValidation.ts#L76-L99:

Flag Default Effect
swcJsLoader false Use SWC instead of Babel for JS transpilation
swcJsMinimizer false Use SWC for JS minification
swcHtmlMinimizer false Use SWC for HTML minification
lightningCssMinimizer false Use Lightning CSS for CSS minification
mdxCrossCompilerCache false Cache MDX compilation across client/server builds
rspackBundler false Use Rspack instead of Webpack
rspackPersistentCache false Enable Rspack's persistent disk cache
ssgWorkerThreads false Parallelize SSG across worker threads
gitEagerVcs false Eager VCS metadata loading

The shortcut future: {faster: true} enables all flags at once (line 89-99). There's also a future.v4 flag set for forward-compatibility:

graph TD
    FASTER["future.faster"] -->|true| ALL["All faster flags enabled"]
    FASTER -->|object| PICK["Pick individual flags"]
    V4["future.v4"] -->|true| ALL_V4["All v4 flags enabled"]
    V4 -->|object| PICK_V4["Pick individual flags"]
    V4 -->|fasterByDefault: true| DEFAULT["faster flags default to true"]

The future.v4 flags represent breaking changes planned for v4 that you can opt into today. One notable interaction: ssgWorkerThreads requires v4.removeLegacyPostBuildHeadAttribute to be enabled, validated at configValidation.ts#L622-L637. This is because the legacy head attribute in postBuild() includes non-serializable Helmet state that can't be passed between worker threads.

The post-processing logic at configValidation.ts#L570-L651 resolves the interplay between v4.fasterByDefault and individual faster flags — if fasterByDefault is true, any faster flag not explicitly set defaults to true.

Tip: Start your migration to v4 today with future: {faster: true, v4: true} in your config. This enables all performance optimizations and forward-compatibility flags, giving you the fastest builds while preparing for the next major version.

What's Next

We've traced the build pipeline from CLI invocation through multi-locale orchestration, bundler compilation, and SSG rendering. But we glossed over what happens inside the bundler — specifically, how Markdown and MDX files become React components. In the next article, we'll dive into the MDX processing pipeline and the content plugins that power docs, blog, and pages.