Docusaurus Architecture: A Map of the Monorepo
Prerequisites
- ›Basic React knowledge (components, hooks, context)
- ›Familiarity with Node.js and npm/yarn workspaces
- ›General understanding of static site generators
Docusaurus Architecture: A Map of the Monorepo
Docusaurus powers tens of thousands of documentation sites — from React Native to Jest to Supabase — yet very few developers have ever looked inside the machine. Under the hood, it's a 40-package Yarn workspaces monorepo managed by Lerna, with a clean architectural split between server-side Node.js orchestration and client-side React rendering. Understanding this split is the key to reading every line of code that follows.
This article gives you the mental model you need. We'll walk through the package taxonomy, trace the two execution worlds, dissect the CLI, and follow the core loadSite() pipeline from config to generated code. By the end, you'll know exactly where to look when you want to understand any Docusaurus behavior.
Monorepo Layout and Package Categories
The root package.json declares Yarn v1 workspaces spanning packages/*, website, and several other directories. Lerna (lerna.json) orchestrates versioning and publishing at a unified version (currently 3.9.2).
The ~40 packages break down into clear categories:
| Category | Examples | Role |
|---|---|---|
| Core | docusaurus |
CLI, server pipeline, client app, SSG |
| Bundler | docusaurus-bundler |
Webpack/Rspack abstraction |
| Content plugins | plugin-content-docs, plugin-content-blog, plugin-content-pages |
Read files, produce routes |
| Themes | theme-classic, theme-common |
React UI components |
| Preset | preset-classic |
Bundles plugins + themes |
| MDX | mdx-loader |
Webpack loader for MDX compilation |
| Utilities | utils, utils-common, utils-validation |
Shared helpers |
| Types | docusaurus-types |
TypeScript type definitions |
| Scaffolding | create-docusaurus |
Project initializer |
| Logger | docusaurus-logger |
Structured logging |
graph TD
subgraph "Preset Classic"
PC[preset-classic]
end
subgraph "Content Plugins"
DOCS[plugin-content-docs]
BLOG[plugin-content-blog]
PAGES[plugin-content-pages]
end
subgraph "Theme Layer"
TC[theme-classic]
TCM[theme-common]
end
subgraph "Core"
CORE[docusaurus]
BUNDLER[docusaurus-bundler]
MDX[mdx-loader]
end
PC --> DOCS
PC --> BLOG
PC --> PAGES
PC --> TC
TC --> TCM
DOCS --> MDX
BLOG --> MDX
CORE --> BUNDLER
The docusaurus core package is by far the largest. It contains the CLI, the entire server-side pipeline, the client React app, the SSG engine, and the webpack configuration layer. Think of it as the kernel — everything else plugs into it.
Tip: When navigating the codebase, start at
packages/docusaurus/src/. The server-side code lives inserver/andcommands/, while client-side code lives inclient/. This is the most important directory split to internalize.
The Two Worlds: Server-Side and Client-Side
Docusaurus has a fundamental architectural split that you must understand before reading any code: server-side (Node.js) and client-side (React in the browser) are separate codebases that communicate through generated files.
flowchart LR
subgraph "Server World (Node.js)"
CONFIG[Config Loading]
PLUGINS[Plugin Lifecycle]
CODEGEN[Code Generation]
end
subgraph ".docusaurus/"
GEN[Generated Files]
end
subgraph "Client World (React)"
APP[App Component]
ROUTES[Routes]
HYDRATION[Hydration]
end
CONFIG --> PLUGINS --> CODEGEN --> GEN
GEN --> APP
GEN --> ROUTES
ROUTES --> HYDRATION
Server-side code runs during docusaurus build and docusaurus start. It reads the config file, executes the plugin lifecycle, generates route manifests, and produces static HTML via SSG. This code lives in packages/docusaurus/src/server/ and packages/docusaurus/src/commands/.
Client-side code is a React application that hydrates in the browser. It uses React Router for navigation, lazy-loads route components, and manages theme context. This code lives in packages/docusaurus/src/client/.
The bridge between them is the .docusaurus/ directory — a generated folder containing JavaScript modules, JSON data, and route configurations that the client-side webpack build consumes via @generated/* aliases. The server writes these files; the client imports them.
The CLI as Orchestrator
The CLI entry point at packages/docusaurus/src/commands/cli.ts uses Commander.js to define all commands. The runCLI() function creates the program and parses arguments:
flowchart TD
CLI[runCLI] --> BUILD[build]
CLI --> START[start]
CLI --> SWIZZLE[swizzle]
CLI --> DEPLOY[deploy]
CLI --> SERVE[serve]
CLI --> CLEAR[clear]
CLI --> WT[write-translations]
CLI --> WHI[write-heading-ids]
CLI --> EXT{External?}
EXT -->|Yes| PLUGIN_CMD[Plugin CLI Extensions]
Each command maps to a dedicated module: build triggers the full static build pipeline, start launches the dev server with hot reload, and swizzle handles theme component customization.
One subtle detail: the CLI checks whether a command is "internal" at cli.ts#L26-L40. If the command isn't recognized, it calls externalCommand() before parsing — this is how plugins can register their own CLI commands. The docs plugin, for instance, adds docs:version for creating documentation snapshots.
Note the environment variable escape hatches at lines 53-56: DOCUSAURUS_CLI_SITE_DIR and DOCUSAURUS_CLI_CONFIG let you override the site directory and config path without passing CLI arguments. This exists because Commander.js can't determine the site directory before parsing, creating a chicken-and-egg problem for plugin CLI extensions that need config context.
The loadSite() Pipeline
The loadSite() function in packages/docusaurus/src/server/site.ts#L276-L298 is the single most important function in the codebase. Every command that needs site data — build, start, deploy — calls it. Here's what it does:
sequenceDiagram
participant CLI as CLI Command
participant LS as loadSite()
participant LC as loadContext()
participant LP as loadPlugins()
participant CSP as createSiteProps()
participant CSF as createSiteFiles()
CLI->>LS: loadSite(params)
LS->>LC: Load config, i18n, bundler
LC-->>LS: LoadContext
LS->>LP: Run plugin lifecycle (4 phases)
LP-->>LS: plugins, routes, globalData
LS->>CSP: Merge routes, metadata, translations
CSP-->>LS: Props
LS->>CSF: Generate .docusaurus/ files
CSF-->>LS: Site ready
The pipeline has four stages:
-
loadContext()(lines 81-173): Loads the site config, resolves i18n locale settings, determines the output directory, initializes the bundler (Webpack or Rspack), and loads code translations. -
loadPlugins(): Runs the full 4-phase plugin lifecycle — initialization,loadContent(),contentLoaded(), andallContentLoaded(). Returns loaded plugins, routes, and global data. We'll cover this in detail in Article 2. -
createSiteProps()(lines 175-230): Merges plugin results into a unifiedPropsobject containing routes, metadata, HTML tags, and code translations. Also handles duplicate route detection. -
createSiteFiles()(lines 233-268): Writes the.docusaurus/directory by callinggenerateSiteFiles().
Tip: The
Propstype is the "contract" between the server pipeline and everything downstream (code generation, bundler config, dev server). If you're debugging any build issue, check what's inPropsfirst.
The .docusaurus/ Bridge: Code Generation
The generated .docusaurus/ directory is the contract between server and client. The generateSiteFiles() function at packages/docusaurus/src/server/codegen/codegen.ts#L162-L174 writes all files in parallel:
| Generated File | Purpose |
|---|---|
docusaurus.config.mjs |
Serialized site config for client access |
routes.js |
React Router route tree with ComponentCreator lazy loading |
registry.js |
Chunk name → module path mapping for code splitting |
routesChunkNames.json |
Route path → chunk names for each route's modules |
client-modules.js |
Plugin client modules (CSS, JS side effects) |
globalData.json |
Cross-plugin data accessible via useGlobalData() |
i18n.json |
Current locale configuration |
codeTranslations.json |
UI string translations |
site-metadata.json |
Plugin versions and site metadata |
flowchart TD
GEN[generateSiteFiles] --> WARN[DONT-EDIT-THIS-FOLDER]
GEN --> CM[client-modules.js]
GEN --> SC[docusaurus.config.mjs]
GEN --> RF[routes.js + registry.js + routesChunkNames.json]
GEN --> GD[globalData.json]
GEN --> SM[site-metadata.json]
GEN --> I18N[i18n.json]
GEN --> CT[codeTranslations.json]
One design decision worth noting: client modules use require() instead of import(). Look at codegen.ts#L68-L77 — the comment explains that import() is async but client modules can include CSS, and the load order matters for CSS specificity. Using synchronous require() ensures CSS files are included in the correct order in the bundle.
The route generation in codegenRoutes.ts deserves special attention. It produces three files: routes.js contains a minimal React Router config using ComponentCreator for lazy loading, registry.js maps chunk names to dynamic import() calls with webpack magic comments for chunk naming, and routesChunkNames.json connects route paths to their chunk names. This three-file system enables aggressive code splitting — each page only loads the JavaScript it needs.
Client Component Tree and Routing
The client React application is assembled in packages/docusaurus/src/client/App.tsx. The component tree is a Russian doll of providers:
graph TD
EB[ErrorBoundary] --> DCP[DocusaurusContextProvider]
DCP --> BCP[BrowserContextProvider]
BCP --> ROOT["Root (@theme/Root)"]
ROOT --> TP["ThemeProvider (@theme/ThemeProvider)"]
TP --> SMD[SiteMetadataDefaults]
TP --> SM["SiteMetadata (@theme/SiteMetadata)"]
TP --> BIB[BaseUrlIssueBanner]
TP --> AN[AppNavigation]
AN --> PN[PendingNavigation]
PN --> ROUTES["renderRoutes(@generated/routes)"]
Notice that @theme/Root and @theme/ThemeProvider are resolved through the theme alias system — they point to either theme-classic's implementations or user-swizzled versions. The @generated/routes import connects to the server-generated route file.
The browser entry point at clientEntry.tsx handles both hydration and client-side rendering. It preloads route data for the current path before rendering, then either calls ReactDOM.hydrateRoot() (for SSG'd pages) or ReactDOM.createRoot() (for dev mode). The router choice between BrowserRouter and HashRouter is driven by the future.experimental_router config option.
For SSR/SSG, serverEntry.tsx wraps the same <App /> with StaticRouter, HelmetProvider, and a BrokenLinksProvider that collects all links and anchors on the page for post-build validation. It renders to HTML and returns the collected metadata alongside the markup.
What's Next
You now have the map. You understand the monorepo structure, the server/client split, the CLI command dispatch, the loadSite() pipeline, the .docusaurus/ bridge, and the client component tree. In the next article, we'll zoom into the heart of the server pipeline: the 4-phase plugin lifecycle that transforms content on disk into React routes in the browser.