Read OSS

Content Plugins and the MDX Pipeline: From Markdown to React

Advanced

Prerequisites

  • Understanding of the plugin lifecycle and actions system (Article 2)
  • Understanding of the build pipeline and webpack loaders (Article 3)
  • Basic knowledge of MDX, remark, and rehype ecosystems

Content Plugins and the MDX Pipeline: From Markdown to React

The plugin lifecycle creates routes. The build pipeline compiles bundles. But how does a .md file on disk become a React component that renders in the browser? That's the job of the MDX pipeline — a sophisticated chain of remark and rehype plugins orchestrated by a webpack loader, and three content plugins that read files and map them to routes.

This article traces the complete content processing pipeline: from raw Markdown file to rendered page. We'll start with the MDX loader, walk through every remark plugin in order, then examine the docs plugin as a case study of how content plugins coordinate file reading, versioning, sidebar generation, and route creation.

The MDX webpack Loader

The entry point for Markdown processing is the webpack loader at packages/docusaurus-mdx-loader/src/loader.ts. When webpack encounters a .md or .mdx file, this loader:

  1. Parses front matter using the configurable parseFrontMatter function (line 44-48)
  2. Compiles MDX to JSX via compileToJSX(), which runs the full remark/rehype chain
  3. Extracts the content title from the first heading
  4. Detects MDX partials — files prefixed with _ are treated as partials and warned if they contain front matter
  5. Emits asset references — images and links referenced in Markdown are transformed into webpack require() calls
flowchart TD
    MD["document.md"] --> LOADER["MDX Loader"]
    LOADER --> FM["Parse front matter"]
    FM --> COMPILE["compileToJSX()"]
    COMPILE --> REMARK["Remark plugins"]
    REMARK --> REHYPE["Rehype plugins"]
    REHYPE --> JSX["JSX output"]
    JSX --> ASSETS["Asset require() emission"]
    ASSETS --> MODULE["Webpack module"]

The loader distinguishes between two webpack compiler names (client and server) and can cache compilation results across them when mdxCrossCompilerCache is enabled — a significant optimization since the same MDX file is compiled twice during a build (once for the client bundle, once for SSG).

The Remark/Rehype Plugin Chain

The processor factory at processor.ts assembles the full transformation chain. The order matters — each plugin transforms the AST for the next one.

The remark plugins execute in this order:

flowchart TD
    A["beforeDefaultRemarkPlugins (user)"] --> B["remark-frontmatter"]
    B --> C["remark-directive"]
    C --> D["contentTitle (extract h1)"]
    D --> E["admonitions (:::note, :::tip)"]
    E --> F["headings (extract + generate IDs)"]
    F --> G["remark-emoji"]
    G --> H["toc (table of contents)"]
    H --> I["details (HTML details/summary)"]
    I --> J["head (HTML head injection)"]
    J --> K["mermaid (if enabled)"]
    K --> L["transformImage (require assets)"]
    L --> M["resolveMarkdownLinks"]
    M --> N["transformLinks (require assets)"]
    N --> O["remark-gfm"]
    O --> P["remark-comment (if mdx1Compat)"]
    P --> Q["remarkPlugins (user)"]
    Q --> R["unusedDirectivesWarning"]
    R --> S["codeCompatPlugin (MDX1 code blocks)"]

Each plugin serves a specific purpose. Here are the most interesting ones:

headings (remark/headings) extracts all headings and generates stable anchor IDs. The anchorsMaintainCase option controls whether IDs are lowercased.

toc (remark/toc) builds a table of contents data structure from headings and injects it as toc metadata accessible to theme components.

transformImage and transformLinks convert relative paths in ![](./img.png) and [link](./doc.md) into webpack require() calls, enabling the bundler to process and fingerprint these assets.

admonitions (remark/admonitions) transforms :::note / :::tip / :::warning directive blocks into <Admonition> JSX components.

unusedDirectivesWarning catches directive-like syntax (e.g., :::something) that wasn't handled by any plugin, warning users about potential typos.

codeCompatPlugin handles MDX v1 code block compatibility — it's always applied last so it runs after user plugins like npm2yarn.

After remark plugins, rehype plugins run. For md format (as opposed to mdx), rehype-raw is prepended to handle raw HTML blocks, with passthrough for MDX expression nodes.

Tip: You can add custom remark/rehype plugins in your docusaurus.config.js via the content plugin options (e.g., docs: {remarkPlugins: [...]}). Your plugins are inserted between the default remark plugins and the unusedDirectivesWarning plugin, giving you access to a well-structured AST.

The Docs Plugin: Content Loading and Versioning

The docs plugin at plugin-content-docs/src/index.ts#L82-L108 is the most complex content plugin. Its constructor immediately resolves the sidebar path and reads version metadata:

const versionsMetadata = await readVersionsMetadata({context, options});

The versioning system supports multiple documentation versions simultaneously. Version metadata includes the version name, content path, sidebar path, and URL prefix. The "current" version points to docs/ by default, while released versions are stored in versioned_docs/v<X>/ and tracked in versions.json.

flowchart TD
    VM["readVersionsMetadata()"] --> CURRENT["Current version<br/>docs/"]
    VM --> V2["Version 2.0<br/>versioned_docs/version-2.0/"]
    VM --> V1["Version 1.0<br/>versioned_docs/version-1.0/"]
    
    CURRENT --> LOAD["loadVersion()"]
    V2 --> LOAD
    V1 --> LOAD
    LOAD --> DOCS["Per-version docs array"]
    LOAD --> SIDEBARS["Per-version sidebars"]

During loadContent(), the plugin calls loadVersion() for each version, which scans the content directory for markdown files, parses their front matter, extracts metadata (title, description, tags, sidebar position), and builds the complete docs data structure. Each version gets its own sidebar configuration and independent content path.

The plugin supports multiple instances via the id option — you can have docs for your main documentation and a second instance like community with different content paths. Each instance gets its own versioning, sidebars, and routes.

The sidebar system in sidebars/index.ts supports two modes:

Autogenerated sidebars — The default. When sidebarPath is undefined, Docusaurus generates sidebars from the filesystem structure. The DefaultSidebars config at line 22-29 creates a single sidebar with type: 'autogenerated' pointing to the root directory.

Autogeneration respects _category_.json or _category_.yml files for customizing category labels, positions, and descriptions. The readCategoriesMetadata() function at lines 43-68 scans for these files using glob patterns.

Manual sidebars — A sidebars.js file that exports sidebar configuration. Loaded via loadFreshModule() to support TypeScript and ESM formats.

Both modes go through the same processing pipeline: normalization → processing → post-processing. The processing step resolves autogenerated items into actual doc references by scanning the filesystem, while post-processing validates link targets and applies ordering.

graph TD
    INPUT["sidebarPath option"] -->|undefined| AUTO["DefaultSidebars<br/>autogenerated from filesystem"]
    INPUT -->|false| DISABLED["DisabledSidebars<br/>no sidebar"]
    INPUT -->|path| MANUAL["Load sidebars.js"]
    AUTO --> NORM["normalizeSidebars()"]
    MANUAL --> NORM
    NORM --> PROC["processSidebars()"]
    PROC --> POST["postProcessSidebars()"]
    POST --> FINAL["Final sidebar data"]

Route Creation: Connecting Content to Theme Components

The most illuminating part of any content plugin is its contentLoaded() implementation. The docs plugin's route creation in routes.ts shows exactly how content becomes routes.

For each doc, the plugin:

  1. Calls actions.createData() to write the doc metadata as a JSON module
  2. Creates a route with component: options.docItemComponent (which resolves to @theme/DocItem)
  3. Sets modules: {content: doc.source} — this tells the route to load the MDX file as a module
  4. Attaches sidebar as a route attribute so parent components can render the correct sidebar
sequenceDiagram
    participant DOC as docs plugin
    participant ACT as actions
    participant GEN as .docusaurus/
    participant THEME as @theme/DocItem

    DOC->>ACT: createData('abc123.json', docMetadata)
    ACT->>GEN: Write JSON to .docusaurus/docs/default/abc123.json
    ACT-->>DOC: modulePath
    DOC->>ACT: addRoute({path, component: '@theme/DocItem', modules: {content: doc.source}})
    Note over DOC,THEME: At runtime, DocItem receives doc content and metadata as props

The route structure is hierarchical: each version creates a parent route with component: docRootComponent (resolving to @theme/DocRoot), and individual doc pages are nested subroutes. This design means the sidebar component renders at the DocRoot level and persists across doc navigation without unmounting.

Tip: The component field in addRoute() is always a string like '@theme/DocItem' — not an actual import. The theme alias system (covered in Article 5) resolves this string to the actual React component during webpack compilation.

The Blog and Pages Plugins

While the docs plugin is the most complex, the blog and pages plugins follow the same pattern:

plugin-content-blog scans blog posts (markdown files with date conventions), generates routes for individual posts, the blog listing, tag pages, and archive pages. It uses @theme/BlogPostPage, @theme/BlogListPage, etc.

plugin-content-pages is the simplest — it scans src/pages/ for React components and MDX files, creating routes with @theme/MDXPage for MDX content. There's no versioning, no sidebars — just straightforward file-to-route mapping.

All three plugins share common patterns: they declare getPathsToWatch() to tell the dev server which files trigger rebuilds, configureWebpack() to register their MDX loader rules for their specific content directories, and getTranslationFiles() to support i18n.

The MDX loader rules are the key connection between content plugins and the MDX pipeline. Each plugin registers a webpack rule matching .md/.mdx files in its content directories with the MDX loader and its specific configuration (remark plugins, admonition settings, etc.). This is why the MDX fallback plugin we saw in Article 2 needs to exclude these paths — it handles everything else.

What's Next

We've traced content from markdown files through the MDX pipeline to route creation. But the route refers to theme components like @theme/DocItem — how does that string become an actual React component? In the next article, we'll explore the theme system's alias resolution, the 100+ components in theme-classic, and the swizzle mechanism that makes Docusaurus uniquely customizable.