Read OSS

Inside the Svelte Compiler: From .svelte Source to JavaScript Output

Advanced

Prerequisites

  • Article 1: Architecture and Codebase Map
  • Understanding of Abstract Syntax Trees (ASTs) and ESTree format
  • Familiarity with the visitor pattern for tree traversal
  • Basic knowledge of compiler concepts (parsing, analysis, code generation)

Inside the Svelte Compiler: From .svelte Source to JavaScript Output

The Svelte compiler is the foundation that makes everything else possible. It takes a .svelte file — a mix of HTML, CSS, and JavaScript — and produces an optimized JavaScript module that imports from svelte/internal/client (or svelte/internal/server). The compiler runs in three phases: parse, analyze, transform. Each phase produces a progressively richer representation of the component. Let's trace the full journey.

The compile() Orchestrator

The top-level compile() function is the entry point. It accepts source code and options, then coordinates all three phases:

export function compile(source, options) {
    source = remove_bom(source);
    state.reset({ warning: options.warningFilter, filename: options.filename });
    const validated = validate_component_options(options, '');
    let parsed = _parse(source);
    const analysis = analyze_component(parsed, source, combined_options);
    const result = transform_component(analysis, source, combined_options);
    result.ast = to_public_ast(source, parsed, options.modernAst);
    return result;
}

Notice the state.reset() call at the top. Before we dive into the phases, let's understand why that matters.

There's a sibling function, compileModule(), for .svelte.js files — modules that use runes but aren't components. It follows the same pattern: analyze_module()transform_module().

flowchart TD
    Source[".svelte source"] --> Parse["Phase 1: Parse<br/>→ AST (Root node)"]
    Parse --> TS{"TypeScript?"}
    TS -->|yes| Strip["remove_typescript_nodes()"]
    TS -->|no| Analyze
    Strip --> Analyze["Phase 2: Analyze<br/>→ ComponentAnalysis"]
    Analyze --> Transform["Phase 3: Transform<br/>→ ESTree Program"]
    Transform --> Print["esrap.print()<br/>→ JavaScript + source map"]
    Print --> Result["CompileResult<br/>{js, css, warnings, ast}"]

Phase 1 — Parse: Source to AST

The parser is hand-written — no PEG generator, no off-the-shelf tool. It's a state machine that consumes the template character by character.

The Parser class manages parsing state: an index cursor, a stack of open nodes, and a fragments stack for building the tree. The constructor drives the state machine:

let state = fragment;

while (this.index < this.template.length) {
    state = state(this) || fragment;
}

Each state is a function that examines the current position and either returns a new state or void (falling back to fragment). The fragment state is the decision hub — it dispatches to element (when it sees <), tag (when it sees {), or text:

export default function fragment(parser) {
    if (parser.match('<')) return element;
    if (parser.match('{')) return tag;
    return text;
}
stateDiagram-v2
    [*] --> fragment
    fragment --> element: "<"
    fragment --> tag: "{"
    fragment --> text: other
    element --> fragment: element closed
    tag --> fragment: tag closed
    text --> fragment: done

The parser produces an AST.Root node with three important children: fragment (the template markup), plus instance and module script blocks (parsed using Acorn for the JavaScript). If the component uses TypeScript, a remove_typescript_nodes() pass strips type annotations from the JS AST before analysis.

Tip: The parser includes a loose mode (parse(source, true)) that tries to produce an AST even from invalid input — useful for editor tooling that needs partial parsing while the user is typing.

Phase 2 — Analyze: Scoping, Validation, and Metadata

Analysis is where the compiler understands what the component means. The analyze_component() function builds scope hierarchies, resolves variable bindings, detects runes, and validates the component structure.

The first step is scope creation. The create_scopes() function from scope.js walks the AST and builds a ScopeRoot / Scope hierarchy. Every block ({#if}, {#each}, function bodies, etc.) creates a child scope. Bindings track where variables are declared and what kind they are — state, derived, prop, bindable_prop, normal, legacy_reactive, and more.

Rune detection happens via get_rune(), which checks whether a call expression is one of the known runes ($state, $derived, $effect, $props, etc.). This is what allows the compiler to distinguish $state(0) from a regular function call named $state.

Then comes the main analysis walk. The analyzer uses the zimmerframe library's walk() function with a visitors object containing around 60 visitors — one for nearly every node type:

flowchart TD
    AST["Parsed AST"] --> Scopes["create_scopes()<br/>→ ScopeRoot + Scope hierarchy"]
    Scopes --> Walk["walk(ast, state, visitors)<br/>~60 analysis visitors"]
    Walk --> CSS["analyze_css() + prune()"]
    CSS --> Result["ComponentAnalysis<br/>{runes, scope, bindings, metadata}"]

The analysis visitors perform validation (e.g., checking that $state is used correctly), gather metadata (e.g., tracking which events are used for delegation), and annotate nodes. The _ catch-all visitor at phases/2-analyze/index.js#L92-L145 handles svelte-ignore comments and scope switching for every node.

Phase 3 — Transform: AST to JavaScript

The transform phase is where the compiler generates output. The transform_component() function makes a critical fork based on the generate option:

const program =
    options.generate === 'server'
        ? server_component(analysis, options)
        : client_component(analysis, options);

For client output, client_component() performs three sequential walks over the AST — one for each logical section of the component:

  1. Module walk — processes <script context="module"> (top-level module scope)
  2. Instance walk — processes <script> (per-component instance scope)
  3. Template walk — processes the HTML template
flowchart LR
    A["module AST"] -->|"walk 1"| M["Module output"]
    B["instance AST"] -->|"walk 2"| I["Instance output"]
    C["template AST"] -->|"walk 3"| T["Template output"]
    M --> Program["Combined ESTree Program"]
    I --> Program
    T --> Program
    Program --> esrap["esrap.print() → JS + sourcemap"]

Each walk uses the same visitors object but with different state. The instance walk gets is_instance: true and uses the instance scope. The template walk shares the instance's transform state (so that variable references are correctly rewritten) but uses the template's scope map.

The state object initialized in client_component() at line 148 is worth studying — it includes the hoisted array (starting with import * as $ from 'svelte/internal/client'), the transform record for variable rewriting, and the events set for event delegation tracking.

After all three walks, esrap.print() converts the ESTree program into JavaScript with source maps. The esrap library is Svelte's own printer, chosen for its TypeScript-aware output and precise source map generation.

Rune Compilation: $state to $.state()

Let's trace how a specific rune — $state — travels through the pipeline. When you write:

<script>
let count = $state(0);
</script>

In Phase 1 (Parse), Acorn parses this as a VariableDeclaration with a CallExpression initializer where the callee is an Identifier named $state.

In Phase 2 (Analyze), get_rune() identifies this as the $state rune. The binding for count is created with kind: 'state'.

In Phase 3 (Transform), the VariableDeclaration visitor detects the rune and generates the appropriate runtime call:

if (rune === '$state' || rune === '$state.raw') {
    const is_state = is_state_source(binding, context.state.analysis);
    if (rune === '$state' && is_proxy) {
        value = b.call('$.proxy', value);
    }
    if (is_state) {
        value = b.call('$.state', value);
    }
}

The compiler decides whether the variable needs a full reactive source ($.state()) or can be a simple proxy ($.proxy()) based on whether the variable is ever reassigned. If count is only mutated (e.g., count.x = 1), it doesn't need $.state(). If it's reassigned (count = 5), it does.

sequenceDiagram
    participant Source as .svelte source
    participant Parse as Parser
    participant Analyze as Analyzer
    participant Transform as Transformer
    participant Output as JS Output

    Source->>Parse: let count = $state(0)
    Parse->>Analyze: VariableDeclaration { init: CallExpression { callee: $state } }
    Analyze->>Analyze: get_rune() → "$state"
    Analyze->>Analyze: binding.kind = "state"
    Analyze->>Transform: ComponentAnalysis
    Transform->>Transform: is_state_source(binding)?
    Transform->>Output: let count = $.state($.proxy(0))

Tip: The b module imported as #compiler/builders provides a fluent API for constructing ESTree nodes. b.call('$.state', value) creates a CallExpression node targeting $.state. This keeps the transform visitors readable despite generating complex AST structures.

The Template Builder and DOM Structure Generation

When the template transform encounters static HTML, it doesn't generate createElement calls. Instead, it builds up a template description using the Template class.

The Template class maintains a tree of nodes (elements, text, comments) that represent the static structure of the DOM. When finalized, this tree is serialized into either:

  • $.from_html(content, flags) — an HTML string that gets parsed once via innerHTML and then cloned via cloneNode(true) for each component instance
  • $.from_tree(structure) — a programmatic tree description for when HTML parsing would be inappropriate (e.g., SVG, MathML namespaces)

The Template tracks whether the content includes <script> tags (which require special handling) and whether importNode is needed (a Firefox workaround). It exposes methods like push_element(), push_text(), and pop_element() to build the tree incrementally as the template visitor walks the AST.

Dynamic content — expression tags like {count}, control flow blocks like {#if} — creates boundaries in the template. The static parts become the cloned template, and the dynamic parts become effect functions that update specific nodes after cloning.

flowchart TD
    Template["Template class"]
    Template --> Static["Static HTML string:<br/>'&lt;div&gt;&lt;span&gt;Hello&lt;/span&gt; &lt;!&gt;&lt;/div&gt;'"]
    Template --> FromHTML["$.from_html(string, flags)"]
    FromHTML --> Clone["cloneNode(true)<br/>per instance"]
    Clone --> Effects["$.template_effect(() => {<br/>  $.set_text(node, count);<br/>})"]

Module-Level State: The state.js Pattern

One design choice that permeates the compiler is the use of module-level mutable variables. The state.js module exports mutable let bindings like warnings, filename, source, dev, and runes:

export let warnings = [];
export let filename;
export let source;
export let dev;
export let runes = false;

The reset() function at line 139 zeroes everything out before each compilation:

export function reset(state) {
    dev = false;
    runes = false;
    source = '';
    filename = (state.filename ?? UNKNOWN_FILENAME).replace(/\\/g, '/');
    warnings = [];
}

Why not pass these as parameters through every function? Performance. The compiler's visitor functions are called thousands of times per compilation. Passing a context object through every call adds overhead in a hot loop. Module-level variables are effectively free to access — they're just variable reads. The tradeoff is that the compiler is single-threaded by design (no concurrent compilations), but that's perfectly fine since compilers run in build tools that process files sequentially or in separate workers.

The adjust() function is called after parsing and initial analysis to configure dev mode, rune mode, and rootDir-relative filenames. This two-stage initialization pattern — reset() then adjust() — exists because some state depends on information gathered during parsing (like whether TypeScript is used).

What's Next

The compiler's output is a JavaScript module that depends heavily on svelte/internal/client. In the next article, we'll explore the reactivity engine that powers those runtime calls — the signal graph of sources, deriveds, and effects, the bitwise flag system for high-performance status tracking, and the batch scheduler that orchestrates updates.