Read OSS

Inside JavaScriptCore: The 4-Tier Execution Pipeline

Advanced

Prerequisites

  • Article 1: Navigating WebKit — Architecture Overview and Codebase Map
  • Article 2: WTF and Memory Management — RefPtr, WeakPtr, and container types
  • Basic compiler theory: AST, intermediate representation (IR), SSA form
  • Familiarity with JIT compilation concepts and speculative optimization

Inside JavaScriptCore: The 4-Tier Execution Pipeline

JavaScriptCore (JSC) is a full JavaScript engine that parses source code, compiles it to bytecode, and then progressively recompiles hot functions through increasingly aggressive optimization tiers. This multi-tier approach is how JSC balances startup latency (users want pages to load fast) against peak throughput (web apps want computation to run fast).

As we established in Part 1, JSC sits at layer 3 of WebKit's build stack, depending on WTF for containers and smart pointers (Part 2) but having no dependency on WebCore. You can build and run JSC as a standalone shell — in fact, the jsc.cpp file provides exactly this entry point, bootstrapping a VM and executing scripts from the command line.

Lexing and Parsing: Source to AST

Every JavaScript execution begins with text. The Lexer<T> class tokenizes raw source characters into a stream of tokens. It's templated on the character type (LChar for Latin-1 or UChar for UTF-16), allowing optimized paths for ASCII-only source — which covers the vast majority of JavaScript in practice.

The Parser<T> is a recursive-descent parser that consumes the token stream and emits an AST using the node types defined in Nodes.h. The AST node hierarchy is extensive — Nodes.h defines expression nodes, statement nodes, declaration nodes, and more.

flowchart LR
    Source["JavaScript Source"] --> Lexer["Lexer<br/>Tokenization"]
    Lexer --> Tokens["Token Stream"]
    Tokens --> Parser["Parser<br/>Recursive Descent"]
    Parser --> AST["AST<br/>(Nodes.h types)"]
    AST --> BytecodeGen["BytecodeGenerator"]
    BytecodeGen --> CodeBlock["CodeBlock<br/>(bytecode + metadata)"]

An important design decision: the parser uses a "TreeBuilder" abstraction. Rather than always building a full AST, it can use a SyntaxChecker tree builder that validates syntax without allocating any nodes — useful for scanning function bodies that may never be called.

Tip: JSC lazily parses function bodies. When a script is first loaded, only top-level code is fully parsed. Inner functions are syntax-checked but their ASTs are only built when they're first invoked. This dramatically reduces parse time for large scripts where many functions are never called.

Bytecode Generation and the Bytecode Format

The BytecodeGenerator walks the AST and emits register-based bytecode. Unlike stack-based VMs (like the JVM), JSC's bytecode operates on virtual registers, which map more naturally to physical CPU registers during JIT compilation.

The bytecode instructions are defined in the bytecode/ directory. Each instruction is a compact operation like get_by_id (property access), call (function invocation), or add (arithmetic). The generated bytecode, along with its constants, exception handlers, and type-profile metadata, is stored in a CodeBlock — the fundamental unit of compiled code in JSC.

Concept Location Purpose
Bytecode instructions Source/JavaScriptCore/bytecode/ Instruction definitions and encoding
BytecodeGenerator Source/JavaScriptCore/bytecompiler/ AST → bytecode lowering
CodeBlock Source/JavaScriptCore/bytecode/CodeBlock.h Container for a function's bytecode + metadata
Value profiles Source/JavaScriptCore/bytecode/ Type feedback collected during execution

Each CodeBlock starts life at the interpreter tier and may be promoted to higher tiers as the function heats up.

Tier 1 — LLInt: The Low-Level Interpreter

The Low-Level Interpreter (LLInt) is the first execution tier. It executes bytecode instructions directly, without generating any native code. The entry point is LLInt::setEntrypoint, which points a CodeBlock at the interpreter's dispatch table.

LLInt is written in a custom macro assembly DSL (defined in .asm files inside llint/), which is processed by an "offline assembler" to generate C++ or native code for each target platform. This approach makes LLInt portable across architectures while remaining faster than a naive C++ switch-dispatch interpreter.

flowchart TD
    CB["CodeBlock<br/>(bytecode)"] --> LLInt["LLInt<br/>Interpreter"]
    LLInt --> Exec["Execute bytecodes<br/>one at a time"]
    Exec --> Profile["Collect type profiles:<br/>• Value types seen<br/>• Branch targets taken<br/>• Call targets"]
    Profile --> Hot{"Function hot<br/>enough?"}
    Hot -->|No| Exec
    Hot -->|Yes| Baseline["Promote to<br/>Baseline JIT"]

While executing, LLInt collects value profiles — recording what types flow through each operation. When a function's execution count exceeds a threshold, LLInt triggers promotion to the Baseline JIT.

Tier 2 — Baseline JIT: Template Compilation

The Baseline JIT is a simple "template" compiler: for each bytecode instruction, it emits a fixed sequence of native machine instructions. There's no optimization, no register allocation across instructions, and no inlining. It's fast to compile and produces code that runs perhaps 5-10x faster than interpretation.

The data structures for Baseline-compiled code are defined in BaselineJITCode.h, including inline caches for property accesses (PropertyInlineCache) and arithmetic profiles.

The Baseline JIT serves a dual purpose: it's both an execution tier and a profiler. It continues collecting the type feedback that LLInt started, but now with richer information from inline caches:

  • Property access ICs record the Structure (hidden class) of objects accessed via get_by_id and put_by_id.
  • Call ICs record which functions are called at each call site.
  • Arithmetic profiles record whether operands are integers, doubles, or BigInts.

This accumulated profile data drives the next tier's speculative optimizations.

Tier 3 — DFG JIT: Speculative Optimization

The Data Flow Graph (DFG) JIT is where things get interesting. Rather than translating bytecodes one-at-a-time, the DFG builds an intermediate representation — a data flow graph — that captures the value dependencies between operations. It then uses the type profiles collected by lower tiers to make speculative assumptions about types.

sequenceDiagram
    participant Baseline as Baseline JIT
    participant DFG as DFG Compiler
    participant Native as DFG-compiled Code
    participant OSR as OSR Exit
    
    Baseline->>DFG: Function is hot, promote
    DFG->>DFG: Build DFG IR from bytecode
    DFG->>DFG: Insert type guards based on profiles
    DFG->>DFG: Optimize: CSE, constant folding, etc.
    DFG->>Native: Emit native code with guards
    Note right of Native: Fast path: types match
    Native->>Native: Execute optimized code
    Note right of Native: Slow path: type mismatch
    Native->>OSR: OSR Exit - deoptimize
    OSR->>Baseline: Resume in Baseline JIT

For example, if a profile shows that x + y has always received integer operands, the DFG emits a fast integer addition with a type guard. If at runtime x turns out to be a string, the guard fails and the code performs an OSR Exit (On-Stack Replacement Exit) — transferring execution back to the Baseline JIT, where the unoptimized code handles the unexpected type correctly.

The DFG source lives in Source/JavaScriptCore/dfg/ and contains:

  • The DFG IR definition (nodes, edges, types)
  • The speculative optimizer
  • OSR entry (promoting a running Baseline function mid-loop)
  • OSR exit (bailing out when speculation fails)
  • Various optimization phases (CSE, dead code elimination, strength reduction)

Tier 4 — FTL JIT and the B3 Backend

The Faster Than Light (FTL) JIT is the highest-throughput tier. It takes the DFG IR and lowers it to B3 (Bare Bones Backend), JSC's own SSA-based compiler backend. B3 replaced LLVM as the backend in 2016 — it provides most of the same classical optimizations but compiles an order of magnitude faster.

The entry point is FTL::compile:

namespace JSC { namespace FTL {
void compile(State&, Safepoint::Result&);
} }

B3 has its own IR (defined in Source/JavaScriptCore/b3/) which is a traditional SSA form with basic blocks, values, and operations. Below B3 sits Air (Assembly IR), a low-level representation close to actual machine instructions. Air handles register allocation and instruction selection before emitting the final machine code.

flowchart TD
    DFG_IR["DFG IR"] --> FTL["FTL Lowering"]
    FTL --> B3_IR["B3 SSA IR"]
    B3_IR --> Opts["B3 Optimizations:<br/>• Constant folding<br/>• CFG simplification<br/>• Strength reduction<br/>• Loop-invariant code motion"]
    Opts --> Air["Air IR<br/>(low-level)"]
    Air --> RegAlloc["Register Allocation"]
    RegAlloc --> MachineCode["Native Machine Code<br/>(x86-64 / ARM64)"]

The key insight is that B3 applies compiler-grade optimizations that the DFG doesn't. DFG optimizes at the JavaScript semantic level (type specialization, inlining); B3 optimizes at the machine level (register pressure, instruction scheduling, redundant load elimination). The two tiers complement each other.

Riptide: The Concurrent Garbage Collector

JSC uses a garbage collector called Riptide for JavaScript objects (distinct from the reference counting used for C++ objects, as we discussed in Part 2). Riptide is a concurrent, generational, mark-and-sweep collector implemented in Source/JavaScriptCore/heap/.

Key design points:

  • Concurrent marking. A GC thread walks the object graph in parallel with the mutator (the JavaScript execution thread). This minimizes pause times.
  • Write barriers. When JIT-compiled code stores a pointer into an object, it must execute a write barrier that informs the GC. The JIT tiers integrate barriers into their generated code.
  • Conservative stack scanning. During collection, the GC scans the native stack looking for values that might be object pointers. This is essential because JIT-compiled code stores JS values in CPU registers and stack slots.
  • Retreating wavefront. Rather than requiring a stop-the-world phase to fix up objects that were modified during concurrent marking, Riptide uses a retreating wavefront algorithm that handles concurrent mutations gracefully.

The heap directory contains the Heap class (orchestrating collections), MarkedBlock (the allocation unit), SlotVisitor (the marking workhorse), and WriteBarrier<T> (the barrier type used throughout JSC).

Runtime Built-Ins and the JSC Shell

The runtime/ directory contains C++ implementations of JavaScript's built-in objects: JSArray, JSObject, JSFunction, JSPromise, RegExpObject, MapObject, and dozens more. These are the native backings for the objects that JavaScript code interacts with.

The standalone jsc.cpp shell ties everything together: it creates a VM (the central JSC state object), creates a GlobalObject, and runs scripts through the full pipeline. It's invaluable for testing JSC in isolation — you can feed it JavaScript and observe which tiers kick in, dump the DFG IR, or profile GC behavior without involving WebCore or any browser UI.

Directory Contents
parser/ Lexer, Parser, AST node types
bytecompiler/ BytecodeGenerator
bytecode/ Bytecode definitions, CodeBlock, value profiles
llint/ LLInt interpreter (macro assembly DSL)
jit/ Baseline JIT compiler
dfg/ DFG IR, speculative optimizer, OSR machinery
ftl/ FTL lowering from DFG IR to B3
b3/ B3 SSA compiler backend
b3/air/ Air low-level IR and register allocator
heap/ Riptide GC: marking, sweeping, barriers
runtime/ Built-in JS objects and the VM class

Bridging to the Next Article

JavaScriptCore provides the execution engine, but web pages are more than JavaScript. In the next article, we'll explore WebCore — the largest component in the repository — and trace the full rendering pipeline from HTML bytes to pixels. We'll see how the DOM tree is built, how CSS is resolved (including the CSS JIT selector compiler that reuses JSC's assembler infrastructure), and how layout and painting produce the visual output. Crucially, we'll also examine the Web IDL binding system that bridges JSC's JavaScript world with WebCore's C++ DOM objects.