The Binder: Building Symbol Tables and Control Flow Graphs
Prerequisites
- ›Article 1: Architecture & Codebase Map
- ›Article 2: Scanner, Parser, and AST
- ›Understanding of symbol tables and scope chains in compilers
- ›Familiarity with control flow analysis concepts
The Binder: Building Symbol Tables and Control Flow Graphs
After the parser delivers a SourceFile AST, the next phase gives names their meaning. The binder walks the AST depth-first, creating Symbol objects for every declaration, organizing them into hierarchical symbol tables, setting parent pointers on every node, and — crucially — constructing the control flow graph that enables TypeScript's type narrowing. All of this happens in src/compiler/binder.ts — roughly 3,900 lines.
The binder is the bridge between syntax and semantics. Understanding it is essential before tackling the checker, because the checker assumes everything the binder produces is already in place: parent pointers, symbol tables, and flow nodes.
bindSourceFile and the Closure-Based Binder
Like the scanner and parser, the binder uses the closure-based module pattern. createBinder() returns a function that accepts a SourceFile and CompilerOptions. Internally, it closes over dozens of var locals:
function createBinder(): (file: SourceFile, options: CompilerOptions) => void {
// Why var? It avoids TDZ checks in the runtime which can be costly.
// See: https://github.com/microsoft/TypeScript/issues/52924
var file: SourceFile;
var options: CompilerOptions;
var parent: Node;
var container: IsContainer | EntityNameExpression;
var blockScopeContainer: IsBlockScopedContainer;
// ... control flow state ...
var currentFlow: FlowNode;
var currentBreakTarget: FlowLabel | undefined;
var currentContinueTarget: FlowLabel | undefined;
// ...
The binder is instantiated once as a module-level singleton — const binder = createBinder() — and reused for every file. Each call to bindSourceFile() reinitializes the closure state and walks the new file's AST.
flowchart TD
BSF["bindSourceFile(file, options)"] --> Init["Initialize closure state<br/>Reset containers, flow nodes"]
Init --> Walk["Depth-first AST walk"]
Walk --> Bind["bind(node)"]
Bind --> SetParent["Set node.parent"]
Bind --> CheckDecl{"Is declaration?"}
CheckDecl -->|Yes| CreateSym["Create/merge Symbol"]
CheckDecl -->|No| Skip["Continue walk"]
CreateSym --> AddToTable["Add to container's symbol table"]
Bind --> FlowCheck{"Affects control flow?"}
FlowCheck -->|Yes| UpdateFlow["Create FlowNode, update currentFlow"]
FlowCheck -->|No| Continue["Continue"]
Bind --> Children["Visit child nodes"]
The bindSourceFile() wrapper is minimal — it wraps the binder call with performance marks for profiling.
Tip: The binder's closure state is the reason you can't run binding on multiple files concurrently within a single program. The closure locals act as implicit thread-local state — fast for single-threaded execution, but not parallelizable.
Symbols, SymbolFlags, and Symbol Tables
The Symbol Interface
A Symbol represents the semantic identity of a named entity. Its key fields:
flags: SymbolFlags— classifies what kind of entity this symbol representsescapedName: __String— the symbol's name (escaped to avoid collisions with built-in properties)declarations: Declaration[]— all AST nodes that declare this symbol (may be multiple for merged declarations)valueDeclaration: Declaration— the first value-producing declarationmembers: SymbolTable— instance members (for classes, interfaces, object literals)exports: SymbolTable— exported members (for modules and namespaces)
SymbolFlags: Classification
SymbolFlags is a bitflag enum that classifies symbols into categories:
flowchart TD
SF["SymbolFlags"]
SF --> Value["Value Space<br/>FunctionScopedVariable | BlockScopedVariable<br/>Property | EnumMember | Function<br/>Class | Method | Accessor"]
SF --> TypeSpace["Type Space<br/>Class | Interface | Enum<br/>EnumMember | TypeLiteral<br/>TypeParameter | TypeAlias"]
SF --> Namespace["Namespace Space<br/>ValueModule | NamespaceModule | Enum"]
SF --> Special["Special Flags<br/>Alias | Transient | Assignment<br/>Optional | ExportStar | Prototype"]
A single symbol can simultaneously exist in the value, type, and namespace "spaces" — this is how class Foo creates both a value (the constructor) and a type (the instance type). The composite flags like Value, Type, and Namespace are unions of the individual flags:
Value = Variable | Property | EnumMember | ObjectLiteral |
Function | Class | Enum | ValueModule | Method |
GetAccessor | SetAccessor,
Type = Class | Interface | Enum | EnumMember |
TypeLiteral | TypeParameter | TypeAlias,
Symbol Tables and Container Hierarchy
Symbols are stored in SymbolTable maps (essentially Map<__String, Symbol>) attached to container nodes. The binder maintains a stack of containers:
- Global container — symbols available everywhere
- Module/SourceFile — top-level declarations, populating the
exportssymbol table - Class/Interface — populating
membersfor instance members - Function/Block — populating
localsfor local declarations
When the binder encounters a declaration, it determines the appropriate container and adds the symbol to that container's symbol table. For var declarations, the container is the enclosing function (function-scoped). For let/const, it's the enclosing block (block-scoped).
Declaration Merging
When two declarations share the same name in the same container, the binder merges them into a single symbol. This is how TypeScript supports:
- Interface merging:
interface Foo { x: number }+interface Foo { y: string } - Namespace merging with classes/functions/enums
- Module augmentation
The merge logic checks SymbolFlags compatibility — for instance, a Class and an Interface can merge (class-interface augmentation), but two Class declarations cannot. The declarations array on the merged symbol accumulates all contributing declaration nodes.
SymbolLinks: Lazy Semantic Extensions
The binder creates Symbols, but it doesn't resolve types — that's the checker's job. The bridge between them is SymbolLinks, a separate interface that extends Symbol at runtime with lazily-computed type information.
classDiagram
class Symbol {
+flags: SymbolFlags
+escapedName: __String
+declarations: Declaration[]
+members: SymbolTable
+exports: SymbolTable
}
class SymbolLinks {
+type: Type
+writeType: Type
+declaredType: Type
+typeParameters: TypeParameter[]
+target: Symbol
+aliasTarget: Symbol
+mapper: TypeMapper
+resolvedExports: SymbolTable
+resolvedMembers: SymbolTable
}
Symbol ..> SymbolLinks : "checker.getSymbolLinks(symbol)"
Key SymbolLinks fields include:
type— the resolved type of the symbol's valuedeclaredType— for classes, interfaces, and type aliases, the declared typetypeParameters— generic type parameters (for type aliases and classes)aliasTarget— for import aliases, the resolved target symbolresolvedExports/resolvedMembers— merged symbol tables combining early-bound and late-bound members
The checker accesses SymbolLinks through a getSymbolLinks(symbol) function that creates the links object on first access and caches it. This lazy initialization pattern is critical for performance — the checker only computes type information for symbols that are actually queried, rather than eagerly computing types for every declaration in the program.
Tip: If you're debugging type resolution in the checker and wondering where a symbol's type comes from, look for
getTypeOfSymbol()→getSymbolLinks(symbol).type. The first call triggers resolution; subsequent calls hit the cache.
Control Flow Graph Construction
The binder's second major responsibility — beyond symbol tables — is constructing the control flow graph. This graph enables TypeScript's type narrowing: when you write if (x !== null) { x.method() }, the checker knows x is non-null inside the if block because the flow graph models the branch.
FlowNode Types
The control flow graph is represented as a linked list of FlowNode objects, classified by FlowFlags:
| FlowNode Type | FlowFlags | Purpose |
|---|---|---|
FlowStart |
Start |
Beginning of a function's flow graph |
FlowAssignment |
Assignment |
Variable assignment (narrows the type) |
FlowCondition |
TrueCondition / FalseCondition |
Branch from a conditional (if/ternary) |
FlowSwitchClause |
SwitchClause |
Branch from a switch case |
FlowLabel |
BranchLabel / LoopLabel |
Junction merging multiple flows |
FlowCall |
Call |
Potential assertion call (e.g., assert(x)) |
FlowArrayMutation |
ArrayMutation |
Array .push() that might narrow tuple type |
Each FlowNode has an antecedent pointer linking it to the preceding flow node(s). The graph is navigated backwards — from a variable use point, the checker walks antecedent links until it reaches an assignment or branch that determines the narrowed type.
How the Binder Builds Flow Graphs
flowchart TB
Start["FlowStart"] --> A1["x = getSomething()<br/>FlowAssignment"]
A1 --> Cond{"if (x !== null)"}
Cond -->|"True"| TrueFlow["FlowCondition<br/>(TrueCondition)"]
Cond -->|"False"| FalseFlow["FlowCondition<br/>(FalseCondition)"]
TrueFlow --> Body["x.method()<br/>(use of x)"]
Body --> Merge["FlowLabel<br/>(BranchLabel)"]
FalseFlow --> Merge
Merge --> After["code after if"]
As the binder walks the AST, it maintains currentFlow — the current position in the flow graph. When it encounters:
- An assignment (
x = expr): Creates aFlowAssignmentnode withcurrentFlowas antecedent, then updatescurrentFlowto the new node. - An
ifstatement: SavescurrentFlow, createsFlowConditionnodes for the true and false branches, processes each branch, then creates aFlowLabeljunction that merges both branch endpoints. - A loop: Creates a
FlowLabelwithLoopLabelflag, which gets back-edges as the loop body is processed. - A
return/throw/break: SetscurrentFlowtounreachableFlow, marking subsequent code as dead. - A function call: If the function might be an assertion (type predicate returning
asserts), creates aFlowCallnode.
The binder tracks several flow targets simultaneously — currentBreakTarget, currentContinueTarget, currentReturnTarget, currentTrueTarget, currentFalseTarget, and currentExceptionTarget — to correctly wire up flow edges for complex control flow involving loops, labeled statements, and try/catch blocks.
Connection to Narrowing
The checker (which we'll cover in Part 4) consumes this graph in getFlowTypeOfReference(). Starting from a variable reference, it walks backwards through the flow graph, applying narrowing at each FlowCondition and FlowAssignment. The id field on each FlowNode is used as a cache key — once the narrowed type at a particular flow node is computed, it's cached to avoid exponential re-computation in graphs with many branches.
What's Next
The binder has given every declaration a Symbol, organized symbols into scoped tables, and threaded a control flow graph through the AST. In Part 4, we'll enter the compiler's largest and most complex module — the 54,000-line type checker — to see how it resolves types from symbols, checks assignability, infers generics, narrows types along control flow edges, and reports the errors that make TypeScript useful.