The Graph Engine: How Terraform Builds and Walks Dependency Graphs
Prerequisites
- ›Article 1: Architecture and Codebase Navigation
- ›Understanding of DAGs, topological sort, and parallel execution concepts
- ›Go concurrency primitives (goroutines, channels, sync.WaitGroup)
The Graph Engine: How Terraform Builds and Walks Dependency Graphs
Terraform's most distinctive architectural decision is that every operation — plan, apply, validate, import — is modeled as a walk over a directed acyclic graph. Resources that don't depend on each other execute in parallel. Resources that do depend on each other execute in topological order. This single abstraction enables Terraform to safely manage thousands of infrastructure resources in a single run.
The graph engine lives in two packages: internal/dag/ provides a generic, Terraform-agnostic graph library, while internal/terraform/ wraps it with domain-specific vertex types and a pipeline of graph transformers. Understanding how these two layers interact is essential for debugging plan behavior, understanding parallelism, and reading the core engine code.
The DAG Package: A Generic Graph Library
The foundation is AcyclicGraph in internal/dag/dag.go#L15-L18:
type AcyclicGraph struct {
Graph
}
AcyclicGraph embeds the base Graph type and adds methods that assume (and verify) the absence of cycles. Vertices are interface{} — the DAG package has no knowledge of Terraform resources, providers, or any domain concepts. Edges are directional, representing dependencies.
classDiagram
class Graph {
+vertices Set
+edges Set
+Add(Vertex)
+Remove(Vertex)
+Connect(Edge)
+DownEdges(Vertex) Set
+UpEdges(Vertex) Set
}
class AcyclicGraph {
+Ancestors(vs) Set
+TransitiveReduction()
+Validate() error
+Walk(cb WalkFunc) Diagnostics
}
class Walker {
+Callback WalkFunc
+Reverse bool
+Update(g)
+Wait() Diagnostics
}
Graph <|-- AcyclicGraph
AcyclicGraph ..> Walker : uses
The key operations that Terraform relies on:
Ancestors()— finds all transitive dependencies of a vertex, used for targetingTransitiveReduction()— removes redundant edges to simplify the graph without changing execution orderValidate()— checks for cycles and self-referencing edgesWalk()— the high-level entry point that creates aWalkerand runs it
This deliberate decoupling means the DAG package can be (and is) tested independently with simple string vertices, without pulling in any Terraform-specific types.
Parallel Walk: Goroutines, Channels, and Dependency Signaling
The Walker struct in internal/dag/walk.go#L39-L68 is the parallel execution engine:
type Walker struct {
Callback WalkFunc
Reverse bool
changeLock sync.Mutex
vertices Set
edges Set
vertexMap map[Vertex]*walkerVertex
wait sync.WaitGroup
diagsMap map[Vertex]tfdiags.Diagnostics
upstreamFailed map[Vertex]struct{}
diagsLock sync.Mutex
}
For each vertex, the Walker creates a walkerVertex (lines 88-119) with two critical channels:
DoneCh— closed when this vertex finishes execution (success or failure)DepsCh— receives a boolean when all upstream dependencies complete;truemeans all succeeded,falsemeans at least one failed
flowchart TD
subgraph "Vertex A (no deps)"
A_exec["Execute callback"] --> A_done["Close DoneCh"]
end
subgraph "Vertex B (depends on A)"
B_wait["Wait on A.DoneCh"] --> B_deps["Receive on DepsCh"]
B_deps -->|"true: deps OK"| B_exec["Execute callback"]
B_deps -->|"false: upstream failed"| B_skip["Skip execution"]
B_exec --> B_done["Close DoneCh"]
B_skip --> B_done
end
A_done -.->|"signals"| B_wait
The Walker spawns two goroutines per vertex: one that waits on dependency channels and sends the aggregate result on DepsCh, and one that receives from DepsCh and runs the callback. As the comment notes, this creates V*2 goroutines total.
When an upstream vertex fails, all downstream vertices receive false on their DepsCh and are recorded in upstreamFailed. Their errors are excluded from the final diagnostic set because they're considered cascading failures — a thoughtful touch that prevents error-message avalanches.
Tip: Terraform defaults to a parallelism of 10 (set in
NewContextatinternal/terraform/context.go#L146-L148). This doesn't limit goroutines — all vertices still get goroutines — but limits how many vertices can execute their callback concurrently, via a semaphore. This protects against provider rate limits.
Terraform's Graph Wrapper and Walk Dispatch
The internal/terraform/ package wraps the generic DAG with its own Graph type at internal/terraform/graph.go#L19-L28:
type Graph struct {
dag.AcyclicGraph
Path addrs.ModuleInstance
}
This embedding gives Graph all the DAG methods while adding a module path and the crucial Walk() method (lines 37-39) that bridges to the GraphWalker interface.
The walk() function (lines 41-120+) is where domain-specific dispatch happens. For each vertex visited by the DAG walker, it:
- Checks if the vertex is
GraphNodeOverridable(for testing framework overrides) - Determines the evaluation context scope (global, module instance, or partial-expanded module)
- Dispatches to
GraphNodeExecutable.Execute()if the vertex implements it
The vertex interfaces form a rich type system:
classDiagram
class GraphNodeExecutable {
<<interface>>
+Execute(EvalContext, walkOperation) Diagnostics
}
class GraphNodeReferenceable {
<<interface>>
+ReferenceableAddrs() []Referenceable
}
class GraphNodeReferencer {
<<interface>>
+References() []*Reference
}
class GraphNodeModuleInstance {
<<interface>>
+Path() ModuleInstance
}
class GraphNodeConfigResource {
<<interface>>
+ResourceAddr() ConfigResource
}
A single node type like NodePlannableResourceInstance implements many of these interfaces simultaneously. The graph walk code uses Go type assertions to discover what each vertex can do, providing a flexible dispatch mechanism without requiring a monolithic node interface.
BasicGraphBuilder: The Transform Pipeline
Graphs are constructed through a pipeline of mutations. The BasicGraphBuilder at internal/terraform/graph_builder.go#L26-L34 holds a slice of GraphTransformer steps:
type BasicGraphBuilder struct {
Steps []GraphTransformer
Name string
SkipGraphValidation bool
}
The Build() method (lines 36-90) iterates this slice, calling Transform(g) on each:
for _, step := range b.Steps {
if step == nil { continue }
err := step.Transform(g)
// ...handle errors...
}
sequenceDiagram
participant Builder as BasicGraphBuilder
participant G as Graph
participant T1 as ConfigTransformer
participant T2 as ReferenceTransformer
participant T3 as TransitiveReductionTransformer
Builder->>T1: Transform(g)
Note over T1,G: Adds resource vertices from config
T1-->>G: mutated
Builder->>T2: Transform(g)
Note over T2,G: Wires dependency edges from HCL refs
T2-->>G: mutated
Builder->>T3: Transform(g)
Note over T3,G: Removes redundant edges
T3-->>G: mutated
Builder->>G: Validate()
Each transformer mutates the graph in place — adding vertices, adding edges, removing vertices, or modifying existing nodes. This pipeline-of-mutations pattern is powerful because transformers are composable: you can add or remove steps without changing the others. It also means the graph's structure at any point depends on the order of previous transformers, which is why the Steps() method on each graph builder is carefully ordered.
After all transforms complete, Build() calls g.Validate() to check for cycles. If validation fails, Terraform logs the graph structure and returns an error.
PlanGraphBuilder: Anatomy of the Plan Graph
The PlanGraphBuilder at internal/terraform/graph_builder_plan.go#L30-L135 produces the graph used for planning. Its Steps() method (lines 148-328) returns over 20 transformers. Here are the most important ones in order:
flowchart TD
CT["ConfigTransformer<br/>Add resource vertices from config"] --> RVT["RootVariableTransformer<br/>Add input variable nodes"]
RVT --> MVT["ModuleVariableTransformer<br/>Add module variable nodes"]
MVT --> LT["LocalTransformer<br/>Add local value nodes"]
LT --> OT["OutputTransformer<br/>Add output value nodes"]
OT --> ORIT["OrphanResourceInstanceTransformer<br/>Add nodes for state resources missing from config"]
ORIT --> ST["StateTransformer<br/>Add deposed instance nodes"]
ST --> AST["AttachStateTransformer<br/>Attach state to resource nodes"]
AST --> ARC["AttachResourceConfigTransformer<br/>Attach config to resource nodes"]
ARC --> PT["ProviderTransformer<br/>Wire provider dependencies"]
PT --> SchemaT["AttachSchemaTransformer<br/>Attach schemas to nodes"]
SchemaT --> MET["ModuleExpansionTransformer<br/>Handle count/for_each"]
MET --> RT["ReferenceTransformer<br/>Wire dependency edges from HCL refs"]
RT --> TT["TargetsTransformer<br/>Prune to targeted resources"]
TT --> CPRT["CloseProviderTransformer<br/>Add provider close nodes"]
CPRT --> TRT["TransitiveReductionTransformer<br/>Simplify graph"]
The ConfigTransformer at internal/terraform/transform_config.go#L33-L73 is typically the first substantive transformer. It walks the config tree and adds a vertex for each resource declaration. But note: at this stage, count and for_each haven't been evaluated yet, so each vertex represents a configuration-level resource, not individual instances. Instance expansion happens later via ModuleExpansionTransformer and dynamic expansion during the walk.
The ReferenceTransformer at internal/terraform/transform_reference.go#L19-L43 is where dependency edges get wired. It examines which addresses each node references (via GraphNodeReferencer) and which addresses each node provides (via GraphNodeReferenceable), then connects them with edges. This is how var.name in a resource's configuration creates a dependency on the variable node, and how aws_instance.web.id creates a dependency on the aws_instance.web resource node.
ApplyGraphBuilder: Differences from Plan
The ApplyGraphBuilder at internal/terraform/graph_builder_apply.go#L26-L93 takes a fundamentally different input: the plan's change set rather than the configuration alone. Its Steps() method (lines 105-200+) introduces the DiffTransformer:
&DiffTransformer{
Concrete: concreteResourceInstance,
State: b.State,
Changes: b.Changes,
Config: b.Config,
},
Where ConfigTransformer adds vertices from configuration, DiffTransformer adds vertices from the planned changes. This ensures the apply graph only includes resources that actually need modification — a critical safety property.
The apply graph also uses different concrete node types. Where the plan graph uses NodePlannableResourceInstance, the apply graph uses NodeApplyableResourceInstance (line 119-124), which calls provider.ApplyResourceChange() instead of PlanResourceChange().
Both builders share many of the same transformers — ReferenceTransformer, ProviderTransformer, TransitiveReductionTransformer — because the fundamental need for dependency ordering and graph simplification is the same.
Walk Operations: Eight Flavors of Graph Walk
The same graph infrastructure supports eight distinct walk types, enumerated in internal/terraform/graph_walk_operation.go#L9-L21:
const (
walkInvalid walkOperation = iota
walkApply
walkPlan
walkPlanDestroy
walkValidate
walkDestroy
walkImport
walkEval
walkInit
)
The PlanGraphBuilder.Steps() method switches on the operation type to select the right initialization (lines 149-160). For walkPlan, it calls initPlan() which sets up concrete node factories for plannable resources. For walkPlanDestroy, it calls initDestroy() which configures the graph for destruction ordering.
This design means the same transformer pipeline can produce dramatically different graphs depending on the operation, while sharing most of the structural logic.
What's Ahead
Now that you understand how graphs are built and walked, Article 3 will follow a single resource instance through the complete plan-and-apply lifecycle — from Context.Plan() creating the graph, through NodePlannableResourceInstance.Execute() calling the provider, to Context.Apply() making real infrastructure changes. We'll also examine the EvalContext interface that provides the bridge between graph nodes and the outside world.