Terraform's Architecture: A Map of the Codebase
Prerequisites
- ›Familiarity with Go (interfaces, packages, goroutines)
- ›Basic Terraform user experience (resources, providers, plan/apply workflow)
Terraform's Architecture: A Map of the Codebase
Terraform is one of the most widely-used infrastructure tools in the world, yet surprisingly few engineers have looked at how it actually works inside. The codebase is a single Go module — a true monolith — comprising roughly 65 packages under internal/. Understanding its architecture unlocks the ability to debug provider issues, contribute patches, or build tools on top of Terraform's internals.
This article provides the mental model you'll need for every subsequent deep-dive in this series. We'll trace the path from main() to a provider plugin call, mapping the key packages and abstractions along the way.
The Boot Sequence: main.go
Everything starts in main.go#L65-L352. The realMain() function is a ~290-line sequential boot that initializes every subsystem before dispatching to a CLI command. Here's the condensed flow:
flowchart TD
A["realMain()"] --> B["OpenTelemetry init"]
B --> C["Logging & panic handler"]
C --> D["terminal.Init() — detect TTY"]
D --> E["cliconfig.LoadConfig() — read .terraformrc"]
E --> F["disco.NewWithCredentialsSource() — service discovery"]
F --> G["providerSource() — build provider chain"]
G --> H["backendInit.Init() — register 14 backends"]
H --> I["extractChdirOption — handle -chdir"]
I --> J["initCommands() — wire ~40 commands"]
J --> K["cli.CLI.Run() — dispatch to command"]
The sequence is deliberately linear. Telemetry comes first (line 70) so that the top-level span wraps the entire execution. Terminal detection (line 113) happens early because output formatting depends on whether stdout is a TTY. CLI config loading (line 139) is intentionally done before -chdir processing, so that relative paths in TERRAFORM_CONFIG_FILE resolve against the true working directory.
One subtle but important detail: backendInit.Init(services) at line 212 registers all 14 built-in backends into a global map. This happens once, early in boot, and the backends are never modified afterward. We'll explore why backends are hardcoded (rather than pluggable) in Article 5.
Tip: If you're debugging Terraform's startup, set
TF_LOG=TRACEto see each step logged. The boot sequence is heavily instrumented withlog.Printfcalls at[INFO]and[TRACE]levels.
Command Registration and the Shared Meta
After boot completes, initCommands() in commands.go#L56-L114 constructs a single command.Meta struct that every command shares:
meta := command.Meta{
WorkingDir: wd,
Streams: streams,
View: views.NewView(streams).SetRunningInAutomation(inAutomation),
Services: services,
ProviderSource: providerSrc,
ProviderDevOverrides: providerDevOverrides,
UnmanagedProviders: unmanagedProviders,
ShutdownCh: makeShutdownCh(),
// ...
}
This Meta is then passed to every command factory closure in the Commands map (lines 122-451). The map contains roughly 40 entries, including subcommands like "state list" and "workspace new".
classDiagram
class Meta {
+WorkingDir
+Streams
+View
+Services
+ProviderSource
+ShutdownCh
}
class PlanCommand {
+Meta
+Run(args) int
}
class ApplyCommand {
+Meta
+Destroy bool
+Run(args) int
}
class InitCommand {
+Meta
+Run(args) int
}
Meta <|-- PlanCommand
Meta <|-- ApplyCommand
Meta <|-- InitCommand
One particularly elegant design choice: destroy is simply ApplyCommand with Destroy: true, as seen at commands.go#L135-L140:
"destroy": func() (cli.Command, error) {
return &command.ApplyCommand{
Meta: meta,
Destroy: true,
}, nil
},
This pattern of reusing command implementations with behavioral flags is common throughout the codebase. It keeps the total number of command structs manageable while supporting a rich CLI surface.
Package Map: The internal/ Tree
Terraform's packages roughly organize into layers. Here's a guided map of the most important ones:
| Layer | Packages | Responsibility |
|---|---|---|
| CLI | command/, command/views/, command/arguments/ |
Parse flags, manage backends, render output |
| Backend | backend/, backend/local/, backend/init/, cloud/ |
State storage, operation execution |
| Core Engine | terraform/ |
Graph building, walking, plan/apply orchestration |
| Graph | dag/ |
Generic DAG library: vertices, edges, parallel walk |
| Configuration | configs/, configs/configload/ |
HCL parsing, module tree assembly |
| State | states/, states/statemgr/ |
In-memory state model, persistence, locking |
| Plans | plans/, plans/planfile/ |
Change tracking, plan serialization |
| Providers | providers/, plugin/, grpcwrap/ |
Provider interface, gRPC bridge, protocol translation |
| Discovery | getproviders/ |
Registry client, filesystem mirrors, multi-source |
| Addresses | addrs/ |
Naming system: Provider, Module, Resource, etc. |
| Diagnostics | tfdiags/ |
Rich errors/warnings with source location |
| Language | lang/ |
HCL expression evaluation, built-in functions |
| Types | cty (external) |
Dynamic type system for configuration values |
The internal/terraform/ package is the beating heart. It contains the Context type that orchestrates all operations, the graph builders that construct dependency graphs, the node types that execute during graph walks, and the EvalContext interface that provides nodes with access to providers and state.
flowchart LR
CLI["command/"] --> Backend["backend/local/"]
Backend --> Core["terraform/Context"]
Core --> GraphBuilder["terraform/GraphBuilder"]
GraphBuilder --> DAG["dag/"]
Core --> Providers["providers/Interface"]
Providers --> GRPC["plugin/GRPCProvider"]
Core --> State["states/SyncState"]
Core --> Configs["configs/Config"]
Tip: The
addrspackage is worth studying early. Its types appear as map keys, graph node identifiers, and targeting criteria everywhere in the codebase. Understandingaddrs.AbsResourceInstanceandaddrs.Providerwill accelerate your reading of nearly every other package.
End-to-End Request Flow: terraform plan
Let's trace what happens when a user runs terraform plan. This is the most important flow to understand because it touches every layer of the architecture.
Step 1: CLI parsing. PlanCommand.Run() in internal/command/plan.go#L22-L118 parses flags via the arguments package, creates a view, prepares the backend, and builds an operation request.
Step 2: Backend dispatch. The operation is sent to the backend's Operation() method. For local execution, this lands in local.opPlan() at internal/backend/local/backend_plan.go#L23-L27.
Step 3: Context creation. opPlan loads config and state, constructs a terraform.Context with NewContext(), and calls Context.Plan().
Step 4: Graph building. Plan() delegates to PlanAndEval() at internal/terraform/context_plan.go#L180-L194, which creates a PlanGraphBuilder and calls Build().
Step 5: Graph walking. The built graph is walked in parallel using the dag.Walker. Each vertex executes its Execute() method.
Step 6: Provider calls. Resource nodes like NodePlannableResourceInstance call provider.PlanResourceChange() via gRPC to compute diffs.
sequenceDiagram
participant User
participant CLI as PlanCommand
participant Backend as local.opPlan
participant Ctx as terraform.Context
participant Graph as PlanGraphBuilder
participant Walker as dag.Walker
participant Node as NodePlannableResourceInstance
participant Provider as GRPCProvider
User->>CLI: terraform plan
CLI->>Backend: RunOperation(opReq)
Backend->>Ctx: NewContext(opts)
Backend->>Ctx: Plan(config, state, planOpts)
Ctx->>Graph: Build(rootModule)
Graph-->>Ctx: *Graph
Ctx->>Walker: graph.Walk(walker)
Walker->>Node: Execute(evalCtx, op)
Node->>Provider: PlanResourceChange(req)
Provider-->>Node: planned state + diff
Node-->>Walker: diagnostics
Walker-->>Ctx: accumulated changes
Ctx-->>Backend: *plans.Plan
Backend-->>CLI: RunningOperation
CLI-->>User: plan output
This flow is the backbone of Terraform. The apply flow is nearly identical, except it uses ApplyGraphBuilder and calls provider.ApplyResourceChange() instead. Validate, import, and refresh all follow the same pattern: build a graph, walk it, collect results.
The Command Pattern
Every operation command (plan, apply, refresh, import) follows the same structural pattern:
- Parse flags into a typed
argumentsstruct - Create a command-specific view (human or JSON)
- Prepare the backend with
PrepareBackend() - Build an
Operationrequest withOperationRequest() - Execute with
RunOperation() - Return exit code based on result
This uniformity means that once you understand one command deeply, you can navigate any of them. The differences are in which graph builder is used, which node types populate the graph, and which provider methods get called.
What's Ahead
This article has given you the map. The rest of this series will take you into the territory:
- Article 2 dives into the graph engine — the generic DAG library, the parallel walker, and the transformer pipeline that constructs plan and apply graphs.
- Article 3 follows a resource instance through the full plan-and-apply lifecycle.
- Article 4 explores the provider plugin system and its gRPC-based architecture.
- Article 5 examines state management and the backend abstraction.
- Article 6 dissects the CLI layer, the views system, and the diagnostic system.
- Article 7 covers configuration loading and expression evaluation.
Each article builds on the mental model established here, so keep this package map and request flow diagram handy as reference.