Getting Your Hands Dirty: Building, Testing, and Contributing to Geth
Prerequisites
- ›All previous articles in the series
- ›Go development environment setup
- ›Git basics
Getting Your Hands Dirty: Building, Testing, and Contributing to Geth
Over the past six articles, we've traced every major subsystem of go-ethereum — from the CLI boot sequence through block execution, state storage, transaction pools, P2P networking, and the RPC layer. Now it's time for the practical chapter: how to build the project, run its multi-layered test suite, understand the code generation patterns, and — most usefully — how new hard forks get implemented. Whether you're fixing a bug, adding a feature, or just trying to understand a specific behavior, this article gives you the tools to work with the codebase effectively.
Build Pipeline: Makefile and build/ci.go
As we briefly covered in Part 1, Geth uses a two-tier build pipeline. The Makefile is the developer-facing interface:
make geth # Build just the geth binary
make evm # Build the standalone EVM tool
make all # Build all packages and executables
make test # Run tests (builds first)
make lint # Run linters
make fmt # Format all Go code
make devtools # Install code generation tools
Every make target ultimately calls go run build/ci.go <command>. The build/ci.go file is a Go program with //go:build none — it's never compiled as part of the module but is executed as a script. It handles:
install— Cross-platform compilation with architecture and compiler selectiontest— Test execution with coverage supportlint— Pre-selected linter configurationcheck_generate— Verifies generated code is up-to-datecheck_baddeps— Ensures forbidden dependencies aren't introducedarchive/debsrc/nsis— Packaging for distribution
flowchart TD
DEV["Developer"] -->|make geth| MAKE["Makefile"]
MAKE --> CI["go run build/ci.go install ./cmd/geth"]
CI --> BUILD["go build -o build/bin/geth ./cmd/geth"]
BUILD --> BIN["build/bin/geth"]
DEV -->|make test| MAKE
MAKE --> CI2["go run build/ci.go test"]
CI2 --> TEST["go test ./..."]
DEV -->|make devtools| MAKE
MAKE --> TOOLS["Install stringer, gencodec,<br/>protoc-gen-go, abigen"]
Tip: For day-to-day development,
make gethis all you need. The binary appears at./build/bin/geth. For CI or release builds, thebuild/ci.goscript handles cross-compilation, signing, and packaging.
Testing Strategy: Unit, Integration, and Reference Tests
Geth employs a multi-layered testing strategy. The AGENTS.md file codifies the expected workflow:
flowchart TD
subgraph "During Development"
SHORT["go run ./build/ci.go test -short<br/>Fast feedback, skips slow tests"]
end
subgraph "Before Commit"
FULL["go run ./build/ci.go test<br/>Full suite including reference tests"]
LINT["go run ./build/ci.go lint<br/>Style checks"]
GEN["go run ./build/ci.go check_generate<br/>Generated code up-to-date"]
DEPS["go run ./build/ci.go check_baddeps<br/>Dependency hygiene"]
end
SHORT -->|iterate| SHORT
SHORT -->|ready to commit| FULL
FULL --> LINT
LINT --> GEN
GEN --> DEPS
The testing layers are:
-
Unit tests — Standard Go
_test.gofiles alongside the code they test. Most packages have comprehensive unit tests. Usego test ./core/vm/...to test a specific package. -
Integration tests — Tests that exercise multiple packages together, often using the in-memory database backend. The
eth/package has tests that wire up a complete handler with simulated peers. -
Ethereum reference tests — The
tests/directory contains the official Ethereum execution specification test suite. These tests validate that Geth's EVM produces the exact same results as the reference specification across every fork. They test state transitions, block processing, transaction validation, and RLP encoding. -
The
cmd/evmtool — A standalone EVM that can execute state tests, trace transactions, and benchmark opcodes in isolation. It's invaluable for debugging EVM issues without running a full node.
The core/vm/runtime/ package provides a testing runtime for isolated EVM execution — you can create an EVM with a synthetic state, execute arbitrary bytecode, and inspect the results. Many internal tests use this pattern:
// Example pattern from core/vm tests
result, _, err := runtime.Execute(code, input, &runtime.Config{
GasLimit: 1000000,
// ... configuration
})
Code Generation Patterns
Geth uses go:generate directives for three main purposes:
gencodec— Generates type-safe JSON marshaling code. Many types incore/types/usegencodecto producegen_*.gofiles that handle JSON encoding without runtime reflection. The directive inethconfig/config.gois typical:
//go:generate go run github.com/fjl/gencodec -type Config -formats toml -out gen_config.go
-
stringer— GeneratesString()methods for enum types. Themake devtoolstarget installs this tool. -
protoc-gen-go— Protocol buffer code generation for any proto-defined types.
The build system's check_generate command ensures all generated files are up-to-date. If you modify a type that has a go:generate directive, you need to:
make devtools # Install generators (first time only)
go generate ./... # Regenerate all files
flowchart LR
SOURCE["Source type<br/>(e.g., Config struct)"] -->|go:generate directive| GENCODEC["gencodec"]
GENCODEC --> GENERATED["gen_config.go<br/>Type-safe marshaling"]
SOURCE2["Enum type<br/>(e.g., SyncMode)"] -->|go:generate directive| STRINGER["stringer"]
STRINGER --> GENERATED2["syncmode_string.go<br/>String() method"]
Auxiliary Tools and Executables
Beyond geth itself, the cmd/ directory contains several useful tools:
| Tool | Purpose |
|---|---|
cmd/evm |
Standalone EVM — run state tests, trace transactions, benchmark |
cmd/devp2p |
P2P protocol testing — ENR operations, discovery crawling, protocol tests |
cmd/clef |
External signer — manages keys outside the Geth process |
cmd/abigen |
ABI binding generator — creates type-safe Go wrappers for contracts |
cmd/rlpdump |
RLP inspector — decode and display RLP-encoded data |
cmd/era |
Era1 archive tool — work with era1 archive files |
cmd/blsync |
Beacon light client sync — lightweight CL synchronization |
The devp2p tool is particularly useful for network debugging. It can crawl the discovery network, test protocol handshakes, and validate ENR records. If you're working on P2P code, this is your testing companion.
How Forks Get Implemented
The established pattern for implementing a new Ethereum hard fork is one of the most instructive ways to understand Geth's architecture. It touches nearly every subsystem we've covered. Here's the recipe:
flowchart TD
A["1. Add activation field to ChainConfig<br/>(params/config.go)"] --> B["2. Add Rules flag<br/>(params/config.go → Rules struct)"]
B --> C["3. Create new instruction set<br/>(core/vm/jump_table.go)"]
C --> D["4. Implement EIP enable functions<br/>(core/vm/eips.go)"]
D --> E["5. Add fork-specific logic<br/>(core/state_processor.go,<br/>consensus/, etc.)"]
E --> F["6. Update jump table selection<br/>(core/vm/evm.go → NewEVM)"]
F --> G["7. Update reference tests<br/>(tests/)"]
G --> H["8. Add override flag<br/>(cmd/geth/main.go)"]
Step 1: Add a time-based activation field to ChainConfig. Post-Merge forks use *uint64 timestamps (e.g., OsakaTime *uint64, AmsterdamTime *uint64). Pre-Merge forks used block numbers.
Step 2: Add a boolean flag to the Rules struct (e.g., IsOsaka bool, IsAmsterdam bool). The Rules are computed from ChainConfig at a specific block number and timestamp.
Step 3: Create a new instruction set constructor in jump_table.go. Copy the previous fork's table and add new opcodes:
func newAmsterdamInstructionSet() JumpTable {
instructionSet := newOsakaInstructionSet()
enable7843(&instructionSet) // SLOTNUM opcode
enable8024(&instructionSet) // SWAPN, DUPN, EXCHANGE
return validate(instructionSet)
}
Step 4: Implement the enable* functions that modify specific entries in the jump table — setting execution functions, gas costs, and stack parameters for new or modified opcodes.
Step 5: Add any non-EVM fork logic — new system contracts, modified state transition rules, consensus changes, new transaction types.
Step 6: Add the new fork to the switch statement in NewEVM(). The newest fork is always checked first.
Step 7: Update the reference test suite to include the new fork's expected behavior.
Step 8: Add an --override.<forkname> CLI flag for testing the fork before mainnet activation.
Tip: The
AGENTS.mdfile at the repo root contains contributor guidelines, including commit message format (<package>: description), the pre-commit checklist, and the pull request title convention. Read this before your first PR.
Orientation Tips for Exploring Further
After seven articles, you have a complete mental model of Geth. Here are some parting navigation heuristics:
-
Follow the interfaces. When you encounter a method call on an interface, find the concrete implementation by searching for the struct that satisfies it. Go's implicit interface satisfaction means
grepis often more effective than your IDE's "find implementations" feature. -
Start from
eth/backend.go. TheEthereumstruct andNew()constructor are the Rosetta Stone. Every major subsystem is created and wired here. When in doubt about how two components connect, checketh.New(). -
Use the
rawdbpackage as your database dictionary. If you need to know what's stored in the database and how keys are structured,core/rawdb/has the answers. -
The
paramspackage is the source of truth for protocol constants. Gas costs, fork activation logic, chain IDs, precompile addresses — it's all inparams/. -
Tests are documentation. When the code comments don't explain a behavior, the tests often do. Look for
_test.gofiles in the same package. -
Commit messages are changelogs. Geth's commit history follows a strict
<package>: descriptionformat. Usegit log --oneline -- core/vm/to understand the evolution of any subsystem.
The go-ethereum codebase rewards careful reading. It's remarkably well-structured for its size and age, and the interface-driven design means you can understand any subsystem in isolation. Now that you have the map, go explore.