Channels, Network Poller, and the Runtime's Cross-Cutting Concerns
Prerequisites
- ›Articles 1-5: Full series through Memory and GC
- ›Concurrent programming (atomic operations, lock-free structures)
- ›I/O multiplexing concepts (epoll, kqueue)
Channels, Network Poller, and the Runtime's Cross-Cutting Concerns
We've traced Go from repository structure through the compiler pipeline, runtime bootstrap, scheduler, and memory management. This final article examines the high-level concurrency primitives and I/O infrastructure built atop those foundations: channels, the network poller, synchronization primitives, and the compiler directives that tie the runtime together. These are the systems that make Go's "don't communicate by sharing memory; share memory by communicating" philosophy possible at the implementation level.
Channel Implementation: hchan and sudog
Channels are the signature concurrency primitive of Go. Under the hood, they're implemented as a mutex-protected circular buffer with wait queues:
type hchan struct {
qcount uint // total data in the queue
dataqsiz uint // size of the circular queue
buf unsafe.Pointer // points to an array of dataqsiz elements
elemsize uint16
closed uint32
timer *timer // timer feeding this chan
elemtype *_type // element type
sendx uint // send index
recvx uint // receive index
recvq waitq // list of recv waiters
sendq waitq // list of send waiters
bubble *synctestBubble
lock mutex
}
The waitq type is a linked list of sudog structures — the runtime's representation of a goroutine waiting on a synchronization operation:
src/runtime/runtime2.go#L404-L446
sequenceDiagram
participant G1 as Goroutine 1
participant CH as hchan (buffered, cap=2)
participant G2 as Goroutine 2
Note over CH: buf: [_, _], sendx=0, recvx=0
G1->>CH: ch <- "a" (chansend)
Note over CH: buf: ["a", _], sendx=1
G1->>CH: ch <- "b" (chansend)
Note over CH: buf: ["a", "b"], sendx=0
G1->>CH: ch <- "c" (buffer full!)
Note over CH: G1 parked in sendq as sudog
G2->>CH: <-ch (chanrecv)
Note over CH: Returns "a", copies "c" to buf
CH->>G1: goready(G1) — wake from sendq
Note over CH: buf: ["c", "b"], recvx=1
The channel operations have three interesting fast paths:
-
Direct send: When a receiver is already waiting in
recvq, the sender copies the value directly to the receiver's stack (bypassing the buffer entirely) and wakes the receiver withgoready. This avoids two copies through the buffer. -
Buffered send/receive: When the buffer has space (or data), the operation completes without blocking — just a copy to/from the circular buffer under the lock.
-
Blocking: When neither fast path applies, the goroutine creates a sudog, enqueues itself in the appropriate wait queue, and calls
goparkto deschedule. As we saw in Article 4,goparkintegrates with the scheduler to make the goroutine waiting without consuming an OS thread.
The invariants documented at the top of chan.go are worth studying:
For buffered channels: if there are items in the buffer (qcount > 0), the receive queue must be empty. And if there's buffer space (qcount < dataqsiz), the send queue must be empty. These invariants simplify the implementation because you never have waiters and buffer space simultaneously.
Tip: The
debugChanconstant at line 31 can be set totrueduring development to enable verbose channel operation logging. The runtime has similar debug constants for most subsystems.
Select Statement Implementation
The select statement is compiled into calls to runtime.selectgo:
Each case is represented by an scase struct containing the channel and a pointer to the data element. The implementation is careful about lock ordering — when a select involves multiple channels, all channels must be locked simultaneously to prevent deadlock:
func sellock(scases []scase, lockorder []uint16) {
var c *hchan
for _, o := range lockorder {
c0 := scases[o].c
if c0 != c {
c = c0
lock(&c.lock)
}
}
}
The lockorder slice sorts channels by address, ensuring a consistent global lock order. Cases are evaluated in a randomized order (using a pollorder slice) to prevent starvation — without randomization, the first case would be systematically favored.
flowchart TD
A["select statement<br/>(N cases)"] --> B["Shuffle pollorder<br/>(random evaluation)"]
B --> C["Sort lockorder<br/>(by channel address)"]
C --> D["Lock all channels"]
D --> E{"Any case ready?"}
E -->|Yes| F["Execute that case,<br/>unlock all"]
E -->|No| G["Create sudog for each case"]
G --> H["Enqueue in all channel wait queues"]
H --> I["gopark (sleep)"]
I --> J["Woken by some channel"]
J --> K["Dequeue from all other channels"]
K --> F
The Network Poller
Go's network I/O appears blocking to the goroutine but is actually multiplexed onto non-blocking I/O under the hood. The network poller is the bridge between the two worlds.
The platform-independent interface is defined in netpoll.go:
src/runtime/netpoll.go#L15-L41
Each platform must implement: netpollinit(), netpollopen(fd, pd), netpollclose(fd), netpoll(delta), and netpollBreak(). The pollDesc structure tracks the state of each file descriptor:
src/runtime/netpoll.go#L51-L80
Each pollDesc contains two semaphores (rg and wg) for read and write operations. These semaphores use goroutine pointers as state: pdNil (idle), pdWait (preparing to park), pdReady (I/O ready), or a *g pointer (goroutine parked and waiting).
On Linux, the implementation uses epoll:
src/runtime/netpoll_epoll.go#L21-L40
graph TD
subgraph "User Code"
A["conn.Read()"]
end
subgraph "net package"
B["pollDesc.waitRead()"]
end
subgraph "Runtime"
C["runtime_pollWait"]
D["gopark on pollDesc.rg"]
end
subgraph "Scheduler"
E["findRunnable calls netpoll"]
F["epoll_wait returns ready fds"]
G["goready parked goroutines"]
end
A --> B --> C --> D
E --> F --> G
G -.->|"wake"| D
The integration with the scheduler (from Article 4) is elegant: findRunnable calls netpoll(0) (non-blocking) when looking for work. If a thread is about to park with no work, it calls netpoll(delta) with a timeout to wait for I/O. The sysmon thread also periodically polls to ensure no I/O events are missed.
Runtime Synchronization Primitives
The runtime builds its own synchronization hierarchy, documented in HACKING.md:
src/runtime/HACKING.md#L139-L179
| Primitive | Blocks G | Blocks M | Blocks P | Use Case |
|---|---|---|---|---|
mutex |
Yes | Yes | Yes | Protecting shared runtime state |
note |
Yes | Yes | Yes/No | One-shot notifications |
gopark/goready |
Yes | No | No | Channel ops, netpoll, timers |
The runtime mutex is the lowest-level lock. On Linux, it's implemented using futex:
src/runtime/lock_futex.go#L1-L53
This is not sync.Mutex — it's a runtime-internal lock that blocks the OS thread. Using it blocks both the goroutine and the thread, which is why it's reserved for short critical sections in the runtime's lowest levels.
The note primitive provides one-shot notification with futex:
func notewakeup(n *note) {
old := atomic.Xchg(key32(&n.key), 1)
if old != 0 {
throw("notewakeup - double wakeup")
}
futexwakeup(key32(&n.key), 1)
}
The semaphore implementation in sema.go is what sync.Mutex actually uses:
It uses a balanced tree of sudogs (the same structure used by channels) hashed into a fixed table of 251 entries. This design avoids allocating per-mutex kernel resources while providing O(log n) lookup for waiters on distinct addresses.
Linkname and Compiler Directives
The runtime lives in a privileged position — it needs to expose functions to other packages without making them part of the public API. The //go:linkname directive enables this:
src/runtime/HACKING.md#L277-L356
Three forms exist:
- Push linkname: Give a local definition a symbol name in another package
- Pull linkname: Reference a symbol defined in another package
- Export linkname: Mark a symbol as available for linkname by other packages
For example, runtime.main accesses the user's main.main via:
//go:linkname main_main main.main
func main_main()
The runtime also uses compiler directives that are unavailable to normal Go code:
src/runtime/HACKING.md#L424-L488
//go:systemstack— Function must run on the system stack (g0)//go:nowritebarrier— Assert no write barriers in this function//go:nowritebarrierrec— Assert no write barriers in this function or any function it calls (recursively)//go:nosplit— Don't insert stack growth check (function must fit in current stack)
These directives are essential for the runtime's correctness. For example, code that runs without a P (during scheduler transitions) must not trigger write barriers, because write barriers require a P. The nowritebarrierrec directive enforces this at compile time across the entire call graph.
Tip: When reading runtime code, pay attention to
//go:nosplitannotations. They indicate functions that cannot grow the stack and therefore have strict size constraints. If you see//go:systemstackcombined with//go:nosplit, the function runs on the fixed-size system stack and must be very careful about stack usage.
The Complete Picture
Over these six articles, we've traced Go from its repository structure and bootstrap process, through the go command's build orchestration, the compiler's SSA pipeline, the runtime's assembly bootstrap and G-M-P scheduler, memory allocation and garbage collection, and finally the channel, networking, and synchronization infrastructure.
The recurring design themes are worth calling out:
- Layered dispatch: Thin entry points delegate to architecture-specific implementations (compiler, linker, runtime entry, netpoll)
- Lock-free fast paths: Per-P mcaches for allocation, per-P run queues for scheduling, direct sends for channels
- Declarative constraints: SSA pass ordering, API compatibility files, lock rankings
- Cooperative integration: The scheduler, GC, netpoll, and channel operations all coordinate through
gopark/goreadyrather than separate blocking mechanisms
The Go runtime is a cohesive system where every piece — from the first assembly instruction to the garbage collector's write barrier — is designed to work together. Understanding these internals doesn't just satisfy curiosity; it makes you a better Go programmer, giving you the mental model to reason about performance, debug mysterious behavior, and write code that works with the runtime rather than against it.