I/O in Node.js: Streams, Handles, and the Event Loop
Prerequisites
- ›Article 1: architecture-overview
- ›Article 3: cpp-object-model-and-bindings (BaseObject/Wrap hierarchy)
- ›Understanding of libuv event loop model (handles vs requests, uv_run phases)
- ›Familiarity with Node.js streams API from a user perspective
I/O in Node.js: Streams, Handles, and the Event Loop
Node.js's raison d'être is non-blocking I/O. Every network connection, file operation, timer, and child process ultimately flows through the same machinery: libuv handles and requests on the C++ side, streams and event emitters on the JavaScript side, connected by the Wrap hierarchy we explored in Article 3. This article shows how these pieces fit together in practice — from a TCP connection's lifecycle to the ingenious timer system to the microtask queue that powers process.nextTick().
libuv Integration: Handles vs Requests
As we established in Article 3, libuv has two fundamental abstractions that Node.js wraps:
Handles (uv_handle_t) are long-lived objects that can generate multiple events over their lifetime. TCP servers, TCP connections, timers, file system watchers, and signal handlers are all handles. They keep the event loop alive when referenced.
Requests (uv_req_t) are one-shot operations. A file read, a DNS lookup, a connect attempt — each creates a request, dispatches it, and receives a single callback when complete.
graph TD
subgraph "Handles — Long-lived"
TCP["uv_tcp_t<br/>TCP socket"]
TIMER["uv_timer_t<br/>Timer"]
PIPE["uv_pipe_t<br/>Unix pipe / Windows named pipe"]
FSE["uv_fs_event_t<br/>File system watcher"]
SIGNAL["uv_signal_t<br/>Signal handler"]
UDP["uv_udp_t<br/>UDP socket"]
end
subgraph "Requests — One-shot"
FSREQ["uv_fs_t<br/>File system operation"]
CONN["uv_connect_t<br/>Connection attempt"]
WRITE["uv_write_t<br/>Stream write"]
DNS["uv_getaddrinfo_t<br/>DNS lookup"]
WORK["uv_work_t<br/>Thread pool work"]
end
subgraph "Event Loop"
LOOP["uv_run()<br/>Process events"]
end
TCP --> LOOP
TIMER --> LOOP
FSREQ --> LOOP
CONN --> LOOP
The event loop in SpinEventLoopInternal() calls uv_run(UV_RUN_DEFAULT), which processes all pending I/O events. The UV_RUN_DEFAULT mode blocks until there are events to process or no more handles/requests are active. Between uv_run() iterations, platform->DrainTasks(isolate) processes V8 background tasks like optimized code compilation and garbage collection finalization.
The Wrap Hierarchy in Practice: TCP Connection Lifecycle
Let's trace a real TCP connection to see how the C++ wrap hierarchy (from Article 3) operates. When you call net.createServer() and a client connects:
sequenceDiagram
participant NET as lib/net.js
participant TW as TCPWrap (C++)
participant CW as ConnectionWrap
participant LSW as LibuvStreamWrap
participant HW as HandleWrap
participant UV as libuv
Note over NET: server.listen(port)
NET->>TW: new TCP(TCPConstants.SERVER)
TW->>HW: HandleWrap(env, object, &handle_)
HW->>UV: uv_tcp_init(loop, &handle_)
NET->>TW: bind(address, port)
NET->>TW: listen(backlog)
TW->>UV: uv_listen(&handle_, backlog, OnConnection)
Note over UV: Client connects
UV->>CW: OnConnection(handle, status)
CW->>TW: TCPWrap::Instantiate(env, parent, SOCKET)
CW->>UV: uv_accept(server_handle, &client_handle)
CW->>NET: MakeCallback(onconnection, client_wrap)
Note over NET: Data flows
NET->>LSW: ReadStart()
LSW->>UV: uv_read_start(handle, OnAlloc, OnRead)
UV->>LSW: OnRead(handle, nread, buf)
LSW->>NET: MakeCallback(onread, buffer)
TCPWrap inherits from ConnectionWrap<TCPWrap, uv_tcp_t>, which inherits from LibuvStreamWrap, then HandleWrap, then AsyncWrap, then BaseObject. Each layer adds functionality:
BaseObject: Links the C++ object to the JavaScript socket objectAsyncWrap: Providesasync_idforasync_hookstrackingHandleWrap: Manages the libuv handle lifecycle (ref/unref/close)LibuvStreamWrap: ImplementsReadStart()/ReadStop()and write operationsConnectionWrap: HandlesOnConnection()andAfterConnect()callbacksTCPWrap: TCP-specific methods likebind(),listen(),connect()
StreamBase is worth noting separately — it's an abstract interface that LibuvStreamWrap implements, providing a unified stream API that JavaScript can call. Both libuv streams and TLS streams implement StreamBase, which is why tls.TLSSocket can transparently replace a plain net.Socket.
JavaScript Stream Architecture
On the JavaScript side, Node.js streams are state machines built on EventEmitter. The four stream types — Readable, Writable, Duplex, and Transform — live in lib/internal/streams/.
stateDiagram-v2
[*] --> Flowing: pipe() or resume()
[*] --> Paused: Initial state
Paused --> Flowing: resume() / pipe() / 'data' listener
Flowing --> Paused: pause()
Flowing --> Ended: push(null)
Paused --> Ended: push(null) + drain
Ended --> [*]
state Flowing {
[*] --> Reading
Reading --> Buffering: _read() returns data
Buffering --> Reading: Data consumed below hwm
Buffering --> Backpressure: Buffer > highWaterMark
Backpressure --> Reading: Data consumed
}
Readable streams operate in two modes: flowing (data is pushed to consumers automatically) and paused (data must be pulled with read()). The highWaterMark controls buffering: when the internal buffer exceeds this threshold, the stream signals backpressure by returning false from push().
Writable streams have a complementary state machine. The critical method is write(), which returns false when the internal buffer is full — the caller should wait for the 'drain' event before writing more data.
The pipeline() utility in lib/internal/streams/pipeline.js handles the complex error propagation and cleanup when chaining streams, making it the recommended way to connect streams rather than .pipe().
Tip: Always use
pipeline()instead of.pipe()for production code.pipeline()properly handles errors and cleanup across the entire chain, while.pipe()famously doesn't clean up on errors, leading to resource leaks.
The Timer System: Linked Lists and a Single libuv Timer
The timer implementation in lib/internal/timers.js has one of the best ASCII art comments in the codebase. The design is ingenious: rather than creating a libuv timer for each setTimeout() call (which would be expensive with thousands of active timers), Node.js groups timers by duration.
graph TD
subgraph "Timer Architecture"
MAP["PriorityQueue + Object Map<br/>Keys: durations in ms"]
MAP --> L40["TimersList {duration: 40ms}"]
MAP --> L320["TimersList {duration: 320ms}"]
MAP --> L1000["TimersList {duration: 1000ms}"]
L40 --> T1["Timer A<br/>_onTimeout: cb1"]
T1 --> T2["Timer B<br/>_onTimeout: cb2"]
T2 --> T3["Timer C<br/>_onTimeout: cb3"]
L1000 --> T4["Timer D<br/>_onTimeout: cb4"]
T4 --> T5["Timer E<br/>_onTimeout: cb5"]
end
UV_TIMER["Single libuv timer<br/>Set to earliest expiry"] --> MAP
Each duration bucket is a doubly-linked list (TimersList). Adding a timer is O(1) — just append to the list for that duration. Removing is O(1) — unlink from the doubly-linked list. When a timer fires, only the head of the relevant list needs checking, because all timers in the list share the same duration and were inserted in chronological order.
A PriorityQueue (binary heap) tracks which duration bucket expires next. A single libuv uv_timer_t is set to the earliest expiry time. When it fires, Node.js processes all expired timers across all duration buckets, then resets the libuv timer to the next expiry.
This design means Node.js can efficiently manage hundreds of thousands of active timers — a common scenario in HTTP servers where every connection has an idle timeout.
Microtasks, nextTick, and setImmediate
The relationship between process.nextTick(), V8 microtasks (Promises), and setImmediate() is one of the most asked-about aspects of Node.js. They execute at different points in the event loop:
flowchart TD
UV["uv_run() iteration"] --> TIMERS["1. Timers phase<br/>setTimeout / setInterval"]
TIMERS --> NT1["⚡ nextTick queue + microtasks"]
NT1 --> PENDING["2. Pending I/O callbacks"]
PENDING --> NT2["⚡ nextTick queue + microtasks"]
NT2 --> POLL["3. Poll phase<br/>I/O events"]
POLL --> NT3["⚡ nextTick queue + microtasks"]
NT3 --> CHECK["4. Check phase<br/>setImmediate callbacks"]
CHECK --> NT4["⚡ nextTick queue + microtasks"]
NT4 --> CLOSE["5. Close callbacks"]
CLOSE --> NT5["⚡ nextTick queue + microtasks"]
NT5 --> UV
process.nextTick() uses a FixedQueue and the TickInfo shared state (an AliasedFloat64Array in env.h that's accessible from both C++ and JavaScript without crossing the boundary). The kHasTickScheduled flag tells the C++ layer that the nextTick queue needs draining.
The critical insight is that nextTick and microtasks run between every phase of the event loop, not just once per iteration. This means process.nextTick() callbacks execute before any I/O — a sharp tool that can starve I/O if used carelessly.
setImmediate() runs in the "check" phase, which is after the "poll" phase. This means setImmediate() callbacks execute after I/O events have been processed, making it the right choice for deferring work without starving I/O.
async_hooks and AsyncWrap Tracking
Every async operation in Node.js flows through AsyncWrap (from Article 3), which enables the async_hooks API. The tracking works through four lifecycle events:
sequenceDiagram
participant AH as async_hooks
participant AW as AsyncWrap (C++)
participant UV as libuv
Note over AW: Creating a new TCP connection
AW->>AH: init(asyncId, type, triggerAsyncId, resource)
Note over AH: Track: "TCPWrap #7 triggered by #3"
Note over UV: Connection callback fires
AW->>AH: before(asyncId)
Note over AH: Set execution context to #7
AW->>AW: MakeCallback(onconnection)
AW->>AH: after(asyncId)
Note over AH: Restore previous context
Note over AW: Socket closed
AW->>AH: destroy(asyncId)
Note over AH: Cleanup tracking for #7
The provider types are defined in src/async_wrap.h using a macro that lists every async resource type: TCPWRAP, FSREQCALLBACK, GETADDRINFOREQWRAP, HTTP2SESSION, and dozens more. Each type gets a unique enum value that async_hooks consumers can use to filter events.
The executionAsyncId() and triggerAsyncId() functions expose the async context chain, enabling tools like AsyncLocalStorage to propagate request-scoped data through async boundaries without explicit parameter passing. This is built on the same AsyncWrap infrastructure — AsyncLocalStorage stores data keyed by async_id and propagates it through the init hook's triggerAsyncId chain.
Tip:
async_hookshas measurable overhead. In production, preferAsyncLocalStorage(which optimizes the common case) over rawasync_hooks. If you need diagnostic hooks, consider thediagnostics_channelAPI instead, which has lower overhead for pub/sub style instrumentation.
What's Next
We've now covered the complete I/O path: from libuv events through C++ wraps to JavaScript streams and back. In the final article of this series, we'll explore Node.js's cross-cutting concerns — the permission model, the error system, Web Platform API integration, V8 snapshots, single executable applications, and the built-in test runner.