Read OSS

Engine.IO: Transport Negotiation, Upgrades, and Heartbeats

Advanced

Prerequisites

  • Node.js HTTP module (createServer, upgrade event)
  • WebSocket protocol basics (upgrade handshake, frames)
  • Article 1: Architecture and Codebase Navigation

Engine.IO: Transport Negotiation, Upgrades, and Heartbeats

As we established in Article 1, Socket.IO delegates all transport concerns to Engine.IO. This article traces the complete lifecycle of an Engine.IO connection: from the first HTTP request through verification, handshake, transport creation, the seamless upgrade from polling to WebSocket, and the heartbeat protocol that detects dead connections. Understanding this layer is essential because every Socket.IO bug related to "connection dropping" or "client not connecting" lives here.

Request Verification and the Handshake

Every Engine.IO connection begins with an HTTP request. Before anything else, that request passes through BaseServer.verify(), which performs four sequential checks:

sequenceDiagram
    participant Client
    participant Server as Engine.IO Server
    
    Client->>Server: GET /engine.io/?EIO=4&transport=polling
    
    Note over Server: verify() begins
    Server->>Server: 1. Transport check (polling|websocket valid?)
    Server->>Server: 2. Origin header validation
    Server->>Server: 3. SID check (new or existing session?)
    Server->>Server: 4. allowRequest callback (if configured)
    
    alt Verification fails
        Server-->>Client: HTTP 4xx error
    else Verification passes
        Server->>Server: handshake() — generate ID, create transport
        Server-->>Client: {"sid":"abc123","upgrades":["websocket"],"pingInterval":25000,"pingTimeout":20000}
    end

The transport check at line 276-284 validates that the requested transport is in the opts.transports array. WebTransport is handled separately — it doesn't go through verify() at all but enters via onWebTransportSession().

The SID check is particularly interesting. If a SID is present, the server validates that the client exists and that the transport matches — you can't suddenly switch from polling to websocket without going through the upgrade protocol. If no SID is present, it's a handshake request and must be a GET.

Once verification passes, handshake() takes over. It generates a unique session ID via base64id.generateId(), detects the protocol version (EIO=4 or EIO=3), creates the appropriate Transport instance, and instantiates an Engine.IO Socket. The handshake response tells the client everything it needs: the session ID, available upgrades, and the heartbeat timing parameters.

Server Defaults

The default configuration lives in the BaseServer constructor:

Option Default Purpose
pingTimeout 20,000 ms How long to wait for a pong before declaring connection dead
pingInterval 25,000 ms How often to send ping packets
upgradeTimeout 10,000 ms How long the upgrade probe can take
maxHttpBufferSize 1,000,000 bytes Maximum message size (DoS protection)
transports ["polling", "websocket"] WebTransport disabled by default
allowUpgrades true Whether to allow transport upgrades
allowEIO3 false Backward compatibility with Engine.IO v3

Tip: If you're deploying behind a load balancer that doesn't support WebSocket, set transports: ["polling"] to avoid wasted upgrade attempts. Conversely, if all your clients support WebSocket, set transports: ["websocket"] to skip polling entirely and connect faster.

Transport Implementations

Engine.IO supports three transports, each extending a common base Transport class:

classDiagram
    class Transport {
        +sid: string
        +writable: boolean
        +protocol: 3 | 4
        #_readyState: ReadyState
        +send(packets: Packet[])*
        +doClose(fn)*
        +onData(data)
        +onPacket(packet)
    }
    class Polling {
        +maxHttpBufferSize: number
        +httpCompression: object
        -req: EngineRequest
        -res: ServerResponse
        +name: "polling"
        +send(packets)
    }
    class WebSocketTransport {
        -socket: WsWebSocket
        +perMessageDeflate: options
        +name: "websocket"
        +handlesUpgrades: true
        +send(packets)
    }
    class WebTransportTransport {
        -session: any
        -writer: WritableStreamDefaultWriter
        +name: "webtransport"
        +send(packets)
    }
    
    Transport <|-- Polling
    Transport <|-- WebSocketTransport
    Transport <|-- WebTransportTransport

Polling (packages/engine.io/lib/transports/polling.ts) implements HTTP long-polling with a clever two-channel design: GET requests receive data from the server (the response is held open until there's data to send), and POST requests send data to the server. This means there are always two HTTP requests active — one for each direction.

WebSocket (packages/engine.io/lib/transports/websocket.ts) wraps a ws WebSocket instance. It's immediately writable (no request/response cycle to manage), and sets handlesUpgrades: true to signal that it can be the target of a transport upgrade.

WebTransport (packages/engine.io/lib/transports/webtransport.ts) uses HTTP/3 bidirectional streams with the engine.io-parser's createPacketEncoderStream() for binary-framed encoding. It reads incoming data via an async iterator over the stream reader.

The Probe-Based Upgrade Mechanism

The upgrade from polling to WebSocket is one of Engine.IO's most elegant features. It happens transparently to the application, without dropping a single packet. The protocol is implemented in _maybeUpgrade():

sequenceDiagram
    participant Client
    participant Polling as Polling Transport
    participant WS as WebSocket Transport
    participant Socket as Engine.IO Socket
    
    Note over Client: Opens WebSocket alongside active polling
    Client->>WS: WebSocket upgrade request
    Socket->>Socket: _maybeUpgrade(wsTransport)
    
    Client->>WS: ping "probe"
    WS->>Client: pong "probe"
    
    Note over Socket: Start sending noop to polling<br/>to flush buffered response quickly
    Socket->>Polling: noop packet (every 100ms)
    
    Client->>WS: upgrade packet
    Note over Socket: Swap transport!
    Socket->>Socket: this.transport = wsTransport
    Socket->>Polling: discard()
    
    Note over Client,Socket: All traffic now flows over WebSocket

The five steps are:

  1. Client opens a WebSocket alongside the existing polling connection, including the session ID
  2. Client sends ping "probe" over WebSocket to test the connection
  3. Server responds pong "probe" confirming the WebSocket link works
  4. Client sends upgrade packet — the server now swaps the transport reference
  5. Old polling transport is discarded

The check() function at line 341-346 is a subtle but important detail. Every 100ms, it sends a noop packet to the polling transport. Why? Because the polling GET response might be waiting for data to send back. The noop forces the response to flush immediately, freeing up the client to complete the upgrade without waiting for the polling timeout.

The entire upgrade must complete within upgradeTimeout (default 10 seconds) or it's aborted with cleanup().

Heartbeat Protocol and Dead Connection Detection

Engine.IO uses a ping/pong heartbeat to detect dead connections. The implementation in packages/engine.io/lib/socket.ts reveals an interesting protocol version difference:

sequenceDiagram
    participant Server
    participant Client
    
    Note over Server,Client: Protocol v4 (Engine.IO v4+)
    loop Every pingInterval (25s)
        Server->>Client: ping
        Client->>Server: pong (must arrive within pingTimeout=20s)
    end
    
    Note over Server,Client: Protocol v3 (Legacy)
    loop Every pingInterval
        Client->>Server: ping
        Server->>Client: pong
    end

In protocol v3, the client initiates pings and the server responds with pongs. In v4, this was reversed — the server sends pings and the client responds with pongs. This reversal was made because server-initiated pings are more reliable: the server can definitively detect dead clients, whereas client-initiated pings might not detect a server that's alive but no longer processing the client's connection.

The schedulePing() method sets up a timer at pingInterval. When it fires, it sends a ping and calls resetPingTimeout(), which starts a pingTimeout timer. If the client's pong doesn't arrive before that timer fires, the connection is closed with reason "ping timeout".

The onPacket() handler at line 153-193 enforces the direction: if a v4 client sends a ping, it's treated as an error. Similarly, if a v3 client sends a pong, that's also an error. This prevents protocol confusion in mixed-version deployments.

Socket.IO Binding to Engine.IO

The bridge between the two layers is established when Socket.IO's Server.bind() is called:

flowchart TD
    A["Engine.IO emits 'connection'"] --> B["Server.onconnection(conn)"]
    B --> C["new Client(this, conn)"]
    C --> D["Client.setup()"]
    D --> E["Wire data/error/close handlers"]
    D --> F["Start 45s connect timeout"]
    
    G["Engine.IO emits 'data'"] --> H["Client.ondata()"]
    H --> I["decoder.add(data)"]
    I --> J["decoder emits 'decoded'"]
    J --> K["Client.ondecoded(packet)"]
    K --> L{Packet type?}
    L -- "CONNECT" --> M["Client.connect(namespace)"]
    L -- "EVENT/ACK/etc" --> N["socket._onpacket(packet)"]

The Client.setup() method wires up four handlers: ondata feeds raw data into the Socket.IO parser, ondecoded routes decoded packets to the correct namespace, onerror propagates errors, and onclose cleans up. It also starts a 45-second connect timeout — if the client doesn't join at least one namespace within that window, the connection is forcibly closed.

For protocol v3 backward compatibility, onconnection() at line 744-747 automatically connects the client to the root namespace /, since v3 clients don't send an explicit CONNECT packet.

What's Next

We've now traced the full path from HTTP request to established connection, covering transport negotiation, the upgrade dance, and heartbeat management. In Article 3, we'll follow a packet through the dual-layer serialization pipeline: how Socket.IO encodes events, namespaces, and acknowledgements, how Engine.IO wraps those into transport frames, and how binary data is extracted and reconstructed on the other side.