Read OSS

The In-Memory Store: How Emulate Manages State Without a Database

Intermediate

Prerequisites

  • TypeScript generics
  • Article 1: Architecture Overview

The In-Memory Store: How Emulate Manages State Without a Database

In the previous article, we saw how every ServicePlugin receives a Store through the composition root. Now we need to understand what that Store actually is — because it's doing the work of a database without any of the complexity. No query language, no migrations, no connection strings. Just TypeScript generics and Maps all the way down.

The design goal is clear: provide enough functionality to emulate real API state management (CRUD, pagination, indexes, persistence) while remaining fast enough that tests don't notice the overhead. The entire store implementation fits in 287 lines.

The Entity Contract and Collection Design

Every record in the store implements the Entity interface:

packages/@emulators/core/src/store.ts#L1-L5

export interface Entity {
  id: number;
  created_at: string;
  updated_at: string;
}

Three fields. That's the contract. Every entity gets an auto-incrementing numeric id and ISO timestamp strings for creation and modification times. The InsertInput<T> type (line 7) uses Omit to make these three fields optional on insert — the collection fills them in automatically.

The Collection<T extends Entity> class is the workhorse:

classDiagram
    class Entity {
        +id: number
        +created_at: string
        +updated_at: string
    }
    class Collection~T extends Entity~ {
        -items: Map~number, T~
        -indexes: Map~string, Map~string, Set~number~~~
        -autoId: number
        +insert(data): T
        +get(id): T | undefined
        +findBy(field, value): T[]
        +findOneBy(field, value): T | undefined
        +update(id, data): T | undefined
        +delete(id): boolean
        +all(): T[]
        +query(options): PaginatedResult~T~
        +count(filter?): number
        +clear(): void
        +snapshot(): CollectionSnapshot~T~
        +restore(snap): void
    }
    class Store {
        -collections: Map~string, Collection~
        -_data: Map~string, unknown~
        +collection~T~(name, indexFields): Collection~T~
        +getData~V~(key): V | undefined
        +setData~V~(key, value): void
        +reset(): void
        +snapshot(): StoreSnapshot
        +restore(snap): void
    }
    Store "1" *-- "*" Collection
    Collection --> Entity

The insert() method on line 99 is the most interesting CRUD operation:

packages/@emulators/core/src/store.ts#L99-L115

insert(data: InsertInput<T>): T {
  const now = new Date().toISOString();
  const explicitId = data.id != null && data.id > 0 ? data.id : undefined;
  const id = explicitId ?? this.autoId++;
  if (id >= this.autoId) {
    this.autoId = id + 1;
  }
  const item = { ...data, id, created_at: now, updated_at: now } as unknown as T;
  this.items.set(id, item);
  this.addToIndex(item);
  return item;
}

Notice the explicit ID support: if you pass id: 42, it uses 42 and advances autoId past it to prevent collisions. This is essential for seeding — GitHub App installation IDs, for example, need to match what the config specifies.

Secondary Indexes: O(1) Lookups Without a Query Engine

Without indexes, finding a user by login would require scanning every entity. With 28 collections in the GitHub emulator alone, that becomes expensive quickly. The solution is secondary indexes implemented as nested Maps.

packages/@emulators/core/src/store.ts#L63-L97

The index structure is Map<fieldName, Map<fieldValue, Set<entityId>>>. When you call findBy("login", "octocat"):

  1. Look up the "login" index map
  2. Look up the "octocat" key in that map
  3. Get back a Set<number> of entity IDs
  4. Map those IDs to actual entities
flowchart LR
    A["findBy('login', 'octocat')"] --> B{"Index exists for 'login'?"}
    B -->|Yes| C["indexes.get('login')"]
    C --> D["indexMap.get('octocat')"]
    D --> E["Set{1}"]
    E --> F["items.get(1)"]
    F --> G["GitHubUser entity"]
    B -->|No| H["Fallback: scan all items"]

Index maintenance happens automatically. On insert(), addToIndex() iterates through indexed fields and adds the entity's ID to the appropriate sets. On update(), it first calls removeFromIndex() for the old values, then addToIndex() for the new ones. On delete(), it cleans up. This is the classic secondary index pattern, just implemented in 40 lines of TypeScript.

The findBy() method has a smart fallback: if no index exists for the requested field, it falls back to a linear scan with this.all().filter(). This means you can query any field — indexed fields are just faster.

Tip: When creating a new emulator, choose index fields based on how routes look up data. If a route uses :owner/:repo, you probably want an index on full_name or owner_id.

Query, Pagination, and Sorting

The query() method supports the full pipeline: filter → count → sort → paginate:

packages/@emulators/core/src/store.ts#L162-L188

query(options: QueryOptions<T> = {}): PaginatedResult<T> {
  let results = this.all();
  if (options.filter) { results = results.filter(options.filter); }
  const total_count = results.length;
  if (options.sort) { results.sort(options.sort); }
  const page = options.page ?? 1;
  const per_page = Math.min(options.per_page ?? 30, 100);
  const start = (page - 1) * per_page;
  const paged = results.slice(start, start + per_page);
  return { items: paged, total_count, page, per_page, has_next: ..., has_prev: ... };
}

The per_page cap at 100 matches GitHub's API limit — this is production-fidelity behavior, not just a convenience. If your code sends per_page=1000, the emulator will silently cap it just like GitHub would.

sequenceDiagram
    participant Route as Route Handler
    participant Collection as Collection<T>
    participant Result as PaginatedResult<T>

    Route->>Collection: query({ filter, sort, page: 2, per_page: 30 })
    Collection->>Collection: all() → full list
    Collection->>Collection: filter(fn) → filtered list
    Collection->>Collection: count filtered → total_count
    Collection->>Collection: sort(fn) → sorted list
    Collection->>Collection: slice(30, 60) → page 2
    Collection->>Result: { items, total_count, page, per_page, has_next, has_prev }
    Result-->>Route: Paginated response

The PaginatedResult<T> type includes has_next and has_prev booleans. These map directly to the Link header values in GitHub-style paginated responses, as we'll see in Article 3.

The Typed Store Facade Pattern

Each emulator creates a typed wrapper around the raw Store. The GitHub emulator's facade maps 28 named collections with specific indexes:

packages/@emulators/github/src/store.ts#L78-L119

export function getGitHubStore(store: Store): GitHubStore {
  return {
    users: store.collection<GitHubUser>("github.users", ["login"]),
    orgs: store.collection<GitHubOrg>("github.orgs", ["login"]),
    repos: store.collection<GitHubRepo>("github.repos", ["owner_id", "full_name"]),
    issues: store.collection<GitHubIssue>("github.issues", ["repo_id", "number"]),
    // ... 24 more collections
  };
}

Two design decisions stand out:

Namespace convention: Collection names are prefixed with the service name (github.users, github.repos). Since all services share the same Store instance in the CLI's multi-service mode, this prevents collisions. A Vercel user and a GitHub user live in vercel.users and github.users respectively.

Index selection: The repos collection is indexed on ["owner_id", "full_name"] because routes look up repos both ways — by owner ID (listing a user's repos) and by full name (the /repos/:owner/:repo pattern). Issues are indexed on ["repo_id", "number"] because every issue route starts with a repo lookup followed by an issue number.

The Store.collection() method has a safety check: if you request a collection that already exists with different index fields, it throws an error. This catches bugs where two code paths disagree about how a collection should be indexed.

packages/@emulators/core/src/store.ts#L225-L241

classDiagram
    class GitHubStore {
        +users: Collection~GitHubUser~
        +orgs: Collection~GitHubOrg~
        +repos: Collection~GitHubRepo~
        +issues: Collection~GitHubIssue~
        +pullRequests: Collection~GitHubPullRequest~
        +comments: Collection~GitHubComment~
        +... 22 more collections
    }
    class Store {
        +collection(name, indexFields): Collection
    }
    Store <.. GitHubStore : getGitHubStore() wraps

Store._data: The Non-Entity Escape Hatch

Not all state fits the entity model. OAuth pending authorization codes are ephemeral — they have a TTL, aren't paginated, and don't need IDs. Session maps link cookie values to user logins. These use the _data key-value escape hatch:

packages/@emulators/core/src/store.ts#L243-L249

getData<V>(key: string): V | undefined {
  return this._data.get(key) as V | undefined;
}
setData<V>(key: string, value: V): void {
  this._data.set(key, value);
}

The GitHub OAuth routes use this extensively:

packages/@emulators/github/src/routes/oauth.ts#L30-L55

function getPendingCodes(store: Store): Map<string, PendingCode> {
  let map = store.getData<Map<string, PendingCode>>("github.oauth.pendingCodes");
  if (!map) {
    map = new Map();
    store.setData("github.oauth.pendingCodes", map);
  }
  return map;
}

The lazy initialization pattern (get-or-create) appears everywhere _data is used. The namespacing convention (github.oauth.pendingCodes) matches the collection naming to prevent cross-service collisions.

flowchart TD
    A[OAuth authorize] --> B["getPendingCodes(store)"]
    B --> C{"_data has 'github.oauth.pendingCodes'?"}
    C -->|No| D[Create new Map, store via setData]
    C -->|Yes| E[Return existing Map]
    D --> F["map.set(code, { login, scope, ... })"]
    E --> F
    F --> G[Token exchange later reads + deletes]

Snapshot, Restore, and Persistence

The store needs to survive server restarts — especially in the Next.js adapter where serverless functions cold-start frequently. The snapshot/restore system handles this.

Store.snapshot() serializes all collections and _data into a plain object:

packages/@emulators/core/src/store.ts#L258-L287

But there's a problem: Map and Set don't survive JSON.stringify. The serializeValue and deserializeValue functions handle this with tagged unions:

packages/@emulators/core/src/store.ts#L39-L61

export function serializeValue(value: unknown): unknown {
  if (value instanceof Map) {
    return { __type: "Map", entries: [...value.entries()].map(([k, v]) => [k, serializeValue(v)]) };
  }
  if (value instanceof Set) {
    return { __type: "Set", values: [...value.values()] };
  }
  return value;
}

This recursive approach means nested Maps (a Map containing Maps) serialize correctly. The __type tag tells deserializeValue how to reconstruct the original structure.

The PersistenceAdapter interface is deliberately minimal:

packages/@emulators/core/src/persistence.ts#L1-L23

export interface PersistenceAdapter {
  load(): Promise<string | null>;
  save(data: string): Promise<void>;
}

Two methods. load() returns JSON or null, save() writes JSON. The filePersistence factory creates an adapter that reads/writes to disk, creating directories as needed. But you could implement an adapter backed by Redis, S3, or any other storage.

The Next.js adapter adds a serialized save queue to prevent race conditions when multiple requests mutate state simultaneously:

packages/@emulators/adapter-next/src/index.ts#L175-L187

function enqueueSave(): void {
  if (!persistence || !apps) return;
  pendingSave = pendingSave.then(async () => {
    if (!apps) return;
    const snapshot = takeSnapshot(apps);
    const json = JSON.stringify(snapshot);
    await persistence.save(json);
  });
}

By chaining onto pendingSave, each save operation waits for the previous one to complete. This turns concurrent saves into a serial queue without locks.

Tip: Persistence is opt-in. If you don't pass a PersistenceAdapter, the store exists purely in memory and resets on restart — which is usually what you want for CI.

What's Next

The store provides the data layer, but who guards the door? In the next article, we'll examine the middleware stack in detail — how the auth middleware resolves identity through three different paths, how the rate limiter enforces production-like limits, and how error handling produces responses that match the real API error formats.