Read OSS

Inside the Page: Selectors, Injected Scripts, and DOM Interaction

Advanced

Prerequisites

  • Articles 1-3: Architecture, Protocol, and Browser Abstraction
  • DOM and CSS selector knowledge
  • Understanding of browser execution contexts

Inside the Page: Selectors, Injected Scripts, and DOM Interaction

We've traced the path from API call to protocol message to server-side browser abstraction. But when the server needs to find an element, check if it's visible, or click on it, it can't do that from Node.js — it needs to run code inside the browser page. This article explores how Playwright's injected script architecture works, how the selector engine system evaluates queries, how Locators lazily compose selectors on the client-side, and how auto-waiting provides the reliability that makes Playwright tests stable.

Injected Script Architecture

Playwright maintains a separate package, packages/injected/, that contains code designed to run inside browser page contexts. The main entry point is InjectedScript, defined in packages/injected/src/injectedScript.ts#L1-L60.

This code is compiled separately during the build process and bundled as a string in the server (via packages/playwright-core/src/generated/injectedScriptSource). When the server needs to interact with a page, it evaluates this script in the page's JavaScript context, creating an InjectedScript instance that serves as Playwright's in-page agent.

The InjectedScript class handles:

  • Selector evaluation (finding elements matching a selector)
  • Element state queries (visible, enabled, stable, etc.)
  • ARIA snapshot generation (for accessibility testing)
  • Hit target interception (detecting what element would be clicked)
  • Selector generation (for the recorder/codegen)
flowchart TD
    subgraph "Node.js Process"
        S["Server (frames.ts)"]
        B["Bundled injectedScriptSource"]
    end
    
    subgraph "Browser Page"
        UC["Utility Context"]
        MC["Main Context"]
        IS["InjectedScript Instance"]
    end
    
    S -->|"evaluate(injectedScriptSource)"| UC
    B -->|"compiled at build time"| S
    UC --> IS
    IS -->|"querySelector"| DOM["DOM Tree"]
    IS -->|"state checks"| DOM
    IS -->|"ARIA tree"| DOM

The key insight is the utility world isolation. Playwright evaluates InjectedScript in a separate JavaScript context (the "utility" world) that shares the DOM but not the page's JavaScript namespace. This means Playwright's selector evaluation can't be interfered with by the page's own scripts — if the page overrides document.querySelector, Playwright's code is unaffected.

Tip: The utility world isolation is why Playwright can reliably interact with pages that modify built-in prototypes or use frameworks that wrap the DOM. Your tests won't break because a page redefines Element.prototype.click.

Selector Engine System

Playwright supports a rich set of selector engines. The server-side Selectors class at packages/playwright-core/src/server/selectors.ts#L23-L56 manages both built-in and custom engines:

this._builtinEngines = new Set([
  'css', 'css:light',
  'xpath', 'xpath:light',
  '_react', '_vue',
  'text', 'text:light',
  'id', 'id:light',
  'role', 'internal:attr', 'internal:label', 'internal:text',
  'internal:role', 'internal:testid',
  'internal:has', 'internal:has-not',
  'internal:has-text', 'internal:has-not-text',
  'internal:and', 'internal:or', 'internal:chain',
  'nth', 'visible', 'internal:control',
  // ...
]);

The engines fall into several categories:

Category Engines Purpose
CSS css, css:light Standard CSS selectors
XPath xpath, xpath:light XPath selectors
Text text, internal:text, internal:has-text Text content matching
Role role, internal:role ARIA role-based selectors
Test ID data-testid, internal:testid Testing attribute selectors
Framework _react, _vue Framework component selectors
Combinators internal:has, internal:and, internal:or Selector composition

The :light suffix indicates "light DOM only" — excluding shadow DOM traversal, which is enabled by default.

Selectors can be chained using the >> syntax. For example:

role=button >> text=Submit

This means "find a button role element, then within it find text 'Submit'". Each segment is evaluated by its own engine, and the >> operator composes them sequentially.

flowchart LR
    A["'role=button >> text=Submit'"] --> B["Parse selector"]
    B --> C["Part 1: role=button"]
    B --> D["Part 2: text=Submit"]
    C --> E["Role engine evaluates<br/>→ finds buttons"]
    E --> F["Scope narrows"]
    F --> D
    D --> G["Text engine evaluates<br/>within button scope"]
    G --> H["Final element"]

Locator: Lazy Client-Side Composition

The Locator class at packages/playwright-core/src/client/locator.ts#L40-L73 is one of Playwright's most important abstractions. Unlike older APIs that return ElementHandle objects (server-side references), a Locator is a pure client-side object that stores a selector string and resolves it fresh every time an action is performed.

export class Locator implements api.Locator {
  _frame: Frame;
  _selector: string;

  constructor(frame: Frame, selector: string, options?: LocatorOptions) {
    this._frame = frame;
    this._selector = selector;

    if (options?.hasText)
      this._selector += ` >> internal:has-text=${escapeForTextSelector(options.hasText, false)}`;
    if (options?.hasNotText)
      this._selector += ` >> internal:has-not-text=${escapeForTextSelector(options.hasNotText, false)}`;
    if (options?.has)
      this._selector += ` >> internal:has=` + JSON.stringify(options.has._selector);
    if (options?.hasNot)
      this._selector += ` >> internal:has-not=` + JSON.stringify(options.hasNot._selector);
    if (options?.visible !== undefined)
      this._selector += ` >> visible=${options.visible ? 'true' : 'false'}`;
  }

Every filtering method (has, hasText, hasNot, visible) simply appends to the selector string using the >> chaining syntax. No browser round-trip occurs until an action (click, fill, textContent) is called.

This design has profound implications:

  1. No stale references — the element is found fresh each time
  2. Composabilitylocator.filter().locator().nth() builds a compound selector string
  3. Auto-waiting — when an action is called, Playwright waits for the selector to match
classDiagram
    class Locator {
        +_frame: Frame
        +_selector: string
        +click()
        +fill()
        +locator(): Locator
        +filter(): Locator
        +nth(): Locator
        +first(): Locator
        +last(): Locator
    }
    
    note for Locator "No server-side state!<br/>Just _frame + _selector string.<br/>Browser round-trip only on actions."

Tip: Prefer Locators over page.$() / ElementHandle. Locators don't hold server-side references, so they can't cause "object collected" errors and naturally handle dynamic content.

Frame Execution and Action Flow

Frame is the execution boundary for all DOM interactions on the server side. Defined in packages/playwright-core/src/server/frames.ts, it's the largest core file at roughly 1,800 lines.

Each Frame maintains two execution contexts:

  • Main world: The page's own JavaScript context. Used for page.evaluate() calls and _react/_vue selectors.
  • Utility world: An isolated context for Playwright's own code. Used for selector evaluation and state checks.

The FrameExecutionContext class in packages/playwright-core/src/server/dom.ts#L48-L57 bridges these worlds:

export class FrameExecutionContext extends js.ExecutionContext {
  readonly frame: frames.Frame;
  readonly world: types.World | null;

Let's trace the complete flow of a locator.click() call:

sequenceDiagram
    participant User as locator.click()
    participant Client as Client Frame
    participant Proto as Protocol
    participant Server as Server Frame
    participant IS as InjectedScript
    participant PD as PageDelegate

    User->>Client: click()
    Client->>Proto: {method: "click", params: {selector}}
    Proto->>Server: Frame.click()
    
    loop Retry until timeout
        Server->>IS: querySelector(selector)
        IS-->>Server: elementHandle or null
        alt Element found
            Server->>IS: checkActionability()
            IS-->>Server: visible? stable? enabled?
            alt Actionable
                Server->>PD: rawMouse.click(x, y)
                PD-->>Server: done
                Server-->>Proto: success
            else Not actionable
                Note over Server: Wait and retry
            end
        else Not found
            Note over Server: Wait and retry
        end
    end

Auto-Waiting and Retry Logic

Playwright's auto-waiting is what makes it fundamentally more reliable than older automation tools. Rather than immediately failing when an element isn't ready, Playwright retries the entire operation in a loop until the timeout expires.

The retry logic lives in the server's Frame class. For action methods like click(), the server:

  1. Resolves the selector to find the element
  2. Checks actionability: is the element visible, enabled, stable (not animating), and not obscured by another element?
  3. Performs the action
  4. If any step fails with a recoverable error, waits briefly and retries from step 1

This interacts with the ProgressController from Article 3. The ProgressController manages the overall timeout, while the frame-level retry loop keeps attempting the operation within that window.

The InjectedScript in the browser handles actionability checks efficiently. For a click operation, it checks:

Check What it means
visible Element has non-zero bounding box and is not visibility: hidden
stable Element's position hasn't changed between two animation frames
enabled Not a disabled form element
receives events No other element would intercept the click at the target coordinates
flowchart TD
    A["Start: click(selector)"] --> B["Resolve selector"]
    B --> C{"Element found?"}
    C -->|No| D["Wait for DOM change"]
    D --> B
    C -->|Yes| E["Check visible"]
    E --> F{"Visible?"}
    F -->|No| D
    F -->|Yes| G["Check stable"]
    G --> H{"Stable?"}
    H -->|No| I["Wait for RAF"]
    I --> G
    H -->|Yes| J["Check enabled"]
    J --> K{"Enabled?"}
    K -->|No| D
    K -->|Yes| L["Scroll into view"]
    L --> M["Check hit target"]
    M --> N{"Receives events?"}
    N -->|No| D
    N -->|Yes| O["Perform click"]
    O --> P["Success ✓"]
    
    D --> Q{"Timeout?"}
    Q -->|Yes| R["Throw TimeoutError"]

The PerformActionResult type in packages/playwright-core/src/server/dom.ts#L39 reveals the full set of recoverable error states:

type PerformActionResult = 'error:notvisible' | 'error:notconnected' | 
  'error:notinviewport' | 'error:optionsnotfound' | 'error:optionnotenabled' | 
  { missingState: ElementState } | { hitTargetDescription: string } | 'done';

Each of these except 'done' triggers a retry. The NonRecoverableDOMError class represents errors that should not be retried — like trying to fill a non-input element.

Tip: If your test is timing out on a click, Playwright's error message includes a "Call log" showing exactly which actionability check was failing. Look for messages like "waiting for element to be visible" or "element is not stable" to diagnose the issue.

What's Next

We've now covered Playwright from the user-facing API through the protocol, server-side abstraction, and all the way into the browser page. In the next article, we turn to Playwright's test runner — the multi-process architecture, the fixture system, the task pipeline, and how test.extend() builds composable test configurations.