Read OSS

Tools and the Scheduler: How Gemini CLI Executes Actions

Advanced

Prerequisites

  • Article 2: The Agentic Loop
  • Understanding of builder and strategy design patterns
  • Basic knowledge of MCP (Model Context Protocol)

Tools and the Scheduler: How Gemini CLI Executes Actions

When Gemini's model decides to read a file, run a shell command, or invoke an MCP tool, that decision enters a sophisticated pipeline: the tool call is validated, policy-checked, possibly confirmed by the user, executed, and the result is fed back to the model for the next turn. This article explores how Gemini CLI's tool system and scheduler orchestrate this entire flow.

The ToolInvocation Interface

Every tool call in Gemini CLI follows a uniform lifecycle defined by the ToolInvocation interface:

flowchart LR
    V[validate params] --> D[getDescription]
    D --> C[shouldConfirmExecute]
    C --> E[execute]
    E --> P[getPolicyUpdateOptions]

The interface provides a clean contract:

  • params — Validated parameters for this invocation
  • getDescription() — Human-readable description of what the tool will do
  • toolLocations() — File paths the tool will affect (used for UI display)
  • shouldConfirmExecute(abortSignal, forcedDecision?) — Returns confirmation details if user approval is needed, false if not
  • execute(signal, updateOutput?, options?) — Runs the tool and returns a ToolResult
  • getPolicyUpdateOptions(outcome) — Provides tool-specific options when the user approves "always allow"

The shouldConfirmExecute method is where policy decisions materialize into user-facing behavior. It receives a forcedDecision that may come from the MessageBus policy check, and returns either false (proceed without confirmation) or a ToolCallConfirmationDetails object with type-specific UI data.

BaseDeclarativeTool and the Builder Pattern

Tools are defined declaratively through BaseDeclarativeTool, which separates schema definition from execution. Each tool class specifies:

  • A name, display name, and description for the model's function declarations
  • A JSON parameter schema for validation
  • A Kind enum value (ReadOnly, Write, Execute, Other)
  • Whether output is markdown, whether it supports live updates

When the model requests a tool call, the BaseDeclarativeTool validates the parameters against the schema using SchemaValidator, then calls createInvocation() to produce a ToolInvocation instance. This is the builder pattern: the tool definition is the builder, and the invocation is the product.

classDiagram
    class BaseDeclarativeTool~TParams, TResult~ {
        +name: string
        +displayName: string
        +description: string
        +kind: Kind
        +parameterSchema: object
        #createInvocation(): ToolInvocation
        +validate(params): ToolInvocation
    }
    
    class ToolInvocation~TParams, TResult~ {
        <<interface>>
        +params: TParams
        +getDescription(): string
        +shouldConfirmExecute(): Promise
        +execute(): Promise~TResult~
        +toolLocations(): ToolLocation[]
    }
    
    class BaseToolInvocation~TParams, TResult~ {
        +messageBus: MessageBus
        +respectsAutoEdit: boolean
        +getApprovalMode: Function
        #getConfirmationDetails(): Promise
        #getMessageBusDecision(): Promise
    }
    
    BaseDeclarativeTool ..> ToolInvocation : creates
    BaseToolInvocation ..|> ToolInvocation

The BaseToolInvocation abstract class provides the default confirmation flow. Its shouldConfirmExecute method checks if the tool respects AUTO_EDIT mode, then queries the MessageBus for a policy decision. If the decision is allow, no confirmation needed. If deny, it throws. If ask_user, it delegates to getConfirmationDetails() for tool-specific UI.

Tip: The respectsAutoEdit flag on BaseToolInvocation controls whether a tool can auto-approve in AUTO_EDIT mode. Only write-type tools (WriteFile, Edit) set this to true — shell commands always require explicit confirmation in default mode.

Tool Registry and Model-Aware Definitions

The ToolRegistry is the central store for all tool definitions — both built-in and dynamically discovered. It maps tool names to AnyDeclarativeTool instances and provides getFunctionDeclarations(modelId?) for the model's function-calling API.

A key design feature is model-family-aware tool definitions. The getToolSet function at the top of coreTools.ts routes to different schema sets based on the model:

export function getToolSet(modelId?: string): CoreToolSet {
  const family = getToolFamily(modelId);
  switch (family) {
    case 'gemini-3':
      return GEMINI_3_SET;
    case 'default-legacy':
    default:
      return DEFAULT_LEGACY_SET;
  }
}

Gemini 3 models may support different parameter schemas or descriptions than legacy models. Each tool definition (like READ_FILE_DEFINITION) has a base property returning the default-legacy schema and an overrides function that resolves to the appropriate set for the current model.

The full inventory of built-in tools spans file operations, code intelligence, web access, and agent control:

Category Tools
File I/O ReadFile, WriteFile, Edit, ReadManyFiles, Glob, LS
Search Grep (backed by ripgrep)
Execution Shell
Web WebSearch, WebFetch
Memory Memory (save/recall facts)
Agent Control ActivateSkill, AskUser, ExitPlanMode, UpdateTopic
Task Tracking WriteTodos

The Scheduler: Event-Driven Tool Orchestration

The Scheduler is the bridge between tool call requests from the model and actual tool execution. It's described in the source as an "Event-Driven Orchestrator for Tool Execution."

flowchart TD
    A[Tool Call Requests from Turn] --> B[Scheduler.schedule]
    B --> C{Already processing?}
    C -- Yes --> D[Enqueue request]
    C -- No --> E[Start batch]
    E --> F[ToolModificationHandler<br/>validates & transforms]
    F --> G[Check Policy via PolicyEngine]
    G --> H{Policy Decision}
    H -- ALLOW --> I[Execute tool]
    H -- DENY --> J[Return error]
    H -- ASK_USER --> K[Evaluate BeforeTool hook]
    K --> L[Resolve confirmation via MessageBus]
    L --> M{User confirms?}
    M -- Yes --> N[Update policy if always-allow]
    N --> I
    M -- No --> O[Return cancelled]
    I --> P[ToolExecutor.execute]
    P --> Q[Track state via SchedulerStateManager]
    Q --> R[Return CompletedToolCall]

The scheduler coordinates five components:

  1. ToolModificationHandler — Validates and transforms tool call requests before execution
  2. PolicyEngine (via checkPolicy) — Evaluates rules to determine allow/deny/ask_user
  3. evaluateBeforeToolHook — Fires pre-execution hooks that can modify or block
  4. resolveConfirmation — Handles user confirmation flow through the MessageBus
  5. ToolExecutor — Actually runs the tool and captures results

The schedule() method at line 192 accepts one or multiple ToolCallRequestInfo objects. When the model emits multiple function calls in a single response (parallel tool use), they're batched together. If a batch is already in-flight, new requests are queued via _enqueueRequest().

Each tool call progresses through states tracked by SchedulerStateManager: Scheduled → Validating → Executing → Completed (or Errored). The state manager also handles MCP progress updates — when an MCP tool reports incremental progress, the scheduler updates the executing state with progress percentage and message.

Deep Dive: ShellTool and EditTool

Among the built-in tools, ShellTool is the most complex. It must:

  1. Parse the command for policy evaluation (using shell-quote parsing)
  2. Integrate with the sandbox manager for command wrapping
  3. Handle background execution with PID tracking
  4. Support platform-specific execution via ShellExecutionService

The Edit tool takes a different approach — rather than replacing file contents wholesale, it uses a diff-based modification model. The model provides old_string and new_string parameters, and the tool performs a surgical replacement. This is more token-efficient than sending entire file contents and produces cleaner diffs for user confirmation.

Both tools demonstrate the confirmation flow in action. ShellTool produces exec-type confirmation details with the parsed command and root command. EditTool produces edit-type details with the file diff, original content, and new content — rendered as a visual diff in the terminal UI.

MCP Tool Integration

MCP (Model Context Protocol) tools are integrated through the DiscoveredMCPTool class and the naming convention mcp_{serverName}_{toolName}. When MCP servers are configured, the McpClientManager discovers available tools and registers them in the ToolRegistry with the mcp_ prefix.

sequenceDiagram
    participant Config as Config.initialize()
    participant MCM as McpClientManager
    participant MCP as MCP Server
    participant TR as ToolRegistry
    
    Config->>MCM: connectToServers()
    MCM->>MCP: listTools()
    MCP-->>MCM: tool schemas
    MCM->>TR: registerTool(DiscoveredMCPTool)
    Note over TR: Tool registered as<br/>mcp_serverName_toolName

MCP tools flow through the same scheduler pipeline as built-in tools — they're validated, policy-checked, and confirmed before execution. The mcp_ prefix enables wildcard policy rules like mcp_serverName_* to allow or deny all tools from a specific server, as we'll explore in the next article on the policy engine.

Tip: When debugging MCP tool issues, check the ToolRegistry's allKnownTools map. MCP tools are registered under their prefixed names and include the server name in their _serverName annotation for policy matching.

The scheduler, combined with the tool system's builder pattern and the MessageBus confirmation flow, creates a flexible yet secure execution pipeline. In the next article, we'll examine the security layers that govern what this pipeline is allowed to do — the policy engine, sandboxing, and safety checkers.