Read OSS

The Spec-Driven Workflow: How Templates Instruct AI Agents

Intermediate

Prerequisites

  • Article 1: Architecture and Project Navigation
  • Article 3: Integration System (for understanding how templates are processed)
  • Basic YAML frontmatter knowledge
  • Familiarity with at least one AI coding assistant

The Spec-Driven Workflow: How Templates Instruct AI Agents

Spec Kit's real innovation isn't the Python CLI — it's the Markdown files the CLI produces. These 9 command templates are structured instructions for AI agents. They encode a complete development workflow as a directed acyclic graph through YAML frontmatter, embed shell script invocations for filesystem operations the AI can't do alone, and include hook checkpoints where extensions inject additional behavior. The AI agent is both the reader and executor of these documents. This article is about what's inside them.

The 9 Slash Commands and the Workflow DAG

The templates/commands/ directory contains 9 Markdown files, each representing a slash command:

Command Purpose Handoffs to
specify Create feature specification from description plan, clarify
plan Generate technical implementation plan tasks, checklist
tasks Break plan into ordered task list analyze, implement
implement Execute tasks from task list
clarify Structured clarification of underspecified requirements
analyze Cross-artifact consistency analysis
constitution Create/update project governing principles
checklist Generate quality validation checklists
taskstoissues Convert tasks.md to GitHub issues

The handoffs field in YAML frontmatter creates a DAG:

flowchart TD
    specify["speckit.specify"] -->|"Build Technical Plan"| plan["speckit.plan"]
    specify -->|"Clarify Requirements"| clarify["speckit.clarify"]
    plan -->|"Create Tasks"| tasks["speckit.tasks"]
    plan -->|"Create Checklist"| checklist["speckit.checklist"]
    tasks -->|"Analyze Consistency"| analyze["speckit.analyze"]
    tasks -->|"Implement"| implement["speckit.implement"]

    constitution["speckit.constitution"]
    taskstoissues["speckit.taskstoissues"]

The handoffs aren't just documentation — they're metadata that some AI assistants render as actionable buttons or suggestions. For example, the specify.md frontmatter at templates/commands/specify.md:

handoffs:
  - label: Build Technical Plan
    agent: speckit.plan
    prompt: Create a plan for the spec. I am building with...
  - label: Clarify Spec Requirements
    agent: speckit.clarify
    prompt: Clarify specification requirements
    send: true

The send: true field indicates the handoff should auto-trigger rather than waiting for user confirmation. This creates a natural flow where completing a specification leads directly into planning.

Anatomy of a Command Template

Every command template follows the same structure. Let's annotate the plan.md template at templates/commands/plan.md:

flowchart TD
    subgraph "YAML Frontmatter"
        A["description: one-line summary"]
        B["handoffs: next-step commands"]
        C["scripts:<br/>  sh: scripts/bash/setup-plan.sh --json<br/>  ps: scripts/powershell/setup-plan.ps1 -Json"]
        D["agent_scripts:<br/>  sh: scripts/bash/update-agent-context.sh __AGENT__"]
    end
    subgraph "Body"
        E["## User Input<br/>{ARGS} placeholder"]
        F["## Pre-Execution Checks<br/>Hook system checkpoint"]
        G["## Outline<br/>Step-by-step instructions for the AI"]
        H["## Guidelines<br/>Quality constraints"]
    end
    A --> E
    C --> E

The three placeholder types serve distinct purposes:

  • {SCRIPT} — replaced by the shell command from scripts.<type>, giving the AI a command to run for filesystem setup
  • {ARGS} — replaced by the agent-specific argument placeholder ($ARGUMENTS for most, {{args}} for Gemini)
  • __AGENT__ — replaced by the agent name, used in scripts that need to know which agent is active

As we saw in Part 3, the process_template() pipeline resolves all three during init, and strips the scripts: and agent_scripts: blocks from the frontmatter so the final output is clean.

Shell Scripts as the Operational Layer

AI agents can read files and write code, but they can't reliably perform structured filesystem operations like "scan existing spec directories and determine the next sequential number." That's where the shell scripts come in.

The scripts/bash/create-new-feature.sh script is the most important — it's invoked by before_specify hooks (via the git extension) to create feature branches with sequential numbering. It accepts a --json flag for machine-parseable output:

JSON_MODE=false
# ... argument parsing ...
if [ "$JSON_MODE" = true ]; then
    echo "{\"BRANCH_NAME\":\"$BRANCH_NAME\",\"FEATURE_NUM\":\"$FEATURE_NUM\"}"
fi

The --json flag is a recurring pattern across all scripts. Command templates instruct the AI to execute the script, parse the JSON output, and use the values in subsequent steps. For example, check-prerequisites.sh outputs:

{"FEATURE_DIR": "specs/003-user-auth", "AVAILABLE_DOCS": ["spec.md", "plan.md"]}
sequenceDiagram
    participant AI as AI Agent
    participant CMD as Command Template
    participant SH as Shell Script

    AI->>CMD: Read /speckit.tasks
    CMD->>AI: "Run: .specify/scripts/bash/check-prerequisites.sh --json"
    AI->>SH: Execute script
    SH->>AI: {"FEATURE_DIR": "...", "AVAILABLE_DOCS": [...]}
    AI->>AI: Parse JSON, locate spec artifacts
    AI->>AI: Generate tasks.md

Every script exists in both bash/ and powershell/ variants. The --script flag during specify init determines which variant gets installed to .specify/scripts/. The cross-platform duality is complete — every .sh script has a .ps1 counterpart with identical behavior.

Tip: The --json output mode is the key insight for working with AI agents and shell scripts. Human-readable output is nice for debugging, but JSON output lets the AI reliably extract structured data without regex parsing. If you're building similar AI-agent tooling, always provide a machine-parseable output mode.

The Hook System: Extension Points in Every Command

Every command template contains two hook checkpoints — one before execution, one after. Here's the pre-execution check from specify.md:

## Pre-Execution Checks

**Check for extension hooks (before specification)**:
- Check if `.specify/extensions.yml` exists in the project root.
- If it exists, read it and look for entries under the `hooks.before_specify` key
- Filter out hooks where `enabled` is explicitly `false`
- For each executable hook, output based on its `optional` flag:
  - **Optional hook**: Present to user with prompt
  - **Mandatory hook**: Execute immediately via `EXECUTE_COMMAND: {command}`

This is a declarative plugin system where the AI agent is the runtime. The template tells the AI: "Read this YAML file, check for hooks, execute mandatory ones, suggest optional ones." The actual hook logic is defined in extension manifests (like the git extension) and merged into .specify/extensions.yml during extension installation.

flowchart TD
    A["AI reads command template"] --> B{"extensions.yml exists?"}
    B -->|No| C["Skip hooks silently"]
    B -->|Yes| D["Parse hooks.before_{stage}"]
    D --> E{"Hook optional?"}
    E -->|Yes| F["Present to user:<br/>'Run /speckit.git.commit?'"]
    E -->|No| G["Execute immediately:<br/>EXECUTE_COMMAND: speckit.git.feature"]
    F --> H["User decides"]
    G --> I["Wait for result"]
    H --> J["Continue to main logic"]
    I --> J

The hook system is covered in depth in Part 5. What's important here is that every command template includes both before_ and after_ checkpoints, creating 18 potential hook points across the 9 commands. The git extension uses all 18 of them.

Document Templates as LLM Constraints

Beyond command templates, Spec Kit includes document templates that constrain the output of AI agents. The templates/spec-template.md is the most important — it defines the structure of a feature specification:

# Feature Specification: [FEATURE NAME]

## User Scenarios & Testing *(mandatory)*

<!--
  IMPORTANT: User stories should be PRIORITIZED as user journeys ordered by importance.
  Each user story/journey must be INDEPENDENTLY TESTABLE
-->

### User Story 1 - [Brief Title] (Priority: P1)

The template enforces several constraints on the AI:

  1. Prioritized user stories — each story gets a P1/P2/P3 priority, preventing flat lists where everything seems equally important
  2. [NEEDS CLARIFICATION] markers — the command template limits these to a maximum of 3, forcing the AI to make informed defaults rather than asking dozens of questions
  3. Technology-agnostic success criteria — "Users can complete checkout in under 3 minutes" not "API response time under 200ms"
  4. Mandatory vs. optional sections — some sections must be completed, others should be removed entirely if not applicable (not left as "N/A")

The templates/plan-template.md includes "Phase -1 gates" that reference constitutional principles:

## Constitution Check

Language/Version: [e.g., Python 3.11]
Primary Dependencies: [e.g., FastAPI]

And the templates/constitution-template.md defines the governance skeleton — core principles like Library-First, CLI Interface Mandate, and Test-First Imperative that constrain all generated implementation plans.

These templates function as prompt engineering at scale. Instead of crafting perfect prompts for each interaction, the templates embed structural constraints that guide the AI toward higher-quality output regardless of the underlying model. As the spec-driven.md philosophy document puts it: the templates "transform the LLM from a creative writer into a disciplined specification engineer."

What's Next

Commands and templates define what the AI does. But how do you add new commands? How do lifecycle hooks get wired into the workflow? Part 5 covers Spec Kit's two extensibility mechanisms: extensions (custom commands + hooks) and presets (template overrides), including a detailed walkthrough of the bundled Git extension and the multi-catalog discovery system.