Read OSS

Spec Kit Architecture: How a CLI Orchestrates AI-Driven Development

Intermediate

Prerequisites

  • Basic Python packaging concepts (pyproject.toml, wheels, entry points)
  • Familiarity with CLI frameworks (typer or click)
  • General awareness of AI coding assistants (Claude, Copilot, Gemini)

Spec Kit Architecture: How a CLI Orchestrates AI-Driven Development

Every AI coding assistant has its own configuration directory, its own command format, its own way of receiving instructions. If you want to give structured workflow guidance to Claude Code, Copilot, Gemini CLI, Cursor, and twenty-plus more agents, you're looking at a combinatorial explosion of format-specific files. GitHub's Spec Kit solves this with a single CLI — specify — that transforms a shared set of workflow templates into agent-specific instructions for over 25 assistants. This article maps the architecture that makes that possible.

What Is Spec Kit and Spec-Driven Development?

Spec-Driven Development (SDD) inverts the usual relationship between specs and code. Instead of specifications serving as discardable scaffolding for implementation, SDD treats them as the primary artifact. Code becomes an expression of the spec, not the other way around.

Spec Kit operationalizes this philosophy through a four-phase workflow:

flowchart LR
    A["/speckit.specify"] --> B["/speckit.plan"]
    B --> C["/speckit.tasks"]
    C --> D["/speckit.implement"]
    A -.->|"clarify"| E["/speckit.clarify"]
    E -.-> A

Each phase is a slash command that an AI agent executes. The specify command turns a natural-language feature description into a structured specification. The plan command transforms that spec into a technical implementation plan. The tasks command breaks the plan into an ordered task list. And implement executes those tasks. The AI agent is both the reader and executor of these commands — the markdown files are the program, and the LLM is the runtime.

The specify CLI tool itself doesn't run these workflows. It bootstraps projects with the right files, in the right format, for whichever AI assistant the developer uses. Think of it as a compiler that targets 25+ instruction set architectures from a single source.

Directory Structure and Module Roles

The repository is organized into a clear separation between the Python CLI package and the content it distributes:

Directory Purpose
src/specify_cli/ Python package — CLI commands, integration system, extension/preset managers
templates/commands/ The 9 slash command templates (markdown with YAML frontmatter)
templates/*.md Document templates (spec, plan, tasks, constitution, checklist)
scripts/bash/ and scripts/powershell/ Shell scripts invoked by command templates
extensions/git/ Bundled git extension — 5 commands + 18 lifecycle hooks
presets/lean/ Bundled "lean" preset — lighter workflow templates
tests/ pytest suite mirroring the integration subpackage structure
docs/ DocFX documentation site source

The Python package under src/specify_cli/ has a compact module structure:

graph TD
    INIT["__init__.py<br/>(4,143 lines — CLI + TUI + orchestration)"]
    INT["integrations/<br/>__init__.py + base.py + 27 agent subpackages"]
    MAN["integrations/manifest.py<br/>(SHA-256 file tracking)"]
    EXT["extensions.py<br/>(ExtensionManifest, Registry, Manager)"]
    PRE["presets.py<br/>(PresetManifest, Registry, Manager)"]
    AGT["agents.py<br/>(CommandRegistrar — bridges extensions to agents)"]

    INIT --> INT
    INIT --> EXT
    INIT --> PRE
    EXT --> AGT
    PRE --> AGT
    INT --> MAN

Everything radiates from __init__.py. The integrations package owns the agent abstraction layer. Extensions and presets are parallel plugin systems that both use agents.py as their bridge to agent-specific output formats.

The Monolithic __init__.py — Why It Matters

At 4,143 lines, src/specify_cli/__init__.py is the gravitational center of the project. It contains:

  • The main() entry point and Typer app configuration
  • All CLI commands: init, check, version, extension, preset, integration
  • The StepTracker TUI component for progress display
  • The select_with_arrows() interactive selection widget
  • Shared infrastructure installers (_install_shared_infra, _locate_core_pack)
  • Asset resolution logic for both wheel installs and source checkouts

This single-file approach is a deliberate tradeoff. For a CLI tool that users install via uv tool install, startup time matters. A single module means fewer file-system lookups during import. The decomposition happens around this file: the integration class hierarchy lives in integrations/base.py, manifest tracking in integrations/manifest.py, and the extension/preset systems in their own modules.

The entry point itself is remarkably simple — __init__.py#L4139-L4143:

def main():
    app()

if __name__ == \"__main__\":
    main()

The app object is a Typer instance with a custom BannerGroup that displays the ASCII banner before help output — __init__.py#L301-L307:

app = typer.Typer(
    name=\"specify\",
    help=\"Setup tool for Specify spec-driven development projects\",
    add_completion=False,
    invoke_without_command=True,
    cls=BannerGroup,
)

Tip: If you're exploring the codebase for the first time, search for @app.command() decorators within __init__.py to find every CLI command. There are about a dozen, each handling a distinct subcommand like init, extension add, or preset list.

Module Dependency Graph and Bootstrap Chain

When a user runs specify init, the import chain triggers a cascade that registers all 25+ integrations before the first line of command logic executes. Here's the sequence:

flowchart TD
    A["specify (entry point)"] --> B["specify_cli:main()"]
    B --> C["__init__.py module loads"]
    C --> D["_build_agent_config()"]
    D --> E["from .integrations import INTEGRATION_REGISTRY"]
    E --> F["integrations/__init__.py loads"]
    F --> G["_register_builtins()"]
    G --> H["Imports 27 agent subpackages"]
    H --> I["_register() for each → populates INTEGRATION_REGISTRY"]
    I --> J["AGENT_CONFIG dict built from registry"]

The critical function is _build_agent_config(), called at module level:

def _build_agent_config() -> dict[str, dict[str, Any]]:
    \"\"\"Derive AGENT_CONFIG from INTEGRATION_REGISTRY.\"\"\"
    from .integrations import INTEGRATION_REGISTRY
    config: dict[str, dict[str, Any]] = {}
    for key, integration in INTEGRATION_REGISTRY.items():
        if integration.config:
            config[key] = dict(integration.config)
    return config

AGENT_CONFIG = _build_agent_config()

This means INTEGRATION_REGISTRY must be fully populated before __init__.py finishes loading. That's accomplished by _register_builtins(), which imports every integration subpackage alphabetically and registers each instance. The final line of integrations/__init__.py triggers it unconditionally:

_register_builtins()

This "register at import time" pattern means the registry is always complete and consistent — there's no risk of a partially-initialized state. The downside is that adding a new integration with a syntax error will prevent the entire CLI from starting.

The agents.py module follows a similar pattern with lazy initialization. Its CommandRegistrar class builds AGENT_CONFIGS from the registry on first access, with a try/except to handle circular import scenarios during module loading — agents.py#L30-L57.

Air-Gapped Bundling and Distribution

Enterprise environments often can't reach the internet during installation. Spec Kit solves this with Hatch's force-include mechanism, which bundles all runtime assets directly into the Python wheel.

The configuration lives in pyproject.toml#L28-L45:

[tool.hatch.build.targets.wheel.force-include]
\"templates/agent-file-template.md\" = \"specify_cli/core_pack/templates/agent-file-template.md\"
\"templates/commands\" = \"specify_cli/core_pack/commands\"
\"scripts/bash\" = \"specify_cli/core_pack/scripts/bash\"
\"scripts/powershell\" = \"specify_cli/core_pack/scripts/powershell\"
\"extensions/git\" = \"specify_cli/core_pack/extensions/git\"
\"presets/lean\" = \"specify_cli/core_pack/presets/lean\"

At build time, Hatch copies the repository's templates/, scripts/, extensions/, and presets/ directories into specify_cli/core_pack/ inside the wheel. At runtime, _locate_core_pack() resolves which path to use:

def _locate_core_pack() -> Path | None:
    candidate = Path(__file__).parent / \"core_pack\"
    if candidate.is_dir():
        return candidate
    return None
flowchart TD
    A["_locate_core_pack()"] --> B{"core_pack/ exists?"}
    B -->|"Yes (wheel install)"| C["Use specify_cli/core_pack/"]
    B -->|"No (source checkout)"| D["Fallback to repo root: templates/, scripts/"]

Every function that needs assets — _install_shared_infra(), _locate_bundled_extension(), _locate_bundled_preset() — follows this dual-path pattern. The fallback to repository-relative paths means developers running from a source checkout (uv run specify init) get the same behavior without building a wheel first.

Tip: If you're debugging asset resolution issues, check whether core_pack/ exists under your specify_cli installation directory. Its absence means you're running from source, which is fine for development but means asset paths resolve differently.

What's Next

With the architecture mapped out — the monolithic CLI core, the integration registry bootstrap, and the air-gapped bundling — we're ready to dive into the most important command in the entire system. In Part 2, we'll trace the complete execution path of specify init, from its 17 CLI parameters through the interactive TUI to the 8-step orchestration pipeline that transforms an empty directory into a fully scaffolded SDD project.