测试 25 个集成：测试套件与贡献者指南

为 25 个以上 AI 智能体生成文件的系统，面临的是组合爆炸式的测试挑战。每个集成会产生不同的输出格式、文件扩展名、目录结构和 frontmatter。哪怕只有一个占位符替换出错，也可能悄无声息地生成让 AI 智能体困惑的指令。Spec Kit 的测试套件采用基于模式的方式来解决这个问题：每个集成测试都验证相同的不变量，并根据各智能体的输出格式加以适配。本文梳理测试架构，并以一份实用指南作为结尾，介绍最常见的贡献场景——添加新的 AI 智能体。

测试架构概览

测试套件的目录结构与源码保持镜像关系：

tests/
├── conftest.py                          # ANSI 转义码清理工具
├── integrations/
│   ├── conftest.py                      # StubIntegration 测试辅助类
│   ├── test_base.py                     # 基类单元测试
│   ├── test_integration_claude.py       # Claude 专项测试
│   ├── test_integration_copilot.py      # Copilot 专项测试
│   ├── test_integration_windsurf.py     # ...每个智能体一个文件（共 27 个）
│   ├── test_manifest.py                 # IntegrationManifest 测试
│   └── test_registry.py                 # INTEGRATION_REGISTRY 测试
├── test_extensions.py                   # 扩展系统测试
├── test_presets.py                      # 预设系统测试
├── test_agent_config_consistency.py     # 跨集成一致性检查
├── test_merge.py                        # JSON 合并逻辑测试
└── test_branch_numbering.py             # 分支命名测试

测试文件共 51 个。数量最多的是 tests/integrations/ 下的各智能体专项测试——每个受支持的智能体对应一个文件，另有基类、manifest 和 registry 的测试文件。

共享的 conftest.py 提供了一个工具函数：

_ANSI_ESCAPE_RE = re.compile(r"\x1b\[[0-?]*[ -/]*[@-~]")

def strip_ansi(text: str) -> str:
    """Remove ANSI escape codes from Rich-formatted CLI output."""
    return _ANSI_ESCAPE_RE.sub("", text)

这个函数不可或缺，因为 CLI 使用 Rich 输出带样式的内容。任何捕获 CLI 标准输出的测试，都需要在断言前先去除 ANSI 转义码。

位于 tests/integrations/conftest.py 的集成测试 conftest 提供了一个 StubIntegration——它是 MarkdownIntegration 的最小子类，用于在不依赖任何真实智能体的情况下测试基类行为：

class StubIntegration(MarkdownIntegration):
    key = "stub"
    config = {
        "name": "Stub Agent",
        "folder": ".stub/",
        "commands_subdir": "commands",
        "install_url": None,
        "requires_cli": False,
    }

集成测试模式

每个集成测试文件都遵循相同的结构。以 tests/integrations/test_integration_claude.py 为例：

注册验证：

def test_registered(self):
    assert "claude" in INTEGRATION_REGISTRY
    assert get_integration("claude") is not None

配置校验：

def test_config_uses_skills(self):
    integration = get_integration("claude")
    assert integration.config["folder"] == ".claude/"
    assert integration.config["commands_subdir"] == "skills"

关键的占位符测试：

def test_setup_creates_skill_files(self, tmp_path):
    integration = get_integration("claude")
    manifest = IntegrationManifest("claude", tmp_path)
    created = integration.setup(tmp_path, manifest, script_type="sh")

    content = plan_skill.read_text(encoding="utf-8")
    assert "{SCRIPT}" not in content
    assert "{ARGS}" not in content
    assert "__AGENT__" not in content

这三行断言是整个测试套件中最重要的模式。它验证 process_template() 已正确替换所有占位符。如果命令文件中残留未处理的 {SCRIPT}，AI 智能体看到的将是字面上的占位符文本而非 shell 命令——整个工作流会悄悄失效，却不会抛出任何错误。

tests/integrations/test_integration_copilot.py 中的 Copilot 测试还增加了格式专项断言：

def test_setup_creates_agent_md_files(self, tmp_path):
    # 验证 .agent.md 扩展名
    for f in agent_files:
        assert f.name.endswith(".agent.md")

def test_setup_creates_companion_prompts(self, tmp_path):
    # 验证对应的 .prompt.md 文件
    for f in prompt_files:
        content = f.read_text(encoding="utf-8")
        assert content.startswith("---\nagent: speckit.")

def test_agent_and_prompt_counts_match(self, tmp_path):
    # 关键：每个 .agent.md 都必须有对应的 .prompt.md
    assert len(agents) == len(prompts)

提示： 添加新的集成测试时，建议从 test_integration_windsurf.py（最简单的示例）复制开始，再根据目标智能体的格式要求调整断言。占位符检查（{SCRIPT}、{ARGS}、__AGENT__）应出现在每个集成测试中。

所有集成测试都使用 pytest 的 tmp_path fixture 来隔离文件系统操作，确保测试之间互不干扰，也不会在开发者机器上留下残余文件。

扩展与预设测试

tests/test_extensions.py 中的扩展测试覆盖完整生命周期：

flowchart TD
    A["Manifest Validation"] --> B["Registry Operations"]
    B --> C["Manager Install/Remove"]
    C --> D["Command Registration"]
    D --> E["Catalog Discovery"]
    E --> F["Hook Execution"]

manifest 验证测试同时覆盖正向与反向场景——有效的 manifest 能正确解析，无效的则抛出带有具体信息的 ValidationError。这些测试使用 tempfile.mkdtemp() 和 shutil.rmtree() 管理临时目录，而非 tmp_path，两种方式均可。

tests/test_presets.py 中的预设测试与扩展测试结构相同，涵盖 manifest 验证、注册表操作和模板解析。解析器的优先级栈（本地 > 预设 > 扩展 > 核心）通过构造多个来源提供同名模板的场景来验证，确认优先级高的来源能胜出。

CI 流水线与发布流程

CI 在每次推送到 main 分支以及每个 Pull Request 时触发。test.yml 工作流包含两个作业：

Ruff 代码检查 —— 仅在 Python 3.13 上运行 uvx ruff check src/（代码检查不需要多版本矩阵）。

pytest —— 在 Python 3.11、3.12 和 3.13 三个版本上运行完整测试套件：

strategy:
  matrix:
    python-version: ["3.11", "3.12", "3.13"]
steps:
  - name: Install dependencies
    run: uv sync --extra test
  - name: Run tests
    run: uv run pytest

flowchart LR
    A["Push/PR"] --> B["ruff check<br/>(3.13 only)"]
    A --> C["pytest<br/>(3.11)"]
    A --> D["pytest<br/>(3.12)"]
    A --> E["pytest<br/>(3.13)"]
    B --> F{"All green?"}
    C --> F
    D --> F
    E --> F
    F -->|Yes| G["✓ Merge allowed"]

发布流程由标签触发，通过 release.yml 完成。推送 v* 格式的标签后，工作流会提取版本号，从上一个标签以来的提交记录中生成发布说明，并创建 GitHub Release。wheel 包由 Hatch 构建，通过第一篇文章中介绍的 force-include 配置将所有静态资源打包进去。

值得注意的是工具选型：CI 全程使用 uv 进行依赖管理和脚本执行（uv sync、uv run），没有任何 pip install 或 requirements.txt——一切依赖的解析与安装均由 uv 从 pyproject.toml 处理。

贡献指南：添加新集成

最常见的贡献是为新的 AI 编程助手添加支持。AGENTS.md 是权威参考，以下是精简版流程：

flowchart TD
    A["1. Choose base class"] --> B{"Agent format?"}
    B -->|"Standard .md"| C["MarkdownIntegration"]
    B -->|".toml"| D["TomlIntegration"]
    B -->|"skill dirs"| E["SkillsIntegration"]
    B -->|"Fully custom"| F["IntegrationBase"]
    C --> G["2. Create subpackage"]
    D --> G
    E --> G
    F --> G
    G --> H["3. Register in _register_builtins()"]
    H --> I["4. Add scripts/ dir"]
    I --> J["5. Write tests"]
    J --> K["6. Run test suite"]

第 1 步：选择基类。 大多数智能体使用 MarkdownIntegration。如果智能体需要 TOML 格式（如 Gemini），使用 TomlIntegration。如果使用技能目录结构，使用 SkillsIntegration。只有在需要伴生文件或设置合并时（如 Copilot），才直接使用 IntegrationBase。

第 2 步：创建子包。 新建 src/specify_cli/integrations/myagent/__init__.py：

"""MyAgent integration."""
from ..base import MarkdownIntegration

class MyAgentIntegration(MarkdownIntegration):
    key = "myagent"
    config = {
        "name": "My Agent",
        "folder": ".myagent/",
        "commands_subdir": "commands",
        "install_url": "https://myagent.dev/install",
        "requires_cli": True,  # or False for IDE-only agents
    }
    registrar_config = {
        "dir": ".myagent/commands",
        "format": "markdown",
        "args": "$ARGUMENTS",
        "extension": ".md",
    }
    context_file = ".myagent/rules.md"

key 应与实际的 CLI 二进制名称保持一致，以便工具检测正常工作。folder 必须以 / 结尾。registrar_config["dir"] 是 CommandRegistrar 写入扩展命令的路径。

第 3 步：注册。 按字母顺序添加到 integrations/__init__.py：

from .myagent import MyAgentIntegration
# ...
_register(MyAgentIntegration())

第 4 步：添加脚本。 创建 src/specify_cli/integrations/myagent/scripts/ 目录，并在其中添加 update-context.sh 和 update-context.ps1。这两个文件是更新智能体上下文文件的轻量封装脚本。

第 5 步：编写测试。 创建 tests/integrations/test_integration_myagent.py，至少包含：注册检查、配置校验、安装文件验证，以及占位符替换测试。

第 6 步：运行测试套件。 执行 uv run pytest 确保全部通过，再运行 uvx ruff check src/ 进行代码检查。

提示： CONTRIBUTING.md 指出，较大的改动需要事先与维护者讨论。添加新集成是一条成熟的贡献路径，通常会受到欢迎，但在开始之前请先查看 issue 追踪器中是否存在 agent_request 相关的 issue——也许已经有人在做同样的工作了。

系列总结

在这六篇文章中，我们从架构基础出发，依次深入 specify init 流水线、四层集成体系、命令模板工作流引擎、扩展与预设插件系统，最后来到测试套件与贡献工作流。核心洞察在于：Spec Kit 同时扮演两个角色——一个生成文件的 Python CLI，以及一套声明式指令集，其中 markdown 模板是程序，AI 智能体是运行时。CLI 是编译器，模板是代码，大语言模型是 CPU。

这份代码库值得细读。庞大的 __init__.py 密度很高，但结构清晰。集成体系是模板方法模式的教科书级应用。通过 Hatch 的 force-include 实现的离线资源打包，是任何需要随 CLI 分发运行时资源的项目都值得借鉴的模式。而 hook 系统——AI 读取 YAML 配置并在运行时执行 hook——则是一种为 AI 辅助开发时代量身定制的插件架构新思路。

测试 25 个集成：测试套件与贡献者指南

前置知识

测试 25 个集成：测试套件与贡献者指南

测试架构概览

集成测试模式

扩展与预设测试

CI 流水线与发布流程

贡献指南：添加新集成

系列总结