Read OSS

GitHub Automation — AI-Powered Issue Management at Scale

Intermediate

Prerequisites

  • Article 1: Architecture Overview
  • Basic GitHub Actions knowledge (workflows, events, permissions)
  • Basic TypeScript/shell scripting familiarity

GitHub Automation — AI-Powered Issue Management at Scale

The claude-code repository doesn't just contain Claude Code plugins — it uses Claude Code to manage itself. This recursive dogfooding pattern spans 12 GitHub Actions workflows, custom slash commands, sandboxed CLI wrappers, and TypeScript lifecycle scripts. When someone opens an issue, Claude Code triages it. When duplicates appear, Claude Code finds them. When issues go stale, scripts close them with grace periods and author veto mechanisms.

This article traces an issue's complete lifecycle and examines the three-layer security model that makes running AI agents on your GitHub repository safe for production use.

The Issue Lifecycle Pipeline

An issue in the claude-code repository passes through up to seven stages, each managed by a different workflow or script:

stateDiagram-v2
    [*] --> Opened: Issue created
    Opened --> Triaged: claude-issue-triage.yml
    Triaged --> DupeChecked: claude-dedupe-issues.yml
    DupeChecked --> Active: No duplicates found
    DupeChecked --> DupeCommented: Duplicate detected
    DupeCommented --> AutoClosed: 3-day grace + no dispute
    DupeCommented --> Active: Author disputes (👎 reaction)
    Active --> Stale: sweep.ts (14d inactive)
    Stale --> Closed: sweep.ts (14d after stale label)
    Active --> NeedsInfo: Triage labels needs-info/needs-repro
    NeedsInfo --> Closed: sweep.ts (7d no response)
    NeedsInfo --> Active: Info provided → re-triage
    Closed --> Locked: lock-closed-issues.yml

The lifecycle labels and their timeouts are defined in a single source of truth at scripts/issue-lifecycle.ts:

export const lifecycle = [
  { label: "invalid",   days: 3,  reason: "this doesn't appear to be about Claude Code" },
  { label: "needs-repro", days: 7, reason: "we still need reproduction steps" },
  { label: "needs-info",  days: 7, reason: "we still need more information" },
  { label: "stale",      days: 14, reason: "inactive for too long" },
  { label: "autoclose",  days: 14, reason: "inactive for too long" },
] as const;

Every script and workflow that needs timeout information imports from this file. The STALE_UPVOTE_THRESHOLD constant (10 thumbs-up reactions) provides an escape valve — popular issues are exempt from staleness, even if inactive.

AI-Powered Triage and Deduplication

When an issue is opened, two workflows fire in parallel:

claude-issue-triage.yml at /.github/workflows/claude-issue-triage.yml runs the /triage-issue command via anthropics/claude-code-action@v1. It uses Claude Opus (claude-opus-4-6) with a 5-minute timeout. The concurrency group ensures only one triage runs per issue at a time.

The triage command at /.claude/commands/triage-issue.md is carefully constrained:

  • Only add/remove labels — no comments, no issue edits
  • Only use labels that exist — fetches the label list first, never invents new ones
  • Conservative with lifecycle labels — "false positives are worse than missing labels"
  • Different behavior for new issues vs. comments — new issues get full categorization; comments only trigger lifecycle label updates

The command distinguishes between issues events (full triage) and issue_comment events (lifecycle label management). On comments, it removes stale/autoclose labels (new activity means the issue is alive) and evaluates whether needs-repro/needs-info should be removed (the missing information was provided).

flowchart TD
    EVENT{"Event type?"} --> NEW["issues (new issue)"]
    EVENT --> COMMENT["issue_comment"]

    NEW --> LABELS["Fetch available labels"]
    LABELS --> READ["Read issue details"]
    READ --> VALID{"About<br/>Claude Code?"}
    VALID -->|No| INVALID["Label: invalid"]
    VALID -->|Yes| CATEGORIZE["Apply category labels<br/>(type, area, platform)"]
    CATEGORIZE --> DUPECHECK["Search for duplicates"]
    DUPECHECK --> LIFECYCLE["Evaluate lifecycle labels<br/>(needs-repro, needs-info)"]
    LIFECYCLE --> APPLY["Apply all labels"]

    COMMENT --> READ_CONV["Read full conversation"]
    READ_CONV --> STALE{"Has stale/<br/>autoclose?"}
    STALE -->|Yes| REMOVE["Remove stale labels"]
    STALE -->|No| NEEDS{"Has needs-repro/<br/>needs-info?"}
    NEEDS -->|Yes| PROVIDED{"Info<br/>provided?"}
    PROVIDED -->|Yes| REMOVE_NEEDS["Remove needs-* label"]
    PROVIDED -->|No| KEEP["Keep label"]

    style INVALID fill:#E53935,color:#fff

claude-dedupe-issues.yml at /.github/workflows/claude-dedupe-issues.yml runs the /dedupe command with Sonnet. The dedupe command at /.claude/commands/dedupe.md launches 5 parallel agents for diverse search — each using different keywords and approaches to find potential duplicates. A final filtering agent eliminates false positives before posting results via a comment script. The workflow also logs events to Statsig for analytics.

Tip: The triage command's tool restrictions (Bash(./scripts/gh.sh:*) and Bash(./scripts/edit-issue-labels.sh:*)) demonstrate the same pattern we saw in Part 2's allowed-tools — constraining AI agents to specific, auditable operations. This is essential when running AI on public repositories.

Auto-Close with Grace Period

The auto-close-duplicates script at scripts/auto-close-duplicates.ts implements a sophisticated grace period mechanism. When the dedupe command posts a "possible duplicate" comment, the clock starts ticking. Three days later, this script checks whether the issue should be closed:

flowchart TD
    START["Fetch open issues<br/>created >3 days ago"] --> SCAN["For each issue"]
    SCAN --> DUPE{"Has bot 'possible<br/>duplicate' comment?"}
    DUPE -->|No| SKIP["Skip"]
    DUPE -->|Yes| AGE{"Comment older<br/>than 3 days?"}
    AGE -->|No| SKIP
    AGE -->|Yes| ACTIVITY{"Any comments<br/>after dupe comment?"}
    ACTIVITY -->|Yes| SKIP
    ACTIVITY -->|No| REACTION{"Author gave 👎<br/>on dupe comment?"}
    REACTION -->|Yes| SKIP
    REACTION -->|No| EXTRACT["Extract duplicate<br/>issue number"]
    EXTRACT --> CLOSE["Close as duplicate"]

    style CLOSE fill:#E53935,color:#fff
    style SKIP fill:#4CAF50,color:#fff

The author veto mechanism is the key design detail. The script checks reactions on the duplicate detection comment for a thumbs-down from the issue author specifically (matched by reaction.user.id === issue.user.id):

scripts/auto-close-duplicates.ts#L228-L241

If the author disagrees, the issue stays open. If they don't respond within three days, it's closed with a comment explaining what happened and inviting them to reopen.

The @claude Mentions Handler

The claude.yml workflow enables natural language interaction with Claude Code directly in GitHub issues and PRs. It triggers when @claude appears in issue comments, PR review comments, PR reviews, or issue bodies:

if: |
  (github.event_name == 'issue_comment' && contains(github.event.comment.body, '@claude')) ||
  (github.event_name == 'pull_request_review_comment' && contains(github.event.comment.body, '@claude')) ||
  ...

This turns every issue and PR into a potential Claude Code interaction point. The workflow uses anthropics/claude-code-action@v1 with Sonnet and read-only permissions — it can read code and respond but can't push changes.

sequenceDiagram
    participant U as User
    participant GH as GitHub
    participant WF as claude.yml
    participant CC as Claude Code Action

    U->>GH: Comment: "@claude explain this error"
    GH->>WF: issue_comment event
    WF->>WF: Check: contains '@claude'?
    WF->>CC: Run claude-code-action
    CC->>GH: Read repository context
    CC->>GH: Post response comment

Security Sandboxing: Three Layers

Running AI agents on a public GitHub repository requires serious security consideration. The claude-code repository implements a three-layer defense:

Layer 1: gh.sh — Sandboxed CLI Wrapper

The script at scripts/gh.sh restricts the gh CLI to four read-heavy subcommands:

case "$CMD" in
  "issue view"|"issue list"|"search issues"|"label list")
    ;;
  *)
    echo "Error: only 'issue view', 'issue list', 'search issues', 'label list' are allowed"
    exit 1
    ;;
esac

Flags are allowlisted (--comments, --state, --limit, --label). The search command explicitly blocks repo:, org:, and user: qualifiers to prevent cross-repository access. Issue numbers must be numeric. The repository is always scoped via GH_REPO — no escaping to other repositories.

This is defense in depth: even if Claude Code somehow generates an unexpected gh command, the wrapper rejects it.

Layer 2: DevContainer Firewall

The DevContainer configuration at .devcontainer/devcontainer.json requires NET_ADMIN and NET_RAW capabilities:

"runArgs": [
  "--cap-add=NET_ADMIN",
  "--cap-add=NET_RAW"
]

These capabilities allow the firewall script at .devcontainer/init-firewall.sh to set up iptables rules. The firewall implements a default-deny policy:

iptables -P INPUT DROP
iptables -P FORWARD DROP
iptables -P OUTPUT DROP

Then it selectively allows traffic to:

  • GitHub (IPs fetched from api.github.com/meta)
  • npm registry (registry.npmjs.org)
  • Anthropic API (api.anthropic.com)
  • Statsig (statsig.anthropic.com, statsig.com)
  • VS Code marketplace and update servers
  • Sentry for error reporting

The firewall verifies itself: it confirms that example.com is unreachable and api.github.com is reachable. If either check fails, the script exits with an error.

Layer 3: Enterprise Managed Settings

The examples at examples/settings/ show how organizations can lock down Claude Code at the policy level.

flowchart TD
    subgraph "Layer 1: CLI Wrapper"
        GH["gh.sh<br/>4 allowed subcommands<br/>Flag allowlist<br/>Injection prevention"]
    end

    subgraph "Layer 2: Network Firewall"
        FW["init-firewall.sh<br/>Default-deny iptables<br/>Allowlisted domains only<br/>Self-verification"]
    end

    subgraph "Layer 3: Enterprise Settings"
        ES["managed-settings.json<br/>Permission lockdown<br/>Hook restrictions<br/>Tool deny lists"]
    end

    GH --> FW --> ES

    style GH fill:#4CAF50,color:#fff
    style FW fill:#2196F3,color:#fff
    style ES fill:#9C27B0,color:#fff

Enterprise Settings Profiles

The repository includes three deployment profiles showing progressive lockdown:

Setting Lax Strict Bash-Sandbox
Disable --dangerously-skip-permissions
Block plugin marketplaces
Block user/project permission rules
Block user/project hooks
Deny WebSearch/WebFetch tools
Bash requires approval
Bash must run sandboxed

The strict profile at examples/settings/settings-strict.json demonstrates the full lockdown:

{
  "permissions": {
    "disableBypassPermissionsMode": "disable",
    "ask": ["Bash"],
    "deny": ["WebSearch", "WebFetch"]
  },
  "allowManagedPermissionRulesOnly": true,
  "allowManagedHooksOnly": true,
  "strictKnownMarketplaces": []
}

The key properties:

  • allowManagedPermissionRulesOnly: Ignores user and project-level permission rules — only enterprise-managed rules apply
  • allowManagedHooksOnly: Blocks hooks from user settings and project configuration
  • strictKnownMarketplaces: []: Empty array blocks all plugin marketplaces
  • disableBypassPermissionsMode: "disable": Prevents users from running with --dangerously-skip-permissions

The sandbox configuration in the bash-sandbox profile goes further with network restrictions, limiting which domains bash commands can reach and blocking local socket access.

Tip: These settings files work at any level of the settings hierarchy, but properties like strictKnownMarketplaces, allowManagedHooksOnly, and allowManagedPermissionRulesOnly only take effect in enterprise settings. Test locally by applying to managed-settings.json before deploying to your organization.

The Staleness Sweep

The sweep script at scripts/sweep.ts runs on a schedule and handles two operations: marking inactive issues as stale, and closing issues whose lifecycle labels have expired.

The markStale function paginates through open issues sorted by update time (oldest first). It skips pull requests, locked issues, assigned issues, already-stale issues, and issues with ≥10 thumbs-up reactions. Everything else that hasn't been updated within the staleness window gets labeled.

The closeExpired function iterates through each lifecycle label, finds issues where the label was applied longer ago than the timeout, and checks for human activity since the label was applied. This is the safety net — if a real user (not a bot) commented after the label was applied, the issue is spared even if the script would otherwise close it:

scripts/sweep.ts#L124-L138

The script also supports --dry-run mode, which logs what would happen without making changes — essential for testing lifecycle configuration updates.

Wrapping Up the Series

Across these five articles, we've mapped the anthropics/claude-code repository from its high-level architecture through the plugin component model, multi-agent orchestration patterns, hook implementations, and GitHub automation infrastructure. The repository is a reference implementation for how to extend Claude Code — and how to use Claude Code to manage itself.

The patterns we've examined — convention-over-configuration plugin discovery, model-tier-appropriate agent design, fail-open vs. fail-closed hook philosophies, and layered security sandboxing — apply far beyond this specific repository. They're the building blocks for any team integrating AI agents into their development workflow.

If you're building a Claude Code plugin, start with the plugin-dev skill's documentation. If you're setting up enterprise deployment, start with the settings examples. And if you're curious about running AI agents on GitHub at scale, the automation infrastructure here is the best reference implementation available.