Read OSS

Automated Supply Chain Security: The Dependency Updater and Version Pinning System

Advanced

Prerequisites

  • Article 1: Architecture (for versions.json context)
  • Basic semantic versioning (semver) understanding
  • Familiarity with the GitHub API

Automated Supply Chain Security: The Dependency Updater and Version Pinning System

In Parts 1-3 we treated versions.json as a given — the source of truth that Dockerfiles read to clone and verify upstream dependencies. But someone (or something) has to keep those versions current. Doing it manually means monitoring four upstream repositories for new releases, parsing different tag formats, verifying that updates aren't downgrades, updating both versions.json and versions.env, and creating pull requests with meaningful diff links.

The dependency_updater is a Go command-line tool that automates this entire pipeline. It runs daily via GitHub Actions, queries the GitHub Tags API with pagination, filters versions by tracking mode, enforces anti-downgrade protection, and auto-generates PRs. The version parsing logic alone handles four distinct tag formats — a testament to the diversity of upstream versioning conventions.

versions.json: The Source of Truth

Let's look more carefully at the structure of versions.json. Each of the four dependencies has a well-defined schema:

{
  "op_node": {
    "tag": "op-node/v1.16.11",
    "commit": "cba7aba0c98aae22720b21c3a023990a486cb6e0",
    "tagPrefix": "op-node",
    "owner": "ethereum-optimism",
    "repo": "optimism",
    "tracking": "release"
  }
}
Field Purpose
tag The Git tag to clone at — the human-readable version identifier
commit The expected SHA at that tag — for supply chain verification
tagPrefix Optional prefix for monorepo tags (e.g., op-node in op-node/v1.16.11)
owner / repo GitHub organization and repository name
tracking Version selection strategy: release, tag, or branch

The tracking field controls which upstream versions the updater considers valid:

  • release: Only stable versions (no prerelease suffixes). Used by all four current dependencies.
  • tag: Stable releases plus release candidates (-rc1, -rc.2). Useful for following pre-release chains.
  • branch: Track the latest commit on a specific branch, ignoring tags entirely. The branch field specifies which branch to follow.
flowchart TD
    T{"tracking mode"} -->|release| R["Only stable versions<br/>v0.7.0 ✅<br/>v0.7.1-rc1 ❌<br/>v0.7.1-synctest.0 ❌"]
    T -->|tag| RC["Stable + RC versions<br/>v0.7.0 ✅<br/>v0.7.1-rc1 ✅<br/>v0.7.1-synctest.0 ❌"]
    T -->|branch| BR["Latest commit on branch<br/>No tag filtering<br/>Tracks HEAD of branch"]

Version Parsing Across Four Formats

The four upstream dependencies use four different versioning conventions. The version.go file handles all of them through a normalization pipeline.

Dependency Example Tag Format
base_reth_node v0.7.0 Standard v-prefixed semver
nethermind 1.36.2 Bare semver (no v-prefix)
op_geth v1.101702.0 V-prefixed with unusual minor version
op_node op-node/v1.16.11 Monorepo-prefixed tag

The ParseVersion function handles this diversity in three steps:

flowchart LR
    A["Input:<br/>op-node/v1.16.11-rc1"] -->|"Step 1: Strip prefix"| B["v1.16.11-rc1"]
    B -->|"Step 2: Normalize RC"| C["v1.16.11-rc.1"]
    C -->|"Step 3: Parse semver"| D["Version{1, 16, 11, rc.1}"]

Step 1: Strip tag prefix. If a tagPrefix is configured, it's removed along with any trailing slash. op-node/v1.16.11 becomes v1.16.11.

Step 2: Normalize RC formats. The normalizeRCFormat function uses a regex to convert all RC format variants to semver-compatible form:

var rcPattern = regexp.MustCompile(`(?i)-rc[.-]?(\d+)`)

func normalizeRCFormat(version string) string {
    return rcPattern.ReplaceAllString(version, "-rc.$1")
}

This single regex handles -rc1, -rc.1, -rc-1, and -RC1 — all becoming -rc.1. The (?i) flag makes it case-insensitive. The tests in version_test.go verify all these variants.

Step 3: Parse with Masterminds/semver. The normalized string is handed to the semver.NewVersion() function from the Masterminds library, which handles the v-prefix automatically and produces a structured version object.

Tip: The v1.101702.0 format from op-geth looks weird but it's valid semver. The minor version 101702 encodes information about the Optimism protocol version. The semver library treats it as any other integer — no special handling needed.

Version Comparison and Anti-Downgrade Protection

Once versions are parsed, the updater needs to determine whether a new version is actually an upgrade. The ValidateVersionUpgrade function enforces strict forward-only movement:

func ValidateVersionUpgrade(currentTag, newTag, tagPrefix string) error {
    if currentTag == "" {
        _, err := ParseVersion(newTag, tagPrefix)
        return err
    }
    currentVersion, err := ParseVersion(currentTag, tagPrefix)
    if err != nil {
        _, newErr := ParseVersion(newTag, tagPrefix)
        return newErr
    }
    newVersion, err := ParseVersion(newTag, tagPrefix)
    if err != nil {
        return fmt.Errorf("new version %q is not a valid semver: %w", newTag, err)
    }
    if newVersion.LessThan(currentVersion) {
        return fmt.Errorf("version downgrade detected: %s -> %s", currentTag, newTag)
    }
    return nil
}

The anti-downgrade logic is simple but has thoughtful edge cases:

  • Empty current version: Any valid new version is accepted (first-time setup)
  • Unparseable current version: If the current tag can't be parsed, the update is allowed as long as the new tag is valid semver
  • RC ordering: RC versions sort below their corresponding stable release (v0.3.0-rc2 < v0.3.0)
flowchart TD
    A["v0.2.2 (current)"] --> B{"Is v0.3.0-rc1 an upgrade?"}
    B -->|"0.3.0-rc1 > 0.2.2 ✅"| C["Valid upgrade"]
    C --> D{"Is v0.3.0-rc2 an upgrade?"}
    D -->|"0.3.0-rc2 > 0.3.0-rc1 ✅"| E["Valid upgrade"]
    E --> F{"Is v0.3.0 an upgrade?"}
    F -->|"0.3.0 > 0.3.0-rc2 ✅"| G["Valid: RC → stable"]
    G --> H{"Is v0.3.0-rc2 an upgrade?"}
    H -->|"0.3.0-rc2 < 0.3.0 ❌"| I["BLOCKED: downgrade!"]
    style I fill:#ff6666

The test suite at version_test.go lines 71-122 is comprehensive, covering upgrade paths, downgrades, edge cases with unparseable versions, and even cross-prefix validation (preventing rollup-boost/v0.7.11 from being "upgraded" to websocket-proxy/v0.0.2).

The Update Flow: GitHub API to Pull Request

The main update logic lives in dependency_updater.go. The flow for each dependency follows this sequence:

flowchart TD
    A["Read versions.json"] --> B["For each dependency"]
    B --> C{"tracking mode?"}
    C -->|"release/tag"| D["Paginate GitHub Tags API"]
    C -->|"branch"| E["Get latest commit on branch"]
    D --> F["Filter by tagPrefix"]
    F --> G["Filter by tracking mode<br/>(release-only or release+RC)"]
    G --> H["Validate: is this an upgrade?"]
    H --> I["Find maximum valid version"]
    I --> J{"Version changed?"}
    J -->|Yes| K["Update versions.json"]
    J -->|No| L["Skip dependency"]
    E --> M{"Commit changed?"}
    M -->|Yes| K
    M -->|No| L
    K --> N["Regenerate versions.env"]
    N --> O["Create commit with diff URL"]

The tag-based update logic in getVersionAndCommit is particularly well-structured. Rather than taking the first valid upgrade it finds, it collects all valid tags across all pages, then finds the maximum version among them:

// Collect all valid tags across all pages, then find the max version
var validTags []*github.RepositoryTag

for {
    tags, resp, err := client.Repositories.ListTags(ctx, owner, repo, options)
    // ... filter by prefix, tracking mode, upgrade validation ...
    if resp.NextPage == 0 {
        break
    }
    options.Page = resp.NextPage
}

// Find the maximum version among valid tags
for _, tag := range validTags {
    if selectedTag == nil {
        selectedTag = tag
        continue
    }
    cmp, _ := CompareVersions(*tag.Name, *selectedTag.Name, tagPrefix)
    if cmp > 0 {
        selectedTag = tag
    }
}

This pagination-then-max approach is important because the GitHub Tags API doesn't return tags in semantic version order — it uses chronological or lexicographic ordering. Simply taking the first page's results could miss the highest version.

The function also generates a diff URL for the commit message: https://github.com/{owner}/{repo}/compare/{old_tag}...{new_tag}. This gives PR reviewers a direct link to see what changed upstream.

For branch tracking mode (lines 284-307), the logic is simpler: query the latest commit on the specified branch and compare SHA hashes. If the commit has changed, generate a diff URL between the old and new commits.

All API calls are wrapped with retry.Do0 from the Optimism SDK (line 100), retrying up to 3 times with 1-second fixed delays. This handles transient GitHub API failures gracefully.

versions.env Generation

After updating versions.json, the updater regenerates versions.env via the createVersionsEnv function:

func createVersionsEnv(repoPath string, dependencies Dependencies) error {
    envLines := []string{}
    for dependency := range dependencies {
        repoUrl := generateGithubRepoUrl(dependencies, dependency) + ".git"
        dependencyPrefix := strings.ToUpper(dependency)
        envLines = append(envLines, fmt.Sprintf("export %s_%s=%s",
            dependencyPrefix, "TAG", dependencies[dependency].Tag))
        envLines = append(envLines, fmt.Sprintf("export %s_%s=%s",
            dependencyPrefix, "COMMIT", dependencies[dependency].Commit))
        envLines = append(envLines, fmt.Sprintf("export %s_%s=%s",
            dependencyPrefix, "REPO", repoUrl))
    }
    slices.Sort(envLines)
    // ... write to file
}

The naming convention is deterministic: the dependency key is uppercased (op_gethOP_GETH), then suffixed with _TAG, _COMMIT, or _REPO. The lines are sorted alphabetically before writing, which keeps diffs clean when only one dependency changes.

flowchart LR
    VJ["versions.json<br/>op_geth.tag = v1.101702.0<br/>op_geth.commit = d0734fd..."] -->|createVersionsEnv| VE["versions.env<br/>export OP_GETH_TAG=v1.101702.0<br/>export OP_GETH_COMMIT=d0734fd...<br/>export OP_GETH_REPO=https://..."]
    VE -->|"COPY into Dockerfile"| DF["RUN . /tmp/versions.env && ..."]

Tip: Both versions.json and versions.env are checked into the repository. The .env file is a derived artifact from the .json file. If they ever get out of sync, running the updater will reconcile them. Never edit versions.env manually — always modify versions.json and regenerate.

Commit Hash Verification in Docker Builds

The commit hash stored alongside each tag isn't just for reference — it's an active security control. As we saw in Part 3, every Dockerfile verifies that the cloned repository's HEAD matches the expected commit. Let's trace the full verification chain:

sequenceDiagram
    participant GH as GitHub Tags API
    participant UP as dependency_updater
    participant VJ as versions.json
    participant DF as Dockerfile (build time)
    participant REPO as Upstream Repo

    UP->>GH: ListTags(owner, repo)
    GH-->>UP: tags with commit SHAs
    UP->>UP: Find max valid version
    UP->>VJ: Write tag + commit SHA
    Note over DF: At build time...
    DF->>REPO: git clone --branch TAG
    DF->>DF: git rev-parse HEAD
    DF->>DF: Compare HEAD == expected COMMIT
    alt Match
        DF->>DF: Proceed to build
    else Mismatch
        DF->>DF: BUILD FAILS
    end

When the updater queries GitHub's Tags API, each tag object includes the commit SHA it points to. The updater stores this SHA in versions.json. At build time, the Dockerfile clones at the tag, then compares HEAD against the stored SHA. If they don't match, the build fails.

This protects against a specific threat: an upstream maintainer (or compromised account) could move a tag to point at a different commit. Without the SHA check, the Dockerfile would happily build whatever code the tag now points to. With the check, the build fails immediately, alerting operators to the discrepancy.

Daily Automation with GitHub Actions

The entire update pipeline runs automatically via update-dependencies.yml:

on:
  schedule:
    - cron: '0 13 * * *'  # Daily at 1 PM UTC
  workflow_dispatch:

The workflow is straightforward: checkout the repo, build the Go updater, run it, and create a PR if versions changed:

flowchart TD
    A["Daily cron at 1 PM UTC"] --> B["Checkout repository"]
    B --> C["cd dependency_updater && go build"]
    C --> D["Run updater with --github-action"]
    D --> E{"Any versions updated?"}
    E -->|Yes| F["Updater writes TITLE and DESC<br/>to GITHUB_OUTPUT"]
    F --> G["peter-evans/create-pull-request<br/>creates PR with diff links"]
    E -->|No| H["Workflow exits cleanly"]

When running in GitHub Actions mode (--github-action true), the updater writes the commit title and description to GITHUB_OUTPUT (lines 385-407) using GitHub's multiline output syntax. The peter-evans/create-pull-request action then creates a PR with the updater's output:

- name: create pull request
  if: ${{ steps.run_dependency_updater.outputs.TITLE != '' }}
  uses: peter-evans/create-pull-request@271a8d0340265f705b14b6d32b9829c1cb33d45e
  with:
    title: ${{ steps.run_dependency_updater.outputs.TITLE }}
    body: "${{ steps.run_dependency_updater.outputs.DESC }}"
    branch: run-dependency-updater
    delete-branch: true

The PR title follows the pattern chore: updated op-geth, op-node and the body includes diff links for each updated dependency. The delete-branch: true flag cleans up after merge.

Notice the if condition: if the updater's TITLE output is empty, no PR is created. This happens when all dependencies are already at their latest versions.

The Full Supply Chain Pipeline

Stepping back, here's the complete supply chain security pipeline:

  1. Daily: GitHub Actions runs the dependency updater
  2. Updater: Queries GitHub API → finds latest versions → validates upgrades → captures commit SHAs → updates files → creates PR
  3. PR Review: Human reviews the diff links and approves
  4. Merge: Updated versions.json and versions.env land on main
  5. CI Build: Docker builds clone at pinned tags and verify commit hashes
  6. Runtime: Containers run verified binaries

Every link in this chain includes a verification step. The updater validates versions aren't downgrades. The Dockerfiles verify commit hashes match. The CI pipeline uses pinned action SHAs (not version tags) for its own supply chain security. And the whole system runs on a daily cadence, ensuring Base nodes stay current with upstream changes.

What's Next

The dependency updater creates PRs, but those PRs need to be validated before merging. In Part 5, we'll explore the CI/CD pipeline that builds Docker images for three execution clients across two CPU architectures, using a sophisticated build → upload digest → merge manifest pattern. We'll see how platform-specific optimizations like asm-keccak are conditionally applied, and how the pipeline uses pinned action SHAs and harden-runner for defense in depth.