Read OSS

Schema-Driven Design: Kong's Database Layer from Validation to Query

Advanced

Prerequisites

  • Articles 1-4 (full architecture and plugin understanding)
  • Understanding of Lua metatables and OOP patterns
  • Basic knowledge of database concepts (CRUD, migrations, schemas)

Schema-Driven Design: Kong's Database Layer from Validation to Query

One of Kong's most elegant architectural decisions is that a single schema definition drives everything: database table creation, input validation, API endpoint generation, cache key computation, and foreign key enforcement. Add a new entity, and the system auto-generates CRUD endpoints, database queries, and validation logic without writing boilerplate.

This article explores the schema system from the metaschema at the top, through concrete entity definitions, down to the DAO layer and database strategies — and shows how the same schemas power the Admin API.

The MetaSchema: Schema of Schemas

How do you validate a schema? With another schema. kong/db/schema/metaschema.lua defines the rules for writing entity schemas — the allowed field types, validators, transformations, and constraints.

The metaschema declares the vocabulary of field attributes at lines 47–60:

local validators = {
  { between = { type = "array", elements = { type = "number" }, len_eq = 2 } },
  { eq = { type = "any" } },
  { ne = { type = "any" } },
  { gt = { type = "number" } },
  { len_eq = { type = "integer" } },
  { match = { type = "string" } },
  { starts_with = { type = "string" } },
  -- ...
}

Each validator maps to a validation function in the Schema class. When you write { type = "string", match = "^[a-z]+$" } in an entity schema, the metaschema validates that match is a valid field attribute of type string, and the Schema class later uses the corresponding match validator function.

The metaschema also defines the MetaSubSchema variant used for plugin configuration schemas. The distinction matters: entity schemas define database tables with primary keys and foreign keys, while subschemas define the config field within the plugins entity.

classDiagram
    class MetaSchema {
        +validate(schema_def)
        +fields: validators, types, constraints
    }
    class MetaSubSchema {
        +validate(plugin_schema)
        +plugin-specific rules
    }
    class EntitySchema {
        +name: string
        +fields: field_defs[]
        +primary_key: string[]
        +entity_checks: check[]
    }
    class PluginSubSchema {
        +name: string
        +fields: config_fields[]
    }
    MetaSchema --> EntitySchema : validates
    MetaSubSchema --> PluginSubSchema : validates
    MetaSchema --> MetaSubSchema : derives from

Entity Schema Definitions

A concrete entity schema demonstrates the system in action. kong/db/schema/entities/routes.lua defines the routes entity with field definitions, foreign keys, and entity-level checks:

local entity_checks = {
  { conditional = {
    if_field = "protocols",
    if_match = { elements = { type = "string", not_one_of = { "grpcs", "https", "tls", "tls_passthrough" }}},
    then_field = "snis",
    then_match = { len_eq = 0 },
    then_err = "'snis' can only be set when 'protocols' is 'grpcs', 'https', 'tls' or 'tls_passthrough'",
  }},
}

Entity checks enforce cross-field constraints that can't be expressed on individual fields. In this case, SNIs (Server Name Indications) only make sense for TLS-based protocols — setting SNIs on an HTTP route is an error.

The loading order in constants.lua matters. The CORE_ENTITIES list at lines 137–155 specifies: workspaces, consumers, certificates, services, routes, snis, upstreams, targets, plugins, tags, ... — dependencies are listed before dependents. Services must be loaded before Routes because Routes have a foreign key to Services.

Tip: The CORE_ENTITIES table doubles as both an array and a set. After the array is populated, a loop at lines 307–309 adds CORE_ENTITIES["routes"] = true for O(1) membership checks. This pattern appears throughout Kong's Lua code.

DAO Auto-Generation and the DB Module

kong/db/init.lua is where schemas become functional. The DB.new() constructor iterates over core entity schemas, validates each against the metaschema, creates Entity objects, and instantiates DAOs:

for _, entity_name in ipairs(constants.CORE_ENTITIES) do
  local entity_schema = require("kong.db.schema.entities." .. entity_name)
  local ok, err_t = MetaSchema:validate(entity_schema)
  -- ...
  local entity, err = Entity.new(entity_schema)
  schemas[entity_name] = entity
end

Then at lines 108–118:

for _, schema in pairs(schemas) do
  local strategy = strategies[schema.name]
  daos[schema.name] = DAO.new(self, schema, strategy, errors)
end

The __index metamethod at lines 31–33 makes kong.db.routes resolve to self.daos['routes']:

DB.__index = function(self, k)
  return DB[k] or rawget(self, "daos")[k]
end

This means kong.db.routes:select({ id = "..." }) transparently delegates to the Routes DAO, which uses the Routes schema for validation and the Routes strategy for database access.

flowchart TD
    A[DB.new] --> B[Load each entity schema]
    B --> C[MetaSchema:validate]
    C --> D[Entity.new - create schema object]
    D --> E[Strategy.new - create DB strategy]
    E --> F[DAO.new - create DAO]
    F --> G["kong.db.routes = DAO(routes_schema, postgres_strategy)"]
    G --> H["kong.db.routes:select() → schema.validate → strategy.select → SQL"]

Database Strategies: Postgres vs DB-less (LMDB)

The DAO layer is strategy-agnostic. Each DAO delegates to a strategy object that implements the actual database operations. Kong ships with two strategies:

Postgres (kong/db/strategies/postgres/) — The traditional strategy generates SQL queries from schema definitions. The connector manages connection pooling via ngx_lua's cosocket-based Postgres driver. Connections are kept alive across requests using setkeepalive().

Off/LMDB (kong/db/strategies/off/) — Used in DB-less and Data Plane modes. Instead of querying a remote database, this strategy reads from LMDB (Lightning Memory-Mapped Database), an embedded key-value store. LMDB provides memory-mapped, read-only access with zero-copy semantics — ideal for a data plane that receives configuration from a control plane and needs fast local lookups.

flowchart LR
    subgraph "Traditional Mode"
        A1[DAO] --> B1[Postgres Strategy]
        B1 --> C1[(PostgreSQL)]
    end
    subgraph "DB-less / Data Plane"
        A2[DAO] --> B2[Off Strategy]
        B2 --> C2[(LMDB)]
    end
    D[Admin API / Plugin] --> A1
    D --> A2

The ENTITY_CACHE_STORE mapping at lines 156–174 determines which shared memory cache stores each entity type. Core routing entities (routes, services, plugins, upstreams, targets) go to core_cache, while less critical entities (consumers, tags) go to the regular cache. This separation ensures routing data isn't evicted by plugin cache pressure.

Declarative Config and DB-less Mode

In DB-less mode, Kong reads its entire configuration from a YAML/JSON file rather than a database. The kong/db/declarative/init.lua module manages this:

function _M.new_config(kong_config, partial)
  local schema, err = declarative_config.load(
    kong_config.loaded_plugins,
    kong_config.loaded_vaults
  )
  local self = { schema = schema, partial = partial }
  return setmetatable(self, _MT)
end

The declarative config schema is dynamically generated based on the loaded plugins — it includes fields for every core entity plus plugin-specific custom entities. The loading pipeline:

  1. Parse YAML/JSON into Lua tables
  2. Validate against the generated schema (using the same Schema class as database validation)
  3. Flatten into entity records keyed by entity type
  4. Write to LMDB in a single transaction

When a Data Plane receives configuration from a Control Plane (as we'll explore in Part 6), it arrives as a serialized declarative config payload. The same load_into_cache_with_events function processes both file-based and network-received configurations, rebuilding the router and plugins iterator afterward.

flowchart TD
    A["kong.yml (YAML/JSON)"] --> B[Parse to Lua tables]
    B --> C["Validate against declarative schema"]
    C --> D{Valid?}
    D -->|No| E[Error with path to invalid field]
    D -->|Yes| F[Flatten into entity records]
    F --> G[Write to LMDB transaction]
    G --> H[Rebuild router]
    H --> I[Rebuild plugins iterator]

Admin API: Schema-Driven Endpoint Generation

The Admin API in kong/api/init.lua auto-generates CRUD endpoints from entity schemas. Core routes are loaded first, then entity-based endpoints are generated:

-- Load core routes
for _, v in ipairs({"kong", "health", "cache", "config", "debug"}) do
  local routes = require("kong.api.routes." .. v)
  api_helpers.attach_routes(app, routes)
end

The kong/api/endpoints.lua module provides the auto-generation machinery. For each entity schema, it generates:

  • GET /routes — List all routes (paginated)
  • POST /routes — Create a route
  • GET /routes/:routes — Get a specific route
  • PATCH /routes/:routes — Update a route
  • PUT /routes/:routes — Upsert a route
  • DELETE /routes/:routes — Delete a route

Foreign key relationships generate nested routes automatically. Since Routes have a foreign key to Services, the system generates GET /services/:services/routes to list routes for a specific service.

Plugins can extend the Admin API via their api.lua module. The customize_routes function at line 58 allows plugin-defined endpoints to override or wrap auto-generated ones, with the original function passed as a parent parameter.

Tip: The error codes in endpoints.lua at lines 25–40 map internal error types to HTTP status codes. UNIQUE_VIOLATION → 409, NOT_FOUND → 404, SCHEMA_VIOLATION → 400. This mapping is the single source of truth for Admin API error responses.

The Schema Cascade

The beauty of this design is the cascade. A single schema definition produces:

Artifact Source
Database table (DDL) Schema fields + types
Input validation Schema validators + entity_checks
Cache key computation Schema cache_key field
Admin API endpoints Schema name + foreign keys
Declarative config format Schema fields (YAML structure)
Plugin subschemas MetaSubSchema validation

When Kong adds a new entity (like filter_chains for Wasm), a single schema file cascades into all of these artifacts automatically. It's a compelling example of how centralized data modeling reduces inconsistency across system boundaries.

In Part 6, we'll see how this data layer connects to Kong's distributed architecture — how Control Planes serialize configuration and push it to Data Planes that apply it through the declarative config pipeline.