System32 CI

Kata CI

Container-native continuous integration that runs every pipeline step in isolated containers. Hermetic builds, parallel execution, and content-addressable caching.

v0.1.0
Built with Go
Open Source

Overview

Kata CI is a container-native CI system that runs every pipeline step inside an isolated container. No shared state between steps, no flaky tests from environment drift, no "works on my machine" failures. Every build is hermetic and reproducible.

Pipelines are defined in a single YAML file (.kata-ci.yaml) at the root of your repo. Kata CI parses the dependency graph between steps and runs independent steps in parallel, with content-addressable caching for near-instant reruns when inputs haven't changed.

Hermetic Builds
Every step runs in a fresh container. No shared filesystem, no leaked state, fully reproducible.
Parallel Execution
Independent steps run concurrently. Kata CI builds the DAG from your dependencies automatically.
Smart Caching
Content-addressable cache keyed on inputs. Unchanged steps resolve instantly from cache.

Architecture

system-overview
kata-cli CLI & local runner
|
kata-engine DAG scheduler, step orchestrator
|
+--------+-----------+
| | |
runner cache triggers
(container) (CAS store) (git hooks)
  • kata-cli — The command-line interface. Parses config, invokes the engine, streams logs.
  • kata-engine — The scheduler. Builds the step DAG, resolves cache hits, dispatches steps to runners.
  • runner — Container runtime adapter. Pulls images, mounts artifacts, executes commands, captures output.
  • cache — Content-addressable store. Keys are hashed from step inputs (image, commands, dependencies, mounted files).
  • triggers — Git-native triggers. Watches branches, tags, and PRs. Filters by path globs.

How It Works

STEP 1
Parse
Read .kata-ci.yaml and build the step dependency graph (DAG).
STEP 2
Cache Check
Hash each step's inputs. Skip steps with a cache hit.
STEP 3
Schedule
Run independent steps in parallel. Respect dependency ordering.
STEP 4
Execute
Spin up a container per step. Mount artifacts, run commands, capture output.
STEP 5
Report
Stream logs, collect exit codes, update cache, report pass/fail status.

Quickstart

Install Kata CI, add a pipeline config to your repo, and run it.

install
# Install via shell script
$ curl -fsSL https://get.kata.ci | sh
# Or with Go
$ go install github.com/system32-ai/kata-ci/cmd/kata@latest
# Verify installation
$ kata version
kata-ci v0.1.0 (go1.22, linux/amd64)

Create a .kata-ci.yaml in your project root:

.kata-ci.yaml yaml
pipeline: "my-app"

steps:
  - name: "build"
    image: "node:20-alpine"
    commands:
      - "npm ci"
      - "npm run build"
    artifacts:
      - "dist/"

  - name: "test"
    image: "node:20-alpine"
    depends_on: ["build"]
    commands:
      - "npm test"

  - name: "lint"
    image: "node:20-alpine"
    depends_on: ["build"]
    commands:
      - "npm run lint"
terminal
$ kata run
[PARSE] Pipeline "my-app": 3 steps, 2 parallel groups
[CACHE] build: MISS
[RUN] build: npm ci && npm run build
[PASS] build (2.1s)
[RUN] test, lint (parallel)
[PASS] lint (0.4s)
[PASS] test (1.2s)
[DONE] All 3 steps passed (3.3s total, 2.1s wall)
Tip: Run kata run again — the build step will resolve from cache instantly if source files haven't changed.

Pipeline Config

The pipeline config lives in .kata-ci.yaml at the root of your repository. It defines the pipeline name, global settings, and a list of steps.

.kata-ci.yaml yaml
pipeline: "my-service"

settings:
  timeout: "15m"           # Max pipeline duration
  max_parallel: 4          # Max concurrent steps
  fail_fast: true           # Stop on first failure
  cache_enabled: true       # Enable content-addressable cache

env:                         # Global env vars for all steps
  NODE_ENV: "ci"
  CI: "true"

steps:
  # ... step definitions
FieldTypeDescription
pipelinestringPipeline name. Used in logs and cache keys.
settings.timeoutdurationMax total pipeline duration. Default: 30m.
settings.max_parallelintMax steps to run concurrently. Default: CPU count.
settings.fail_fastboolCancel remaining steps on first failure. Default: true.
settings.cache_enabledboolEnable the content-addressable cache. Default: true.
envmapEnvironment variables injected into every step.
stepslistOrdered list of step definitions.

Steps

Each step runs in its own container. Define the container image, commands to run, dependencies on other steps, and artifacts to pass between steps.

step definition yaml
steps:
  - name: "build"
    image: "golang:1.22-alpine"
    commands:
      - "go build -o bin/app ./cmd/app"
    artifacts:
      - "bin/"
    env:
      CGO_ENABLED: "0"
    timeout: "5m"

  - name: "unit-test"
    image: "golang:1.22-alpine"
    depends_on: ["build"]
    commands:
      - "go test -race -cover ./..."

  - name: "integration-test"
    image: "golang:1.22-alpine"
    depends_on: ["build"]
    services:
      - name: "postgres"
        image: "postgres:16-alpine"
        env:
          POSTGRES_PASSWORD: "test"
    commands:
      - "go test -tags=integration ./..."

  - name: "deploy"
    image: "alpine:3.19"
    depends_on: ["unit-test", "integration-test"]
    commands:
      - "./bin/app deploy --env staging"
    only:
      branches: ["main"]
FieldTypeDescription
namestringUnique step name. Referenced in depends_on.
imagestringContainer image to run the step in.
commandslistShell commands to execute sequentially.
depends_onlistSteps that must complete before this step runs.
artifactslistPaths to persist and pass to dependent steps.
serviceslistSidecar containers (databases, queues) available during the step.
envmapStep-specific environment variables (merged with global env).
timeoutdurationMax duration for this step. Default: pipeline timeout.
onlyobjectConditional execution. Filter by branches, tags, or paths.

Caching

Kata CI uses a content-addressable cache. Each step's cache key is computed from: the container image, commands, environment variables, dependency artifacts, and watched file hashes. If the key matches, the step is skipped and its artifacts are restored from cache.

You can also define explicit cache paths for dependencies like node_modules or Go module cache.

.kata-ci.yaml yaml
steps:
  - name: "install"
    image: "node:20-alpine"
    commands:
      - "npm ci"
    cache:
      paths:
        - "node_modules/"
      key_files:
        - "package-lock.json"
    artifacts:
      - "node_modules/"
FieldDescription
cache.pathsDirectories to cache between runs.
cache.key_filesFiles whose content is hashed into the cache key. Changes invalidate the cache.
terminal — cache hit
$ kata run
[CACHE] install: HIT (key: a3f8c2d...)
[CACHE] build: HIT (key: 91b4e7f...)
[CACHE] lint: HIT (key: d5a1c09...)
[RUN] test: MISS — running
[PASS] test (1.1s)
[DONE] 4 steps (3 cached, 1 ran) — 1.1s wall time

Parallel Execution

Kata CI automatically parallelizes steps that don't depend on each other. The engine builds a directed acyclic graph (DAG) from your depends_on declarations and schedules independent steps concurrently.

Use settings.max_parallel to cap concurrency. Steps with no depends_on are eligible to run immediately.

build test deploy
lint

In this example, test and lint run in parallel after build completes. deploy waits for both to finish.

Secrets

Secrets are injected as environment variables at runtime and are never written to disk or included in cache keys. Define them in your environment or pass them via CLI flags.

.kata-ci.yaml yaml
steps:
  - name: "deploy"
    image: "alpine:3.19"
    secrets:
      - "DEPLOY_TOKEN"
      - "AWS_ACCESS_KEY_ID"
      - "AWS_SECRET_ACCESS_KEY"
    commands:
      - "./deploy.sh"
terminal
# Pass secrets via environment
$ DEPLOY_TOKEN=xxx kata run
# Or via CLI flag
$ kata run --secret DEPLOY_TOKEN=xxx
Secrets are masked in log output. If a secret value appears in stdout/stderr, Kata CI replaces it with ***.

Triggers

Define when pipelines run. Triggers support branch filters, tag patterns, path globs, and PR events. Combine multiple triggers with AND/OR logic.

.kata-ci.yaml yaml
triggers:
  - event: "push"
    branches: ["main", "release/*"]
    paths:
      - "src/**"
      - "go.mod"

  - event: "pull_request"
    actions: ["opened", "synchronize"]

  - event: "tag"
    pattern: "v*"
FieldDescription
eventGit event type: push, pull_request, tag, schedule.
branchesBranch name patterns (glob supported).
pathsOnly trigger when matching file paths change.
actionsPR actions: opened, synchronize, closed.
patternTag name pattern (glob).

CLI Commands

Kata CI provides a minimal CLI for running, validating, and inspecting pipelines.

CommandDescription
kata runExecute the pipeline from .kata-ci.yaml. Use --step <name> to run a single step.
kata validateValidate the pipeline config without executing. Checks YAML syntax, image refs, and DAG cycles.
kata graphPrint the step dependency graph. Use --format dot for Graphviz output.
kata cacheManage the local cache. kata cache list, kata cache clear, kata cache stats.
kata logsView logs from the last run. Use --step <name> to filter by step.
kata versionPrint version, Go version, and platform info.
terminal
# Run the full pipeline
$ kata run
# Run a single step
$ kata run --step build
# Validate config
$ kata validate
Config valid. 3 steps, 0 cycles, 2 parallel groups
# View dependency graph
$ kata graph
build -> test, lint
test, lint -> deploy
# Cache stats
$ kata cache stats
Entries: 12 | Size: 48MB | Hit rate: 94%

Core Concepts

ConceptDescription
PipelineA named collection of steps defined in .kata-ci.yaml. Pipelines run on triggers or manually.
StepA single unit of work that runs inside a container. Steps declare dependencies, commands, and artifacts.
DAGDirected acyclic graph built from step dependencies. Determines execution order and parallelism.
ArtifactFiles or directories produced by a step and consumed by dependent steps.
Cache KeyContent-addressable hash of step inputs. Used to skip unchanged steps.
ServiceSidecar container (e.g. database) available during a step's execution.
TriggerGit event (push, PR, tag) that starts a pipeline run.

Roadmap

Remote Execution
Run pipelines on remote runners for faster builds with dedicated compute.
Matrix Builds
Run the same step across multiple configurations (OS, language version, etc.).
Web Dashboard
Browser UI for viewing pipeline history, logs, cache stats, and build trends.
Distributed Cache
Shared cache backend (S3, GCS) for team-wide cache hits across machines.
GitHub App
Native GitHub integration with PR status checks, commit status, and annotations.
Plugin System
Extend Kata CI with custom step types, reporters, and cache backends.