Krane

Krane is a Kubernetes cluster agent that follows a pull-based model similar to the Kubernetes kubelet. It polls the control plane for state changes using a sequence-based synchronization protocol, applying changes to local Kubernetes resources. This architecture enables multi-cluster orchestration without requiring the control plane to track connected clients or push events.

Krane pulls desired state from ctrl and ensures the actual cluster state matches. It handles deployment and sentinel lifecycle operations (create, update, delete) by translating high-level state messages into Kubernetes resources.

Architecture

Pull-Based Model

Krane implements a streaming architecture where agents in each cluster connect to ctrl's ClusterService via WatchDeployments and WatchSentinels RPCs. These establish server-streaming connections where the control plane queries resource tables directly and streams state changes. Krane processes each state change, applies it to Kubernetes, and updates its version watermark. On reconnection, Krane sends its last-seen version to resume incrementally without missing events.

This model eliminates the need for the control plane to track connected clients in memory, simplifying horizontal scaling and removing a class of connection state bugs.

Version-Based Synchronization

The sync engine uses version numbers embedded in resource tables to track state changes. Every modification to deployments or sentinels updates a version column via the Restate VersioningService singleton, providing a globally unique, monotonically increasing version across all resources. Krane maintains a versionLastSeen watermark and requests changes after that version.

On fresh start (version=0), Krane receives the complete desired state as a bootstrap. After bootstrap, the stream switches to incremental mode, receiving only new changes as they occur.

Why StatefulSets Instead of Deployments?

We use StatefulSets for stateless containers, which is unusual. The system expects each instance to have a stable DNS address that doesn't change when pods restart.

StatefulSets guarantee this. Each pod gets a predictable name (dep-abc-0) and DNS record (dep-abc-0.dep-abc.unkey.svc.cluster.local). The control plane tracks these addresses for service discovery and routing.

Standard Deployments use random pod names and changing DNS addresses. This works fine behind a load balancer, but our current architecture needs stable instance addressing.

This is a known design compromise. Future versions might move instance addressing to service meshes instead of requiring stable DNS.

Deployment Flow

sequenceDiagram autonumber participant User participant Ctrl as Control Plane participant DB as Database participant Krane as Krane Agent participant K8s as Kubernetes API participant Pod User->>Ctrl: Create deployment request Ctrl->>DB: Store desired state with version=N Note over Krane: WatchDeployments stream Krane->>Ctrl: WatchDeployments(version > lastSeen) Ctrl->>DB: Query deployment_topology Ctrl->>Krane: DeploymentState(version=N, ApplyDeployment) Krane->>K8s: Create/Update Service K8s->>Krane: Service ready Krane->>K8s: Create/Update StatefulSet K8s->>Krane: StatefulSet created K8s->>K8s: Schedule pods K8s->>Pod: Pull image & start container Pod->>Pod: Container running Note over Krane: versionLastSeen = N

Kubernetes Backend

The Kubernetes backend runs inside a cluster with appropriate RBAC permissions. It uses in-cluster config to authenticate with the API server.

These need to be fine tuned, but they work for now.

- apiGroups: ["apps"]
  resources: ["deployments"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""]
  resources: ["pods", "services"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""]
  resources: ["configmaps", "secrets"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["apps"]
  resources: ["statefulsets"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

Resource Creation

Creating a deployment creates two resources:

Headless Service with ClusterIP: None and publishNotReadyAddresses: true for DNS-based discovery. Each pod gets a DNS record even before it's ready. The Service selector matches unkey.deployment.id.

StatefulSet with the specified replicas, CPU, and memory. Resource requests and limits are set to the same value for predictable scheduling. Image pull secrets are automatically added for Depot registry images. Restart policy is always.

Resource limits are enforced at the pod level. Exceeding memory kills the pod. Exceeding CPU throttles it.

RBAC Requirements

The Kubernetes backend requires specific RBAC permissions to function. Krane needs the ability to create, read, and delete StatefulSets in the Apps API group, create, read, and delete Services in the Core API group, and list and read Pods in the Core API group to query status.

A typical RBAC configuration looks like this:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: krane
  namespace: unkey
rules:
  - apiGroups: ["apps"]
    resources: ["statefulsets"]
    verbs: ["create", "get", "list", "delete"]
  - apiGroups: [""]
    resources: ["services"]
    verbs: ["create", "get", "delete"]
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "list"]

Without these permissions, Krane cannot manage deployments and will return permission denied errors.

Labels and Management

All resources created by Krane are labeled with unkey.managed.by=krane and unkey.deployment.id=\{deployment_id\}. These labels serve multiple purposes: they identify resources managed by Krane for filtering and querying, they enable automatic cleanup during eviction scans, and they prevent Krane from interfering with non-Krane resources in shared namespaces.

When querying deployments, Krane verifies the unkey.managed.by label matches krane. This prevents it from returning information about deployments created by other tools or controllers in the same namespace.

ClusterService API

The control plane exposes a ClusterService with these key RPCs:

WatchDeployments and WatchSentinels establish server-streaming connections for receiving state changes. Krane sends its region and last-seen version; the control plane streams bootstrap state (if version=0) followed by incremental changes by querying resource tables directly.

GetDesiredDeploymentState and GetDesiredSentinelState return the current desired state for a specific resource. Used for on-demand reconciliation when Kubernetes reports unexpected state.

ReportDeploymentStatus and ReportSentinelStatus receive status updates from Krane about actual Kubernetes state (pod running, pod failed, etc.).

State Change Distribution

When deployment changes occur, the control plane stores the desired state in the database with an updated version number. Krane instances watching that region receive the change on their stream. Each Krane instance independently applies changes to its local cluster.

Multi-Region Support

Resource tables include a region column, so each Krane instance only receives changes relevant to its cluster. The control plane doesn't need to know which Krane instances exist or are connected; it simply writes changes to the database, and any Krane watching that region will receive them.

Krane

On this page