Deployment

Sentinel runs as a Kubernetes Deployment managed by Krane. For details on how Krane manages sentinel state, see the Krane documentation.

When sentinels are created

Sentinels are created as part of the deployment workflow in the control plane worker (svc/ctrl/worker/deploy/deploy_handler.go). During the deploying phase, the ensureSentinels step checks whether each target region already has a sentinel for the environment. If a region has no sentinel, the workflow inserts one into the database and writes a deployment_changes outbox entry in the same transaction. Once the outbox entry exists, Krane picks it up via the WatchDeploymentChanges stream and applies the corresponding Kubernetes resources.

Convergence tracking

After creating a sentinel, the deploy workflow calls SentinelService.Deploy() (svc/ctrl/worker/sentinel/) and blocks until the sentinel has fully converged in Kubernetes. This ensures traffic can route to the sentinel before the deployment proceeds to domain assignment. SentinelService is a Restate virtual object keyed by sentinel ID, so calls for the same sentinel serialise automatically. A Deploy call:

Reads the current sentinel row and merges the request fields over it (zero values mean “keep current”). If nothing changed and the sentinel is already healthy on the desired image, the call returns READY immediately.
Writes the new config plus a deployment_changes outbox entry in a single transaction and sets deploy_status = progressing. Krane picks up the outbox entry and applies the update to Kubernetes.
Creates a Restate awakeable and persists its ID under the notify_ready_awakeable state key, then suspends until either the awakeable resolves or a 10-minute timeout fires.
The awakeable is resolved by ReportSentinelStatus on the control-plane cluster service: when Krane reports a sentinel whose running_image matches image and whose health is healthy, the RPC calls SentinelService.NotifyReady (a shared handler on the same virtual object), which resolves the stored awakeable ID and unblocks Deploy.
On resolve, deploy_status is set to ready and the call returns. On timeout, deploy_status is set to failed — the single-sentinel path does not self-rollback; failed sentinels stay on whatever state Kubernetes ended up in, and fleet-level rollback is the operator’s call via SentinelRolloutService.RollbackAll.

Krane reports the following fields on every Deployment status change, which the control plane writes into the sentinels row so anything that needs to reason about rollout progress can do so from the DB:

Field	Purpose
`available_replicas`	Pods ready for MinReadySeconds
`updated_replicas`	Pods running the current pod template spec
`ready_replicas`	Pods passing readiness probes
`observed_generation`	Last generation processed by K8s

The sentinel’s deploy_status column is an enum with four values: idle, progressing, ready, failed. Fleet rollouts have their own lifecycle state (including rolling_back) which is stored in Restate K/V, not in the sentinel row — see the next section.

Fleet-wide image rollouts

Changing the sentinel image for a single sentinel is a SentinelService.Deploy call. Rolling the image across the whole fleet is a separate service: SentinelRolloutService (svc/ctrl/worker/sentinel/rollout_*.go). The deploy workflow never does this implicitly — it only creates sentinels for regions that don’t have one yet and never auto-upgrades existing sentinels. Fleet rollouts are initiated explicitly (e.g. operator tooling). SentinelRolloutService is a Restate virtual object keyed by the literal string singleton, which serialises all rollout operations globally. Its state lives in the Restate K/V store under the rollout key and tracks the full lifecycle so Resume, Cancel, and RollbackAll can pick up where the previous call left off.

Rollout lifecycle

Rollout state is one of:

State	Meaning
`idle`	No rollout in flight (also the effective state before the first `Rollout` call).
`in_progress`	A rollout is executing waves.
`paused`	A wave had at least one failing sentinel. Waiting for operator to call `Resume` or `RollbackAll`.
`rolling_back`	`RollbackAll` is reverting succeeded sentinels to their previous image.
`cancelled`	Operator called `Cancel`, or a rollback finished.
`completed`	All waves finished with no failures.

Waves

Rollout(image, wavePercentages?, slackWebhookUrl?) starts a rollout:

Lists all running sentinels (paged) and their current image. Sentinels already on the target image are filtered out; the rest are captured into PreviousImages so a later rollback knows what to revert to.
Splits the remaining sentinel IDs into waves by cumulative percentage. Defaults to [1, 5, 25, 50, 100] — e.g. with 100 sentinels, waves of [1, 4, 20, 25, 50]. Callers can override via wave_percentages (computeWaves in rollout_state.go).
Persists the full rolloutState (image, waves, previous images, Slack webhook, counters) into Restate state and starts executing waves.

Each wave fans out SentinelService.Deploy calls via RequestFuture and then collects all responses:

If every sentinel in the wave reports READY, the wave is recorded in SucceededIDs and the next wave starts.
If any sentinel fails or errors, the failed IDs are recorded in FailedIDs, the rollout transitions to paused, and the call returns. Sentinels that succeeded within the paused wave stay in SucceededIDs.

Resume, Cancel, RollbackAll

Operator handlers on the paused rollout:

Resume — only valid from paused. Advances CurrentWave by one (skipping the wave that failed) and re-enters executeWaves. Sentinels that failed in the skipped wave are not retried; they stay in FailedIDs.
Cancel — valid from in_progress or paused. Flips the state to cancelled. Succeeded sentinels keep the new image; failed ones stay where they are. This is the “live with it” exit.
RollbackAll — valid from paused or cancelled. Fans out SentinelService.Deploy back to the per-sentinel entry in PreviousImages for everything in SucceededIDs. Failed sentinels are not touched — they already never made it to the new image. Returns the count of sentinels that reverted successfully, then transitions to cancelled.

Re-entrancy: Rollout is rejected while a rollout is in any non-terminal state (i.e. anything that isn’t idle, completed, or cancelled). This check is in Rollout itself rather than relying solely on the virtual-object lock, because clients can send multiple Rollout calls to the singleton object.

Slack notifications

If slack_webhook_url is provided on the initial Rollout call, the service posts progress updates at each phase transition (rollout started, wave started, wave completed, rollout paused, rollout resumed, rollout completed, rollback started, rollback completed). Notification failures are logged but do not fail the rollout.

Where sentinels run

All sentinel pods run in a dedicated sentinel Kubernetes namespace, separate from customer workloads and other Unkey services. This namespace contains:

Sentinel Deployments (one per environment per region)
Services for routing traffic to sentinel pods
Gossip headless Services and CiliumNetworkPolicies for cache invalidation
Secrets for database, ClickHouse, and Redis credentials

Sentinel pods are scheduled onto dedicated sentinel node class nodes using a toleration for the node-class=sentinel:NoSchedule taint. This keeps sentinel workloads isolated from customer instance pods at the node level, preventing resource contention between the proxy layer and the workloads it routes to.

Kubernetes resources

Each sentinel consists of five resources, all created via server-side apply:

Resource	Scope	Purpose
Deployment	Per sentinel	Sentinel pods with rolling update strategy
ClusterIP Service	Per sentinel	Routes traffic to sentinel pods on port 8040
PodDisruptionBudget	Per sentinel	Keeps at least one pod available during disruptions
Headless Service	Per environment	Gossip peer discovery (resolves to pod IPs on port 7946)
CiliumNetworkPolicy	Per environment	Allows gossip traffic between sentinel pods

The environment-scoped resources (headless Service, CiliumNetworkPolicy) are shared across all sentinels in an environment and are not owned by any single Deployment.

Deployment spec

Setting	Value
Strategy	RollingUpdate
Max Unavailable	0
Max Surge	1
Min Ready Seconds	5
Topology Spread	maxSkew=1 across availability zones, ScheduleAnyway
Ports	8040 (HTTP), 7946 (gossip TCP+UDP)

Probes

Probe	Value
Liveness	`GET /_unkey/internal/health` on port 8040
Readiness	`GET /_unkey/internal/health` on port 8040

Two consecutive failures remove the pod from the Service endpoints, stopping traffic from reaching it.

Both probes hit the same trivial endpoint that returns 200 unconditionally without checking dependencies. A sentinel with a dead database connection or unavailable Redis reports as healthy. This needs to be migrated to proper liveness and readiness checks like our other services: liveness must verify the process is alive, readiness must verify sentinel can actually serve traffic (database reachable, middleware engine initialized). Tracked in #5367.

Labels

All sentinel resources carry these labels:

Label	Value
`app.kubernetes.io/managed-by`	`krane`
`app.kubernetes.io/component`	`sentinel`
`unkey.com/workspace.id`	Workspace ID
`unkey.com/project.id`	Project ID
`unkey.com/app.id`	App ID
`unkey.com/environment.id`	Environment ID
`unkey.com/sentinel.id`	Sentinel ID

Cache prewarming

On startup, the router service loads all deployments with status READY in the environment and prefetches their instances. This avoids cold-start latency spikes on the first requests after a pod restart.

Overview

Services

RFCs

When sentinels are created

Convergence tracking

Fleet-wide image rollouts

Rollout lifecycle

Waves

Resume, Cancel, RollbackAll

Slack notifications

Where sentinels run

Kubernetes resources

Deployment spec

Probes

Labels

Cache prewarming

Overview

Services

RFCs

Documentation Index

​When sentinels are created

​Convergence tracking

​Fleet-wide image rollouts

​Rollout lifecycle

​Waves

​Resume, Cancel, RollbackAll

​Slack notifications

​Where sentinels run

​Kubernetes resources

​Deployment spec

​Probes

​Labels

​Cache prewarming

When sentinels are created

Convergence tracking

Fleet-wide image rollouts

Rollout lifecycle

Waves

Resume, Cancel, RollbackAll

Slack notifications

Where sentinels run

Kubernetes resources

Deployment spec

Probes

Labels

Cache prewarming