Documentation Index
Fetch the complete documentation index at: https://engineering.unkey.com/llms.txt
Use this file to discover all available pages before exploring further.
These are the “tests for the live data”, not for the code. They assert properties that must hold by construction of the pipeline. If a query here returns any rows, something upstream is broken and any aggregate derived from the current state could be wrong.
Why these exist
The metering pipeline’s correctness guarantees are structural (monotonic counters,max - min idempotent under duplicates, container_uid isolates restarts). Structure holds as long as nobody changes it. These queries detect the drift: the moment a counter goes backward, a bucket has fewer samples than it should, or a negative delta sneaks into the data, an alert fires and a human looks before any downstream consumer acts on it.
Run them hourly against the raw or MV tables, depending on query cost. None take longer than a few seconds on the current data volume.
1. Counter monotonicity per container_uid
A kernel counter inside a single container incarnation should only ever increase. A decrease implies: (a) BPF map reset without corresponding container_uid change (severity: undercount), (b) a replay-triggered row insertion with stale data that Replacing hasn’t merged yet (severity: none on FINAL reads, but still a smell), (c) a genuine bug.
cpu_usage_usec (in both the inner lagInFrame and the outer prev-comparison) with one of:
network_egress_public_bytesnetwork_egress_private_bytesnetwork_ingress_public_bytesnetwork_ingress_private_bytes
container_uid.
When this fires: the most common cause is the BPF counter map resetting without the Go side knowing (now mitigated by bpffs pinning, see svc/heimdall/internal/network/network_linux.go). Second most common: someone ran a backfill against the raw table with stale rows. Both undercount, so aggregates stay safe, but investigate.
2. 15s bucket sample density
The per-15s MV expects ~3 samples per bucket (heimdall ticks every 5s). A bucket with only 1 sample yieldsmax - min = 0 for that bucket, so the chart goes flat for exactly the window where the pod was under the heaviest load (that’s when heimdall gets CPU-throttled and misses ticks). Day/month-grain aggregates are unaffected, but this is the canary for the problem we hit during wrk load in dev.
heimdall_tick_duration_seconds p99), containerd drop + informer lag (both signals firing at once), or a bug in the ingest path.
3. Negative deltas in counter columns
Every counter delta must be >= 0.max - min over a monotonic counter can never legitimately go negative; if one does, either the container_uid partitioning let a counter reset leak in, or UInt64 arithmetic underflowed somewhere. Runs directly against raw since there is no long-retention aggregate yet.
container_uid (see #1) or a schema drift that reintroduces UInt64 arithmetic somewhere in the query layer.
4. Rows without a label
Every billable row should have a non-emptyworkspace_id, project_id, environment_id, and resource_id. Empty values are usually a signal that pods escaped krane’s label-injection or that informer cache was racing.
managed-by label).
5. Disk-used vs disk-allocated sanity
disk_used_bytes must never exceed disk_allocated_bytes. If it does, either statfs is reading the wrong mount (audit risk #5) or the PVC resized and we captured the old allocation mid-resize.
6. Pods reported by heimdall vs pods scheduled
A sanity check that every scheduled pod is producing checkpoints. Runs against the K8s API + ClickHouse; easiest to do as a Go program, but the ClickHouse half is:kubectl get pods --field-selector spec.nodeName=<node> restricted to billable pods (managed-by=krane, component in (deployment, sentinel)). A gap means heimdall is either not running on that node, or can’t reach that pod’s cgroup.
7. BPF map headroom
Before the LRU starts evicting (and thus silently dropping traffic from the oldest veth), alert. The gauge is exposed by heimdall asunkey_heimdall_bpf_map_entries (added in this branch).
Prometheus alert rule:
Where these should live
- Short-term: copy a select few (monotonicity, sample density, negative deltas) into a Grafana alert that runs against ClickHouse hourly. One Slack alert per violation.
- Medium-term: a small Go binary in
svc/ctrlthat runs the full list nightly, writes results to ametering_quality_checktable, alerts on any row > 0. - Long-term: if/when invoicing is built on top, gate invoice generation on these checks passing for the period. Anything flagged gets manual review before downstream consumers act on it.
Related docs
- metrics-architecture. The pipeline design these checks are asserting properties of.
- heimdall. The collector these checks verify.

