# engineering ## Docs - [Architecture](https://engineering.unkey.com/architecture/index.md): System architecture and design references - [0000 Template](https://engineering.unkey.com/architecture/rfcs/0000-template.md): You may copy this as a starting point, but it's not required - [0001 RBAC](https://engineering.unkey.com/architecture/rfcs/0001-rbac.md) - [0002 Secret Scanning](https://engineering.unkey.com/architecture/rfcs/0002-github-secret-scanning.md) - [0003 Sentinel Pools](https://engineering.unkey.com/architecture/rfcs/0003-key-shape.md) - [0004 COSS Starter](https://engineering.unkey.com/architecture/rfcs/0004-coss-starter.md) - [0005 Analytics API](https://engineering.unkey.com/architecture/rfcs/0005-analytics-api.md): Unkey exposes APIs to retrieve all required data to build end-user facing dashboards and drive our customer's usage-based billing. - [0006 Auth Migration](https://engineering.unkey.com/architecture/rfcs/0006-auth-migration.md): Migrate everything to WorkOS, despite their bad APIs.. - [0007 Client-side file structure](https://engineering.unkey.com/architecture/rfcs/0007-client-file-structure.md): File structure for our client apps - [0008 Dataplane](https://engineering.unkey.com/architecture/rfcs/0008-dataplane.md): Global Unkey Deployment Architecture - [0009 Pricing refresh for 2025](https://engineering.unkey.com/architecture/rfcs/0009-pricing-updates.md): We need to update our pricing for 2025 - [0010 Splitting the monorepo](https://engineering.unkey.com/architecture/rfcs/0010-split-monos.md): Splitting the monorepo into multiple smaller, more focused repositories / monorepos. - [0008 URNs](https://engineering.unkey.com/architecture/rfcs/0011-unkey-resource-names.md): Implementing Uniform Resource Names (URNs) and Structured Error Codes at Unkey - [0012 Stricter Linter](https://engineering.unkey.com/architecture/rfcs/0012-stricter-linter.md): Adding more strict lint rules to minimize issues in our codebase. - [0013 TLS Certificates for custom domains](https://engineering.unkey.com/architecture/rfcs/0013-custom-domains.md): Issuing certificates for custom domains using Let's Encrypt's HTTP-01 challenge. - [0014 Sentinel Middleware](https://engineering.unkey.com/architecture/rfcs/0014-sentinel-middleware.md): Composable HTTP middleware schema for Sentinel, Unkey's reverse proxy. - [Authentication](https://engineering.unkey.com/architecture/services/api/api-design/auth.md): Authenticating with the Unkey API - [Error handling](https://engineering.unkey.com/architecture/services/api/api-design/errors.md): Understanding and working with API errors - [API design overview](https://engineering.unkey.com/architecture/services/api/api-design/index.md): Design philosophy for Unkey APIs - [RPC-style API](https://engineering.unkey.com/architecture/services/api/api-design/rpc.md): Action-oriented API design - [Configuration](https://engineering.unkey.com/architecture/services/api/configuration.md): Configuration model and required settings for the api service - [Architecture](https://engineering.unkey.com/architecture/services/api/overview.md): API service components, request flow, and dependencies - [Architecture](https://engineering.unkey.com/architecture/services/control-plane/api/architecture.md): Runtime composition and request flow for the control plane API - [Configuration](https://engineering.unkey.com/architecture/services/control-plane/api/configuration.md): Configuration model and required settings for the control plane API - [Overview](https://engineering.unkey.com/architecture/services/control-plane/api/overview.md): Control plane API for deployment intent and orchestration - [Configuration](https://engineering.unkey.com/architecture/services/control-plane/worker/configuration.md): Configuration model and required settings for the control plane worker - [Deployment sync](https://engineering.unkey.com/architecture/services/control-plane/worker/deployment-sync.md): How the control plane streams state changes to krane agents - [Overview](https://engineering.unkey.com/architecture/services/control-plane/worker/overview.md): Control plane worker for workflow execution - [Certificates](https://engineering.unkey.com/architecture/services/control-plane/worker/workflows/certificates.md): ACME challenge and certificate issuance - [Custom domains](https://engineering.unkey.com/architecture/services/control-plane/worker/workflows/custom-domains.md): Custom domain verification and lifecycle - [Deployments](https://engineering.unkey.com/architecture/services/control-plane/worker/workflows/deployments.md): Deploy, promote, rollback, cancel, and build-queue workflows - [GitHub App](https://engineering.unkey.com/architecture/services/control-plane/worker/workflows/github-app.md): GitHub App authentication and failure modes - [Key Last Used Sync](https://engineering.unkey.com/architecture/services/control-plane/worker/workflows/key-last-used-sync.md): How lastUsedAt timestamps flow from ClickHouse to MySQL. - [Routing](https://engineering.unkey.com/architecture/services/control-plane/worker/workflows/routing.md): Frontline route assignment and traffic switching - [Configuration](https://engineering.unkey.com/architecture/services/frontline/configuration.md): Configuration model and required settings for the frontline service - [Frontline ingress](https://engineering.unkey.com/architecture/services/frontline/ingress.md): How frontline routes traffic to sentinel - [Overview](https://engineering.unkey.com/architecture/services/frontline/overview.md): Ingress service for TLS termination and routing - [Routing and failover](https://engineering.unkey.com/architecture/services/frontline/routing.md): Frontline routing decisions and cross-region forwarding - [Configuration](https://engineering.unkey.com/architecture/services/krane/configuration.md): Configuration model and required settings for the krane service - [Overview](https://engineering.unkey.com/architecture/services/krane/overview.md): Kubernetes control agent for deployments and secrets - [Secrets service](https://engineering.unkey.com/architecture/services/krane/secrets.md): Secrets decryption RPC and authentication - [Configuration](https://engineering.unkey.com/architecture/services/sentinel/configuration.md): Config fields, defaults, and dependencies - [Deployment](https://engineering.unkey.com/architecture/services/sentinel/deployment.md): Namespace, resources, and pod lifecycle - [Failure modes](https://engineering.unkey.com/architecture/services/sentinel/failure-modes.md): Failure scenarios, responses, and diagnosis - [Overview](https://engineering.unkey.com/architecture/services/sentinel/overview.md): Per-environment reverse proxy and gateway - [Firewall](https://engineering.unkey.com/architecture/services/sentinel/policies/firewall.md): Policy that denies matched requests - [Policies](https://engineering.unkey.com/architecture/services/sentinel/policies/index.md): Policy engine and evaluation model - [JWTAuth](https://engineering.unkey.com/architecture/services/sentinel/policies/jwtauth.md): JWT authentication policy (schema only) - [KeyAuth](https://engineering.unkey.com/architecture/services/sentinel/policies/keyauth.md): API key authentication policy - [Match expressions](https://engineering.unkey.com/architecture/services/sentinel/policies/match-expressions.md): Request matching rules for policies - [OpenAPI validation](https://engineering.unkey.com/architecture/services/sentinel/policies/openapi.md): OpenAPI request validation (schema only) - [Policy schema](https://engineering.unkey.com/architecture/services/sentinel/policies/policy.md): The sentinel.v1.Policy message structure - [Principal](https://engineering.unkey.com/architecture/services/sentinel/policies/principal.md): Authenticated identity from auth policies - [RateLimit](https://engineering.unkey.com/architecture/services/sentinel/policies/ratelimit.md): Gateway rate limiting (schema only) - [Request flow](https://engineering.unkey.com/architecture/services/sentinel/request-flow.md): Request lifecycle through sentinel - [Authentication](https://engineering.unkey.com/architecture/services/vault/auth.md): RPC authentication and bearer token handling - [Configuration](https://engineering.unkey.com/architecture/services/vault/configuration.md): Configuration model and required settings for the vault service - [Overview](https://engineering.unkey.com/architecture/services/vault/overview.md): Encryption key service backed by object storage - [Overview](https://engineering.unkey.com/company/index.md): How Unkey works - [Meetings](https://engineering.unkey.com/company/meetings.md): Fight for your time and the time of others - [Contributing](https://engineering.unkey.com/contributing/how-to-contribute.md): Guidelines for contributing to Unkey - [Local development](https://engineering.unkey.com/contributing/local/development.md): Set up, run, and test Unkey locally - [Code quality](https://engineering.unkey.com/contributing/quality/code-quality.md): Design goals and coding standards for Unkey - [Documentation](https://engineering.unkey.com/contributing/quality/documentation.md): Standards for internal documentation and code comments - [Anti-patterns](https://engineering.unkey.com/contributing/quality/testing/anti-patterns.md): Common testing mistakes to avoid - [Fuzz tests](https://engineering.unkey.com/contributing/quality/testing/fuzz-tests.md): Finding edge cases with randomized inputs - [HTTP handler tests](https://engineering.unkey.com/contributing/quality/testing/http-handler-tests.md): Testing API endpoints with the test harness - [Testing](https://engineering.unkey.com/contributing/quality/testing/index.md): Testing standards and patterns for Unkey - [Integration tests](https://engineering.unkey.com/contributing/quality/testing/integration-tests.md): Testing components with real dependencies - [Simulation tests](https://engineering.unkey.com/contributing/quality/testing/simulation-tests.md): Property-based testing with the simulation framework - [Unit tests](https://engineering.unkey.com/contributing/quality/testing/unit-tests.md): Table-driven patterns and unit test conventions - [Bazel](https://engineering.unkey.com/contributing/tooling/bazel.md): Why Unkey uses Bazel and how to work with it - [Users & Roles](https://engineering.unkey.com/infra/clickhouse/index.md) - [grafana_readonly](https://engineering.unkey.com/infra/clickhouse/roles/grafana-readonly.md) - [insertonly_role](https://engineering.unkey.com/infra/clickhouse/roles/insertonly-role.md) - [readonly_role](https://engineering.unkey.com/infra/clickhouse/roles/readonly-role.md): Read-only access to the default database - [apiv2](https://engineering.unkey.com/infra/clickhouse/users/apiv2.md) - [github](https://engineering.unkey.com/infra/clickhouse/users/github.md) - [grafana](https://engineering.unkey.com/infra/clickhouse/users/grafana.md) - [sentinel](https://engineering.unkey.com/infra/clickhouse/users/sentinel.md) - [unkey_admin](https://engineering.unkey.com/infra/clickhouse/users/unkey-admin.md) - [vector](https://engineering.unkey.com/infra/clickhouse/users/vector.md) - [vercel_dashboard](https://engineering.unkey.com/infra/clickhouse/users/vercel-dashboard.md) - [Creating a New EKS Cluster Region](https://engineering.unkey.com/infra/clusters/create-region.md): Adding a new EKS region. - [Deleting an EKS Cluster](https://engineering.unkey.com/infra/clusters/delete-cluster.md): Tearing down an EKS cluster cleanly. - [Kubeconfig Setup](https://engineering.unkey.com/infra/clusters/kubeconfig.md): Configure kubeconfig for all live clusters. - [Network CIDR Assignments](https://engineering.unkey.com/infra/clusters/networks.md): VPC CIDR allocations for all regions. - [Scheduling Workloads](https://engineering.unkey.com/infra/clusters/scheduling.md): Node groups, taints, and pod scheduling. - [ArgoCD Debugging Commands](https://engineering.unkey.com/infra/deployments/argocd-debugging.md): Debugging ArgoCD apps, syncs, and clusters. - [CronJobs (Restate Scheduled Tasks)](https://engineering.unkey.com/infra/deployments/cronjobs.md): How to define and deploy scheduled tasks that trigger Restate service endpoints. - [Sentinel Rollout](https://engineering.unkey.com/infra/deployments/sentinel-rollout.md): How to roll out a new sentinel image across the fleet. - [Domain Connect](https://engineering.unkey.com/infra/domain-connect.md): How our one-click custom domain DNS setup works, and how it's wired together. - [Environments](https://engineering.unkey.com/infra/environments.md): Staging and production environment URLs and how traffic flows between them. - [Overview](https://engineering.unkey.com/infra/index.md): Documentation for Unkey's infrastructure. - [Working on Infrastructure](https://engineering.unkey.com/infra/infra-work.md): Planning, promotions, and production deploys. - [Configuring GitHubActionsDeployRole](https://engineering.unkey.com/infra/legacy-2025/github-actions-deploy-role-setup.md): Creating the GitHubActionsDeployRole IAM role. - [Configuring GitHub OIDC](https://engineering.unkey.com/infra/legacy-2025/github-oidc.md): GitHub Actions OIDC setup with AWS. - [How Pulumi IaC and ESC Work Together](https://engineering.unkey.com/infra/legacy-2025/pulumi-iac-esc.md): Pulumi stacks, ESC environments, and secrets. - [Unkey Infrastructure as Code Documentation](https://engineering.unkey.com/infra/legacy-2025/pulumi-infrastructure-architecture.md): Pulumi AWS infrastructure architecture and workflows. - [Pulumi Workflow](https://engineering.unkey.com/infra/legacy-2025/pulumi-workflow.md): Day-to-day Pulumi deployment workflow. - [Data quality checks](https://engineering.unkey.com/infra/metering/data-quality-checks.md): Invariant queries that should always return zero rows. If one of them returns anything, the metering pipeline is producing bad data. Wire these into an hourly alert or a CI canary. - [eBPF primer (what it is, why we use it)](https://engineering.unkey.com/infra/metering/ebpf-primer.md): Background on eBPF for engineers touching heimdall's network metering. Covers what it is, why the kernel lets us run code inside it, and why it was the right tool for per-pod byte accounting. - [heimdall (per-pod metering agent)](https://engineering.unkey.com/infra/metering/heimdall.md): How we collect per-pod CPU, memory, disk, and network counters from kernel sources and write them to ClickHouse. - [Metrics architecture (from kernel to chart)](https://engineering.unkey.com/infra/metering/metrics-architecture.md): Plain-English walkthrough of how we turn kernel numbers into charts and bills. Canonical reference for everything metering-related. Other docs in this section link here rather than re-explain. - [Why monotonic counters, not deltas](https://engineering.unkey.com/infra/metering/why-counters-not-deltas.md): The defence for storing raw cumulative counters and computing aggregates with max minus min at query time. Citations and prior art for pushing back on the "just write deltas" proposal. - [Alerting](https://engineering.unkey.com/infra/observability/alerting.md): Alert routing, severities, and adding rules. - [Checkly Alerts](https://engineering.unkey.com/infra/observability/checkly.md): Configuration and management of alerts in checklyhq.com - [incident.io](https://engineering.unkey.com/infra/observability/incident-io.md): Alert routing config and API usage. - [AWS Secrets Manager - Required Secrets](https://engineering.unkey.com/infra/secrets/aws-secrets.md): Required secrets for EKS cluster infrastructure.