Prerequisites
Before starting, ensure you have:- AWS credentials configured (
AWS_PROFILE) with permissions for EKS, IAM, Route53, Secrets Manager, and ELB - CLI tools installed:
awscli,eksctl,kubectl,helm,argocd - GitHub App credentials for ArgoCD repository access
- Route53 hosted zones created for
<environment>.aws.unkey.comandaws.unkey.cloud - CIDR allocation — confirm the target region has an entry in
networks. The generator script will refuse to run if the CIDR is missing.
Step 1: Generate configuration
Thegenerate-region-config.sh script creates all eksctl and helm environment files for a region.
Dry run first
Generate files
What gets created
| Category | Apps | When |
|---|---|---|
| Always generated | eksctl config, argocd, core, networking, reloader, runtime, dragonfly, tailscale, external-dns, observability, thanos, vector-logs, unkey-api | Every run |
Deploy-only (--with-deploy) | control-api, control-worker, restate, sentinel, frontline, krane, vault | Only with --with-deploy |
configs/<environment>/<region>.yaml and helm-chart/<chart>/environments/<environment>/<region>.yaml. The script refuses to overwrite existing files — delete them first if you need to regenerate.
Step 2: Review & commit
Check the generated files make sense:- VPC CIDR matches the networks assignment
- Hostnames and domain patterns look correct
- gossip WAN seeds have a
TODOcomment (expected — you’ll fill them in at Step 6)
Promote all apps to the new commit
Each ArgoCD ApplicationSet reads a promotion file (eks-cluster/promotions/<environment>/<app>.yaml) that pins a specific git SHA as the targetRevision. After pushing the new region’s config, you must promote every app to a revision that includes the new env files — otherwise ArgoCD will check out the older pinned commit where the files don’t exist, and all apps will show Unknown sync status.
Use the promote script to update all apps to the pushed commit:
Step 3: Verify secrets replication
All secrets in AWS Secrets Manager (unkey/shared, unkey/control, unkey/krane, etc.) are already replicated from us-east-1 to the regions where unkey-api runs. Once the cluster is up, External Secrets will pull from the local region’s Secrets Manager automatically.
Verify replication is in place for your region:
replicate-secrets-to-new-region.sh script automates this for all secrets at once:
Step 4: Create cluster
Set the required variables and run the bootstrap script:| Step | What happens |
|---|---|
| 1 | Create IAM policies (ExternalDNS, SecretsManager, ALB, ACK) |
| 2 | Create EKS cluster (without node groups) |
| 3 | Wait for cluster ACTIVE status |
| 4 | Update kubeconfig |
| 5 | Patch addon tolerations |
| 6 | Create node groups |
| 7 | Create observability S3 bucket |
| 8 | Install AWS Load Balancer Controller |
| 9 | Install CRDs (Prometheus, External Secrets) |
| 10 | Install and configure ArgoCD |
Step 5: Verify deployment
Step 6: Configure gossip WAN seeds
Both unkey-api and frontline (if--with-deploy) use memberlist-based WAN gossip for cross-region state sharing. This is a chicken-and-egg problem: each region needs to know the other region’s NLB DNS name, but that NLB doesn’t exist until the chart deploys.
6a. Deploy with empty seeds (already done)
The generated config hasUNKEY_GOSSIP_WAN_SEEDS: "" for unkey-api and appropriate defaults for frontline. ArgoCD will deploy them, creating the NLB and registering DNS via ExternalDNS.
6b. Verify the gossip NLB DNS is registered
Wait for ExternalDNS to create the DNS records, then verify:6c. Update the new region’s WAN seeds
Point the new region to the existing region(s). unkey-api — edithelm-chart/unkey-api/environments/<env>/<region>.yaml:
helm-chart/frontline/environments/<env>/<region>.yaml:
6d. Update existing regions to include the new region
Each existing region must add the new region as a seed. Seeds are comma-separated if there are multiple peer regions. Example — existingus-east-1 unkey-api config gets:
6e. Commit, push, and sync
6f. Verify gossip is healthy
Step 7: Enable Global Accelerator (deploy regions only)
For regions running frontline with--with-deploy, the generated config already sets globalAccelerator.enabled: true and includes the listener ARN. After the frontline NLB is created:
- The GA resolver Helm hook job runs automatically
- It discovers the NLB ARN and creates an
EndpointGroupCRD - The ACK Global Accelerator controller reconciles and attaches the NLB to the Global Accelerator
Quick Reference
| Script | What it does |
|---|---|
generate-region-config.sh | Generate all config files for a new region |
promote | Update promotion files to deploy a revision via ArgoCD |
promotion-changelists | Generate a changelog of PRs between the old and new promotion revisions |
replicate-secrets-to-new-region.sh | Add a new region to secrets replication (only needed for regions not already replicated) |
setup-cluster.sh | Full cluster bootstrap (IAM → EKS → nodes → ArgoCD) |
setup-global-accelerator.sh | Create Global Accelerator (one-time) |
setup-acm-certificate.sh | Create wildcard ACM cert for a region |
validate-aws-resources.sh | Validate AWS resources exist |
apply-addon-tolerations.sh | Patch EKS addon tolerations |
Troubleshooting
CIDR not found
CIDR_MAP in generate-region-config.sh. Add it there and in networks.
Node groups not scheduling pods
All node groups use taints. Pods need matching tolerations. Check:| Node group | Taint |
|---|---|
unkey | node-class=unkey:NoSchedule |
untrusted | node-class=untrusted:NoSchedule |
sentinel | node-class=sentinel:NoSchedule |
observability | node-class=observability:NoSchedule |
api | node-class=api:NoSchedule |
Gossip not joining
- DNS not resolving — ExternalDNS may not have registered yet. Check
kubectl logs -n networking -l app.kubernetes.io/name=external-dns. - NLB not ready —
kubectl get svc -n api unkey-api-gossip-wanshould show an external hostname. - Security groups — WAN gossip uses port 7947 TCP+UDP. The NLB must allow inbound on this port.
- Secret mismatch — All regions in a gossip ring must share the same
UNKEY_GOSSIP_SECRET_KEY(pulled from AWS Secrets Manager).
ExternalSecrets failing
- Secrets are replicated to this region (see Step 3)
- Pod Identity association exists for the service account
- The SecretStore references the correct region

