Skip to main content
End-to-end guide for adding a new AWS region to the Unkey EKS infrastructure. Assumes familiarity with Kubernetes, AWS, and the existing repo layout.

Prerequisites

Before starting, ensure you have:
  • AWS credentials configured (AWS_PROFILE) with permissions for EKS, IAM, Route53, Secrets Manager, and ELB
  • CLI tools installed: awscli, eksctl, kubectl, helm, argocd
  • GitHub App credentials for ArgoCD repository access
  • Route53 hosted zones created for <environment>.aws.unkey.com and aws.unkey.cloud
  • CIDR allocation — confirm the target region has an entry in networks. The generator script will refuse to run if the CIDR is missing.

Step 1: Generate configuration

The generate-region-config.sh script creates all eksctl and helm environment files for a region.

Dry run first

cd eks-cluster
./scripts/generate-region-config.sh <region> --dry-run

# With a non-default environment:
./scripts/generate-region-config.sh <region> staging --dry-run
This prints the file list and CIDR without writing anything.

Generate files

# Base region (unkey-api + infrastructure only)
./scripts/generate-region-config.sh <region>

# Full deploy region (adds control-api, frontline, krane, vault, etc.)
./scripts/generate-region-config.sh <region> --with-deploy

What gets created

CategoryAppsWhen
Always generatedeksctl config, argocd, core, networking, reloader, runtime, dragonfly, tailscale, external-dns, observability, thanos, vector-logs, unkey-apiEvery run
Deploy-only (--with-deploy)control-api, control-worker, restate, sentinel, frontline, krane, vaultOnly with --with-deploy
Files are written to configs/<environment>/<region>.yaml and helm-chart/<chart>/environments/<environment>/<region>.yaml. The script refuses to overwrite existing files — delete them first if you need to regenerate.

Step 2: Review & commit

Check the generated files make sense:
git diff --stat
git diff
Things to verify:
  • VPC CIDR matches the networks assignment
  • Hostnames and domain patterns look correct
  • gossip WAN seeds have a TODO comment (expected — you’ll fill them in at Step 6)
Commit the generated config and push:
git add configs/ helm-chart/
git commit -m "Add region config for <region>"
git push

Promote all apps to the new commit

Each ArgoCD ApplicationSet reads a promotion file (eks-cluster/promotions/<environment>/<app>.yaml) that pins a specific git SHA as the targetRevision. After pushing the new region’s config, you must promote every app to a revision that includes the new env files — otherwise ArgoCD will check out the older pinned commit where the files don’t exist, and all apps will show Unknown sync status. Use the promote script to update all apps to the pushed commit:
./scripts/promote <environment> $(git ls-remote origin main | awk '{print $1}')
git add eks-cluster/promotions/
git commit -m "Promote all apps for <region>"
git push

Step 3: Verify secrets replication

All secrets in AWS Secrets Manager (unkey/shared, unkey/control, unkey/krane, etc.) are already replicated from us-east-1 to the regions where unkey-api runs. Once the cluster is up, External Secrets will pull from the local region’s Secrets Manager automatically. Verify replication is in place for your region:
aws secretsmanager describe-secret \
  --secret-id unkey/shared \
  --region us-east-1 \
  --query 'ReplicationStatus[].Region' \
  --output text
If your region is not in the list, you need to add it to each secret’s replication configuration:
# Add a new replica region to an existing secret
aws secretsmanager replicate-secret-to-regions \
  --secret-id unkey/shared \
  --add-replica-regions Region=<region> \
  --region us-east-1

# Repeat for each secret: unkey/control, unkey/krane,
# unkey/sentinel, unkey/vault, unkey/vector, unkey/frontline, unkey/argocd
The replicate-secrets-to-new-region.sh script automates this for all secrets at once:
./scripts/replicate-secrets-to-new-region.sh us-east-1 <region>
After initial replication, AWS keeps them in sync automatically — no cron or Lambda needed. See AWS Secrets for the full secret inventory.

Step 4: Create cluster

Set the required variables and run the bootstrap script:
ENVIRONMENT=production001 PRIMARY_REGION=<region> ./scripts/setup-cluster.sh
The script executes in order:
StepWhat happens
1Create IAM policies (ExternalDNS, SecretsManager, ALB, ACK)
2Create EKS cluster (without node groups)
3Wait for cluster ACTIVE status
4Update kubeconfig
5Patch addon tolerations
6Create node groups
7Create observability S3 bucket
8Install AWS Load Balancer Controller
9Install CRDs (Prometheus, External Secrets)
10Install and configure ArgoCD
For production environments you’ll be prompted to type the environment name to confirm. Expected duration: 15–25 minutes (mostly waiting for EKS cluster and node group creation).

Step 5: Verify deployment

# Nodes are ready
kubectl get nodes

# ArgoCD is running
kubectl get pods -n argocd

# Get ArgoCD admin password
kubectl -n argocd get secret argocd-initial-admin-secret \
  -o jsonpath="{.data.password}" | base64 -d; echo

# Access ArgoCD UI
kubectl port-forward svc/argocd-server -n argocd 8080:443
Check that ArgoCD has picked up the new region’s ApplicationSets and apps are syncing. Core infrastructure apps (external-dns, observability, etc.) should sync automatically.

Step 6: Configure gossip WAN seeds

Both unkey-api and frontline (if --with-deploy) use memberlist-based WAN gossip for cross-region state sharing. This is a chicken-and-egg problem: each region needs to know the other region’s NLB DNS name, but that NLB doesn’t exist until the chart deploys.

6a. Deploy with empty seeds (already done)

The generated config has UNKEY_GOSSIP_WAN_SEEDS: "" for unkey-api and appropriate defaults for frontline. ArgoCD will deploy them, creating the NLB and registering DNS via ExternalDNS.

6b. Verify the gossip NLB DNS is registered

Wait for ExternalDNS to create the DNS records, then verify:
# unkey-api
dig unkey-api-gossip.<region>.aws.unkey.cloud

# frontline (deploy regions only)
dig frontline-gossip.<region>.aws.unkey.cloud
If ExternalDNS hasn’t registered the friendly name yet, get the raw NLB hostname:
kubectl get svc -n api unkey-api-gossip-wan \
  -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'

6c. Update the new region’s WAN seeds

Point the new region to the existing region(s). unkey-api — edit helm-chart/unkey-api/environments/<env>/<region>.yaml:
env:
  UNKEY_GOSSIP_WAN_SEEDS: "unkey-api-gossip.<existing-region>.aws.unkey.cloud"
frontline (deploy regions only) — edit helm-chart/frontline/environments/<env>/<region>.yaml:
gossip:
  wanSeeds: "frontline-gossip.<existing-region>.aws.unkey.cloud"

6d. Update existing regions to include the new region

Each existing region must add the new region as a seed. Seeds are comma-separated if there are multiple peer regions. Example — existing us-east-1 unkey-api config gets:
env:
  UNKEY_GOSSIP_WAN_SEEDS: "unkey-api-gossip.eu-central-1.aws.unkey.cloud,unkey-api-gossip.<new-region>.aws.unkey.cloud"

6e. Commit, push, and sync

git add helm-chart/
git commit -m "Wire gossip WAN seeds for <region>"
git push
ArgoCD will redeploy the affected services. Pods restart and join the WAN gossip ring.

6f. Verify gossip is healthy

# Check unkey-api gossip logs
kubectl logs -n api -l app.kubernetes.io/component=unkey-api --tail=50 | grep -i gossip

# Check the WAN NLB has a healthy target
aws elbv2 describe-target-health \
  --target-group-arn <TARGET_GROUP_ARN> \
  --region <region>

Step 7: Enable Global Accelerator (deploy regions only)

For regions running frontline with --with-deploy, the generated config already sets globalAccelerator.enabled: true and includes the listener ARN. After the frontline NLB is created:
  1. The GA resolver Helm hook job runs automatically
  2. It discovers the NLB ARN and creates an EndpointGroup CRD
  3. The ACK Global Accelerator controller reconciles and attaches the NLB to the Global Accelerator
Verify:

# EndpointGroup exists
kubectl get endpointgroups -n frontline
If the Global Accelerator doesn’t exist yet (first-time setup), create it first:
ENVIRONMENT=production001 ./scripts/setup-global-accelerator.sh

Quick Reference

ScriptWhat it does
generate-region-config.shGenerate all config files for a new region
promoteUpdate promotion files to deploy a revision via ArgoCD
promotion-changelistsGenerate a changelog of PRs between the old and new promotion revisions
replicate-secrets-to-new-region.shAdd a new region to secrets replication (only needed for regions not already replicated)
setup-cluster.shFull cluster bootstrap (IAM → EKS → nodes → ArgoCD)
setup-global-accelerator.shCreate Global Accelerator (one-time)
setup-acm-certificate.shCreate wildcard ACM cert for a region
validate-aws-resources.shValidate AWS resources exist
apply-addon-tolerations.shPatch EKS addon tolerations

Troubleshooting

CIDR not found

Error: No CIDR found for 'production001-xx-xxxx-1'
The region isn’t in the CIDR_MAP in generate-region-config.sh. Add it there and in networks.

Node groups not scheduling pods

All node groups use taints. Pods need matching tolerations. Check:
kubectl describe node <node-name> | grep Taints
kubectl get pods -A --field-selector=status.phase!=Running
kubectl describe pod <pending-pod> -n <namespace>  # look for "Insufficient" or "didn't match"
Common taints:
Node groupTaint
unkeynode-class=unkey:NoSchedule
untrustednode-class=untrusted:NoSchedule
sentinelnode-class=sentinel:NoSchedule
observabilitynode-class=observability:NoSchedule
apinode-class=api:NoSchedule

Gossip not joining

  1. DNS not resolving — ExternalDNS may not have registered yet. Check kubectl logs -n networking -l app.kubernetes.io/name=external-dns.
  2. NLB not readykubectl get svc -n api unkey-api-gossip-wan should show an external hostname.
  3. Security groups — WAN gossip uses port 7947 TCP+UDP. The NLB must allow inbound on this port.
  4. Secret mismatch — All regions in a gossip ring must share the same UNKEY_GOSSIP_SECRET_KEY (pulled from AWS Secrets Manager).

ExternalSecrets failing

kubectl get externalsecrets -A
kubectl describe externalsecret <name> -n <namespace>
Check that:
  • Secrets are replicated to this region (see Step 3)
  • Pod Identity association exists for the service account
  • The SecretStore references the correct region

ArgoCD apps not syncing

argocd app list
argocd app get <app-name>
kubectl logs -n argocd -l app.kubernetes.io/name=argocd-application-controller --tail=100
Verify the ApplicationSet generator includes the new cluster/region.