Multi-Cluster Deployment Guide¶
This guide covers deploying NovaEdge across multiple Kubernetes clusters using a hub-spoke architecture.
Overview¶
NovaEdge supports multi-cluster deployments where:
- Hub Cluster: Runs the control plane (operator, controller, web UI)
- Spoke Clusters: Run only data plane agents that connect back to the hub
This architecture enables: - Centralized configuration management - Unified observability across all clusters - Cross-cluster load balancing and failover - Edge/remote location support
Architecture¶
Hub-Spoke Overview¶
flowchart TB
subgraph Hub["Hub Cluster (Management)"]
subgraph ControlPlane["Control Plane"]
OP["NovaEdge<br/>Operator"]
CTRL["NovaEdge<br/>Controller"]
WEBUI["Web UI<br/>Dashboard"]
end
subgraph HubConfig["Configuration"]
RC1[("RemoteCluster<br/>edge-west")]
RC2[("RemoteCluster<br/>edge-east")]
RC3[("RemoteCluster<br/>datacenter-1")]
end
HubAgents["Local Agents<br/>(DaemonSet)"]
end
subgraph Edge1["Edge Cluster 1 (us-west)"]
A1["Agents<br/>(DaemonSet)"]
V1{{"VIP<br/>10.1.0.100"}}
end
subgraph Edge2["Edge Cluster 2 (eu-west)"]
A2["Agents<br/>(DaemonSet)"]
V2{{"VIP<br/>10.2.0.100"}}
end
subgraph DC1["Datacenter 1 (on-prem)"]
A3["Agents<br/>(DaemonSet)"]
V3{{"VIP<br/>10.3.0.100"}}
end
CTRL <-->|"gRPC/mTLS"| A1
CTRL <-->|"gRPC/mTLS"| A2
CTRL <-->|"gRPC/mTLS"| A3
CTRL --> HubAgents
OP --> CTRL
OP --> HubAgents
RC1 --> CTRL
RC2 --> CTRL
RC3 --> CTRL
style Hub fill:#e6f3ff
style Edge1 fill:#f0fff0
style Edge2 fill:#fff0f0
style DC1 fill:#fff5e6
Connection Flow¶
sequenceDiagram
participant SA as Spoke Agent
participant HC as Hub Controller
participant RCR as RemoteCluster<br/>Registry
SA->>HC: Connect (gRPC/mTLS)
SA->>HC: StreamConfigRequest<br/>{clusterName, region, zone}
HC->>RCR: Register agent
HC->>SA: ConfigSnapshot (streaming)
loop Heartbeat (30s)
SA->>HC: ReportStatus {healthy, metrics}
HC->>RCR: Update connection status
end
SA--xHC: Disconnect
HC->>RCR: Mark cluster disconnected
Cross-Cluster Traffic Flow¶
flowchart LR
subgraph Client["Client Region: US"]
C((Client))
end
subgraph Edge1["Edge US-West"]
GW1["Gateway"]
RT1["Router"]
BE1["Backend"]
end
subgraph Edge2["Edge EU-West"]
GW2["Gateway"]
RT2["Router"]
BE2["Backend"]
end
C -->|"1. Request"| GW1
GW1 -->|"2. Route"| RT1
RT1 -->|"3a. Local<br/>(preferred)"| BE1
RT1 -.->|"3b. Cross-cluster<br/>(if needed)"| BE2
style Edge1 fill:#90EE90
style Edge2 fill:#FFE4B5
Prerequisites¶
- Kubernetes 1.29+ on all clusters
- Helm 3.0+
- Network connectivity from spoke clusters to hub (direct or tunnel)
- TLS certificates for secure communication
Deployment Steps¶
Step 1: Deploy Hub Cluster¶
First, deploy the full NovaEdge stack on the hub cluster:
# On the hub cluster
# Install the operator
helm install novaedge-operator ./charts/novaedge-operator \
--namespace novaedge-system \
--create-namespace
# Create NovaEdgeCluster with external access enabled
kubectl apply -f - <<EOF
apiVersion: novaedge.io/v1alpha1
kind: NovaEdgeCluster
metadata:
name: novaedge-hub
namespace: novaedge-system
spec:
version: "v0.1.0"
controller:
replicas: 3 # HA for production
leaderElection: true
grpcPort: 9090
agent:
hostNetwork: true
vip:
enabled: true
mode: L2
webUI:
enabled: true
service:
type: LoadBalancer # For external access
tls:
enabled: true
certManager:
enabled: true
issuerRef:
name: novaedge-ca-issuer
kind: ClusterIssuer
EOF
Step 2: Expose Controller for Remote Access¶
The controller needs to be accessible from spoke clusters:
# Option 1: LoadBalancer Service
apiVersion: v1
kind: Service
metadata:
name: novaedge-controller-external
namespace: novaedge-system
spec:
type: LoadBalancer
ports:
- name: grpc
port: 9090
targetPort: 9090
selector:
app.kubernetes.io/name: novaedge-controller
---
# Option 2: Ingress with TLS passthrough
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: novaedge-controller
namespace: novaedge-system
annotations:
nginx.ingress.kubernetes.io/ssl-passthrough: "true"
nginx.ingress.kubernetes.io/backend-protocol: "GRPCS"
spec:
ingressClassName: nginx
rules:
- host: novaedge-controller.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: novaedge-controller
port:
number: 9090
tls:
- hosts:
- novaedge-controller.example.com
secretName: novaedge-controller-tls
Step 3: Generate Agent Certificates¶
Generate mTLS certificates for agent authentication:
# Using cert-manager
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: novaedge-agent-cert
namespace: novaedge-system
spec:
secretName: novaedge-agent-cert
duration: 8760h # 1 year
renewBefore: 720h # 30 days
subject:
organizations:
- NovaEdge
commonName: novaedge-agent
usages:
- client auth
issuerRef:
name: novaedge-ca-issuer
kind: ClusterIssuer
Export certificates for spoke clusters:
# Export CA certificate
kubectl get secret novaedge-ca -n novaedge-system \
-o jsonpath='{.data.ca\.crt}' | base64 -d > ca.crt
# Export agent certificate and key
kubectl get secret novaedge-agent-cert -n novaedge-system \
-o jsonpath='{.data.tls\.crt}' | base64 -d > tls.crt
kubectl get secret novaedge-agent-cert -n novaedge-system \
-o jsonpath='{.data.tls\.key}' | base64 -d > tls.key
Step 4: Deploy Agents on Spoke Clusters¶
On each spoke cluster:
# Switch to spoke cluster context
kubectl config use-context spoke-cluster-1
# Create namespace
kubectl create namespace novaedge-system
# Create TLS secrets
kubectl create secret generic novaedge-ca \
--from-file=ca.crt=ca.crt \
-n novaedge-system
kubectl create secret tls novaedge-agent-cert \
--cert=tls.crt \
--key=tls.key \
-n novaedge-system
# Install agent chart
helm install novaedge-agent ./charts/novaedge-agent \
--namespace novaedge-system \
--set cluster.name=edge-west-1 \
--set cluster.region=us-west \
--set connection.controllerEndpoint=novaedge-controller.example.com:9090 \
--set tls.enabled=true \
--set tls.caSecretName=novaedge-ca \
--set tls.clientCertSecretName=novaedge-agent-cert \
--set vip.enabled=true \
--set vip.mode=L2
Step 5: Register Remote Cluster in Hub¶
Back on the hub cluster, create a NovaEdgeRemoteCluster resource:
apiVersion: novaedge.io/v1alpha1
kind: NovaEdgeRemoteCluster
metadata:
name: edge-west-1
namespace: novaedge-system
spec:
clusterName: edge-west-1
region: us-west
zone: us-west-2a
labels:
environment: production
tier: edge
connection:
mode: Direct
controllerEndpoint: novaedge-controller.novaedge-system.svc.cluster.local:9090
tls:
enabled: true
routing:
enabled: true
priority: 100
weight: 100
localPreference: true
allowCrossClusterTraffic: true
healthCheck:
enabled: true
interval: 30s
timeout: 10s
healthyThreshold: 2
unhealthyThreshold: 3
failoverEnabled: true
Step 6: Verify Deployment¶
Check the status of remote clusters:
# On hub cluster
kubectl get novaedgeremoteclusters -n novaedge-system
# Output:
# NAME CLUSTER REGION PHASE CONNECTED AGENTS AGE
# edge-west-1 edge-west-1 us-west Connected true 3 5m
# edge-east-1 edge-east-1 eu-west Connected true 2 3m
# Describe for details
kubectl describe novaedgeremotecluster edge-west-1 -n novaedge-system
# Check status with novactl
novactl status
Cross-Cluster Routing¶
Global Backend Definition¶
Create backends that span multiple clusters:
apiVersion: novaedge.io/v1alpha1
kind: ProxyBackend
metadata:
name: api-backend
namespace: default
spec:
# Endpoints are automatically discovered from all clusters
serviceRef:
name: api-service
port: 8080
lbPolicy: P2C # Power of Two Choices for optimal latency
# Multi-cluster settings
multiCluster:
enabled: true
# Prefer local endpoints
localityAware: true
# Weight by cluster
clusterWeights:
edge-west-1: 50
edge-east-1: 30
hub: 20
healthCheck:
interval: 10s
httpHealthCheck:
path: /health
Geo-Aware Routing¶
Route traffic based on client location:
apiVersion: novaedge.io/v1alpha1
kind: ProxyRoute
metadata:
name: geo-route
namespace: default
spec:
parentRefs:
- name: global-gateway
hostnames:
- api.example.com
rules:
# Route US traffic to US cluster
- matches:
- headers:
- name: X-Client-Region
value: us-*
type: RegularExpression
backendRef:
name: api-backend
clusterSelector:
matchLabels:
region: us-west
# Route EU traffic to EU cluster
- matches:
- headers:
- name: X-Client-Region
value: eu-*
type: RegularExpression
backendRef:
name: api-backend
clusterSelector:
matchLabels:
region: eu-west
# Default: nearest cluster
- backendRef:
name: api-backend
Failover Configuration¶
Configure automatic failover between clusters:
apiVersion: novaedge.io/v1alpha1
kind: NovaEdgeRemoteCluster
metadata:
name: edge-west-1
namespace: novaedge-system
spec:
clusterName: edge-west-1
region: us-west
connection:
mode: Direct
controllerEndpoint: novaedge-controller.example.com:9090
routing:
enabled: true
priority: 100 # Primary
weight: 100
healthCheck:
enabled: true
interval: 10s
timeout: 5s
unhealthyThreshold: 3
failoverEnabled: true # Enable automatic failover
---
apiVersion: novaedge.io/v1alpha1
kind: NovaEdgeRemoteCluster
metadata:
name: edge-west-2
namespace: novaedge-system
spec:
clusterName: edge-west-2
region: us-west
routing:
enabled: true
priority: 200 # Secondary (higher number = lower priority)
weight: 0 # Standby - no traffic unless primary fails
healthCheck:
enabled: true
failoverEnabled: true
Tunnel Mode for NAT/Firewall Traversal¶
For spoke clusters behind NAT or firewalls:
WireGuard Tunnel¶
# On hub: Deploy tunnel relay
apiVersion: apps/v1
kind: Deployment
metadata:
name: novaedge-tunnel-relay
namespace: novaedge-system
spec:
replicas: 2
selector:
matchLabels:
app: novaedge-tunnel-relay
template:
spec:
containers:
- name: wireguard
image: linuxserver/wireguard
# ... WireGuard configuration
---
# On spoke: Configure tunnel connection
# values.yaml for novaedge-agent chart
connection:
mode: Tunnel
controllerEndpoint: via-tunnel
tunnel:
type: WireGuard
relayEndpoint: tunnel.example.com:51820
wireGuard:
publicKey: "HUB_WIREGUARD_PUBLIC_KEY"
endpoint: "tunnel.example.com:51820"
allowedIPs:
- 10.0.0.0/8
persistentKeepalive: 25
Monitoring Multi-Cluster Deployments¶
Centralized Metrics¶
All clusters export metrics to the hub's Prometheus:
# On hub cluster
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: novaedge
namespace: novaedge-system
spec:
# Federate metrics from remote clusters
additionalScrapeConfigs:
name: remote-cluster-scrape-configs
key: prometheus.yaml
---
apiVersion: v1
kind: Secret
metadata:
name: remote-cluster-scrape-configs
namespace: novaedge-system
stringData:
prometheus.yaml: |
- job_name: 'novaedge-edge-west-1'
static_configs:
- targets: ['edge-west-1-metrics.example.com:9090']
tls_config:
ca_file: /etc/prometheus/certs/ca.crt
bearer_token_file: /etc/prometheus/tokens/edge-west-1
Web UI Dashboard¶
The NovaEdge web UI provides a unified view of all clusters:
# Access the web UI
kubectl port-forward -n novaedge-system svc/novaedge-webui 9080:9080
# Open http://localhost:9080
# Navigate to: Clusters > Overview
Best Practices¶
Security¶
- Always use mTLS for agent-controller communication
- Rotate certificates regularly (automate with cert-manager)
- Use network policies to restrict agent communication
- Audit remote cluster access via controller logs
High Availability¶
- Run 3+ controller replicas in the hub cluster
- Use anti-affinity to spread controllers across nodes/zones
- Configure health checks with appropriate thresholds
- Test failover regularly
Performance¶
- Enable locality-aware routing to prefer local backends
- Use appropriate LB algorithms (P2C or EWMA for latency-sensitive)
- Monitor latency between clusters
- Size agents appropriately based on traffic
Operations¶
- Use GitOps for configuration management
- Version control all cluster configurations
- Implement gradual rollouts for agent updates
- Monitor certificate expiration
Troubleshooting¶
Agent Connection Issues¶
# Check agent logs
kubectl logs -n novaedge-system -l app.kubernetes.io/name=novaedge-agent --tail=100
# Verify network connectivity
kubectl exec -n novaedge-system <agent-pod> -- \
nc -zv novaedge-controller.example.com 9090
# Check TLS certificates
kubectl exec -n novaedge-system <agent-pod> -- \
openssl s_client -connect novaedge-controller.example.com:9090 \
-CAfile /etc/novaedge/tls/ca.crt
Remote Cluster Not Connecting¶
# Check remote cluster status
kubectl get novaedgeremotecluster -n novaedge-system
# View detailed status
kubectl describe novaedgeremotecluster edge-west-1 -n novaedge-system
# Check controller logs for connection attempts
kubectl logs -n novaedge-system -l app.kubernetes.io/name=novaedge-controller \
--tail=100 | grep "edge-west-1"
Cross-Cluster Routing Not Working¶
# Verify backend endpoints
novactl describe backend api-backend
# Check routing configuration
novactl describe route api-route
# Test connectivity manually
kubectl exec -n novaedge-system <agent-pod> -- \
curl -v http://backend-service.remote-ns.svc.cluster.local:8080/health
Next Steps¶
- Operator Guide - Managing the hub cluster
- CRD Reference - NovaEdgeRemoteCluster details
- Helm Values Reference - Agent chart configuration
- Observability Guide - Monitoring multi-cluster deployments