Overview

KubeBolt is an open-source Kubernetes monitoring and management platform. Full cluster visibility in under 2 minutes — no agents, no Prometheus, no databases.

What is KubeBolt?

KubeBolt reads directly from the Kubernetes API Server and Metrics Server to give you a visual, intuitive dashboard with actionable insights. It's designed for development teams and small-to-medium engineering organizations that deploy on Kubernetes but don't have dedicated SRE teams.

Core philosophy: Connect. See. Fix. — Only a kubeconfig is needed. No YAML to write, no CRDs to install, no sidecars to inject.

What you get

23 resource views with live CPU/memory metrics from Metrics Server
Interactive cluster topology map with Grid and Flow layouts
12-rule insights engine with actionable recommendations
Multi-cluster support — all kubeconfig contexts auto-discovered
Gateway API support — Gateways and HTTPRoutes via dynamic client
Cluster actions — Pod terminal, file browser, port-forward, restart, scale, YAML edit
AI Copilot — 16 tools querying live data, multi-provider (Claude, GPT, Ollama, vLLM)
Built-in authentication — Admin/Editor/Viewer roles with JWT sessions
Global search — ⌘K across all resource types
RBAC-aware — adapts to your ServiceAccount permissions automatically

Performance

Metric	Value
Backend memory usage	`~19 MB`
Frontend bundle (gzipped)	`~148 KB`
API response time	`<5ms` (from informer cache)
Startup time	`<3s`

Compatibility

Provider	Metrics Server	Notes
Amazon EKS	Pre-installed	Mount `~/.aws` for credentials
Google GKE	Pre-installed	Works out of the box
Azure AKS	Pre-installed	Works out of the box
k3s / k0s	Pre-installed	Works out of the box
Docker Desktop	Manual install	Use `docker-kubeconfig.sh` helper
Minikube	Addon	`minikube addons enable metrics-server`

Quick Start

Get KubeBolt running in under 2 minutes.

Fastest: Helm Chart

helm install kubebolt \
  oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt

kubectl port-forward svc/kubebolt 3000:80

Open http://localhost:3000 and log in with the default admin user.

Default admin password: On first boot, KubeBolt seeds a default admin user and prints the generated password to the server logs. Check with kubectl logs deployment/kubebolt-api | grep "Generated admin password". You can also set it explicitly: --set auth.adminPassword=YourPassword.

Docker Compose

# Remote clusters (EKS, GKE, AKS)
kubectl config use-context my-cluster
cd deploy && docker compose up -d

# Docker Desktop K8s (needs kubeconfig rewrite)
./deploy/docker-kubeconfig.sh
cd deploy && docker compose up -d

Frontend at http://localhost:3000. Nginx proxies /api and /ws to the Go backend.

Helm (in-cluster)

helm install kubebolt \
  oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt

kubectl port-forward svc/kubebolt 3000:80

Local Development

Requires Go 1.25+ and Node 20+.

# Backend (port 8080)
cd apps/api && go run cmd/server/main.go --kubeconfig ~/.kube/config

# Frontend (port 5173, proxies /api and /ws to backend)
cd apps/web && npm install && npm run dev

Installation Methods

Multiple ways to install KubeBolt, from Docker Compose to in-cluster Helm deployment.

Helm Chart (recommended for Kubernetes)

Production deployment inside your cluster. OCI-based chart on GitHub Container Registry. Also listed on Artifact Hub.

helm install kubebolt oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt

kubectl port-forward svc/kubebolt 3000:80

# With Ingress and custom admin password
helm install kubebolt \
  oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt \
  --set auth.adminPassword=YourSecurePassword \
  --set ingress.enabled=true \
  --set ingress.host=kubebolt.example.com

The chart creates a ClusterRole with full read permissions, a ServiceAccount, and the KubeBolt deployment. Configurable values include image tags, resource limits, Ingress, auth, and RBAC settings.

Docker Compose

Full stack with separate API and web containers. Multi-arch images (amd64/arm64).

# Remote clusters (EKS, GKE, AKS)
kubectl config use-context my-cluster
cd deploy && docker compose up -d

# Docker Desktop K8s (needs kubeconfig rewrite)
./deploy/docker-kubeconfig.sh
cd deploy && docker compose up -d

Frontend at http://localhost:3000. Admin password printed to logs (docker compose logs api).

EKS note: The compose file mounts ~/.aws for AWS credential access. Ensure your AWS session is active.

Single Binary (macOS, Linux, Windows)

One executable with embedded frontend. API + UI in a single process on one port. Download from GitHub Releases.

# macOS Apple Silicon
curl -LO https://github.com/clm-cloud-solutions/kubebolt/releases/latest/download/kubebolt-darwin-arm64
chmod +x kubebolt-darwin-arm64 && mv kubebolt-darwin-arm64 /usr/local/bin/kubebolt

# Linux amd64
curl -LO https://github.com/clm-cloud-solutions/kubebolt/releases/latest/download/kubebolt-linux-amd64
chmod +x kubebolt-linux-amd64 && sudo mv kubebolt-linux-amd64 /usr/local/bin/kubebolt

# Run (auto-detects kubeconfig; admin password printed to logs)
kubebolt --kubeconfig ~/.kube/config

Available for darwin-arm64, darwin-amd64, linux-arm64, linux-amd64, and windows-amd64. Verify with sha256sum -c CHECKSUMS.txt.

.env support: The binary auto-loads a .env file from the current directory. Put KUBEBOLT_ADMIN_PASSWORD, KUBEBOLT_AI_API_KEY, etc. in a .env file next to the binary. System env vars and CLI flags take precedence.

Local Development

Requires Go 1.25+ and Node 20+.

# Backend (port 8080)
cd apps/api && go run cmd/server/main.go --kubeconfig ~/.kube/config

# Frontend (port 5173, proxies /api and /ws to backend)
cd apps/web && npm install && npm run dev

To build your own single binary:

make build-binary
# Produces apps/api/kubebolt (embedded frontend)

Homebrew (macOS, Linux)

Install and update via Homebrew. Automatic version management via brew upgrade. Available for macOS and Linux (amd64 + arm64).

# Add the CLM tap (one-time setup)
brew tap clm-cloud-solutions/tap

# Install
brew install kubebolt

# Run
kubebolt --kubeconfig ~/.kube/config

# Update to the latest version
brew upgrade kubebolt

Docker (single container)

Single image with embedded frontend. No nginx, no compose. Multi-arch (amd64/arm64). Runs as non-root.

docker run -p 3000:3000 \
  -v ~/.kube:/root/.kube:ro \
  ghcr.io/clm-cloud-solutions/kubebolt:latest

The image is signed with Cosign. Verify with:

cosign verify ghcr.io/clm-cloud-solutions/kubebolt:latest \
  --certificate-identity-regexp 'https://github.com/clm-cloud-solutions/kubebolt/.*' \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com

EKS note: For EKS clusters, mount ~/.aws and set AWS_PROFILE so the binary can obtain tokens. The image includes aws-cli.

kubectl Plugin (krew)

Run KubeBolt as a kubectl subcommand. Available via CLM's custom krew index.

# Prerequisite: install krew (one-time)
# https://krew.sigs.k8s.io/docs/user-guide/setup/install/

# Add the CLM custom index
kubectl krew index add clm https://github.com/clm-cloud-solutions/krew-index.git

# Install
kubectl krew install clm/kubebolt

# Run (uses current kubectl context)
kubectl kubebolt

# Update
kubectl krew upgrade clm/kubebolt

Coming Soon

The following distribution methods are planned but require additional infrastructure.

Install Script

One-command install that auto-detects OS and architecture. Requires a custom domain (kubebolt.io) to host the script.

curl -fsSL https://get.kubebolt.dev | sh

Official krew-index

After upstream acceptance into kubernetes-sigs/krew-index, users will be able to install without adding the CLM custom index:

kubectl krew install kubebolt

Kubernetes Operator

Declarative lifecycle management via a KubeBolt CRD. Separate controller project with Kubebuilder, handles upgrades and self-healing.

kubectl apply -f https://get.kubebolt.dev/operator.yaml

Kubernetes Operator

Declarative lifecycle management via KubeBolt CRD. Separate controller project with Kubebuilder.

kubectl apply -f https://get.kubebolt.dev/operator.yaml

Architecture

Go backend with in-memory caches and BoltDB for auth. React frontend with live WebSocket updates.

System Diagram

Kubernetes Cluster(s)
  │ API Server + Metrics Server
  ▼ kubeconfig (all contexts)
KubeBolt Backend (Go)
  ├─ Cluster Manager     → multi-cluster lifecycle, async connection
  ├─ Shared Informers    → typed resources (client-go)
  ├─ Dynamic Client      → Gateway API CRDs (unstructured)
  ├─ Permission Probe    → 22 SelfSubjectAccessReview calls
  ├─ Metrics Collector   → 30s poll, in-memory cache
  ├─ Insights Engine     → 12 rules
  ├─ Auth Service        → JWT sessions, BoltDB user store, role enforcement
  ├─ REST API (Chi v5)   → resource lists, details, YAML, logs
  ├─ WebSocket Hub       → real-time broadcasts
  └─ Copilot Proxy       → LLM tool-calling bridge
  │
  ▼ REST + WebSocket
KubeBolt Frontend (React 18 + TypeScript + Vite 5 + Tailwind)
  ├─ 23 Resource Views   → TanStack Table + Query
  ├─ Cluster Map         → React Flow 11
  ├─ AI Copilot          → multi-provider, 16 tools
  └─ Theme               → dark/light via CSS custom properties (--kb-*)

Go Workspace

Monorepo with go.work containing three modules:

apps/api — Main backend server (entry: cmd/server/main.go)
packages/agent — Phase 2 lightweight node agent (stub)
packages/shared — Shared Go utilities

Key Backend Packages

Package	Purpose
`cluster/manager.go`	Multi-cluster lifecycle, context switching, async initial connection
`cluster/connector.go`	Shared informers + dynamic client, 20s cache sync timeout, 15s rest timeout
`cluster/permissions.go`	RBAC probing via SSAR, cluster-wide then namespace fallback, semaphore of 10
`cluster/nslister.go`	Multi-namespace lister wrappers for namespace-scoped ServiceAccounts
`cluster/graph.go`	In-memory topology graph with debounced rebuild (2s)
`cluster/relationships.go`	Edge detection: ownerRefs, selectors, Gateway parentRefs, volumes
`metrics/collector.go`	Metrics Server polling, per-namespace fallback, graceful degradation
`insights/engine.go`	12-rule evaluation engine
`auth/service.go`	User management, JWT issue/verify, role enforcement, BoltDB persistence
`auth/middleware.go`	Auth middleware with httpOnly cookie extraction and role-based route guards
`websocket/hub.go`	Broadcast hub, 4096 buffer, silent drops when no clients
`api/router.go`	Chi router with requireConnector middleware
`api/handlers.go`	REST handlers with metrics injection, YAML, logs, deployment history

Data Flow

Manager reads kubeconfig contexts → async connection (HTTP server binds immediately, returns 503 until connected)
Permission probe: 22 SSAR calls, cluster-wide then namespace fallback, ~2-5s
Informers start only for permitted resources
Namespace-scoped SAs → per-namespace informer factories with multi-lister aggregation
Dynamic client discovers Gateway API CRDs (5s timeout, gracefully skipped)
Metrics Collector polls every 30s → in-memory cache (per-namespace when cluster-wide denied)
REST API serves enriched resources with metrics injection, paginated (50/page). 403 for restricted.
WebSocket broadcasts resource changes with debounced topology rebuilds

Resource Views

23 resource types, each with search, filtering, pagination, and live metrics.

Supported Resources

Category	Resources
Workloads	Pods, Deployments, StatefulSets, DaemonSets, Jobs, CronJobs, ReplicaSets
Traffic	Services, Ingresses, Gateways, HTTPRoutes, EndpointSlices
Storage	PersistentVolumeClaims, PersistentVolumes, StorageClasses
Config	ConfigMaps, Secrets (keys only, never values), HPAs
Cluster	Nodes, Namespaces, Events, Roles, ClusterRoles, RoleBindings, ClusterRoleBindings

Resource Detail Views

Each resource has a tabbed detail page at /:type/:namespace/:name. Available tabs vary by resource type:

Overview — Key fields, status, conditions, labels, annotations
YAML — Syntax-highlighted definition with copy/download/edit. Secrets redacted.
Pods — For workloads (Deployments, StatefulSets, DaemonSets, Jobs)
Logs — Pod and workload logs with container selector, tail lines (100/500/1000), 10s auto-refresh
Containers — Container specs, env vars, ports, mounts (Pods only)
Volumes — Volume mounts and claims (Pods only)
Related — Parent and child resources from topology edges
History — Revision history via ReplicaSets (Deployments) or ControllerRevisions (StatefulSets, DaemonSets)
Events — Events filtered to the specific resource
Monitor — SVG donut gauges for CPU/memory from Metrics Server
Files — Exec-based file browser with directory navigation and content viewer
Terminal — WebSocket-to-SPDY exec bridge with xterm.js

Insights Engine

12 built-in rules that detect common Kubernetes issues and provide actionable recommendations.

Rule Definitions

Rule	Severity	Condition
crash-loop	Critical	Pod in CrashLoopBackOff with restarts >3/hour
oom-killed	Critical	Container terminated with OOMKilled (exit code 137)
zero-replicas	Critical	Deployment with 0 available replicas
node-not-ready	Critical	Node condition Ready ≠ True
image-pull-backoff	Critical	Pod in ImagePullBackOff state
cpu-throttle-risk	Warning	CPU usage >80% of limit sustained
memory-pressure	Warning	Memory usage >85% of limit
hpa-maxed-out	Warning	HPA current replicas == max replicas
pvc-pending	Warning	PVC in Pending state for >5 minutes
frequent-restarts	Warning	Pod with >5 restarts in 24 hours (non crash-loop)
resource-underrequest	Info	Requests <40% of actual usage
evicted-pods	Info	Pods evicted from node due to pressure

Each insight includes the affected resource, a human-readable message, and a specific suggestion with remediation steps or kubectl commands.

Cluster Map

Interactive topology visualization of resource relationships.

Layout Modes

Grid — Compact grid of resources within namespace regions. Best for overview.
Flow — Horizontal dependency chain: Ingress/Gateway → HTTPRoute → Service → Deployment → ReplicaSet → Pod. Best for understanding traffic flow.

In both modes, namespace regions are arranged in a grid of up to 3 columns. Regions are ReactFlow group nodes with child resource nodes.

Relationship Detection

Relationship	Detection Method	Edge Type
Deployment → ReplicaSet → Pod	ownerReferences chain	owns
Service → Pods	Label selector matching	selects
Ingress → Service	spec.rules.http.paths.backend	routes
Gateway → HTTPRoute	HTTPRoute spec.parentRefs	routes
HTTPRoute → Service	spec.rules.backendRefs	routes
HPA → Deployment	spec.scaleTargetRef	hpa
Pod → PVC	spec.volumes.persistentVolumeClaim	mounts
PVC → PV	spec.volumeName	bound
Pod → ConfigMap/Secret	volumes + envFrom	mounts / envFrom

AI Copilot

Talk to your cluster in natural language. Press ⌘J to open.

How it Works

The copilot sends your question to a configured LLM provider along with tool definitions that map to KubeBolt's REST API. The LLM calls tools to fetch live cluster data, analyzes the results, and responds with data-backed answers and kubectl commands.

17 Native Tools

Tool	What it does
`get_cluster_overview`	Resource counts, CPU/memory, health score, events
`list_resources`	List any of 23 resource types with filtering
`get_resource_detail`	Full detail of a specific resource
`get_resource_yaml`	Raw YAML definition (secrets redacted)
`get_resource_describe`	kubectl-describe output for deep troubleshooting
`get_pod_logs`	Pod logs with container, tail, since and grep options
`get_workload_pods`	Pods owned by a workload controller
`get_workload_history`	Revision history for Deployments/StatefulSets/DaemonSets
`get_cronjob_jobs`	Job children of a CronJob to investigate execution history
`get_topology`	Full cluster topology graph
`get_insights`	Active insights with severity
`get_events`	Events with filtering
`search_resources`	Global search by name across 16 resource types
`get_permissions`	Detected RBAC permissions
`list_clusters`	All available kubeconfig contexts
`get_kubebolt_docs`	Product knowledge base (features, navigation, admin pages)

Supported Providers

Anthropic — Claude Sonnet 4.6, Claude Opus 4.6/4.7, Claude Haiku 4.5. Prompt caching on system prompt + tool definitions.
OpenAI — GPT-5, GPT-5 Mini, GPT-4o, GPT-4o Mini. Automatic prompt caching, max_completion_tokens for reasoning models.
Self-hosted — Ollama, vLLM, Groq, DeepSeek (OpenAI-compatible).

BYO Key: You bring your own API key. KubeBolt never stores or transmits your key to any server other than the provider you configure. For production, use the API proxy mode to keep keys server-side.

Contextual "Ask Copilot"

Launches the Copilot panel with a pre-loaded prompt that already carries the cluster, namespace, resource name and symptom. Five surfaces:

Insights — "Diagnose this insight and recommend a fix"
Resource Detail (Pods, Deployments, StatefulSets, Services, Nodes) — "Investigate this resource"
Events — button on every Warning row, "Explain this Kubernetes Warning event"

Templates live in services/copilot/triggers.ts and are versioned so the LLM never sees stale framing.

Conversation memory

Long sessions stop bleeding context. When the estimated conversation size crosses SESSION_BUDGET_TOKENS × AUTO_COMPACT_THRESHOLD (default 80%), the handler folds older turns into a summary generated by the provider's cheap-tier model (Haiku 4.5 / gpt-4o-mini) and stubs bulky tool_results in the preserved tail. The active turn's tool_results are always protected so mid-flight compacts never truncate a response.

A Scissors icon in the panel header exposes the same primitive on demand — "new session with summary" collapses the whole transcript into a single summary message so you can pivot topics without losing context.

Env var	Default	Purpose
`KUBEBOLT_AI_AUTO_COMPACT`	`true`	Master switch for auto-compaction
`KUBEBOLT_AI_SESSION_BUDGET_TOKENS`	context window of the model	Total ceiling. Trigger fires at budget × threshold
`KUBEBOLT_AI_AUTO_COMPACT_THRESHOLD`	`0.80`	Fraction of the budget at which compact fires
`KUBEBOLT_AI_COMPACT_MODEL`	auto (cheap tier of same provider)	Override the summarisation model
`KUBEBOLT_AI_COMPACT_PRESERVE_TURNS`	`3`	Turns kept intact after compaction

Scope guardrail

The system prompt defines in-scope (Kubernetes operations, DevOps/SRE topics that support the user's cluster, KubeBolt itself) and out-of-scope (general coding unrelated to cluster resources, non-technical topics, competitor cloud products). The LLM refuses out-of-scope questions with a one-sentence polite redirect in the user's language — never answers partially.

Admin Copilot Usage

At /admin/copilot-usage: sessions, tokens billed, cache hit rate, estimated USD cost (best-effort pricing table for Anthropic and OpenAI), top tools with error rates, and a per-session drill-down modal with tool breakdown and compact events. Range selector 24h / 7d / 30d. Stored locally in BoltDB with a 30-day / 5000-entry retention cap. Requires authentication to be enabled.

RBAC & Permissions

KubeBolt auto-detects your kubeconfig's permissions and adapts automatically.

Permission Detection

Uses SelfSubjectAccessReview API to test list verb for 22 resource types
Two-phase: cluster-wide first, then namespace-level fallback for RoleBinding-based access
Concurrent execution (semaphore of 10), completes in ~2-5 seconds
If SSAR API itself is unavailable, falls back to assume full access

Access Levels

Level	Backend	Frontend
Cluster-admin	All informers start normally	Full UI, no restrictions
Cluster read-only	Informers for permitted resources only	Restricted items dimmed, "Limited access" banner
Namespace-scoped	Per-namespace informer factories with multi-lister aggregation	Resources scoped to permitted namespaces

Frontend Behavior

"Limited access — showing X of Y resource types" banner
Sidebar items dimmed with shield icon for restricted resources
Summary cards show "No access" instead of "0"
"No access to Nodes — capacity data unavailable" for node restrictions
PermissionDenied component for 403 resource pages

API Endpoint

GET /api/v1/cluster/permissions returns the full permission map per resource type with canList, canWatch, canGet, namespaceScoped, and namespaces fields.

Authentication

Built-in username/password authentication with role-based access control. No external identity provider required.

Overview

KubeBolt ships with a built-in auth system that supports three roles: Admin, Editor, and Viewer. Auth is enabled by default and uses BoltDB for user storage — no external database needed. Sessions are managed via JWT tokens stored in httpOnly cookies.

First boot: A default admin user is seeded automatically on first startup. The generated password is printed to the server logs. Change it immediately after first login.

Roles

KubeBolt enforces three roles with increasing levels of access:

Action	Viewer	Editor	Admin
View resources, metrics, topology, insights	Yes	Yes	Yes
View pod logs	Yes	Yes	Yes
Use AI Copilot (read-only tools)	Yes	Yes	Yes
Pod terminal (exec)	No	Yes	Yes
Edit YAML / Apply changes	No	Yes	Yes
Restart / Scale workloads	No	Yes	Yes
Delete resources	No	Yes	Yes
Port forwarding	No	Yes	Yes
Switch clusters	No	Yes	Yes
Manage users (create, edit, delete)	No	No	Yes
Change auth settings	No	No	Yes

Session Management

JWT tokens issued on login, stored in httpOnly secure cookies (not localStorage)
Token expiry is configurable (default: 24 hours)
Tokens are validated on every API request via auth middleware
Logout invalidates the cookie client-side

Storage

User accounts are stored in a local BoltDB file. By default, the database is written to ./data/kubebolt.db. Use the KUBEBOLT_DATA_DIR environment variable to customize the storage path. In Kubernetes deployments, mount a PersistentVolume to this path for durability.

Environment Variables

Variable	Default	Description
`KUBEBOLT_AUTH_ENABLED`	`true`	Enable or disable authentication. Set to `false` to allow anonymous access.
`KUBEBOLT_ADMIN_PASSWORD`	auto-generated	Override the default admin password on first boot. Ignored if admin user already exists.
`KUBEBOLT_JWT_SECRET`	auto-generated	Secret key for signing JWT tokens. Auto-generated and persisted in BoltDB if not set.
`KUBEBOLT_DATA_DIR`	`./data`	Directory for BoltDB storage file.

Helm Configuration

When deploying via Helm, configure auth through values:

# values.yaml
auth:
  enabled: true
  adminPassword: "my-secure-password"

# Or use an existing Kubernetes secret
auth:
  enabled: true
  existingSecret: "kubebolt-auth-secret"
  # Secret must contain keys: admin-password, jwt-secret

# Install with inline password
helm install kubebolt \
  oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt \
  --set auth.adminPassword="my-secure-password"

# Install with existing secret
helm install kubebolt \
  oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt \
  --set auth.existingSecret=kubebolt-auth-secret

Disabling Authentication

To run KubeBolt without authentication (e.g., behind a VPN or for local development):

# Local development
KUBEBOLT_AUTH_ENABLED=false go run cmd/server/main.go --kubeconfig ~/.kube/config

# Docker Compose (set in deploy/.env)
KUBEBOLT_AUTH_ENABLED=false

# Helm
helm install kubebolt \
  oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt \
  --set auth.enabled=false

Warning: Disabling auth exposes full cluster management capabilities to anyone who can reach the KubeBolt UI. Only disable auth when access is already restricted at the network level.

Metrics Server

KubeBolt uses the Kubernetes Metrics Server for CPU/memory data.

How it Works

The Metrics Collector polls metrics.k8s.io/v1beta1 (PodMetrics and NodeMetrics) every 30 seconds. Results are stored in an in-memory cache — no database required.

Graceful Degradation

If Metrics Server is not installed, KubeBolt shows all resource state and events normally. CPU/memory bars display a message with a one-click install command. The Collector distinguishes between "not installed" and "403 Forbidden" via apierrors.IsForbidden().

Namespace-scoped Metrics

When cluster-wide metrics access is denied (namespace-scoped ServiceAccounts), the Collector falls back to per-namespace polling: PodMetricses(ns).List() for each accessible namespace.

Multi-Cluster

All kubeconfig contexts auto-discovered. Switch clusters in one click.

How it Works

The Cluster Manager reads all contexts from the kubeconfig file at startup. The initial connection targets the current-context. You can switch clusters at runtime via the API or the cluster selector in the UI.

Switching

When switching clusters, the manager tears down the old connector (informers, metrics collector, insights engine) and creates a new one for the target context. The permission probe runs again for the new cluster. The frontend shows a "Connecting to cluster" overlay during the switch.

API

GET  /api/v1/clusters              → list all contexts
POST /api/v1/clusters/switch       → { "context": "production-eks" }

Notifications

Alert delivery to Slack, Discord and email when insights fire. Optional — disabled when no channel is configured.

Channels

Each channel activates when its connection details are set. You can enable one, two or all three at once.

Slack — Block Kit messages with severity colour bar and a button that deep-links back to the resource in KubeBolt.
Discord — Embed messages with equivalent formatting.
Email (SMTP) — Plain/HTML email with three delivery modes: instant, hourly digest, or daily digest. Supports multiple recipients.

Global settings

Cross-channel knobs that apply to every notifier:

Env var	Default	Purpose
`KUBEBOLT_NOTIFICATIONS_ENABLED`	`true`	Master kill switch — set false for maintenance windows.
`KUBEBOLT_NOTIFICATIONS_MIN_SEVERITY`	`warning`	Insights below this threshold don't notify. Values: `critical`, `warning`, `info`.
`KUBEBOLT_NOTIFICATIONS_COOLDOWN`	`1h`	Dedup window — the same insight fires at most once per cooldown.
`KUBEBOLT_NOTIFICATIONS_BASE_URL`	—	Public URL of your KubeBolt — embedded as a deep link in messages.
`KUBEBOLT_NOTIFICATIONS_INCLUDE_RESOLVED`	`false`	Also notify when an insight transitions to resolved.

Slack

# Create an Incoming Webhook in Slack, then:
KUBEBOLT_SLACK_WEBHOOK_URL=https://hooks.slack.com/services/T.../B.../...

Discord

# Server Settings → Integrations → Webhooks → New Webhook, then:
KUBEBOLT_DISCORD_WEBHOOK_URL=https://discord.com/api/webhooks/...

Email

KUBEBOLT_SMTP_HOST=smtp.example.com
KUBEBOLT_SMTP_PORT=587
KUBEBOLT_SMTP_USERNAME=alerts@example.com
KUBEBOLT_SMTP_PASSWORD=...
KUBEBOLT_SMTP_FROM=KubeBolt <alerts@example.com>
KUBEBOLT_SMTP_TO=sre@example.com,oncall@example.com
KUBEBOLT_SMTP_DIGEST_MODE=hourly   # instant | hourly | daily

Admin UI: The /admin/notifications page shows which channels are configured, surfaces the current global settings, and lets you send a test notification to each channel without restarting. Authenticated users with the Admin role only.

REST API Reference

All endpoints under /api/v1.

Cluster

Method	Endpoint	Description
GET	`/clusters`	List all kubeconfig contexts
POST	`/clusters/switch`	Switch active cluster
GET	`/cluster/overview`	Full cluster summary with counts, CPU/memory, health, events, workloads
GET	`/cluster/permissions`	Probed RBAC permissions per resource type

Resources

Method	Endpoint	Description
GET	`/resources/:type`	List with pagination (?limit=50), filtering (?namespace=, ?search=, ?status=)
GET	`/resources/:type/:ns/:name`	Detail with metrics injection
GET	`/resources/:type/:ns/:name/yaml`	Raw YAML (secrets redacted, managedFields stripped)
GET	`/resources/pods/:ns/:name/logs`	Pod logs (?container=, ?tailLines=100)
GET	`/resources/:workload/:ns/:name/pods`	Pods owned by deployment/statefulset/daemonset/job
GET	`/resources/deployments/:ns/:name/history`	Revision history via ReplicaSets

Other

Method	Endpoint	Description
GET	`/topology`	Full topology graph (nodes + edges)
GET	`/insights`	Active insights (?severity=critical,warning)
GET	`/events`	Events (?type=Warning, ?involvedName=, ?involvedKind=)
WS	`/ws`	WebSocket for real-time updates

WebSocket Events

Real-time updates via /api/v1/ws.

Event Types

Type	Description
`resource:updated`	Kubernetes resource changed
`resource:deleted`	Kubernetes resource removed
`event:new`	New Kubernetes event
`insight:new`	New insight detected
`insight:resolved`	Insight resolved
`metrics:refresh`	Metrics cache updated (every 30s)
`cluster.switched`	Active cluster changed

Broadcast buffer: 4096 messages. Messages dropped silently when no clients are connected to avoid log spam during cluster switches.

Amazon EKS Guide

Deploy KubeBolt on Amazon EKS clusters.

Authentication

IRSA (IAM Roles for Service Accounts) — recommended for in-cluster deployment
Pod Identity — newer EKS authentication method, also supported
AWS CLI — for local/Docker deployment, mount ~/.aws with active SSO session

Docker Compose

The compose file already mounts ~/.aws for EKS token generation. Ensure your AWS profile/SSO session is active:

kubectl config use-context my-eks-cluster
cd deploy && docker compose up -d

Helm with ALB Ingress

helm install kubebolt \
  oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt \
  --set ingress.enabled=true \
  --set ingress.className=alb \
  --set ingress.host=kubebolt.example.com

EKS Fargate is compatible — KubeBolt doesn't require DaemonSets in Phase 1.

Google GKE Guide

Deploy KubeBolt on Google Kubernetes Engine.

Authentication

Workload Identity — recommended for in-cluster deployments
GKE comes with Metrics Server pre-installed

Helm with GCE Ingress

helm install kubebolt \
  oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt \
  --set ingress.enabled=true \
  --set ingress.className=gce

GKE Autopilot clusters are compatible — no privileged containers needed.

Azure AKS Guide

Deploy KubeBolt on Azure Kubernetes Service.

Authentication

Azure AD Workload Identity — recommended for in-cluster deployments
Azure RBAC integration via kubeconfig

Helm with AGIC

helm install kubebolt \
  oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt \
  --set ingress.enabled=true \
  --set ingress.className=azure-application-gateway

Docker Desktop Guide

Run KubeBolt with Docker Desktop's built-in Kubernetes.

The Problem

Docker Desktop K8s uses 127.0.0.1:6443 as the API server address. This works from your host machine, but not from inside a container (which has its own localhost).

The Solution

Use the helper script to rewrite the kubeconfig to use kubernetes.docker.internal instead:

# 1. Enable Kubernetes in Docker Desktop → Settings → Kubernetes → Enable
# 2. Switch context
kubectl config use-context docker-desktop

# 3. Generate container-compatible kubeconfig
./deploy/docker-kubeconfig.sh

# 4. Start
cd deploy && docker compose up -d

Metrics Server

Docker Desktop doesn't include Metrics Server. Install it manually:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Roadmap

Where KubeBolt is heading.

Phase 1.0 — Core Platform ✅

23 resource views, multi-cluster, Gateway API, insights engine, cluster map, RBAC detection, Docker Compose.

Phase 1.3 — Terminal & Actions ✅

Pod terminal (xterm.js), port-forwarding, restart/scale, describe, YAML editing (CodeMirror 6), delete, global search (⌘K).

Phase 1.4 — File Browser & History ✅

Exec-based file browser, StatefulSet/DaemonSet history via ControllerRevisions, CronJob child jobs, YAML export.

Phase 1.5 — Distribution ✅

Helm chart (OCI), multi-arch images (amd64/arm64), GitHub releases, Artifact Hub, cloud guides.

Phase 1.6 — Animated Map & Notifications ✅

Draggable map nodes with animated pulse halos, cluster management UI, Slack/Discord/email notifications with digest modes, global settings (master toggle, base URL, resolved-insight alerts).

Phase 1.7 — Authentication ✅

Built-in username/password auth with Admin/Editor/Viewer roles, user management UI, JWT sessions, BoltDB storage.

Phase 1.8 — AI Copilot ✅

In-app AI assistant with 17 cluster tools, multi-provider (Claude, GPT, Ollama, vLLM), BYO API key, SSE streaming, fallback model, structured logging with per-session token accounting.

Phase 1.9 — Extended Distribution ✅

Single binary with embedded frontend, Homebrew tap, Docker single-container, kubectl plugin (krew).

Phase 1.5.x — Copilot maturity ✅

Contextual "Ask Copilot" buttons across insights, resource detail pages and warning events. Conversation memory with auto-compact at 80% of the budget (via cheap-tier model) and manual "new session with summary". Admin usage analytics page with cost estimates (BoltDB-backed, 30-day retention). Product knowledge base tool (get_kubebolt_docs) and scope guardrail. Prompt caching for Anthropic and OpenAI.

Phase 2.0 — OTel-native agent

Lightweight DaemonSet agent as an optimized distribution of the OpenTelemetry Collector. <1% CPU, <50MB RAM per node, <40MB binary. Exports in parallel to the customer's OTel backend — citizen of the ecosystem, not a silo.

Phase 2.1 — Hierarchical AI agents

Six specialized layers with different autonomy and compute budgets: deterministic detectors → router (Haiku) → investigator (Sonnet) → planner → deterministic executor → postmortem. 70% resolved without AI, 25% with cheap AI, 5% with expensive AI.

Phase 3.0 — SaaS Platform

Multi-tenant platform with OAuth/SSO, team collaboration, custom dashboards, billing.

Contributing

KubeBolt is open source under the MIT license. Contributions are welcome.

Development Setup

Requires Go 1.25+ and Node 20+.

# Clone
git clone https://github.com/clm-cloud-solutions/kubebolt.git
cd kubebolt

# Backend
cd apps/api && go run cmd/server/main.go --kubeconfig ~/.kube/config

# Frontend (separate terminal)
cd apps/web && npm install && npm run dev

CI

GitHub Actions on push/PR to main:

Backend: go build ./... (Go 1.25, ubuntu-latest)
Frontend: npm ci && npm run build (Node 20, ubuntu-latest)

Repository Structure

kubebolt/
├── apps/api/          # Go backend
├── apps/web/          # React frontend
├── packages/agent/    # Phase 2 node agent (stub)
├── packages/shared/   # Shared Go utilities
├── deploy/            # Docker Compose + Helm + scripts
├── docs/              # SPEC.md + images
├── go.work            # Go workspace
├── CLAUDE.md          # Claude Code context
└── README.md