/ Documentation
Home GitHub

Overview

KubeBolt is an open-source Kubernetes monitoring and management platform. Full cluster visibility in under 2 minutes — no agents, no Prometheus, no databases.

What is KubeBolt?

KubeBolt reads directly from the Kubernetes API Server and Metrics Server to give you a visual, intuitive dashboard with actionable insights. It's designed for development teams and small-to-medium engineering organizations that deploy on Kubernetes but don't have dedicated SRE teams.

Core philosophy: Connect. See. Fix. — Only a kubeconfig is needed. No YAML to write, no CRDs to install, no sidecars to inject.

What you get

  • 23 resource views with live CPU/memory metrics from Metrics Server
  • Interactive cluster topology map with Grid and Flow layouts
  • 12-rule insights engine with actionable recommendations
  • Multi-cluster support — all kubeconfig contexts auto-discovered
  • Gateway API support — Gateways and HTTPRoutes via dynamic client
  • Cluster actions — Pod terminal, file browser, port-forward, restart, scale, YAML edit
  • AI Copilot — 16 tools querying live data, multi-provider (Claude, GPT, Ollama, vLLM)
  • Built-in authentication — Admin/Editor/Viewer roles with JWT sessions
  • Global search — ⌘K across all resource types
  • RBAC-aware — adapts to your ServiceAccount permissions automatically

Performance

MetricValue
Backend memory usage~19 MB
Frontend bundle (gzipped)~148 KB
API response time<5ms (from informer cache)
Startup time<3s

Compatibility

ProviderMetrics ServerNotes
Amazon EKSPre-installedMount ~/.aws for credentials
Google GKEPre-installedWorks out of the box
Azure AKSPre-installedWorks out of the box
k3s / k0sPre-installedWorks out of the box
Docker DesktopManual installUse docker-kubeconfig.sh helper
MinikubeAddonminikube addons enable metrics-server

Quick Start

Get KubeBolt running in under 2 minutes.

Fastest: Helm Chart

helm install kubebolt \
  oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt

kubectl port-forward svc/kubebolt 3000:80

Open http://localhost:3000 and log in with the default admin user.

Default admin password: On first boot, KubeBolt seeds a default admin user and prints the generated password to the server logs. Check with kubectl logs deployment/kubebolt-api | grep "Generated admin password". You can also set it explicitly: --set auth.adminPassword=YourPassword.

Docker Compose

# Remote clusters (EKS, GKE, AKS)
kubectl config use-context my-cluster
cd deploy && docker compose up -d

# Docker Desktop K8s (needs kubeconfig rewrite)
./deploy/docker-kubeconfig.sh
cd deploy && docker compose up -d

Frontend at http://localhost:3000. Nginx proxies /api and /ws to the Go backend.

Helm (in-cluster)

helm install kubebolt \
  oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt

kubectl port-forward svc/kubebolt 3000:80

Local Development

Requires Go 1.25+ and Node 20+.

# Backend (port 8080)
cd apps/api && go run cmd/server/main.go --kubeconfig ~/.kube/config

# Frontend (port 5173, proxies /api and /ws to backend)
cd apps/web && npm install && npm run dev

Installation Methods

Multiple ways to install KubeBolt, from Docker Compose to in-cluster Helm deployment.

Helm Chart (recommended for Kubernetes)

Production deployment inside your cluster. OCI-based chart on GitHub Container Registry. Also listed on Artifact Hub.

helm install kubebolt oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt

kubectl port-forward svc/kubebolt 3000:80

# With Ingress and custom admin password
helm install kubebolt \
  oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt \
  --set auth.adminPassword=YourSecurePassword \
  --set ingress.enabled=true \
  --set ingress.host=kubebolt.example.com

The chart creates a ClusterRole with full read permissions, a ServiceAccount, and the KubeBolt deployment. Configurable values include image tags, resource limits, Ingress, auth, and RBAC settings.

Docker Compose

Full stack with separate API and web containers. Multi-arch images (amd64/arm64).

# Remote clusters (EKS, GKE, AKS)
kubectl config use-context my-cluster
cd deploy && docker compose up -d

# Docker Desktop K8s (needs kubeconfig rewrite)
./deploy/docker-kubeconfig.sh
cd deploy && docker compose up -d

Frontend at http://localhost:3000. Admin password printed to logs (docker compose logs api).

EKS note: The compose file mounts ~/.aws for AWS credential access. Ensure your AWS session is active.

Single Binary (macOS, Linux, Windows)

One executable with embedded frontend. API + UI in a single process on one port. Download from GitHub Releases.

# macOS Apple Silicon
curl -LO https://github.com/clm-cloud-solutions/kubebolt/releases/latest/download/kubebolt-darwin-arm64
chmod +x kubebolt-darwin-arm64 && mv kubebolt-darwin-arm64 /usr/local/bin/kubebolt

# Linux amd64
curl -LO https://github.com/clm-cloud-solutions/kubebolt/releases/latest/download/kubebolt-linux-amd64
chmod +x kubebolt-linux-amd64 && sudo mv kubebolt-linux-amd64 /usr/local/bin/kubebolt

# Run (auto-detects kubeconfig; admin password printed to logs)
kubebolt --kubeconfig ~/.kube/config

Available for darwin-arm64, darwin-amd64, linux-arm64, linux-amd64, and windows-amd64. Verify with sha256sum -c CHECKSUMS.txt.

.env support: The binary auto-loads a .env file from the current directory. Put KUBEBOLT_ADMIN_PASSWORD, KUBEBOLT_AI_API_KEY, etc. in a .env file next to the binary. System env vars and CLI flags take precedence.

Local Development

Requires Go 1.25+ and Node 20+.

# Backend (port 8080)
cd apps/api && go run cmd/server/main.go --kubeconfig ~/.kube/config

# Frontend (port 5173, proxies /api and /ws to backend)
cd apps/web && npm install && npm run dev

To build your own single binary:

make build-binary
# Produces apps/api/kubebolt (embedded frontend)

Homebrew (macOS, Linux)

Install and update via Homebrew. Automatic version management via brew upgrade. Available for macOS and Linux (amd64 + arm64).

# Add the CLM tap (one-time setup)
brew tap clm-cloud-solutions/tap

# Install
brew install kubebolt

# Run
kubebolt --kubeconfig ~/.kube/config

# Update to the latest version
brew upgrade kubebolt

Docker (single container)

Single image with embedded frontend. No nginx, no compose. Multi-arch (amd64/arm64). Runs as non-root.

docker run -p 3000:3000 \
  -v ~/.kube:/root/.kube:ro \
  ghcr.io/clm-cloud-solutions/kubebolt:latest

The image is signed with Cosign. Verify with:

cosign verify ghcr.io/clm-cloud-solutions/kubebolt:latest \
  --certificate-identity-regexp 'https://github.com/clm-cloud-solutions/kubebolt/.*' \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com
EKS note: For EKS clusters, mount ~/.aws and set AWS_PROFILE so the binary can obtain tokens. The image includes aws-cli.

kubectl Plugin (krew)

Run KubeBolt as a kubectl subcommand. Available via CLM's custom krew index.

# Prerequisite: install krew (one-time)
# https://krew.sigs.k8s.io/docs/user-guide/setup/install/

# Add the CLM custom index
kubectl krew index add clm https://github.com/clm-cloud-solutions/krew-index.git

# Install
kubectl krew install clm/kubebolt

# Run (uses current kubectl context)
kubectl kubebolt

# Update
kubectl krew upgrade clm/kubebolt

Coming Soon

The following distribution methods are planned but require additional infrastructure.

Install Script

One-command install that auto-detects OS and architecture. Requires a custom domain (kubebolt.io) to host the script.

curl -fsSL https://get.kubebolt.dev | sh

Official krew-index

After upstream acceptance into kubernetes-sigs/krew-index, users will be able to install without adding the CLM custom index:

kubectl krew install kubebolt

Kubernetes Operator

Declarative lifecycle management via a KubeBolt CRD. Separate controller project with Kubebuilder, handles upgrades and self-healing.

kubectl apply -f https://get.kubebolt.dev/operator.yaml

Kubernetes Operator

Declarative lifecycle management via KubeBolt CRD. Separate controller project with Kubebuilder.

kubectl apply -f https://get.kubebolt.dev/operator.yaml

Architecture

Go backend with in-memory caches and BoltDB for auth. React frontend with live WebSocket updates.

System Diagram

Kubernetes Cluster(s)
  │ API Server + Metrics Server
  ▼ kubeconfig (all contexts)
KubeBolt Backend (Go)
  ├─ Cluster Manager     → multi-cluster lifecycle, async connection
  ├─ Shared Informers    → typed resources (client-go)
  ├─ Dynamic Client      → Gateway API CRDs (unstructured)
  ├─ Permission Probe    → 22 SelfSubjectAccessReview calls
  ├─ Metrics Collector   → 30s poll, in-memory cache
  ├─ Insights Engine     → 12 rules
  ├─ Auth Service        → JWT sessions, BoltDB user store, role enforcement
  ├─ REST API (Chi v5)   → resource lists, details, YAML, logs
  ├─ WebSocket Hub       → real-time broadcasts
  └─ Copilot Proxy       → LLM tool-calling bridge
  │
  ▼ REST + WebSocket
KubeBolt Frontend (React 18 + TypeScript + Vite 5 + Tailwind)
  ├─ 23 Resource Views   → TanStack Table + Query
  ├─ Cluster Map         → React Flow 11
  ├─ AI Copilot          → multi-provider, 16 tools
  └─ Theme               → dark/light via CSS custom properties (--kb-*)

Go Workspace

Monorepo with go.work containing three modules:

  • apps/api — Main backend server (entry: cmd/server/main.go)
  • packages/agent — Phase 2 lightweight node agent (stub)
  • packages/shared — Shared Go utilities

Key Backend Packages

PackagePurpose
cluster/manager.goMulti-cluster lifecycle, context switching, async initial connection
cluster/connector.goShared informers + dynamic client, 20s cache sync timeout, 15s rest timeout
cluster/permissions.goRBAC probing via SSAR, cluster-wide then namespace fallback, semaphore of 10
cluster/nslister.goMulti-namespace lister wrappers for namespace-scoped ServiceAccounts
cluster/graph.goIn-memory topology graph with debounced rebuild (2s)
cluster/relationships.goEdge detection: ownerRefs, selectors, Gateway parentRefs, volumes
metrics/collector.goMetrics Server polling, per-namespace fallback, graceful degradation
insights/engine.go12-rule evaluation engine
auth/service.goUser management, JWT issue/verify, role enforcement, BoltDB persistence
auth/middleware.goAuth middleware with httpOnly cookie extraction and role-based route guards
websocket/hub.goBroadcast hub, 4096 buffer, silent drops when no clients
api/router.goChi router with requireConnector middleware
api/handlers.goREST handlers with metrics injection, YAML, logs, deployment history

Data Flow

  • Manager reads kubeconfig contexts → async connection (HTTP server binds immediately, returns 503 until connected)
  • Permission probe: 22 SSAR calls, cluster-wide then namespace fallback, ~2-5s
  • Informers start only for permitted resources
  • Namespace-scoped SAs → per-namespace informer factories with multi-lister aggregation
  • Dynamic client discovers Gateway API CRDs (5s timeout, gracefully skipped)
  • Metrics Collector polls every 30s → in-memory cache (per-namespace when cluster-wide denied)
  • REST API serves enriched resources with metrics injection, paginated (50/page). 403 for restricted.
  • WebSocket broadcasts resource changes with debounced topology rebuilds

Resource Views

23 resource types, each with search, filtering, pagination, and live metrics.

Supported Resources

CategoryResources
WorkloadsPods, Deployments, StatefulSets, DaemonSets, Jobs, CronJobs, ReplicaSets
TrafficServices, Ingresses, Gateways, HTTPRoutes, EndpointSlices
StoragePersistentVolumeClaims, PersistentVolumes, StorageClasses
ConfigConfigMaps, Secrets (keys only, never values), HPAs
ClusterNodes, Namespaces, Events, Roles, ClusterRoles, RoleBindings, ClusterRoleBindings

Resource Detail Views

Each resource has a tabbed detail page at /:type/:namespace/:name. Available tabs vary by resource type:

  • Overview — Key fields, status, conditions, labels, annotations
  • YAML — Syntax-highlighted definition with copy/download/edit. Secrets redacted.
  • Pods — For workloads (Deployments, StatefulSets, DaemonSets, Jobs)
  • Logs — Pod and workload logs with container selector, tail lines (100/500/1000), 10s auto-refresh
  • Containers — Container specs, env vars, ports, mounts (Pods only)
  • Volumes — Volume mounts and claims (Pods only)
  • Related — Parent and child resources from topology edges
  • History — Revision history via ReplicaSets (Deployments) or ControllerRevisions (StatefulSets, DaemonSets)
  • Events — Events filtered to the specific resource
  • Monitor — SVG donut gauges for CPU/memory from Metrics Server
  • Files — Exec-based file browser with directory navigation and content viewer
  • Terminal — WebSocket-to-SPDY exec bridge with xterm.js

Insights Engine

12 built-in rules that detect common Kubernetes issues and provide actionable recommendations.

Rule Definitions

RuleSeverityCondition
crash-loopCriticalPod in CrashLoopBackOff with restarts >3/hour
oom-killedCriticalContainer terminated with OOMKilled (exit code 137)
zero-replicasCriticalDeployment with 0 available replicas
node-not-readyCriticalNode condition Ready ≠ True
image-pull-backoffCriticalPod in ImagePullBackOff state
cpu-throttle-riskWarningCPU usage >80% of limit sustained
memory-pressureWarningMemory usage >85% of limit
hpa-maxed-outWarningHPA current replicas == max replicas
pvc-pendingWarningPVC in Pending state for >5 minutes
frequent-restartsWarningPod with >5 restarts in 24 hours (non crash-loop)
resource-underrequestInfoRequests <40% of actual usage
evicted-podsInfoPods evicted from node due to pressure

Each insight includes the affected resource, a human-readable message, and a specific suggestion with remediation steps or kubectl commands.

Cluster Map

Interactive topology visualization of resource relationships.

Layout Modes

  • Grid — Compact grid of resources within namespace regions. Best for overview.
  • Flow — Horizontal dependency chain: Ingress/Gateway → HTTPRoute → Service → Deployment → ReplicaSet → Pod. Best for understanding traffic flow.

In both modes, namespace regions are arranged in a grid of up to 3 columns. Regions are ReactFlow group nodes with child resource nodes.

Relationship Detection

RelationshipDetection MethodEdge Type
Deployment → ReplicaSet → PodownerReferences chainowns
Service → PodsLabel selector matchingselects
Ingress → Servicespec.rules.http.paths.backendroutes
Gateway → HTTPRouteHTTPRoute spec.parentRefsroutes
HTTPRoute → Servicespec.rules.backendRefsroutes
HPA → Deploymentspec.scaleTargetRefhpa
Pod → PVCspec.volumes.persistentVolumeClaimmounts
PVC → PVspec.volumeNamebound
Pod → ConfigMap/Secretvolumes + envFrommounts / envFrom

AI Copilot

Talk to your cluster in natural language. Press ⌘J to open.

How it Works

The copilot sends your question to a configured LLM provider along with tool definitions that map to KubeBolt's REST API. The LLM calls tools to fetch live cluster data, analyzes the results, and responds with data-backed answers and kubectl commands.

17 Native Tools

ToolWhat it does
get_cluster_overviewResource counts, CPU/memory, health score, events
list_resourcesList any of 23 resource types with filtering
get_resource_detailFull detail of a specific resource
get_resource_yamlRaw YAML definition (secrets redacted)
get_resource_describekubectl-describe output for deep troubleshooting
get_pod_logsPod logs with container, tail, since and grep options
get_workload_podsPods owned by a workload controller
get_workload_historyRevision history for Deployments/StatefulSets/DaemonSets
get_cronjob_jobsJob children of a CronJob to investigate execution history
get_topologyFull cluster topology graph
get_insightsActive insights with severity
get_eventsEvents with filtering
search_resourcesGlobal search by name across 16 resource types
get_permissionsDetected RBAC permissions
list_clustersAll available kubeconfig contexts
get_kubebolt_docsProduct knowledge base (features, navigation, admin pages)

Supported Providers

  • Anthropic — Claude Sonnet 4.6, Claude Opus 4.6/4.7, Claude Haiku 4.5. Prompt caching on system prompt + tool definitions.
  • OpenAI — GPT-5, GPT-5 Mini, GPT-4o, GPT-4o Mini. Automatic prompt caching, max_completion_tokens for reasoning models.
  • Self-hosted — Ollama, vLLM, Groq, DeepSeek (OpenAI-compatible).
BYO Key: You bring your own API key. KubeBolt never stores or transmits your key to any server other than the provider you configure. For production, use the API proxy mode to keep keys server-side.

Contextual "Ask Copilot"

Launches the Copilot panel with a pre-loaded prompt that already carries the cluster, namespace, resource name and symptom. Five surfaces:

  • Insights — "Diagnose this insight and recommend a fix"
  • Resource Detail (Pods, Deployments, StatefulSets, Services, Nodes) — "Investigate this resource"
  • Events — button on every Warning row, "Explain this Kubernetes Warning event"

Templates live in services/copilot/triggers.ts and are versioned so the LLM never sees stale framing.

Conversation memory

Long sessions stop bleeding context. When the estimated conversation size crosses SESSION_BUDGET_TOKENS × AUTO_COMPACT_THRESHOLD (default 80%), the handler folds older turns into a summary generated by the provider's cheap-tier model (Haiku 4.5 / gpt-4o-mini) and stubs bulky tool_results in the preserved tail. The active turn's tool_results are always protected so mid-flight compacts never truncate a response.

A Scissors icon in the panel header exposes the same primitive on demand — "new session with summary" collapses the whole transcript into a single summary message so you can pivot topics without losing context.

Env varDefaultPurpose
KUBEBOLT_AI_AUTO_COMPACTtrueMaster switch for auto-compaction
KUBEBOLT_AI_SESSION_BUDGET_TOKENScontext window of the modelTotal ceiling. Trigger fires at budget × threshold
KUBEBOLT_AI_AUTO_COMPACT_THRESHOLD0.80Fraction of the budget at which compact fires
KUBEBOLT_AI_COMPACT_MODELauto (cheap tier of same provider)Override the summarisation model
KUBEBOLT_AI_COMPACT_PRESERVE_TURNS3Turns kept intact after compaction

Scope guardrail

The system prompt defines in-scope (Kubernetes operations, DevOps/SRE topics that support the user's cluster, KubeBolt itself) and out-of-scope (general coding unrelated to cluster resources, non-technical topics, competitor cloud products). The LLM refuses out-of-scope questions with a one-sentence polite redirect in the user's language — never answers partially.

Admin Copilot Usage

At /admin/copilot-usage: sessions, tokens billed, cache hit rate, estimated USD cost (best-effort pricing table for Anthropic and OpenAI), top tools with error rates, and a per-session drill-down modal with tool breakdown and compact events. Range selector 24h / 7d / 30d. Stored locally in BoltDB with a 30-day / 5000-entry retention cap. Requires authentication to be enabled.

RBAC & Permissions

KubeBolt auto-detects your kubeconfig's permissions and adapts automatically.

Permission Detection

  • Uses SelfSubjectAccessReview API to test list verb for 22 resource types
  • Two-phase: cluster-wide first, then namespace-level fallback for RoleBinding-based access
  • Concurrent execution (semaphore of 10), completes in ~2-5 seconds
  • If SSAR API itself is unavailable, falls back to assume full access

Access Levels

LevelBackendFrontend
Cluster-adminAll informers start normallyFull UI, no restrictions
Cluster read-onlyInformers for permitted resources onlyRestricted items dimmed, "Limited access" banner
Namespace-scopedPer-namespace informer factories with multi-lister aggregationResources scoped to permitted namespaces

Frontend Behavior

  • "Limited access — showing X of Y resource types" banner
  • Sidebar items dimmed with shield icon for restricted resources
  • Summary cards show "No access" instead of "0"
  • "No access to Nodes — capacity data unavailable" for node restrictions
  • PermissionDenied component for 403 resource pages

API Endpoint

GET /api/v1/cluster/permissions returns the full permission map per resource type with canList, canWatch, canGet, namespaceScoped, and namespaces fields.

Authentication

Built-in username/password authentication with role-based access control. No external identity provider required.

Overview

KubeBolt ships with a built-in auth system that supports three roles: Admin, Editor, and Viewer. Auth is enabled by default and uses BoltDB for user storage — no external database needed. Sessions are managed via JWT tokens stored in httpOnly cookies.

First boot: A default admin user is seeded automatically on first startup. The generated password is printed to the server logs. Change it immediately after first login.

Roles

KubeBolt enforces three roles with increasing levels of access:

ActionViewerEditorAdmin
View resources, metrics, topology, insightsYesYesYes
View pod logsYesYesYes
Use AI Copilot (read-only tools)YesYesYes
Pod terminal (exec)NoYesYes
Edit YAML / Apply changesNoYesYes
Restart / Scale workloadsNoYesYes
Delete resourcesNoYesYes
Port forwardingNoYesYes
Switch clustersNoYesYes
Manage users (create, edit, delete)NoNoYes
Change auth settingsNoNoYes

Session Management

  • JWT tokens issued on login, stored in httpOnly secure cookies (not localStorage)
  • Token expiry is configurable (default: 24 hours)
  • Tokens are validated on every API request via auth middleware
  • Logout invalidates the cookie client-side

Storage

User accounts are stored in a local BoltDB file. By default, the database is written to ./data/kubebolt.db. Use the KUBEBOLT_DATA_DIR environment variable to customize the storage path. In Kubernetes deployments, mount a PersistentVolume to this path for durability.

Environment Variables

VariableDefaultDescription
KUBEBOLT_AUTH_ENABLEDtrueEnable or disable authentication. Set to false to allow anonymous access.
KUBEBOLT_ADMIN_PASSWORDauto-generatedOverride the default admin password on first boot. Ignored if admin user already exists.
KUBEBOLT_JWT_SECRETauto-generatedSecret key for signing JWT tokens. Auto-generated and persisted in BoltDB if not set.
KUBEBOLT_DATA_DIR./dataDirectory for BoltDB storage file.

Helm Configuration

When deploying via Helm, configure auth through values:

# values.yaml
auth:
  enabled: true
  adminPassword: "my-secure-password"

# Or use an existing Kubernetes secret
auth:
  enabled: true
  existingSecret: "kubebolt-auth-secret"
  # Secret must contain keys: admin-password, jwt-secret
# Install with inline password
helm install kubebolt \
  oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt \
  --set auth.adminPassword="my-secure-password"

# Install with existing secret
helm install kubebolt \
  oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt \
  --set auth.existingSecret=kubebolt-auth-secret

Disabling Authentication

To run KubeBolt without authentication (e.g., behind a VPN or for local development):

# Local development
KUBEBOLT_AUTH_ENABLED=false go run cmd/server/main.go --kubeconfig ~/.kube/config

# Docker Compose (set in deploy/.env)
KUBEBOLT_AUTH_ENABLED=false

# Helm
helm install kubebolt \
  oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt \
  --set auth.enabled=false
Warning: Disabling auth exposes full cluster management capabilities to anyone who can reach the KubeBolt UI. Only disable auth when access is already restricted at the network level.

Metrics Server

KubeBolt uses the Kubernetes Metrics Server for CPU/memory data.

How it Works

The Metrics Collector polls metrics.k8s.io/v1beta1 (PodMetrics and NodeMetrics) every 30 seconds. Results are stored in an in-memory cache — no database required.

Graceful Degradation

If Metrics Server is not installed, KubeBolt shows all resource state and events normally. CPU/memory bars display a message with a one-click install command. The Collector distinguishes between "not installed" and "403 Forbidden" via apierrors.IsForbidden().

Namespace-scoped Metrics

When cluster-wide metrics access is denied (namespace-scoped ServiceAccounts), the Collector falls back to per-namespace polling: PodMetricses(ns).List() for each accessible namespace.

Multi-Cluster

All kubeconfig contexts auto-discovered. Switch clusters in one click.

How it Works

The Cluster Manager reads all contexts from the kubeconfig file at startup. The initial connection targets the current-context. You can switch clusters at runtime via the API or the cluster selector in the UI.

Switching

When switching clusters, the manager tears down the old connector (informers, metrics collector, insights engine) and creates a new one for the target context. The permission probe runs again for the new cluster. The frontend shows a "Connecting to cluster" overlay during the switch.

API

GET  /api/v1/clusters              → list all contexts
POST /api/v1/clusters/switch       → { "context": "production-eks" }

Notifications

Alert delivery to Slack, Discord and email when insights fire. Optional — disabled when no channel is configured.

Channels

Each channel activates when its connection details are set. You can enable one, two or all three at once.

  • Slack — Block Kit messages with severity colour bar and a button that deep-links back to the resource in KubeBolt.
  • Discord — Embed messages with equivalent formatting.
  • Email (SMTP) — Plain/HTML email with three delivery modes: instant, hourly digest, or daily digest. Supports multiple recipients.

Global settings

Cross-channel knobs that apply to every notifier:

Env varDefaultPurpose
KUBEBOLT_NOTIFICATIONS_ENABLEDtrueMaster kill switch — set false for maintenance windows.
KUBEBOLT_NOTIFICATIONS_MIN_SEVERITYwarningInsights below this threshold don't notify. Values: critical, warning, info.
KUBEBOLT_NOTIFICATIONS_COOLDOWN1hDedup window — the same insight fires at most once per cooldown.
KUBEBOLT_NOTIFICATIONS_BASE_URLPublic URL of your KubeBolt — embedded as a deep link in messages.
KUBEBOLT_NOTIFICATIONS_INCLUDE_RESOLVEDfalseAlso notify when an insight transitions to resolved.

Slack

# Create an Incoming Webhook in Slack, then:
KUBEBOLT_SLACK_WEBHOOK_URL=https://hooks.slack.com/services/T.../B.../...

Discord

# Server Settings → Integrations → Webhooks → New Webhook, then:
KUBEBOLT_DISCORD_WEBHOOK_URL=https://discord.com/api/webhooks/...

Email

KUBEBOLT_SMTP_HOST=smtp.example.com
KUBEBOLT_SMTP_PORT=587
KUBEBOLT_SMTP_USERNAME=alerts@example.com
KUBEBOLT_SMTP_PASSWORD=...
KUBEBOLT_SMTP_FROM=KubeBolt <alerts@example.com>
KUBEBOLT_SMTP_TO=sre@example.com,oncall@example.com
KUBEBOLT_SMTP_DIGEST_MODE=hourly   # instant | hourly | daily
Admin UI: The /admin/notifications page shows which channels are configured, surfaces the current global settings, and lets you send a test notification to each channel without restarting. Authenticated users with the Admin role only.

REST API Reference

All endpoints under /api/v1.

Cluster

MethodEndpointDescription
GET/clustersList all kubeconfig contexts
POST/clusters/switchSwitch active cluster
GET/cluster/overviewFull cluster summary with counts, CPU/memory, health, events, workloads
GET/cluster/permissionsProbed RBAC permissions per resource type

Resources

MethodEndpointDescription
GET/resources/:typeList with pagination (?limit=50), filtering (?namespace=, ?search=, ?status=)
GET/resources/:type/:ns/:nameDetail with metrics injection
GET/resources/:type/:ns/:name/yamlRaw YAML (secrets redacted, managedFields stripped)
GET/resources/pods/:ns/:name/logsPod logs (?container=, ?tailLines=100)
GET/resources/:workload/:ns/:name/podsPods owned by deployment/statefulset/daemonset/job
GET/resources/deployments/:ns/:name/historyRevision history via ReplicaSets

Other

MethodEndpointDescription
GET/topologyFull topology graph (nodes + edges)
GET/insightsActive insights (?severity=critical,warning)
GET/eventsEvents (?type=Warning, ?involvedName=, ?involvedKind=)
WS/wsWebSocket for real-time updates

WebSocket Events

Real-time updates via /api/v1/ws.

Event Types

TypeDescription
resource:updatedKubernetes resource changed
resource:deletedKubernetes resource removed
event:newNew Kubernetes event
insight:newNew insight detected
insight:resolvedInsight resolved
metrics:refreshMetrics cache updated (every 30s)
cluster.switchedActive cluster changed

Broadcast buffer: 4096 messages. Messages dropped silently when no clients are connected to avoid log spam during cluster switches.

Amazon EKS Guide

Deploy KubeBolt on Amazon EKS clusters.

Authentication

  • IRSA (IAM Roles for Service Accounts) — recommended for in-cluster deployment
  • Pod Identity — newer EKS authentication method, also supported
  • AWS CLI — for local/Docker deployment, mount ~/.aws with active SSO session

Docker Compose

The compose file already mounts ~/.aws for EKS token generation. Ensure your AWS profile/SSO session is active:

kubectl config use-context my-eks-cluster
cd deploy && docker compose up -d

Helm with ALB Ingress

helm install kubebolt \
  oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt \
  --set ingress.enabled=true \
  --set ingress.className=alb \
  --set ingress.host=kubebolt.example.com

EKS Fargate is compatible — KubeBolt doesn't require DaemonSets in Phase 1.

Google GKE Guide

Deploy KubeBolt on Google Kubernetes Engine.

Authentication

  • Workload Identity — recommended for in-cluster deployments
  • GKE comes with Metrics Server pre-installed

Helm with GCE Ingress

helm install kubebolt \
  oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt \
  --set ingress.enabled=true \
  --set ingress.className=gce

GKE Autopilot clusters are compatible — no privileged containers needed.

Azure AKS Guide

Deploy KubeBolt on Azure Kubernetes Service.

Authentication

  • Azure AD Workload Identity — recommended for in-cluster deployments
  • Azure RBAC integration via kubeconfig

Helm with AGIC

helm install kubebolt \
  oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt \
  --set ingress.enabled=true \
  --set ingress.className=azure-application-gateway

Docker Desktop Guide

Run KubeBolt with Docker Desktop's built-in Kubernetes.

The Problem

Docker Desktop K8s uses 127.0.0.1:6443 as the API server address. This works from your host machine, but not from inside a container (which has its own localhost).

The Solution

Use the helper script to rewrite the kubeconfig to use kubernetes.docker.internal instead:

# 1. Enable Kubernetes in Docker Desktop → Settings → Kubernetes → Enable
# 2. Switch context
kubectl config use-context docker-desktop

# 3. Generate container-compatible kubeconfig
./deploy/docker-kubeconfig.sh

# 4. Start
cd deploy && docker compose up -d

Metrics Server

Docker Desktop doesn't include Metrics Server. Install it manually:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Roadmap

Where KubeBolt is heading.

Phase 1.0 — Core Platform ✅

23 resource views, multi-cluster, Gateway API, insights engine, cluster map, RBAC detection, Docker Compose.

Phase 1.3 — Terminal & Actions ✅

Pod terminal (xterm.js), port-forwarding, restart/scale, describe, YAML editing (CodeMirror 6), delete, global search (⌘K).

Phase 1.4 — File Browser & History ✅

Exec-based file browser, StatefulSet/DaemonSet history via ControllerRevisions, CronJob child jobs, YAML export.

Phase 1.5 — Distribution ✅

Helm chart (OCI), multi-arch images (amd64/arm64), GitHub releases, Artifact Hub, cloud guides.

Phase 1.6 — Animated Map & Notifications ✅

Draggable map nodes with animated pulse halos, cluster management UI, Slack/Discord/email notifications with digest modes, global settings (master toggle, base URL, resolved-insight alerts).

Phase 1.7 — Authentication ✅

Built-in username/password auth with Admin/Editor/Viewer roles, user management UI, JWT sessions, BoltDB storage.

Phase 1.8 — AI Copilot ✅

In-app AI assistant with 17 cluster tools, multi-provider (Claude, GPT, Ollama, vLLM), BYO API key, SSE streaming, fallback model, structured logging with per-session token accounting.

Phase 1.9 — Extended Distribution ✅

Single binary with embedded frontend, Homebrew tap, Docker single-container, kubectl plugin (krew).

Phase 1.5.x — Copilot maturity ✅

Contextual "Ask Copilot" buttons across insights, resource detail pages and warning events. Conversation memory with auto-compact at 80% of the budget (via cheap-tier model) and manual "new session with summary". Admin usage analytics page with cost estimates (BoltDB-backed, 30-day retention). Product knowledge base tool (get_kubebolt_docs) and scope guardrail. Prompt caching for Anthropic and OpenAI.

Phase 2.0 — OTel-native agent

Lightweight DaemonSet agent as an optimized distribution of the OpenTelemetry Collector. <1% CPU, <50MB RAM per node, <40MB binary. Exports in parallel to the customer's OTel backend — citizen of the ecosystem, not a silo.

Phase 2.1 — Hierarchical AI agents

Six specialized layers with different autonomy and compute budgets: deterministic detectors → router (Haiku) → investigator (Sonnet) → planner → deterministic executor → postmortem. 70% resolved without AI, 25% with cheap AI, 5% with expensive AI.

Phase 3.0 — SaaS Platform

Multi-tenant platform with OAuth/SSO, team collaboration, custom dashboards, billing.

Contributing

KubeBolt is open source under the MIT license. Contributions are welcome.

Development Setup

Requires Go 1.25+ and Node 20+.

# Clone
git clone https://github.com/clm-cloud-solutions/kubebolt.git
cd kubebolt

# Backend
cd apps/api && go run cmd/server/main.go --kubeconfig ~/.kube/config

# Frontend (separate terminal)
cd apps/web && npm install && npm run dev

CI

GitHub Actions on push/PR to main:

  • Backend: go build ./... (Go 1.25, ubuntu-latest)
  • Frontend: npm ci && npm run build (Node 20, ubuntu-latest)

Repository Structure

kubebolt/
├── apps/api/          # Go backend
├── apps/web/          # React frontend
├── packages/agent/    # Phase 2 node agent (stub)
├── packages/shared/   # Shared Go utilities
├── deploy/            # Docker Compose + Helm + scripts
├── docs/              # SPEC.md + images
├── go.work            # Go workspace
├── CLAUDE.md          # Claude Code context
└── README.md