Skip to content

Grafana Dashboards

Dashboards are provisioned automatically from ConfigMaps. Create a ConfigMap with your dashboard JSON, apply the correct label, and Grafana picks it up.

apiVersion: v1
kind: ConfigMap
metadata:
name: observability-dashboards-tenant-workloads
namespace: observability
labels:
kupe.cloud/grafana-dashboards: "true"
kupe.cloud/grafana-folder: "Workloads"
data:
my-app-overview.json: |-
{
"title": "My App Overview",
"uid": "my-app-overview",
...
}
  • The label kupe.cloud/grafana-dashboards: "true" is required for operator-driven tenant dashboard sync.
  • Use kupe.cloud/grafana-folder: "<Folder Name>" to control target folder (for example, Overview or Workloads).
  • Place ConfigMaps in the Grafana namespace (observability by default).
  • Each key in data must end in .json.
  • The uid field in the dashboard JSON must be unique across all dashboards.
  • Do not use grafana_dashboard or grafana_folder for tenant dashboards. Those are sidecar labels for Main Org dashboards and can route dashboards to the wrong place.

Tenant dashboards are organized by ConfigMap label, not filename prefix.

apiVersion: v1
kind: ConfigMap
metadata:
name: observability-dashboards-tenant-overview
namespace: observability
labels:
kupe.cloud/grafana-dashboards: "true"
kupe.cloud/grafana-folder: "Overview"
data:
service-health.json: |-
{ ... }

Mapping examples:

  • kupe.cloud/grafana-folder: "Overview" -> Overview
  • kupe.cloud/grafana-folder: "Workloads" -> Workloads
  • No folder label -> General

Folders are created automatically in each tenant org if they do not exist.

The easiest way to create dashboard JSON:

  1. Build your dashboard in the Grafana UI.
  2. Click the Share icon and select Export.
  3. Toggle Export for sharing externally to replace data source references with variables.
  4. Copy the JSON and paste it into your ConfigMap.
{
"title": "My Dashboard",
"uid": "unique-id",
"version": 1,
"schemaVersion": 39,
"refresh": "30s",
"time": { "from": "now-1h", "to": "now" },
"tags": ["my-app"],
"templating": {
"list": [
{
"name": "datasource",
"type": "datasource",
"query": "prometheus",
"refresh": 1
},
{
"name": "namespace",
"type": "query",
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"query": "label_values(kube_pod_info, namespace)",
"refresh": 2,
"includeAll": true
}
]
},
"panels": [
{
"title": "Request Rate",
"type": "timeseries",
"gridPos": { "h": 8, "w": 12, "x": 0, "y": 0 },
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"targets": [
{
"expr": "sum(rate(http_requests_total{namespace=\"$namespace\"}[5m])) by (pod)",
"legendFormat": "{{ pod }}"
}
]
}
]
}

Use a datasource variable for tenant dashboards:

"datasource": { "type": "prometheus", "uid": "${datasource}" }

With this templating block:

{
"name": "datasource",
"type": "datasource",
"query": "prometheus"
}
  • Use variables for namespace, workload, and cluster to make dashboards reusable.
  • Keep time range and resolution appropriate to the investigation window.
  • Pair rate and error metrics on one panel for faster correlation.
  • Set meaningful thresholds with color coding (green/yellow/red).
  • Use stat panels for key numbers and timeseries for trends.

If you manage dashboards via Helm, render them into tenant dashboard ConfigMaps with the Kupe labels:

templates/configmap-dashboard.yaml
{{- if .Values.dashboard.enabled }}
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ .Values.dashboard.configMapName | default "observability-dashboards-tenant-workloads" }}
namespace: {{ .Values.dashboard.configMapNamespace | default "observability" }}
labels:
kupe.cloud/grafana-dashboards: "true"
kupe.cloud/grafana-folder: {{ .Values.dashboard.folder | default "Workloads" | quote }}
data:
{{ (.Files.Glob "dashboards/*.json").AsConfig | indent 2 }}
{{- end }}

Place dashboard JSON files in a dashboards/ directory alongside your chart templates.

Kupe provides baseline tenant dashboards out of the box:

  • Overview / Cluster — cluster health and capacity overview.
  • Workloads / Pod Resources — per-pod CPU, memory, network, and filesystem usage.

Platform admins also have additional dashboards in Main Org (overview, workloads, system, infrastructure, billing).

  • Review dashboard usefulness after incidents.
  • Remove low-signal panels.
  • Add panels for recurring failure patterns.