Skip to content

Metrics

Kupe Cloud stores tenant metrics in Mimir and exposes them in Grafana through a tenant-scoped metrics datasource. All of your managed clusters are queryable from the same place, and the cluster label is the main way to filter or compare them.

Kupe collects baseline Kubernetes and workload metrics for every managed cluster, including pod, container, deployment, namespace, and storage signals used by the built-in dashboards.

Application metrics are collected when your pods expose a Prometheus endpoint and opt in to scraping with annotations.

  1. Open Grafana from the Kupe Cloud Portal.
  2. Go to Explore.
  3. Select the tenant metrics datasource if it is not already selected.
  4. Start with a simple query such as up.
  5. Narrow the result with labels such as cluster, namespace, pod, container, or job.
  6. Adjust the time range before concluding that a metric is missing.
topk(10,
sum by (cluster, namespace, pod) (
rate(container_cpu_usage_seconds_total{container!="",pod!=""}[5m])
)
)
topk(10,
sum by (cluster, namespace, pod) (
container_memory_working_set_bytes{container!="",pod!=""}
)
)
sum by (cluster, namespace, pod) (
increase(kube_pod_container_status_restarts_total[30m])
)
100 *
sum by (cluster, namespace, pod) (
rate(container_cpu_cfs_throttled_periods_total{container!="",pod!=""}[5m])
)
/
clamp_min(
sum by (cluster, namespace, pod) (
rate(container_cpu_cfs_periods_total{container!="",pod!=""}[5m])
),
1
)

Expose a Prometheus endpoint from your workload and annotate the pods so Kupe can scrape it.

Preferred annotations:

metadata:
annotations:
k8s.grafana.com/scrape: "true"
k8s.grafana.com/metrics.path: "/metrics"
k8s.grafana.com/metrics.portNumber: "8080"

Compatibility annotations are also supported:

metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/path: "/metrics"
prometheus.io/port: "8080"

Prefer the k8s.grafana.com/* annotations for new workloads. The prometheus.io/* form is kept for compatibility.

  • cluster: the managed cluster name, derived by the platform
  • namespace, pod, container: the main workload dimensions for troubleshooting
  • job: usually derived from app.kubernetes.io/name, with fallback to app

Use low-cardinality labels in your own metrics. Avoid request IDs, UUIDs, timestamps, or other values that create unbounded series counts.

The kube-system namespace is reserved for platform-managed components and is excluded from tenant observability. Workloads deployed to kube-system inside your cluster will not appear in your tenant metrics datasource.

  1. Confirm you are in the correct tenant org and metrics datasource.
  2. Run up or up{namespace="your-namespace"} to verify scrape targets exist.
  3. Check that the workload is not deployed in kube-system.
  4. Verify the scrape annotations, path, and port on the pods.
  5. Expand the time range and retry.
  6. Switch Explore to table view if you need to inspect labels directly.