Skip to content

Metrics

Kupe Cloud stores metrics in Mimir and exposes them in Grafana inside your tenant org.

All clusters in your tenant are queryable from the same Grafana metrics datasource. Use the cluster label to filter or compare clusters.

  1. Open Grafana from the Kupe Cloud Portal.
  2. Go to Explore.
  3. Under Metrics, enter the metric you would like to query. lets start with: up
  4. Narrow by labels such as cluster, namespace, pod, container, or job.
  5. Adjust time range and resolution before concluding a metric is missing.

To test any query below in Grafana Explore:

  1. Open Explore.
  2. In the query editor, switch from Builder to Code mode.
  3. Paste the query.
  4. Click Run query (the refresh button) to execute it.
  5. Adjust time range and step if the chart looks sparse or noisy.
topk(10,
sum by (cluster, namespace, pod) (
rate(container_cpu_usage_seconds_total{container!="",pod!=""}[5m])
)
)
topk(10,
sum by (cluster, namespace, pod) (
container_memory_working_set_bytes{container!="",pod!=""}
)
)
sum by (cluster, namespace, pod) (
increase(kube_pod_container_status_restarts_total[30m])
)
100 *
sum by (cluster, namespace, pod) (
rate(container_cpu_cfs_throttled_periods_total{container!="",pod!=""}[5m])
)
/
clamp_min(
sum by (cluster, namespace, pod) (
rate(container_cpu_cfs_periods_total{container!="",pod!=""}[5m])
),
1
)

To get app metrics into Kupe Cloud, expose a Prometheus endpoint and annotate pods.

Preferred annotations:

metadata:
annotations:
k8s.grafana.com/scrape: "true"
k8s.grafana.com/metrics.path: "/metrics"
k8s.grafana.com/metrics.portNumber: "8080"

Compatibility annotations are also supported:

metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/path: "/metrics"
prometheus.io/port: "8080"

Both annotation families are supported. Use one family consistently.
If both are set, k8s.grafana.com/scrape controls scrape enablement, while path/port behavior can be ambiguous.

  • cluster: derived by Kupe from the managed cluster context and used for multi-cluster filtering.
  • namespace, pod, container: primary workload dimensions for troubleshooting.
  • job: often mapped from app.kubernetes.io/name when available.

Use low-cardinality labels in application metrics. Avoid per-request IDs, UUIDs, or timestamps as labels.

  1. Confirm you’re in the correct tenant org and datasource.
  2. Run up and up{namespace="your-namespace"} to verify scrape targets exist.
  3. Check pod annotations for scrape enablement and correct path/port.
  4. Expand the Grafana time range (for example, last 6h) and retry.
  5. Validate metric names and labels in table view before building final queries.