Logs

Kupe Cloud collects workload logs from every managed cluster and stores them in a tenant-scoped Loki backend. Your Grafana org includes the logs datasource and a built-in Logs dashboard, so there is nothing extra to install before you can start investigating.

What is collected

The platform ships container logs from tenant workloads automatically and labels them with the fields you need for investigation:

cluster
namespace
pod
container

The kube-system namespace is reserved for platform-managed components and is excluded from tenant log shipping.

Start with the Logs dashboard

The Logs dashboard in the Workloads folder is the fastest way to investigate an issue without writing LogQL from scratch.

Use it when you want to:

narrow quickly by cluster, namespace, pod, or container
spot spikes in overall log volume
compare total log volume with likely error volume
read recent lines for the affected workload

Use Grafana Explore

Use Explore when you need deeper filtering, parsing, or ad-hoc queries.

Open Grafana.
Go to Explore.
Select the tenant logs datasource if it is not already selected.
Start with a narrow selector that includes cluster and namespace.
Add line filters or parsers only after you have the right log stream.

Common labels

Label	Example	Meaning
`cluster`	`production`	Managed cluster name
`namespace`	`backend`	Kubernetes namespace
`pod`	`api-7d9b5f-abcde`	Pod name
`container`	`api`	Container name within the pod

Common query patterns

# All logs from a namespace
{cluster="production", namespace="backend"}

# Error logs from pods with a matching prefix
{cluster="production", namespace="backend", pod=~"api-.*"} |~ "(?i)error"

# Logs from one container in a multi-container pod
{cluster="production", namespace="backend", container="api"}

# Parse JSON logs and filter on a field
{cluster="production", namespace="backend"} | json | level="error"

# Parse logfmt logs and filter on a numeric field
{cluster="production", namespace="backend"} | logfmt | http_status >= 500

You can also turn log streams into metrics:

# Top 10 pods by error rate over the last 5 minutes
topk(10,
  sum by (pod) (
    count_over_time(
      {cluster="production", namespace="backend"} |~ "(?i)error" [5m]
    )
  )
)

# Per-namespace log volume in the last 5 minutes
sum by (namespace) (
  count_over_time({cluster="production"}[5m])
)

Logging guidance

prefer structured logs such as JSON or logfmt
include stable fields such as level, service, and request or trace IDs
avoid logging secrets, tokens, or sensitive payloads
keep fields machine-readable so they can be filtered in Explore

If logs are missing

Confirm you are in the correct tenant org and logs datasource.
Check that the workload is not running in kube-system.
Narrow by cluster and namespace before adding more filters.
Expand the time range if the workload is quiet.
Confirm the application is writing logs to stdout or stderr.

Next steps

Metrics: correlate a log spike with resource or error signals
Notifications: route alert notifications to your team