# Query fleet metrics

> Read fleet stats and chart data from CLRK's metrics API: a typed catalog plus a time-series query over the spans and logs your agents already emit.

CLRK serves fleet stats - token usage, request rates, latency
percentiles, error counts - from a read-only metrics API on the
aggregated apiserver. Every value is a query-time aggregation over the same
`otel_traces` / `otel_logs` your `EgressGateway` already writes (see
[Send telemetry to OTLP endpoints](/docs/clrk/guides/send-telemetry-to-otlp.md)
for what those spans carry).

## The mental model

The API has two halves, both under the `metrics.clrk.apoxy.dev` group:

- **A catalog** - the `metrics` resource. Each entry is a named
  aggregation recipe (`gen_ai.tokens`, `egress.requests`, …) with the
  dimensions it can be grouped by. The catalog is the LIST of this
  resource, so `kubectl get metrics` prints it and the console renders
  its menus from a typed object instead of hardcoded JS.
- **A query** - the `series` subresource. `GET metrics/{id}/series`
  runs the recipe and returns a `MetricSeriesSet`: one labeled series
  per group, each carrying one point (a scalar) or one point per time
  bucket (a range).

It's following a `pods` + `pods/log` structure which should be familiar:
`metrics/{id}` is the *descriptor* (what the metric is, cheap, no datastore 
hit); `metrics/{id}/series` is the *data* (run the query, with parameters).

Separately, the **per-agent snapshot** resources - `taskagentmetrics`
and `daemonagentmetrics` - are a flat scalar rollup per agent (the
agents page: invocations, errors, token totals, latency). They carry
their own `series` subresource for the same time-series query scoped to
one agent.

## Browse the catalog

```bash title="terminal"
kubectl get metrics
```

```
NAME                TYPE        UNIT          SOURCE
gen_ai.tokens       Counter     tokens        traces
gen_ai.duration     Histogram   ms            traces
mcp.calls           Counter     calls         traces
mcp.duration        Histogram   ms            traces
egress.requests     Counter     requests      traces
egress.bytes        Counter     bytes         traces
agent.invocations   Counter     invocations   traces
agent.errors        Counter     errors        traces
budget.denied       Counter     denials       traces
log.severity        Counter     records       logs
```

A `GET` on one id returns its descriptor - the value type, the unit,
the backing table, and the `groupBy` dimensions it accepts:

```bash title="terminal"
kubectl get metric gen_ai.tokens -o yaml
```

The `type` tells you how to query it:

- **Counter** - a monotonic count or sum (`count()`, `sum(...)`).
- **Histogram** - a duration distribution, queried as quantile series.
- **Gauge** - a point-in-time value (reserved; the v1 catalog has none).

## Run a query

The query is a `GET` on the `series` subresource. The path carries the
metric id; the rest is query parameters. `kubectl get --raw` is the
simplest way to call it by hand:

```bash title="terminal"
kubectl get --raw \
"/apis/metrics.clrk.apoxy.dev/v1alpha1/namespaces/default/metrics/egress.requests/series"
```

```json
{
  "kind": "MetricSeriesSet",
  "apiVersion": "metrics.clrk.apoxy.dev/v1alpha1",
  "metric": "egress.requests",
  "type": "Counter",
  "unit": "requests",
  "since": "2026-06-25T06:02:16Z",
  "until": "2026-06-25T07:02:16Z",
  "series": [
    { "points": [ { "timestamp": "2026-06-25T07:02:16Z", "value": "22" } ] }
  ]
}
```

### Scalar vs. time-series

Omit `step` and you get a **scalar**: one point per series, summed over
the whole window - the cards and counters.

Set `step` and you get a **range**: one point per `toStartOfInterval`
bucket - the charts.

```bash title="terminal"
# one point per 5-minute bucket
kubectl get --raw \
"/apis/metrics.clrk.apoxy.dev/v1alpha1/namespaces/default/metrics/egress.requests/series?step=5m"
```

The response echoes the resolved `step` and bucket timestamps. An
ungrouped query always returns exactly one series, even over an empty
window (zero points on a range, a single zero point on a scalar).

### Group by a dimension

`groupBy` splits the result into one series per distinct value of an
emitted attribute. Only the dimensions listed in the metric's
descriptor are valid:

```bash title="terminal"
# requests split into 2xx / 4xx / 5xx classes
kubectl get --raw "$B/metrics/egress.requests/series?groupBy=http.response.status_class"
```

Each series carries its group value under a label keyed by the
`groupBy` dimension. A metric that reports more than one value per point
(e.g. `gen_ai.tokens` → input + output) adds a `measure` label, so the
result is one series per `(group × measure)`.

### Histograms

A histogram metric returns one series per requested quantile. Omit
`quantiles` for the default p50 / p95 / p99:

```bash title="terminal"
kubectl get --raw "$B/metrics/gen_ai.duration/series?quantiles=0.5,0.99"
```

Each series is labeled with its `quantile`; values are whole
milliseconds. Combine with `step` for a per-quantile trend and `groupBy`
for one set of quantiles per group.

### The window

`since` and `until` are RFC3339 instants bounding a half-open
`[since, until)` window. Both default to the trailing hour ending now,
so a bare query still returns something sensible:

```bash title="terminal"
kubectl get --raw "$B/metrics/egress.requests/series?since=2026-06-25T00:00:00Z&until=2026-06-25T12:00:00Z&step=1h"
```

A future `until` is clamped to now (there is no data past now); the
response's `until` reflects the clamped value.

## Scope

Reads are scoped, and the scope is server-enforced - you cannot widen
it past what the path grants.

- **Fleet** (`metrics/{id}/series`) is scoped to the request namespace.
  Narrow it within that namespace with `scopeKind` + `scopeName`, where
  `scopeKind` is `TaskAgent`, `DaemonAgent`, or `EgressGateway`:

  ```bash title="terminal"
  kubectl get --raw "$B/metrics/egress.requests/series?scopeKind=EgressGateway&scopeName=prod-agents&groupBy=http.response.status_class"
  ```

- **Per-agent** (`taskagentmetrics/{name}/series`,
  `daemonagentmetrics/{name}/series`) is fixed to the agent the path
  names. The metric id moves to `?metric=`, and `scopeKind` / `scopeName`
  are rejected (the path already fixes the scope):

  ```bash title="terminal"
  kubectl get --raw "$B/taskagentmetrics/my-agent/series?metric=gen_ai.tokens&groupBy=gen_ai.request.model&step=5m"
  ```

## Values, caps, and truncation

- **Values are exact.** Each point's `value` is a Kubernetes
  `resource.Quantity` serialized as a string, so an integer counter
  total stays exact regardless of JSON number precision - a token sum
  above 2⁵³ does not round.
- **Bounded fan-out.** A range query is capped at 1500 buckets
  (`window / step`) and a grouped query at the top 50 groups by total
  value; when more groups exist, the response sets `truncated: true`.
  The scanned window is capped at 31 days.

## Where to next

- Understand the spans these recipes aggregate - see [Send telemetry to
  OTLP endpoints](/docs/clrk/guides/send-telemetry-to-otlp.md).
- Walk a single request across every span it produced - see [Trace
  requests through agents](/docs/clrk/guides/trace-requests-through-agents.md).
- The endpoint + schema reference - see [HTTP
  APIs](/docs/clrk/reference/http-apis.md) (the CLRK Metrics API section).
