Send telemetry to OTLP endpoints
Wire CLRK's request logs and traces to your observability backend. Per-EgressGateway configuration, OTLP/HTTP, no extra collector required.
CLRK's observability story is per-EgressGateway. Each gateway ships
its own captured request/response records and traces to an OTLP/HTTP
endpoint you configure. There is no global controller-manager OTLP
flag; observability is declarative cluster state, not process state.
The mental model
Every outbound request from an agent traverses an EgressGateway.
The gateway's ext_proc captures the L7 transaction (request +
response, with bounded body capture), enriches it with provider-aware
attributes (gen_ai.*), and ships one OTLP record per transaction.
That record lands in whatever OTLP/HTTP endpoint you point it at.
Captured request/response pairs always persist to the
controller-manager's embedded ClickHouse, regardless of
spec.otlp.endpoint. Setting endpoint only adds a best-effort
external re-export on top. Under clrk dev, records are also tee'd
to a local in-process receiver that lights up the TUI's otel-logs /
otel-traces panes.
Configure the endpoint
EgressGateway.spec.otlp:
apiVersion: clrk.apoxy.dev/v1alpha1
kind: EgressGateway
metadata:
name: prod-agents
spec:
defaultPolicy: deny-all
listeners:
- name: tls-out
protocol: TLS
tls:
mode: Terminate
otlp:
endpoint: "https://api.honeycomb.io"
headers:
x-honeycomb-team: "${HONEYCOMB_API_KEY}"
captureBody:
maxBytes: 65536Two important details:
endpointis a base URL. CLRK appends/v1/tracesand/v1/logsitself. Don't include the path.- OTLP/HTTP only today. No gRPC option. If your collector is
gRPC-only, front it with
otelcol-contribdoing HTTP-in / gRPC-out.
What you'll see in spans
One span per L7 transaction through the gateway. Attributes worth querying on:
| Namespace | Attribute | Meaning |
|---|---|---|
gen_ai.* | system | anthropic, openai, google_genai |
operation.name | chat, text_completion, embeddings | |
request.model / response.model | model the request asked for vs. what answered | |
usage.input_tokens / usage.output_tokens | token counts from the provider response | |
response.stream | true if SSE / streaming | |
http.* | request.method, response.status_code | standard HTTP semantics |
clrk.* | component | always egress-extproc for these spans (the ingress dispatch path uses ingress-extproc, worker spans use worker) |
aiproviderroute.name / .namespace / .matched | which APR rule matched | |
dst.name | DNS-bound hostname for the connection | |
req.bytes / resp.bytes / req.truncated / resp.truncated | size and truncation flags | |
body.bytes / body.b64 / body.truncated / body.usage_visible / body.request_rewritten | body capture metadata | |
duration_ms | end-to-end transaction time | |
budget.denied / budget.daily_used / budget.daily_max | budget enforcement signal | |
l4.bytes_upstream | byte counter on the L4 leg (TCP routes) | |
| (agent) | agent.kind, agent.namespace, agent.name, agent.uid, agent.revision | which agent originated the traffic |
invocation.id | stable per-invocation ID - join key across spans |
invocation.id is the load-bearing one for debugging: filter on it
to walk the full chain of egress traffic produced by a single
inbound request. See Trace requests through
agents for a worked debugging
session.
What clrk dev does differently
The clrk dev host CLI runs its own OTLP/HTTP receiver in-process
(port 14318). Under clrk dev, the controller-manager is launched
with --dev-otlp-fallback-endpoint pointing at that receiver. The
cm's OTLP receiver then tees every captured signal to the TUI
receiver in addition to its embedded ClickHouse and any per-EG
spec.otlp.endpoint forwarder - it does not override the per-EG
endpoint. That's why the example manifests can ship with
otlp.endpoint empty and still surface in the TUI's otel-logs /
otel-traces panes.
In production, no dev tee applies. Captured request/response pairs
always persist to the controller-manager's embedded ClickHouse; the
endpoint you set on the EgressGateway adds a best-effort external
re-export on top of that.
Wire it to Honeycomb
otlp:
endpoint: "https://api.honeycomb.io"
headers:
x-honeycomb-team: "${HONEYCOMB_API_KEY}"Honeycomb's OTLP/HTTP ingest accepts both traces and logs at the
auto-appended /v1/traces and /v1/logs paths. Surface
HONEYCOMB_API_KEY via your usual secret-templating path before
applying.
Wire it to Grafana Tempo
otlp:
endpoint: "https://tempo-prod-04-prod-us-east-0.grafana.net"
headers:
Authorization: "Basic ${TEMPO_BASIC_AUTH}"For Grafana Cloud, TEMPO_BASIC_AUTH is base64(<instance-id>:<api-key>).
Wire it to a self-hosted collector
If you already run otelcol-contrib, point CLRK at it and let the
collector fan out to your destinations:
otlp:
endpoint: "http://otelcol.observability.svc.cluster.local:4318"The collector config side:
receivers:
otlp:
protocols:
http: {}
exporters:
otlphttp/honeycomb:
endpoint: "https://api.honeycomb.io"
headers:
x-honeycomb-team: "${HONEYCOMB_API_KEY}"
service:
pipelines:
traces:
receivers: [otlp]
exporters: [otlphttp/honeycomb]
logs:
receivers: [otlp]
exporters: [otlphttp/honeycomb]What's not in OTLP today
Honest gaps so you don't go looking:
- No worker-side spans. Sandbox spawn, teardown, and libcontainer events are slog-only. Coming soon.
- No metrics. Only traces + logs ship over OTLP today. Counters, gauges, and histograms are coming soon. If you need them now, talk to us.
- No sampling control. All spans ship; no head- or tail-sampling knob in the spec. Use your collector for sampling if you need it.
- Bound body capture.
otlp.captureBody.maxBytesdefaults to 64 KiB per direction; bodies over that are truncated and markedclrk.body.truncated=true. Body capture only fires forapplication/json,application/x-ndjson, andtext/event-streamby default; overrideincludeContentTypesif you need others.
Where to next
- Walk a single request from inbound HTTP through every OTLP span it generates - see Trace requests through agents.
- Alert on budget denials - see Cap LLM spend per agent.
- Alert on egress denials (or the absence of records you expected) - see Lock down agent egress.