Send telemetry to OTLP endpoints

CLRK's observability story is per-EgressGateway. Each gateway ships its own captured request/response records and traces to an OTLP/HTTP endpoint you configure. There is no global controller-manager OTLP flag; observability is declarative cluster state, not process state.

The mental model

Every outbound request from an agent traverses an EgressGateway. The gateway's ext_proc captures the L7 transaction (request + response, with bounded body capture), enriches it with provider-aware attributes (gen_ai.*), and ships one OTLP record per transaction. That record lands in whatever OTLP/HTTP endpoint you point it at.

Captured request/response pairs always persist to the controller-manager's embedded ClickHouse, regardless of spec.otlp.endpoint. Setting endpoint only adds a best-effort external re-export on top. Under clrk dev, records are also tee'd to a local in-process receiver that lights up the TUI's otel-logs / otel-traces panes.

Configure the endpoint

EgressGateway.spec.otlp:

$terminalYAML

apiVersion: clrk.apoxy.dev/v1alpha1
kind: EgressGateway
metadata:
  name: prod-agents
spec:
  defaultPolicy: deny-all
  listeners:
    - name: tls-out
      protocol: TLS
      tls:
        mode: Terminate
  otlp:
    endpoint: "https://api.honeycomb.io"
    headers:
      x-honeycomb-team: "${HONEYCOMB_API_KEY}"
    captureBody:
      maxBytes: 65536

Two important details:

endpoint is a base URL. CLRK appends /v1/traces and /v1/logs itself. Don't include the path.
OTLP/HTTP only today. No gRPC option. If your collector is gRPC-only, front it with otelcol-contrib doing HTTP-in / gRPC-out.

What you'll see in spans

One span per L7 transaction through the gateway. Attributes worth querying on:

Namespace	Attribute	Meaning
`gen_ai.*`	`system`	`anthropic`, `openai`, `google_genai`
	`operation.name`	`chat`, `text_completion`, `embeddings`
	`request.model` / `response.model`	model the request asked for vs. what answered
	`usage.input_tokens` / `usage.output_tokens`	token counts from the provider response
	`response.stream`	true if SSE / streaming
`http.*`	`request.method`, `response.status_code`	standard HTTP semantics
`clrk.*`	`component`	always `egress-extproc` for these spans (the ingress dispatch path uses `ingress-extproc`, worker spans use `worker`)
	`aiproviderroute.name` / `.namespace` / `.matched`	which APR rule matched
	`dst.name`	DNS-bound hostname for the connection
	`req.bytes` / `resp.bytes` / `req.truncated` / `resp.truncated`	size and truncation flags
	`body.bytes` / `body.b64` / `body.truncated` / `body.usage_visible` / `body.request_rewritten`	body capture metadata
	`duration_ms`	end-to-end transaction time
	`budget.denied` / `budget.daily_used` / `budget.daily_max`	budget enforcement signal
	`l4.bytes_upstream`	byte counter on the L4 leg (TCP routes)
(agent)	`agent.kind`, `agent.namespace`, `agent.name`, `agent.uid`, `agent.revision`	which agent originated the traffic
	`invocation.id`	stable per-invocation ID - join key across spans

invocation.id is the load-bearing one for debugging: filter on it to walk the full chain of egress traffic produced by a single inbound request. See Trace requests through agents for a worked debugging session.

What `clrk dev` does differently

The clrk dev host CLI runs its own OTLP/HTTP receiver in-process (port 14318). Under clrk dev, the controller-manager is launched with --dev-otlp-fallback-endpoint pointing at that receiver. The cm's OTLP receiver then tees every captured signal to the TUI receiver in addition to its embedded ClickHouse and any per-EG spec.otlp.endpoint forwarder - it does not override the per-EG endpoint. That's why the example manifests can ship with otlp.endpoint empty and still surface in the TUI's otel-logs / otel-traces panes.

In production, no dev tee applies. Captured request/response pairs always persist to the controller-manager's embedded ClickHouse; the endpoint you set on the EgressGateway adds a best-effort external re-export on top of that.

Wire it to Honeycomb

$terminalYAML

otlp:
  endpoint: "https://api.honeycomb.io"
  headers:
    x-honeycomb-team: "${HONEYCOMB_API_KEY}"

Honeycomb's OTLP/HTTP ingest accepts both traces and logs at the auto-appended /v1/traces and /v1/logs paths. Surface HONEYCOMB_API_KEY via your usual secret-templating path before applying.

Wire it to Grafana Tempo

$terminalYAML

otlp:
  endpoint: "https://tempo-prod-04-prod-us-east-0.grafana.net"
  headers:
    Authorization: "Basic ${TEMPO_BASIC_AUTH}"

For Grafana Cloud, TEMPO_BASIC_AUTH is base64(<instance-id>:<api-key>).

Wire it to a self-hosted collector

If you already run otelcol-contrib, point CLRK at it and let the collector fan out to your destinations:

$terminalYAML

otlp:
  endpoint: "http://otelcol.observability.svc.cluster.local:4318"

The collector config side:

$terminalYAML

receivers:
  otlp:
    protocols:
      http: {}
exporters:
  otlphttp/honeycomb:
    endpoint: "https://api.honeycomb.io"
    headers:
      x-honeycomb-team: "${HONEYCOMB_API_KEY}"
service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [otlphttp/honeycomb]
    logs:
      receivers: [otlp]
      exporters: [otlphttp/honeycomb]

What's not in OTLP today

Honest gaps so you don't go looking:

No worker-side spans. Sandbox spawn, teardown, and libcontainer events are slog-only. Coming soon.
No metrics. Only traces + logs ship over OTLP today. Counters, gauges, and histograms are coming soon. If you need them now, talk to us.
No sampling control. All spans ship; no head- or tail-sampling knob in the spec. Use your collector for sampling if you need it.
Bound body capture. otlp.captureBody.maxBytes defaults to 64 KiB per direction; bodies over that are truncated and marked clrk.body.truncated=true. Body capture only fires for application/json, application/x-ndjson, and text/event-stream by default; override includeContentTypes if you need others.

Where to next

Walk a single request from inbound HTTP through every OTLP span it generates - see Trace requests through agents.
Alert on budget denials - see Cap LLM spend per agent.
Alert on egress denials (or the absence of records you expected) - see Lock down agent egress.

The mental model

Configure the endpoint

What you'll see in spans

What clrk dev does differently

Wire it to Honeycomb

Wire it to Grafana Tempo

Wire it to a self-hosted collector

What's not in OTLP today

Where to next

What `clrk dev` does differently