# Core concepts

> The CRDs CLRK installs and how they compose - agents, pools, egress, policies, and observability.

Every resource in CLRK is a Kubernetes custom resource under
`clrk.apoxy.dev/v1alpha1`. You apply them with `clrk apply -f` or `kubectl apply -f`;
the controller-manager reconciles them; the worker materializes them as sandboxed
processes and intercepting Envoy listeners.

| Concept | What it is |
|---|---|
| **TaskAgent** | Triggered agent - one run per request, cron, or webhook |
| **DaemonAgent** | Long-lived agent process - supervised, restart-policied |
| **AgentSandboxRevision** | Immutable snapshot of an agent's template, named `{agent}-{5-digit-generation}` |
| **WorkerPool** | Fleet of worker pods that host sandboxes |
| **EgressGateway** | Intercepting proxy every outbound connection traverses |
| **AIProviderRoute** | Provider-aware match for LLM traffic (OpenAI, Anthropic, etc.) |
| **MCPRoute** | Match for MCP tool-call traffic |
| **EgressL4Route** | L4 routing on CIDR, port, hostname, or SNI |
| **CredentialInjectionPolicy** | Inject a Secret as a header or query param at the proxy |
| **RateLimitPolicy** | Cap request rates per agent, execution, or route |
| **EgressDenyPolicy** | Flip a route from allow to deny |
| **LoggingPolicy** | Toggle request/response capture and header redaction |

---

## Agents

CLRK splits "agent" into two kinds based on how the work is triggered.

### TaskAgent

A `TaskAgent` runs to completion in response to an external trigger - an HTTP request, a
cron fire, or anything that knows how to POST to the materialized ingress. Each fire
gets its own sandbox.

```yaml title="taskagent.yaml"
apiVersion: clrk.apoxy.dev/v1alpha1
kind: TaskAgent
metadata:
  name: word-count
spec:
  workerPoolRef: default
  timeout: 60s
  template:
    spec:
      image: registry.example.com/word-count:0.1
  egressRefs:
    - gatewayRef: default-egress
```

The wire format the agent reads on stdin is a CloudEvents structured-mode envelope
(`.data` is the caller's body). Set `spec.delivery.mode: Metadata` to switch to an
IMDS-style HTTP transport instead - useful for runtimes that don't read stdin gracefully.

`spec.schedule` adds a cron trigger without disabling HTTP triggers. `spec.scheduleInput`
becomes `.data` on cron fires. `spec.warmPoolSize` keeps N pre-spawned sandboxes ready
to absorb cold-start cost; `spec.maxConcurrent` caps in-flight executions globally.

### DaemonAgent

A `DaemonAgent` is a long-lived process that the supervisor keeps alive per
`restartPolicy` (`Always`, `OnFailure`, `Never`). No HTTP ingress - the daemon drives
its own work loop and pushes outbound traffic through the EgressGateway.

```yaml title="daemonagent.yaml"
apiVersion: clrk.apoxy.dev/v1alpha1
kind: DaemonAgent
metadata:
  name: echo-bot
spec:
  workerPoolRef: default
  restartPolicy: Always
  egressRefs:
    - gatewayRef: echo-bot
  template:
    spec:
      image: docker.io/curlimages/curl:8.10.1
      command: ["sh", "-c"]
      args: ["while :; do curl -sS https://api.anthropic.com/v1/messages ...; sleep 5; done"]
```

`spec.maxRestarts` caps restart attempts; `spec.maxLifetimeSeconds` forces a periodic
recycle for processes that accumulate state.

When in doubt, pick `TaskAgent` - the request model composes uniformly with HTTP ingress,
attribution, OTLP, credential injection, and budgets. Reach for `DaemonAgent` only when
the work genuinely needs to outlive a request.

### AgentSandboxRevision

Every change to `spec.template` on a `TaskAgent` or `DaemonAgent` produces a new
`AgentSandboxRevision`, named `{agent}-{5-digit-generation}` (e.g. `word-count-00001`,
`word-count-00002`, ...) from `metadata.generation`. Revisions are immutable; the agent
points at `latestReadyRevisionName` once the new image is pulled and the sandbox is
schedulable. Rollbacks are revision pins.

## Worker pools

A `WorkerPool` is the fleet of worker pods that host sandboxes. Each pool sets its own
sizing, node placement, and execution caps. Agents bind to a pool by name in the same
namespace.

```yaml title="workerpool.yaml"
apiVersion: clrk.apoxy.dev/v1alpha1
kind: WorkerPool
metadata:
  name: default
spec:
  replicas: 3
  maxExecutionsPerWorker: 32
  warmPool: 4
  template:
    image: us-west1-docker.pkg.dev/apoxy-dev/public/clrk-worker:latest
    nodeSelector:
      clrk.apoxy.dev/role: worker
```

`clrk dev` auto-creates a `default` pool with one replica. You override it by applying
your own `WorkerPool` of the same name. `replicas` controls pod count;
`maxExecutionsPerWorker` caps concurrent sandboxes per pod; `warmPool` pre-spawns
sandboxes so cold-start cost doesn't appear on the request hot path.

## Egress

Every outbound connection a sandbox makes goes through an `EgressGateway` - a per-team
Envoy that terminates TLS (or passes it through), captures L7 records, injects
credentials, and enforces policy.

```yaml title="egressgateway.yaml"
apiVersion: clrk.apoxy.dev/v1alpha1
kind: EgressGateway
metadata:
  name: echo-bot
spec:
  defaultPolicy: deny-all
  listeners:
    - name: egress
      protocol: TLS
      tls:
        mode: Terminate
  otlp:
    captureBody:
      maxBytes: 65536
    endpoint: https://otlp.example.com
```

Listeners declare interception capabilities by protocol:

<DocsTable>
| Protocol | What it intercepts |
|---|---|
| `TCP` | Raw L4 - match on CIDR/port/hostname (DNS-snooped). |
| `UDP` | Raw L4 - same matching surface. |
| `TLS` (`mode: Passthrough`) | SNI-routed, no termination. Hostname matching only. |
| `TLS` (`mode: Terminate`) | MITM - leaf cert signed by per-EG CA. Full L7 visibility. |
| `HTTP` | Plain L7. |
| `HTTPS` | Implicitly Terminate-mode TLS + HTTP. |
</DocsTable>

`defaultPolicy: deny-all` (the default) drops anything no route matched.
`defaultPolicy: allow-all` is for dev convenience and should not be used in production.

`spec.otlp.endpoint` is a best-effort external re-export of request records. Captured
request/response pairs always persist to the controller-manager's embedded ClickHouse
regardless of this field. Under `clrk dev` the `clrk dev` host CLI runs an in-process
OTLP receiver, and the controller-manager tees every captured signal to it - that is
what lights up the TUI's `otel-logs` and `otel-traces` panes (in addition to ClickHouse
and any `spec.otlp.endpoint`).

## Routes

Routes attach to a listener via `parentRefs` (Gateway API conventions). Three route
kinds exist, each tuned to the traffic it matches.

### AIProviderRoute

Match LLM-provider traffic by provider, model, or endpoint, and attach filters
(token budgets, custom extension policies). The proxy parses the provider's wire
protocol and emits `gen_ai.*` OTLP attributes for tokens, model, and latency.

```yaml title="aiproviderroute.yaml"
apiVersion: clrk.apoxy.dev/v1alpha1
kind: AIProviderRoute
metadata:
  name: anthropic
spec:
  parentRefs:
    - group: clrk.apoxy.dev
      kind: EgressGateway
      name: echo-bot
  rules:
    - matches:
        - provider: anthropic
          models: ["claude-*"]
          endpoints: ["/v1/messages"]
      filters:
        - type: TokenBudget
          tokenBudget:
            maxTokensPerExecution: 10000
            maxTokensPerDay: 1000000
```

Only `maxTokensPerDay` is enforced today (a counter per route, per EgressGateway, per
UTC calendar day). `maxTokensPerExecution` and `maxOutputTokensPerRequest` are accepted
by the schema but not yet enforced by any code path.

Accepted provider values today: `openai`, `anthropic`, `google`, `azure-openai`,
`bedrock`, and `custom` for self-hosted endpoints (`provider` is a registry-validated
string, not a kubebuilder enum).

### MCPRoute

Match traffic to MCP tool servers and constrain which tools can be invoked. Useful
when the agent is wired to a tool host that you don't fully trust.

```yaml title="mcproute.yaml"
apiVersion: clrk.apoxy.dev/v1alpha1
kind: MCPRoute
metadata:
  name: github-tools
spec:
  parentRefs:
    - kind: EgressGateway
      name: echo-bot
  rules:
    - matches:
        - servers: ["https://mcp.example.com/*"]
          tools: ["github_*"]
      filters:
        - type: ToolPolicy
          toolPolicy:
            allowedTools: ["github_search_issues", "github_get_pr"]
            maxCallsPerExecution: 50
```

### EgressL4Route

Catch-all L4 routing on hostnames, CIDRs, and ports. Hostname matching honors SNI on
TLS listeners and DNS-snooped destination names on plain TCP listeners.

```yaml title="egressl4route.yaml"
apiVersion: clrk.apoxy.dev/v1alpha1
kind: EgressL4Route
metadata:
  name: postgres-allow
spec:
  parentRefs:
    - kind: EgressGateway
      name: echo-bot
  rules:
    - matches:
        - protocol: TCP
          destinationHostnames: ["db.internal.example.com"]
          ports:
            - port: 5432
```

## Policies

Policies attach to a route or gateway via `parentRefs` / `targetRef` (Gateway API
GEP-713) and modify how the proxy treats matched traffic.

### CredentialInjectionPolicy

The headline policy. Stores a Secret on the cluster, attaches it to a route, and the
proxy inserts it as a header (or query param, or provider-specific signature) on the
way out. The agent process never sees the value.

```yaml title="credentialinjectionpolicy.yaml"
apiVersion: clrk.apoxy.dev/v1alpha1
kind: CredentialInjectionPolicy
metadata:
  name: anthropic
spec:
  parentRefs:
    - group: clrk.apoxy.dev
      kind: AIProviderRoute
      name: anthropic
  secretRef:
    name: anthropic-credentials
  secretKey: api-key
  target: Header
  headerName: x-api-key
```

`target: ProviderAuth` (with `providerAuth.type: AWSv4` or `GCPServiceAccount`)
performs provider-specific request signing instead of a plain header swap.

### RateLimitPolicy

Cap request rates at the proxy. `scope: PerAgent` shares the budget across all
executions of the same agent; `PerExecution` is per-sandbox; `PerRoute` is per-route
globally.

```yaml title="ratelimitpolicy.yaml"
apiVersion: clrk.apoxy.dev/v1alpha1
kind: RateLimitPolicy
metadata:
  name: anthropic-1rps
spec:
  requests: 1
  window: 1s
  scope: PerAgent
```

### EgressDenyPolicy

Flip an otherwise-allowed route to deny. Useful when a parent gateway is `allow-all`
but a specific subset should be blocked, or to break-glass-off a route without
deleting it.

```yaml title="egressdenypolicy.yaml"
apiVersion: clrk.apoxy.dev/v1alpha1
kind: EgressDenyPolicy
metadata:
  name: block-github
spec:
  targetRef:
    group: clrk.apoxy.dev
    kind: EgressL4Route
    name: github
  denyResponse:
    statusCode: 451
    message: "Outbound to GitHub is currently disabled by policy."
```

### LoggingPolicy

Per-route override for what the OTLP capture records: request body, response body, and
which headers to redact.

```yaml title="loggingpolicy.yaml"
apiVersion: clrk.apoxy.dev/v1alpha1
kind: LoggingPolicy
metadata:
  name: anthropic-redact
spec:
  captureRequest: true
  captureResponse: true
  redactHeaders: ["x-api-key", "authorization"]
```

## Observability

L7 capture is always on for HTTP and Terminate-mode TLS listeners - the proxy emits one
OTLP log record per request with method, host, path, status, latency, and provider-
specific attributes (`gen_ai.tokens.input`, `gen_ai.tokens.output`, model, route name,
trace ID). Spans for the same request land in OTLP traces.

Captured records always persist to the controller-manager's embedded ClickHouse. Set
`spec.otlp.endpoint` on the EgressGateway to additionally re-export to your collector
(`https://otlp.example.com`) - a best-effort fan-out, not a redirect. Under `clrk dev`
the dev TUI receives a tee of the same records regardless of `spec.otlp.endpoint`.

`spec.otlp.captureBody.maxBytes` controls whether bodies are recorded and how much.
Redaction (via [`LoggingPolicy`](#loggingpolicy) or the
[`CredentialInjectionPolicy`](#credentialinjectionpolicy) default) runs before the
record is emitted - captured request/response payloads never carry the real credential
value.

## Where to next

- [Quickstart](/docs/clrk/getting-started/quickstart.md) - apply the canonical example and
  watch every concept on this page light up.
- [Local development](/docs/clrk/getting-started/local.md) - every flag on `clrk dev`, plus
  the dev-loop patterns that keep you off the apiserver-restart treadmill.
- [Guides](/docs/clrk/guides.md) - task-oriented walkthroughs for credentials, budgets, egress
  lockdown, OTLP wiring, and authentication.
- [Reference](/docs/clrk/reference.md) - every CRD field and CLI flag, autogenerated from the
  source.
