Getting startedGuidesReferenceChangelog
Apoxy:// Docs / Guides / Lock down agent egress

Lock down agent egress

Default-deny outbound traffic. Allowlist the upstreams your agents actually need by hostname, CIDR, and port.

A prompt-injected or compromised agent will try to phone home. The defense is to restrict what it can reach. CLRK's EgressGateway defaults to deny-all - egress is closed unless you open it. This guide is how you open exactly what you need and nothing else.

Default is closed

EgressGateway.spec.defaultPolicy defaults to deny-all. Any outbound destination that doesn't match an attached route is dropped at the worker's dialer, before any connection establishes. You opt into traffic, not out of it.

The example manifests in _examples/ use allow-all because they're demos. For anything past the demo, leave the default alone and add routes for what the agent actually needs.

Two policy primitives

$diagramMERMAID
  • EgressL4Route declares allowed L4 destinations: destinationCIDRs, destinationHostnames, ports, protocol (TCP or UDP). Optional sourceAgents label selector scopes the rule to specific agents.
  • EgressDenyPolicy attaches to a route via targetRef and inverts it from allow to deny, with an HTTP-layer denyResponse (default 403, custom message optional).

For the default-deny + allowlist pattern, you'll mostly use EgressL4Route. EgressDenyPolicy is for the inverse case: a default-allow gateway where you want to punch specific holes shut.

Recipe: default-deny + allowlist

This is the standard production shape. Allow only the upstreams the agent needs:

$terminalYAML
apiVersion: clrk.apoxy.dev/v1alpha1 kind: EgressGateway metadata: name: prod-agents spec: defaultPolicy: deny-all listeners: - name: tcp-out protocol: TCP - name: tls-out protocol: TLS tls: mode: Terminate --- # Allow TLS to Anthropic. apiVersion: clrk.apoxy.dev/v1alpha1 kind: EgressL4Route metadata: name: anthropic-tls spec: parentRefs: - group: clrk.apoxy.dev kind: EgressGateway name: prod-agents sectionName: tls-out rules: - matches: - destinationHostnames: ["api.anthropic.com"] ports: [{ port: 443 }] protocol: TCP --- # Allow plain TCP to one Postgres. apiVersion: clrk.apoxy.dev/v1alpha1 kind: EgressL4Route metadata: name: app-postgres spec: parentRefs: - group: clrk.apoxy.dev kind: EgressGateway name: prod-agents sectionName: tcp-out rules: - matches: - destinationHostnames: ["pg.prod.internal"] ports: [{ port: 5432 }] protocol: TCP

Anything not matched by these rules - DNS lookups to attacker infrastructure, surprise outbound HTTPS to a paste site, an nc to a hostile listener - is dropped at the worker.

How hostnames get resolved

Hostname matching is real, not advisory, but only when the agent uses plain DNS. The worker snoops UDP/53 responses and caches (resolved IP) → name bindings. On connection, the dialer attaches the snooped hostname to the connection's PROXY v2 frame so the L4 routing layer can match on it. SNI on TLS listeners is read too, but SNI is agent-supplied and an attacker can lie there.

Recommendation: have your agents use the kernel resolver (plain UDP/53). Encrypted resolvers (DoT, DoH) bypass the snoop, which means hostname rules degrade to whatever the agent declares via SNI

  • useful for the cooperative case, weak as a security boundary.

Even on a TLS-terminated listener (mode: Terminate), the EgressL4Route hostname match still runs against the agent-supplied SNI - decrypted Host / :authority matching is an L7 feature (AIProviderRoute / the MCP route layer), not EgressL4Route. For a hard egress boundary, pair the hostname rule with a destinationCIDRs match, or move hostname enforcement up to the L7 route layer.

Wildcards and CIDRs

$terminalYAML
destinationHostnames: - "api.openai.com" # exact - "*.azure.openai.com" # single-label wildcard destinationCIDRs: - "10.20.0.0/16" # whole subnet - "203.0.113.5/32" # single IP ports: - port: 443 - startPort: 8000 # inclusive range endPort: 8099 protocol: TCP

Wildcard semantics follow Gateway API's Hostname: *.openai.com matches api.openai.com but not eu.api.openai.com. One prefix label only. If you need multi-label wildcards, list them explicitly.

CIDRs match IP only - no DNS involved. Useful for "allow this internal VPC range" rules.

What a denial looks like

A denied L4 connection is refused at the worker before any backend is selected or any data is spliced - the connection never reaches Envoy or the upstream. Two things to know:

  • The worker emits a dedicated deny record. A denied connection produces no L4 ext_proc allow-record (it never reaches Envoy), but the worker DOES publish an egress.dial.denied OTLP span and log record carrying clrk.egress.deny_reason=policy (plus agent.name, clrk.dst.name, and the peer address/port). Query on clrk.egress.deny_reason to find denials directly - don't rely on record-absence.
  • The agent sees a TCP connection failure. Whatever your language's socket library reports for connect() returning EOF or ECONNREFUSED.

If you want a friendlier denial - say, a 403 with a custom message on the L7 side - use EgressDenyPolicy against a specific allowed route to flip it. The denyResponse is HTTP-shaped (status + body), so it only applies to L7 routes.

Three canned allowlists

Agent that only talks to Anthropic:

$terminalYAML
- destinationHostnames: ["api.anthropic.com"] ports: [{ port: 443 }] protocol: TCP

Agent with customer data:

$terminalYAML
- destinationHostnames: ["api.anthropic.com", "s3.us-east-1.amazonaws.com"] ports: [{ port: 443 }] protocol: TCP - destinationHostnames: ["pg.prod.internal"] ports: [{ port: 5432 }] protocol: TCP

Agent that only reaches cluster-internal services:

$terminalYAML
- destinationHostnames: ["*.svc.cluster.local"] protocol: TCP - destinationCIDRs: ["10.0.0.0/8"] protocol: TCP

What this does NOT do

  • Does not inspect HTTPS request bodies. L4 is a destination-only matcher. For body inspection use AIProviderRoute filters and the MCP route layer.
  • Does not prevent leakage to allowed hosts. If you allow Slack, an agent can DM the attacker via Slack. Hostname allowlists buy you a lot but they don't replace output review.
  • Does not bound bandwidth. No bytes-per-second policy today - coming soon. Talk to us if you need bandwidth caps before then.

Where to next

  • Pair the allowlist with credential injection so the agent doesn't even need to know the key - see Hide credentials from agents.
  • Confirm denials by querying the worker's egress.dial.denied records (clrk.egress.deny_reason=policy), and confirm allowed traffic by reading its OTLP records - see Trace requests through agents.
  • Cap LLM-side cost on the upstreams you do allow - see Cap LLM spend per agent.