Lock down agent egress
Default-deny outbound traffic. Allowlist the upstreams your agents actually need by hostname, CIDR, and port.
A prompt-injected or compromised agent will try to phone home. The
defense is to restrict what it can reach. CLRK's EgressGateway
defaults to deny-all - egress is closed unless you open it. This
guide is how you open exactly what you need and nothing else.
Default is closed
EgressGateway.spec.defaultPolicy defaults to deny-all. Any
outbound destination that doesn't match an attached route is dropped
at the worker's dialer, before any connection establishes. You opt
into traffic, not out of it.
The example manifests in _examples/ use allow-all because they're
demos. For anything past the demo, leave the default alone and add
routes for what the agent actually needs.
Two policy primitives
EgressL4Routedeclares allowed L4 destinations:destinationCIDRs,destinationHostnames,ports,protocol(TCP or UDP). OptionalsourceAgentslabel selector scopes the rule to specific agents.EgressDenyPolicyattaches to a route viatargetRefand inverts it from allow to deny, with an HTTP-layerdenyResponse(default 403, custom message optional).
For the default-deny + allowlist pattern, you'll mostly use
EgressL4Route. EgressDenyPolicy is for the inverse case: a
default-allow gateway where you want to punch specific holes shut.
Recipe: default-deny + allowlist
This is the standard production shape. Allow only the upstreams the agent needs:
apiVersion: clrk.apoxy.dev/v1alpha1
kind: EgressGateway
metadata:
name: prod-agents
spec:
defaultPolicy: deny-all
listeners:
- name: tcp-out
protocol: TCP
- name: tls-out
protocol: TLS
tls:
mode: Terminate
---
# Allow TLS to Anthropic.
apiVersion: clrk.apoxy.dev/v1alpha1
kind: EgressL4Route
metadata:
name: anthropic-tls
spec:
parentRefs:
- group: clrk.apoxy.dev
kind: EgressGateway
name: prod-agents
sectionName: tls-out
rules:
- matches:
- destinationHostnames: ["api.anthropic.com"]
ports: [{ port: 443 }]
protocol: TCP
---
# Allow plain TCP to one Postgres.
apiVersion: clrk.apoxy.dev/v1alpha1
kind: EgressL4Route
metadata:
name: app-postgres
spec:
parentRefs:
- group: clrk.apoxy.dev
kind: EgressGateway
name: prod-agents
sectionName: tcp-out
rules:
- matches:
- destinationHostnames: ["pg.prod.internal"]
ports: [{ port: 5432 }]
protocol: TCPAnything not matched by these rules - DNS lookups to attacker
infrastructure, surprise outbound HTTPS to a paste site, an nc to a
hostile listener - is dropped at the worker.
How hostnames get resolved
Hostname matching is real, not advisory, but only when the agent uses
plain DNS. The worker snoops UDP/53 responses and caches
(resolved IP) → name bindings. On connection, the dialer attaches
the snooped hostname to the connection's PROXY v2 frame so the
L4 routing layer can match on it. SNI on TLS listeners is read too,
but SNI is agent-supplied and an attacker can lie there.
Recommendation: have your agents use the kernel resolver (plain UDP/53). Encrypted resolvers (DoT, DoH) bypass the snoop, which means hostname rules degrade to whatever the agent declares via SNI
- useful for the cooperative case, weak as a security boundary.
Even on a TLS-terminated listener (mode: Terminate), the
EgressL4Route hostname match still runs against the agent-supplied
SNI - decrypted Host / :authority matching is an L7 feature
(AIProviderRoute / the MCP route layer), not EgressL4Route. For a
hard egress boundary, pair the hostname rule with a
destinationCIDRs match, or move hostname enforcement up to the L7
route layer.
Wildcards and CIDRs
destinationHostnames:
- "api.openai.com" # exact
- "*.azure.openai.com" # single-label wildcard
destinationCIDRs:
- "10.20.0.0/16" # whole subnet
- "203.0.113.5/32" # single IP
ports:
- port: 443
- startPort: 8000 # inclusive range
endPort: 8099
protocol: TCPWildcard semantics follow Gateway API's Hostname: *.openai.com
matches api.openai.com but not eu.api.openai.com. One prefix
label only. If you need multi-label wildcards, list them explicitly.
CIDRs match IP only - no DNS involved. Useful for "allow this internal VPC range" rules.
What a denial looks like
A denied L4 connection is refused at the worker before any backend is selected or any data is spliced - the connection never reaches Envoy or the upstream. Two things to know:
- The worker emits a dedicated deny record. A denied connection
produces no L4 ext_proc allow-record (it never reaches Envoy), but
the worker DOES publish an
egress.dial.deniedOTLP span and log record carryingclrk.egress.deny_reason=policy(plusagent.name,clrk.dst.name, and the peer address/port). Query onclrk.egress.deny_reasonto find denials directly - don't rely on record-absence. - The agent sees a TCP connection failure. Whatever your
language's socket library reports for
connect()returning EOF or ECONNREFUSED.
If you want a friendlier denial - say, a 403 with a custom message
on the L7 side - use EgressDenyPolicy against a specific allowed
route to flip it. The denyResponse is HTTP-shaped (status + body),
so it only applies to L7 routes.
Three canned allowlists
Agent that only talks to Anthropic:
- destinationHostnames: ["api.anthropic.com"]
ports: [{ port: 443 }]
protocol: TCPAgent with customer data:
- destinationHostnames: ["api.anthropic.com", "s3.us-east-1.amazonaws.com"]
ports: [{ port: 443 }]
protocol: TCP
- destinationHostnames: ["pg.prod.internal"]
ports: [{ port: 5432 }]
protocol: TCPAgent that only reaches cluster-internal services:
- destinationHostnames: ["*.svc.cluster.local"]
protocol: TCP
- destinationCIDRs: ["10.0.0.0/8"]
protocol: TCPWhat this does NOT do
- Does not inspect HTTPS request bodies. L4 is a destination-only
matcher. For body inspection use
AIProviderRoutefilters and the MCP route layer. - Does not prevent leakage to allowed hosts. If you allow Slack, an agent can DM the attacker via Slack. Hostname allowlists buy you a lot but they don't replace output review.
- Does not bound bandwidth. No bytes-per-second policy today - coming soon. Talk to us if you need bandwidth caps before then.
Where to next
- Pair the allowlist with credential injection so the agent doesn't even need to know the key - see Hide credentials from agents.
- Confirm denials by querying the worker's
egress.dial.deniedrecords (clrk.egress.deny_reason=policy), and confirm allowed traffic by reading its OTLP records - see Trace requests through agents. - Cap LLM-side cost on the upstreams you do allow - see Cap LLM spend per agent.