Claude Code over HTTP
Stand up Claude Code as an internal HTTP service so callers can prompt it without holding the Anthropic API key.
You want one internal endpoint your services can POST a prompt to, get
a Claude response back, and never deal with API keys outside the
cluster. This guide shows the smallest version of that, using
_examples/jq-bot as the worked template, then explains how to fork
it for your own prompt.
What you'll build
A TaskAgent named jq-bot that accepts a JSON body, runs the
@anthropic-ai/claude-code CLI inside a fresh sandbox per request,
and returns the model's structured output. The Anthropic key lives in
a Secret; the proxy injects it on the way to api.anthropic.com. A
cold call runs in 10-30s; warm-pool callers see closer to a second.
The HTTP contract
CLRK materializes a per-TaskAgent Envoy Gateway. Callers send:
POST / HTTP/1.1
Host: <gateway-host>
X-Clrk-TaskAgent: default/<agent-name>
Content-Type: application/json
{ ...your JSON input... }Two things matter:
X-Clrk-TaskAgentis required. The ingress ext_proc reads it before the HTTPRoute header filter runs and 400s requests that don't carry it. Make sure your auth proxy or client always sets it. The value is<namespace>/<agent-name>.- The agent's stdout becomes the response body. Stderr is
captured for logs but does not reach the caller. Set
Content-Typefrom your agent if you want anything other than whatever Envoy infers.
Bring up jq-bot
ANTHROPIC_API_KEY=sk-ant-... clrk dev \
--apply _examples/jq-bot/manifests \
--secret anthropic-credentials=ANTHROPIC_API_KEY:api-keyWait for the per-TaskAgent Gateway to come up:
export KUBECONFIG=~/.clrk/kubeconfig.host
kubectl get gateway jq-bot # PROGRAMMED=TrueThen port-forward and call it:
kubectl port-forward -n clrk svc/clrk-jq-bot 18080:80 &
curl -sS http://localhost:18080/ \
-H 'content-type: application/json' \
-H 'X-Clrk-TaskAgent: default/jq-bot' \
--data '{
"input": [
{"name":"alice","age":30,"role":"eng"},
{"name":"bob","age":42,"role":"pm"},
{"name":"carol","age":25,"role":"eng"}
],
"want": "names of engineers, ascending by age"
}'
# {"filter":"[.[] | select(.role == \"eng\")] | sort_by(.age) | map(.name)","output":["carol","alice"]}Claude generates the jq filter; the shell runs jq against your
input to verify it; you get back both the filter and its output. No
hallucination because the shell is the source of truth, not the
model.
What's inside the image
_examples/jq-bot/Dockerfile:
FROM node:22-alpine
RUN apk add --no-cache bash jq ca-certificates curl \
&& npm install -g @anthropic-ai/claude-code \
&& mkdir -p /root/.claude/projects \
&& rm -rf /root/.npm
# Claude Code refuses to start without an API key in the env. The
# value here is a placeholder - the egress MITM rewrites the
# x-api-key header on every Anthropic request.
ENV ANTHROPIC_API_KEY=clrk-injected-by-proxy
COPY agent.sh /usr/local/bin/agent.sh
RUN chmod +x /usr/local/bin/agent.sh
ENTRYPOINT ["/usr/local/bin/agent.sh"]agent.sh is short. It reads the CloudEvents envelope from stdin,
extracts .data.input and .data.want, asks Claude for one jq
filter, runs it against the input, and emits a single JSON object. The
manifest pins two env vars:
env:
- name: ANTHROPIC_API_KEY
value: clrk-injected-by-proxy
- name: HOME
value: /tmpHOME=/tmp is mandatory - Claude Code writes session artifacts under
$HOME at startup, and the sandbox's root filesystem isn't writable
where the CLI expects. Without it, claude --print exits 0 with no
output and no stderr (a particularly silent failure mode).
Fork it for your own prompt
Copy _examples/jq-bot/ somewhere outside the repo, then:
- Replace
agent.shwith your prompt logic. Keep the CloudEvents-envelope-from-stdin contract -jq '.data'lifts the caller's payload out. See Package a custom agent for the full envelope shape. - Build multi-arch:
docker buildx build --platform=linux/amd64,linux/arm64 -t <your-registry>/<name>:<tag> --push .. Worker pools pull whatever architecture they're on; ship both if you're not certain. - Update the manifest at
manifests/taskagent.yaml:spec.template.spec.imageto your reference, andmetadata.nameto your agent name (becomes the Gateway/Service name too). - Re-apply:
clrk apply -f manifests/. Tag with a content hash, not:latest- the apply only re-rolls the sandbox when the image reference changes.
Operational gotchas specific to Claude Code
These are real ones we have hit. Each is independent.
HOME=/tmpis mandatory. Claude CLI writes session artifacts on startup; without a writable$HOMEit exits 0 with no output.--bare --no-session-persistenceis the right invocation. Without--bareyou get ANSI escapes in the output; without--no-session-persistenceyou accumulate session files even withHOME=/tmp.--dangerously-skip-permissionsdoesn't work as root. Sandboxes run as root by default. Run Claude tool-less and have the shell verify the output (jq-bot's pattern) instead of asking the model to execute shell.- Image-baked
ENVdoes not reach the agent. OnlyPATH, CA-trust hints,CLRK_METADATA_*, and entries you list underspec.template.spec.envare visible. The Dockerfile'sENV ANTHROPIC_API_KEY=clrk-injected-by-proxyis dead weight at runtime; the manifest'senv:block is what survives. spec.template.spec.env[].valueFrom.secretKeyRefis silently dropped. Use literal.valuefor placeholders andCredentialInjectionPolicyfor real secrets. See Hide credentials from agents.- Cold sandboxes need 10-30s. Claude Code's bundle is large and
the cold path includes image-pull + libcontainer setup. The
TaskAgent's
spec.timeout(default 100s) is pinned end-to-end by the ingress controller, so the cap holds - but if you want closer to a second on every call, setspec.warmPoolSizeto keep a pre-built sandbox ready.
The metadata chain on a single call
Every call lands an OTLP span with gen_ai.system=anthropic, the
input/output token counts, the model used, and a deterministic
invocation.id you can join through the chain. See Trace requests
through agents for the full
debugging walkthrough.
Confirm the key stayed out of your fork
In the otel-traces pane in the clrk dev TUI, find the most recent
span where gen_ai.system=anthropic. Two checks:
- The response status is 200. Anthropic 401s when the key is wrong;
a 200 with real
gen_ai.usage.*_tokensis proof the injection fired. - Expand the request span's headers. The
x-api-keyheader attribute (http.request.header.x-api-key) reads[redacted]- CLRK replaces known credential headers with[redacted]before exporting telemetry, so it never ships credentials through OTLP.
Where to next
- Authenticate the callers (this guide assumes you trust whoever reaches the ingress) - see Authenticate users before agents.
- Run on a schedule with the same image - see Schedule recurring agents.
- Cap token spend so a runaway prompt loop has a ceiling - see Cap LLM spend per agent.
- Restrict the agent so it can only reach Anthropic and nothing else