# Claude Code over HTTP

> Stand up Claude Code as an internal HTTP service so callers can prompt it without holding the Anthropic API key.

You want one internal endpoint your services can POST a prompt to, get
a Claude response back, and never deal with API keys outside the
cluster. This guide shows the smallest version of that, using
`_examples/jq-bot` as the worked template, then explains how to fork
it for your own prompt.

## What you'll build

A `TaskAgent` named `jq-bot` that accepts a JSON body, runs the
`@anthropic-ai/claude-code` CLI inside a fresh sandbox per request,
and returns the model's structured output. The Anthropic key lives in
a `Secret`; the proxy injects it on the way to `api.anthropic.com`. A
cold call runs in 10-30s; warm-pool callers see closer to a second.

## The HTTP contract

CLRK materializes a per-`TaskAgent` Envoy Gateway. Callers send:

```http
POST / HTTP/1.1
Host: <gateway-host>
X-Clrk-TaskAgent: default/<agent-name>
Content-Type: application/json

{ ...your JSON input... }
```

Two things matter:

- **`X-Clrk-TaskAgent` is required.** The ingress ext_proc reads it
  before the HTTPRoute header filter runs and 400s requests that
  don't carry it. Make sure your auth proxy or client always sets
  it. The value is `<namespace>/<agent-name>`.
- **The agent's stdout becomes the response body.** Stderr is
  captured for logs but does not reach the caller. Set
  `Content-Type` from your agent if you want anything other than
  whatever Envoy infers.

## Bring up jq-bot

```bash
ANTHROPIC_API_KEY=sk-ant-... clrk dev \
  --apply _examples/jq-bot/manifests \
  --secret anthropic-credentials=ANTHROPIC_API_KEY:api-key
```

Wait for the per-`TaskAgent` Gateway to come up:

```bash
export KUBECONFIG=~/.clrk/kubeconfig.host
kubectl get gateway jq-bot   # PROGRAMMED=True
```

Then port-forward and call it:

```bash
kubectl port-forward -n clrk svc/clrk-jq-bot 18080:80 &

curl -sS http://localhost:18080/ \
  -H 'content-type: application/json' \
  -H 'X-Clrk-TaskAgent: default/jq-bot' \
  --data '{
    "input": [
      {"name":"alice","age":30,"role":"eng"},
      {"name":"bob","age":42,"role":"pm"},
      {"name":"carol","age":25,"role":"eng"}
    ],
    "want": "names of engineers, ascending by age"
  }'
# {"filter":"[.[] | select(.role == \"eng\")] | sort_by(.age) | map(.name)","output":["carol","alice"]}
```

Claude generates the `jq` filter; the shell runs `jq` against your
input to verify it; you get back both the filter and its output. No
hallucination because the shell is the source of truth, not the
model.

## What's inside the image

`_examples/jq-bot/Dockerfile`:

```dockerfile
FROM node:22-alpine

RUN apk add --no-cache bash jq ca-certificates curl \
 && npm install -g @anthropic-ai/claude-code \
 && mkdir -p /root/.claude/projects \
 && rm -rf /root/.npm

# Claude Code refuses to start without an API key in the env. The
# value here is a placeholder - the egress MITM rewrites the
# x-api-key header on every Anthropic request.
ENV ANTHROPIC_API_KEY=clrk-injected-by-proxy

COPY agent.sh /usr/local/bin/agent.sh
RUN chmod +x /usr/local/bin/agent.sh

ENTRYPOINT ["/usr/local/bin/agent.sh"]
```

`agent.sh` is short. It reads the CloudEvents envelope from stdin,
extracts `.data.input` and `.data.want`, asks Claude for one `jq`
filter, runs it against the input, and emits a single JSON object. The
manifest pins two env vars:

```yaml
env:
  - name: ANTHROPIC_API_KEY
    value: clrk-injected-by-proxy
  - name: HOME
    value: /tmp
```

`HOME=/tmp` is mandatory - Claude Code writes session artifacts under
`$HOME` at startup, and the sandbox's root filesystem isn't writable
where the CLI expects. Without it, `claude --print` exits 0 with no
output and no stderr (a particularly silent failure mode).

## Fork it for your own prompt

Copy `_examples/jq-bot/` somewhere outside the repo, then:

1. **Replace `agent.sh`** with your prompt logic. Keep the
   CloudEvents-envelope-from-stdin contract - `jq '.data'` lifts the
   caller's payload out. See [Package a custom
   agent](./package-a-custom-agent) for the full envelope shape.
2. **Build multi-arch**: `docker buildx build --platform=linux/amd64,linux/arm64
   -t <your-registry>/<name>:<tag> --push .`. Worker pools pull
   whatever architecture they're on; ship both if you're not certain.
3. **Update the manifest** at `manifests/taskagent.yaml`:
   `spec.template.spec.image` to your reference, and `metadata.name`
   to your agent name (becomes the Gateway/Service name too).
4. **Re-apply**: `clrk apply -f manifests/`. Tag with a content
   hash, not `:latest` - the apply only re-rolls the sandbox when
   the image reference changes.

## Operational gotchas specific to Claude Code

These are real ones we have hit. Each is independent.

- **`HOME=/tmp` is mandatory.** Claude CLI writes session artifacts on
  startup; without a writable `$HOME` it exits 0 with no output.
- **`--bare --no-session-persistence`** is the right invocation. Without
  `--bare` you get ANSI escapes in the output; without
  `--no-session-persistence` you accumulate session files even with
  `HOME=/tmp`.
- **`--dangerously-skip-permissions` doesn't work as root.** Sandboxes
  run as root by default. Run Claude tool-less and have the shell
  verify the output (jq-bot's pattern) instead of asking the model to
  execute shell.
- **Image-baked `ENV` does not reach the agent.** Only `PATH`,
  CA-trust hints, `CLRK_METADATA_*`, and entries you list under
  `spec.template.spec.env` are visible. The Dockerfile's
  `ENV ANTHROPIC_API_KEY=clrk-injected-by-proxy` is dead weight at
  runtime; the manifest's `env:` block is what survives.
- **`spec.template.spec.env[].valueFrom.secretKeyRef` is silently dropped.** Use
  literal `.value` for placeholders and `CredentialInjectionPolicy`
  for real secrets. See [Hide credentials from
  agents](./hide-credentials-from-agents).
- **Cold sandboxes need 10-30s.** Claude Code's bundle is large and
  the cold path includes image-pull + libcontainer setup. The
  TaskAgent's `spec.timeout` (default 100s) is pinned end-to-end by
  the ingress controller, so the cap holds - but if you want closer
  to a second on every call, set `spec.warmPoolSize` to keep a
  pre-built sandbox ready.

## The metadata chain on a single call

```mermaid
sequenceDiagram
  participant Client
  participant Ingress
  participant Sandbox
  participant EG as EgressGateway
  participant Up as Anthropic
  Client->>Ingress: POST
  Ingress->>Sandbox: dispatch
  Sandbox->>EG: HTTPS + placeholder key
  EG->>EG: swap key from Secret
  EG->>Up: HTTPS + real key
  Up-->>EG: response
  EG-->>Sandbox: response
  Sandbox-->>Ingress: stdout
  Ingress-->>Client: response
```

Every call lands an OTLP span with `gen_ai.system=anthropic`, the
input/output token counts, the model used, and a deterministic
`invocation.id` you can join through the chain. See [Trace requests
through agents](./trace-requests-through-agents) for the full
debugging walkthrough.

## Confirm the key stayed out of your fork

In the `otel-traces` pane in the `clrk dev` TUI, find the most recent
span where `gen_ai.system=anthropic`. Two checks:

- The response status is 200. Anthropic 401s when the key is wrong;
  a 200 with real `gen_ai.usage.*_tokens` is proof the injection
  fired.
- Expand the request span's headers. The `x-api-key` header attribute
  (`http.request.header.x-api-key`) reads `[redacted]` - CLRK replaces
  known credential headers with `[redacted]` before exporting
  telemetry, so it never ships credentials through OTLP.

## Where to next

- Authenticate the callers (this guide assumes you trust whoever
  reaches the ingress) - see [Authenticate users before
  agents](./authenticate-users-before-agents).
- Run on a schedule with the same image - see [Schedule recurring
  agents](./schedule-recurring-agents).
- Cap token spend so a runaway prompt loop has a ceiling - see [Cap
  LLM spend per agent](./cap-llm-spend-per-agent).
- Restrict the agent so it can only reach Anthropic and nothing else
  - see [Lock down agent egress](./lock-down-agent-egress).
