# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview Wrenn Sandbox is a microVM-based code execution platform. Users create isolated sandboxes (Firecracker microVMs), run code inside them, and get output back via SDKs. Think E2B but with persistent sandboxes, pool-based pricing, and a single-binary deployment story. ## Build & Development Commands All commands go through the Makefile. Never use raw `go build` or `go run`. ```bash make build # Build all binaries → builds/ make build-cp # Control plane only make build-agent # Host agent only make build-envd # envd static binary (verified statically linked) make dev # Full local dev: infra + migrate + control plane make dev-infra # Start PostgreSQL + Prometheus + Grafana (Docker) make dev-down # Stop dev infra make dev-cp # Control plane with hot reload (if air installed) make dev-agent # Host agent (sudo required) make dev-envd # envd in TCP debug mode make check # fmt + vet + lint + test (CI order) make test # Unit tests: go test -race -v ./internal/... make test-integration # Integration tests (require host agent + Firecracker) make fmt # gofmt both modules make vet # go vet both modules make lint # golangci-lint make migrate-up # Apply pending migrations make migrate-down # Rollback last migration make migrate-create name=xxx # Scaffold new goose migration (never create manually) make migrate-reset # Drop + re-apply all make generate # Proto (buf) + sqlc codegen make proto # buf generate for all proto dirs make tidy # go mod tidy both modules ``` Run a single test: `go test -race -v -run TestName ./internal/path/...` ## Architecture ``` User SDK → HTTPS/WS → Control Plane → Connect RPC → Host Agent → HTTP/Connect RPC over TAP → envd (inside VM) ``` **Three binaries, two Go modules:** | Binary | Module | Entry point | Runs as | |--------|--------|-------------|---------| | wrenn-cp | `git.omukk.dev/wrenn/sandbox` | `cmd/control-plane/main.go` | Unprivileged | | wrenn-agent | `git.omukk.dev/wrenn/sandbox` | `cmd/host-agent/main.go` | Root (NET_ADMIN + /dev/kvm) | | envd | `git.omukk.dev/wrenn/sandbox/envd` (standalone `envd/go.mod`) | `envd/main.go` | PID 1 inside guest VM | envd is a **completely independent Go module**. It is never imported by the main module. The only connection is the protobuf contract. It compiles to a static binary baked into rootfs images. **Key architectural invariant:** The host agent is **stateful** (in-memory `boxes` map is the source of truth for running VMs). The control plane is **stateless** (all persistent state in PostgreSQL). The reconciler (`internal/api/reconciler.go`) bridges the gap — it periodically compares DB records against the host agent's live state and marks orphaned sandboxes as "stopped". ### Control Plane **Packages:** `internal/api/`, `internal/admin/`, `internal/auth/`, `internal/scheduler/`, `internal/lifecycle/`, `internal/config/`, `internal/db/` Startup (`cmd/control-plane/main.go`) wires: config (env vars) → pgxpool → `db.Queries` (sqlc-generated) → Connect RPC client to host agent → `api.Server`. Everything flows through constructor injection. - **API Server** (`internal/api/server.go`): chi router with middleware. Creates handler structs (`sandboxHandler`, `execHandler`, `filesHandler`, etc.) injected with `db.Queries` and the host agent Connect RPC client. Routes under `/v1/sandboxes/*`. - **Reconciler** (`internal/api/reconciler.go`): background goroutine (every 30s) that compares DB records against `agent.ListSandboxes()` RPC. Marks orphaned DB entries as "stopped". - **Admin UI** at `/admin/` (htmx + basecoat + alpine.js + Go html/template, no SPA, no build step) - **Database**: PostgreSQL via pgx/v5. Queries generated by sqlc from `db/queries/sandboxes.sql`. Migrations in `db/migrations/` (goose, plain SQL). - **Config** (`internal/config/config.go`): purely environment variables (`DATABASE_URL`, `CP_LISTEN_ADDR`, `CP_HOST_AGENT_ADDR`), no YAML/file config. ### Host Agent **Packages:** `internal/hostagent/`, `internal/sandbox/`, `internal/vm/`, `internal/network/`, `internal/devicemapper/`, `internal/envdclient/`, `internal/snapshot/` Startup (`cmd/host-agent/main.go`) wires: root check → enable IP forwarding → clean up stale dm devices → `sandbox.Manager` (containing `vm.Manager` + `network.SlotAllocator` + `devicemapper.LoopRegistry`) → `hostagent.Server` (Connect RPC handler) → HTTP server. - **RPC Server** (`internal/hostagent/server.go`): implements `hostagentv1connect.HostAgentServiceHandler`. Thin wrapper — every method delegates to `sandbox.Manager`. Maps Connect error codes on return. - **Sandbox Manager** (`internal/sandbox/manager.go`): the core orchestration layer. Maintains in-memory state in `boxes map[string]*sandboxState` (protected by `sync.RWMutex`). Each `sandboxState` holds a `models.Sandbox`, a `*network.Slot`, and an `*envdclient.Client`. Runs a TTL reaper (every 10s) that auto-destroys timed-out sandboxes. - **VM Manager** (`internal/vm/manager.go`, `fc.go`, `config.go`): manages Firecracker processes. Uses raw HTTP API over Unix socket (`/tmp/fc-{sandboxID}.sock`), not the firecracker-go-sdk Machine type. Launches Firecracker via `unshare -m` + `ip netns exec`. Configures VM via PUT to `/boot-source`, `/drives/rootfs`, `/network-interfaces/eth0`, `/machine-config`, then starts with PUT `/actions`. - **Network** (`internal/network/setup.go`, `allocator.go`): per-sandbox network namespace with veth pair + TAP device. See Networking section below. - **Device Mapper** (`internal/devicemapper/devicemapper.go`): CoW rootfs via device-mapper snapshots. Shared read-only loop devices per base template (refcounted `LoopRegistry`), per-sandbox sparse CoW files, dm-snapshot create/restore/remove/flatten operations. - **envd Client** (`internal/envdclient/client.go`, `health.go`): dual interface to the guest agent. Connect RPC for streaming process exec (`process.Start()` bidirectional stream). Plain HTTP for file operations (POST/GET `/files?path=...&username=root`). Health check polls `GET /health` every 100ms until ready (30s timeout). ### envd (Guest Agent) **Module:** `envd/` with its own `go.mod` (`git.omukk.dev/wrenn/sandbox/envd`) Runs as PID 1 inside the microVM via `wrenn-init.sh` (mounts procfs/sysfs/dev, sets hostname, writes resolv.conf, then execs envd). Extracted from E2B (Apache 2.0), with shared packages internalized into `envd/internal/shared/`. Listens on TCP `0.0.0.0:49983`. - **ProcessService**: start processes, stream stdout/stderr, signal handling, PTY support - **FilesystemService**: stat/list/mkdir/move/remove/watch files - **Health**: GET `/health` ### Networking (per sandbox) Each sandbox gets its own Linux network namespace (`ns-{idx}`). Slot index (1-based, up to 65534) determines all addressing: ``` Host Namespace Namespace "ns-{idx}" Guest VM ────────────────────────────────────────────────────────────────────────────────────── veth-{idx} ←──── veth pair ────→ eth0 10.12.0.{idx*2}/31 10.12.0.{idx*2+1}/31 │ tap0 (169.254.0.22/30) ←── TAP ──→ eth0 (169.254.0.21) ↑ kernel ip= boot arg ``` - **Host-reachable IP**: `10.11.0.{idx}/32` — routed through veth to namespace, DNAT'd to guest - **Outbound NAT**: guest (169.254.0.21) → SNAT to vpeerIP inside namespace → MASQUERADE on host to default interface - **Inbound NAT**: host traffic to 10.11.0.{idx} → DNAT to 169.254.0.21 inside namespace - IP forwarding enabled inside each namespace - All details in `internal/network/setup.go` ### Sandbox State Machine ``` PENDING → STARTING → RUNNING → PAUSED → HIBERNATED │ │ ↓ ↓ STOPPED STOPPED → (destroyed) Any state → ERROR (on crash/failure) PAUSED → RUNNING (warm snapshot resume) HIBERNATED → RUNNING (cold snapshot resume, slower) ``` ### Key Request Flows **Sandbox creation** (`POST /v1/sandboxes`): 1. API handler generates sandbox ID, inserts into DB as "pending" 2. RPC `CreateSandbox` → host agent → `sandbox.Manager.Create()` 3. Manager: resolve base rootfs → acquire shared loop device → create dm-snapshot (sparse CoW file) → allocate network slot → `CreateNetwork()` (netns + veth + tap + NAT) → `vm.Create()` (start Firecracker with `/dev/mapper/wrenn-{id}`, configure via HTTP API, boot) → `envdclient.WaitUntilReady()` (poll /health) → store in-memory state 4. API handler updates DB to "running" with host_ip **Command execution** (`POST /v1/sandboxes/{id}/exec`): 1. API handler verifies sandbox is "running" in DB 2. RPC `Exec` → host agent → `sandbox.Manager.Exec()` → `envdclient.Exec()` 3. envd client opens bidirectional Connect RPC stream (`process.Start`), collects stdout/stderr/exit_code 4. API handler checks UTF-8 validity (base64-encodes if binary), updates last_active_at, returns result **Streaming exec** (`WS /v1/sandboxes/{id}/exec/stream`): 1. WebSocket upgrade, read first message for cmd/args 2. RPC `ExecStream` → host agent → `sandbox.Manager.ExecStream()` → `envdclient.ExecStream()` 3. envd client returns a channel of events; host agent forwards events through the RPC stream 4. API handler forwards stream events to WebSocket as JSON messages (`{type: "stdout"|"stderr"|"exit", ...}`) **File transfer**: Write uses multipart POST to envd `/files`; read uses GET. Streaming variants chunk in 64KB pieces through the RPC stream. ## REST API Routes defined in `internal/api/server.go`, handlers in `internal/api/handlers_*.go`. OpenAPI spec embedded via `//go:embed` and served at `/openapi.yaml` (Swagger UI at `/docs`). JSON request/response. API key auth via `X-API-Key` header. Error responses: `{"error": {"code": "...", "message": "..."}}`. ## Code Generation ### Proto (Connect RPC) Proto source of truth is `proto/envd/*.proto` and `proto/hostagent/*.proto`. Run `make proto` to regenerate. Three `buf.gen.yaml` files control output: | buf.gen.yaml location | Generates to | Used by | |---|---|---| | `proto/envd/buf.gen.yaml` | `proto/envd/gen/` | Main module (host agent's envd client) | | `proto/hostagent/buf.gen.yaml` | `proto/hostagent/gen/` | Main module (control plane ↔ host agent) | | `envd/spec/buf.gen.yaml` | `envd/internal/services/spec/` | envd module (guest agent server) | The envd `buf.gen.yaml` reads from `../../proto/envd/` (same source protos) but generates into envd's own module. This means the same `.proto` files produce two independent sets of Go stubs — one for each Go module. To add a new RPC method: edit the `.proto` file → `make proto` → implement the handler on both sides. ### sqlc Config: `sqlc.yaml` (project root). Reads queries from `db/queries/*.sql`, reads schema from `db/migrations/`, outputs to `internal/db/`. To add a new query: add it to the appropriate `.sql` file in `db/queries/` → `make generate` → use the new method on `*db.Queries`. ## Key Technical Decisions - **Connect RPC** (not gRPC) for all RPC communication between components - **Buf + protoc-gen-connect-go** for code generation (not protoc-gen-go-grpc) - **Raw Firecracker HTTP API** via Unix socket (not firecracker-go-sdk Machine type) - **TAP networking** (not vsock) for host-to-envd communication - **Device-mapper snapshots** for rootfs CoW — shared read-only loop device per base template, per-sandbox sparse CoW file, Firecracker gets `/dev/mapper/wrenn-{id}` - **PostgreSQL** via pgx/v5 + sqlc (type-safe query generation). Goose for migrations (plain SQL, up/down) - **Admin UI**: htmx + Go html/template + chi router. No SPA, no React, no build step - **Lago** for billing (external service, not in this codebase) ## Coding Conventions - **Go style**: `gofmt`, `go vet`, `context.Context` everywhere, errors wrapped with `fmt.Errorf("action: %w", err)`, `slog` for logging, no global state - **Naming**: Sandbox IDs `sb-` + 8 hex, API keys `wrn_` + 32 chars, Host IDs `host-` + 8 hex - **Dependencies**: Use `go get` to add deps, never hand-edit go.mod. For envd deps: `cd envd && go get ...` (separate module) - **Generated code**: Always commit generated code (proto stubs, sqlc). Never add generated code to .gitignore - **Migrations**: Always use `make migrate-create name=xxx`, never create migration files manually - **Testing**: Table-driven tests for handlers and state machine transitions ### Two-module gotcha The main module (`go.mod`) and envd (`envd/go.mod`) are fully independent. `make tidy`, `make fmt`, `make vet` already operate on both. But when adding dependencies manually, remember to target the correct module (`cd envd && go get ...` for envd deps). `make proto` also generates stubs for both modules from the same proto sources. ## Rootfs & Guest Init - **wrenn-init** (`images/wrenn-init.sh`): the PID 1 init script baked into every rootfs. Mounts virtual filesystems, sets hostname, writes `/etc/resolv.conf`, then execs envd. - **Updating the rootfs** after changing envd or wrenn-init: `bash scripts/update-debug-rootfs.sh [rootfs_path]`. This builds envd via `make build-envd`, mounts the rootfs image, copies in the new binaries, and unmounts. Defaults to `/var/lib/wrenn/images/minimal.ext4`. - Rootfs images are minimal debootstrap — no systemd, no coreutils beyond busybox. Use `/bin/sh -c` for shell builtins inside the guest. ## Fixed Paths (on host machine) - Kernel: `/var/lib/wrenn/kernels/vmlinux` - Base rootfs images: `/var/lib/wrenn/images/{template}.ext4` - Sandbox clones: `/var/lib/wrenn/sandboxes/` - Firecracker: `/usr/local/bin/firecracker` (e2b's fork of firecracker) ## Web UI Styling **Wrenn brand:** Warm earthy developer tool with crafted organic character. **Color palette (light/dark):** Background scale: #f8f6f1 → #f1eeea → #e8e5e0 → #dedbd5 (light); #090b0a → #0f1211 → #151918 → #1b201e → #222826 (dark). Text hierarchy: bright #2c2a26 / body #4a4740 / dim #7a766e / faint #a09b93 (light); #e8e5df / #c8c4bc / #8a867f / #5f5c57 (dark). Sage green brand accent: #5e8c58 (light) / #89a785 (dark), with glow variant rgba(94,140,88,0.08). Borders: #e2dfd9 (light) / #262c2a (dark). Semantic status colors: amber #9e7c2e (warning/building), red #b35544 (error/failed), blue #3d7aac (info/stopped) — each with a color-dim transparent bg variant for badge backgrounds. Destructive: #b35544 light / #c27b6d dark. **Typography:** Four fonts. Manrope (variable, weights 300–700) for all UI labels, nav, body. Instrument Serif (400) for page titles, empty-state headings, large metric values. JetBrains Mono (400/500) for code, env var keys/values, deployment IDs, commit SHAs, log viewer, URL paths. Alice for the sidebar wordmark only. Base body size 14px. Headings: h1 24px serif, h2 20px, h3 18px, h4–h6 11px sans-serif uppercase wide-tracked. Metric card values 34px serif at letter-spacing: -0.08em. Section labels at 0.06–0.07em tracking, weight 550–600. Spacing: 4px base unit (Tailwind scale). Page content p-8 (32px). Cards p-4–p-5. Sidebar nav items 7px 10px. Consistent, moderate density — functional but not cramped. **Borders & depth:** Flat aesthetic — --shadow-sm: 0 0 #0000, no drop shadows. Depth is achieved through background color stepping (bg → bg-3 → bg-4 → bg-5), not shadows. Borders 1px solid in warm muted tones. Corner radii: cards/surfaces 12px, inputs/small buttons 6–8px, avatars 8px, dots 50%. **Components:** Active sidebar nav items use a 3px left-border in sage green rather than filled backgrounds, with a sage glow bg (rgba(94,140,88,0.08)). Focus rings are double-ring: 0 0 0 2px background, 0 0 0 4px ring. Status system has four states (Live/sage, Building/amber+pulse, Failed/red, Stopped/faint) each with solid dot + transparent-bg badge pair. Buttons follow ghost → outline → filled hierarchy. Tables wrapped in rounded-xl border. Dialogs via native . Toasts bottom-anchored. **Animation:** Crisp 150ms transitions on all interactive elements. Sidebar width 250ms ease. Custom wrenn-pulse keyframe (2.5s ease infinite box-shadow bloom) on live/building status dots. Top-of-page loading bar (h-0.5, sage green) on navigation. **Dark mode:** Full support. Very dark near-black-green backgrounds with warm off-white text and desaturated sage accent. Flat (no card shadows). System preference detection + localStorage persistence. **Overall feel:** Warm, earthy, semi-flat. Avoids cold grays entirely — palette leans slightly warm/brown-tinted throughout. The serif + mono + geometric sans type stack gives a designed but unfussy developer-tool character. Organic and considered, not sterile.