CLAUDE.md: replace bloated 850-line version with focused 230-line guide. Fix inaccuracies (module path, build dir, Connect RPC vs gRPC, buf vs protoc). Add detailed architecture with request flows, code generation workflow, rootfs update process, and two-module gotchas. README.md: add core deployment instructions (prerequisites, build, host setup, configuration, running, rootfs workflow).
16 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
Wrenn Sandbox is a microVM-based code execution platform. Users create isolated sandboxes (Firecracker microVMs), run code inside them, and get output back via SDKs. Think E2B but with persistent sandboxes, pool-based pricing, and a single-binary deployment story.
Build & Development Commands
All commands go through the Makefile. Never use raw go build or go run.
make build # Build all binaries → builds/
make build-cp # Control plane only
make build-agent # Host agent only
make build-envd # envd static binary (verified statically linked)
make dev # Full local dev: infra + migrate + control plane
make dev-infra # Start PostgreSQL + Prometheus + Grafana (Docker)
make dev-down # Stop dev infra
make dev-cp # Control plane with hot reload (if air installed)
make dev-agent # Host agent (sudo required)
make dev-envd # envd in TCP debug mode
make check # fmt + vet + lint + test (CI order)
make test # Unit tests: go test -race -v ./internal/...
make test-integration # Integration tests (require host agent + Firecracker)
make fmt # gofmt both modules
make vet # go vet both modules
make lint # golangci-lint
make migrate-up # Apply pending migrations
make migrate-down # Rollback last migration
make migrate-create name=xxx # Scaffold new goose migration (never create manually)
make migrate-reset # Drop + re-apply all
make generate # Proto (buf) + sqlc codegen
make proto # buf generate for all proto dirs
make tidy # go mod tidy both modules
Run a single test: go test -race -v -run TestName ./internal/path/...
Architecture
User SDK → HTTPS/WS → Control Plane → Connect RPC → Host Agent → HTTP/Connect RPC over TAP → envd (inside VM)
Three binaries, two Go modules:
| Binary | Module | Entry point | Runs as |
|---|---|---|---|
| wrenn-cp | git.omukk.dev/wrenn/sandbox |
cmd/control-plane/main.go |
Unprivileged |
| wrenn-agent | git.omukk.dev/wrenn/sandbox |
cmd/host-agent/main.go |
Root (NET_ADMIN + /dev/kvm) |
| envd | git.omukk.dev/wrenn/sandbox/envd (standalone envd/go.mod) |
envd/main.go |
PID 1 inside guest VM |
envd is a completely independent Go module. It is never imported by the main module. The only connection is the protobuf contract. It compiles to a static binary baked into rootfs images.
Key architectural invariant: The host agent is stateful (in-memory boxes map is the source of truth for running VMs). The control plane is stateless (all persistent state in PostgreSQL). The reconciler (internal/api/reconciler.go) bridges the gap — it periodically compares DB records against the host agent's live state and marks orphaned sandboxes as "stopped".
Control Plane
Packages: internal/api/, internal/admin/, internal/auth/, internal/scheduler/, internal/lifecycle/, internal/config/, internal/db/
Startup (cmd/control-plane/main.go) wires: config (env vars) → pgxpool → db.Queries (sqlc-generated) → Connect RPC client to host agent → api.Server. Everything flows through constructor injection.
- API Server (
internal/api/server.go): chi router with middleware. Creates handler structs (sandboxHandler,execHandler,filesHandler, etc.) injected withdb.Queriesand the host agent Connect RPC client. Routes under/v1/sandboxes/*. - Reconciler (
internal/api/reconciler.go): background goroutine (every 30s) that compares DB records againstagent.ListSandboxes()RPC. Marks orphaned DB entries as "stopped". - Admin UI at
/admin/(htmx + Go html/template, no SPA, no build step) - Database: PostgreSQL via pgx/v5. Queries generated by sqlc from
db/queries/sandboxes.sql. Migrations indb/migrations/(goose, plain SQL). - Config (
internal/config/config.go): purely environment variables (DATABASE_URL,CP_LISTEN_ADDR,CP_HOST_AGENT_ADDR), no YAML/file config.
Host Agent
Packages: internal/hostagent/, internal/sandbox/, internal/vm/, internal/network/, internal/filesystem/, internal/envdclient/, internal/snapshot/
Startup (cmd/host-agent/main.go) wires: root check → enable IP forwarding → sandbox.Manager (containing vm.Manager + network.SlotAllocator) → hostagent.Server (Connect RPC handler) → HTTP server.
- RPC Server (
internal/hostagent/server.go): implementshostagentv1connect.HostAgentServiceHandler. Thin wrapper — every method delegates tosandbox.Manager. Maps Connect error codes on return. - Sandbox Manager (
internal/sandbox/manager.go): the core orchestration layer. Maintains in-memory state inboxes map[string]*sandboxState(protected bysync.RWMutex). EachsandboxStateholds amodels.Sandbox, a*network.Slot, and an*envdclient.Client. Runs a TTL reaper (every 10s) that auto-destroys timed-out sandboxes. - VM Manager (
internal/vm/manager.go,fc.go,config.go): manages Firecracker processes. Uses raw HTTP API over Unix socket (/tmp/fc-{sandboxID}.sock), not the firecracker-go-sdk Machine type. Launches Firecracker viaunshare -m+ip netns exec. Configures VM via PUT to/boot-source,/drives/rootfs,/network-interfaces/eth0,/machine-config, then starts with PUT/actions. - Network (
internal/network/setup.go,allocator.go): per-sandbox network namespace with veth pair + TAP device. See Networking section below. - Filesystem (
internal/filesystem/clone.go): CoW rootfs clones viacp --reflink=auto. - envd Client (
internal/envdclient/client.go,health.go): dual interface to the guest agent. Connect RPC for streaming process exec (process.Start()bidirectional stream). Plain HTTP for file operations (POST/GET/files?path=...&username=root). Health check pollsGET /healthevery 100ms until ready (30s timeout).
envd (Guest Agent)
Module: envd/ with its own go.mod (git.omukk.dev/wrenn/sandbox/envd)
Runs as PID 1 inside the microVM via wrenn-init.sh (mounts procfs/sysfs/dev, sets hostname, writes resolv.conf, then execs envd). Extracted from E2B (Apache 2.0), with shared packages internalized into envd/internal/shared/. Listens on TCP 0.0.0.0:49983.
- ProcessService: start processes, stream stdout/stderr, signal handling, PTY support
- FilesystemService: stat/list/mkdir/move/remove/watch files
- Health: GET
/health
Networking (per sandbox)
Each sandbox gets its own Linux network namespace (ns-{idx}). Slot index (1-based, up to 65534) determines all addressing:
Host Namespace Namespace "ns-{idx}" Guest VM
──────────────────────────────────────────────────────────────────────────────────────
veth-{idx} ←──── veth pair ────→ eth0
10.12.0.{idx*2}/31 10.12.0.{idx*2+1}/31
│
tap0 (169.254.0.22/30) ←── TAP ──→ eth0 (169.254.0.21)
↑ kernel ip= boot arg
- Host-reachable IP:
10.11.0.{idx}/32— routed through veth to namespace, DNAT'd to guest - Outbound NAT: guest (169.254.0.21) → SNAT to vpeerIP inside namespace → MASQUERADE on host to default interface
- Inbound NAT: host traffic to 10.11.0.{idx} → DNAT to 169.254.0.21 inside namespace
- IP forwarding enabled inside each namespace
- All details in
internal/network/setup.go
Sandbox State Machine
PENDING → STARTING → RUNNING → PAUSED → HIBERNATED
│ │
↓ ↓
STOPPED STOPPED → (destroyed)
Any state → ERROR (on crash/failure)
PAUSED → RUNNING (warm snapshot resume)
HIBERNATED → RUNNING (cold snapshot resume, slower)
Key Request Flows
Sandbox creation (POST /v1/sandboxes):
- API handler generates sandbox ID, inserts into DB as "pending"
- RPC
CreateSandbox→ host agent →sandbox.Manager.Create() - Manager: resolve base rootfs →
cp --reflinkclone → allocate network slot →CreateNetwork()(netns + veth + tap + NAT) →vm.Create()(start Firecracker, configure via HTTP API, boot) →envdclient.WaitUntilReady()(poll /health) → store in-memory state - API handler updates DB to "running" with host_ip
Command execution (POST /v1/sandboxes/{id}/exec):
- API handler verifies sandbox is "running" in DB
- RPC
Exec→ host agent →sandbox.Manager.Exec()→envdclient.Exec() - envd client opens bidirectional Connect RPC stream (
process.Start), collects stdout/stderr/exit_code - API handler checks UTF-8 validity (base64-encodes if binary), updates last_active_at, returns result
Streaming exec (WS /v1/sandboxes/{id}/exec/stream):
- WebSocket upgrade, read first message for cmd/args
- RPC
ExecStream→ host agent →sandbox.Manager.ExecStream()→envdclient.ExecStream() - envd client returns a channel of events; host agent forwards events through the RPC stream
- API handler forwards stream events to WebSocket as JSON messages (
{type: "stdout"|"stderr"|"exit", ...})
File transfer: Write uses multipart POST to envd /files; read uses GET. Streaming variants chunk in 64KB pieces through the RPC stream.
REST API
Routes defined in internal/api/server.go, handlers in internal/api/handlers_*.go. OpenAPI spec embedded via //go:embed and served at /openapi.yaml (Swagger UI at /docs). JSON request/response. API key auth via X-API-Key header. Error responses: {"error": {"code": "...", "message": "..."}}.
Code Generation
Proto (Connect RPC)
Proto source of truth is proto/envd/*.proto and proto/hostagent/*.proto. Run make proto to regenerate. Three buf.gen.yaml files control output:
| buf.gen.yaml location | Generates to | Used by |
|---|---|---|
proto/envd/buf.gen.yaml |
proto/envd/gen/ |
Main module (host agent's envd client) |
proto/hostagent/buf.gen.yaml |
proto/hostagent/gen/ |
Main module (control plane ↔ host agent) |
envd/spec/buf.gen.yaml |
envd/internal/services/spec/ |
envd module (guest agent server) |
The envd buf.gen.yaml reads from ../../proto/envd/ (same source protos) but generates into envd's own module. This means the same .proto files produce two independent sets of Go stubs — one for each Go module.
To add a new RPC method: edit the .proto file → make proto → implement the handler on both sides.
sqlc
Config: sqlc.yaml (project root). Reads queries from db/queries/*.sql, reads schema from db/migrations/, outputs to internal/db/.
To add a new query: add it to the appropriate .sql file in db/queries/ → make generate → use the new method on *db.Queries.
Key Technical Decisions
- Connect RPC (not gRPC) for all RPC communication between components
- Buf + protoc-gen-connect-go for code generation (not protoc-gen-go-grpc)
- Raw Firecracker HTTP API via Unix socket (not firecracker-go-sdk Machine type)
- TAP networking (not vsock) for host-to-envd communication
- PostgreSQL via pgx/v5 + sqlc (type-safe query generation). Goose for migrations (plain SQL, up/down)
- Admin UI: htmx + Go html/template + chi router. No SPA, no React, no build step
- Lago for billing (external service, not in this codebase)
Coding Conventions
- Go style:
gofmt,go vet,context.Contexteverywhere, errors wrapped withfmt.Errorf("action: %w", err),slogfor logging, no global state - Naming: Sandbox IDs
sb-+ 8 hex, API keyswrn_+ 32 chars, Host IDshost-+ 8 hex - Dependencies: Use
go getto add deps, never hand-edit go.mod. For envd deps:cd envd && go get ...(separate module) - Generated code: Always commit generated code (proto stubs, sqlc). Never add generated code to .gitignore
- Migrations: Always use
make migrate-create name=xxx, never create migration files manually - Testing: Table-driven tests for handlers and state machine transitions
Two-module gotcha
The main module (go.mod) and envd (envd/go.mod) are fully independent. make tidy, make fmt, make vet already operate on both. But when adding dependencies manually, remember to target the correct module (cd envd && go get ... for envd deps). make proto also generates stubs for both modules from the same proto sources.
Rootfs & Guest Init
- wrenn-init (
images/wrenn-init.sh): the PID 1 init script baked into every rootfs. Mounts virtual filesystems, sets hostname, writes/etc/resolv.conf, then execs envd. - Updating the rootfs after changing envd or wrenn-init:
bash scripts/update-debug-rootfs.sh [rootfs_path]. This builds envd viamake build-envd, mounts the rootfs image, copies in the new binaries, and unmounts. Defaults to/var/lib/wrenn/images/minimal.ext4. - Rootfs images are minimal debootstrap — no systemd, no coreutils beyond busybox. Use
/bin/sh -cfor shell builtins inside the guest.
Fixed Paths (on host machine)
- Kernel:
/var/lib/wrenn/kernels/vmlinux - Base rootfs images:
/var/lib/wrenn/images/{template}.ext4 - Sandbox clones:
/var/lib/wrenn/sandboxes/ - Firecracker:
/usr/local/bin/firecracker
Web UI Styling
Wrenn brand: Warm earthy developer tool with crafted organic character.
Color palette (light/dark): Background scale: #f8f6f1 → #f1eeea → #e8e5e0 → #dedbd5 (light); #090b0a → #0f1211 → #151918 → #1b201e → #222826 (dark). Text hierarchy: bright #2c2a26 / body #4a4740 / dim #7a766e / faint #a09b93 (light); #e8e5df / #c8c4bc / #8a867f / #5f5c57 (dark). Sage green brand accent: #5e8c58 (light) / #89a785 (dark), with glow variant rgba(94,140,88,0.08). Borders: #e2dfd9 (light) / #262c2a (dark). Semantic status colors: amber #9e7c2e (warning/building), red #b35544 (error/failed), blue #3d7aac (info/stopped) — each with a color-dim transparent bg variant for badge backgrounds. Destructive: #b35544 light / #c27b6d dark.
Typography: Four fonts. Manrope (variable, weights 300–700) for all UI labels, nav, body. Instrument Serif (400) for page titles, empty-state headings, large metric values. JetBrains Mono (400/500) for code, env var keys/values, deployment IDs, commit SHAs, log viewer, URL paths. Alice for the sidebar wordmark only. Base body size 14px. Headings: h1 24px serif, h2 20px, h3 18px, h4–h6 11px sans-serif uppercase wide-tracked. Metric card values 34px serif at letter-spacing: -0.08em. Section labels at 0.06–0.07em tracking, weight 550–600. Spacing: 4px base unit (Tailwind scale). Page content p-8 (32px). Cards p-4–p-5. Sidebar nav items 7px 10px. Consistent, moderate density — functional but not cramped.
Borders & depth: Flat aesthetic — --shadow-sm: 0 0 #0000, no drop shadows. Depth is achieved through background color stepping (bg → bg-3 → bg-4 → bg-5), not shadows. Borders 1px solid in warm muted tones. Corner radii: cards/surfaces 12px, inputs/small buttons 6–8px, avatars 8px, dots 50%.
Components: Active sidebar nav items use a 3px left-border in sage green rather than filled backgrounds, with a sage glow bg (rgba(94,140,88,0.08)). Focus rings are double-ring: 0 0 0 2px background, 0 0 0 4px ring. Status system has four states (Live/sage, Building/amber+pulse, Failed/red, Stopped/faint) each with solid dot + transparent-bg badge pair. Buttons follow ghost → outline → filled hierarchy. Tables wrapped in rounded-xl border. Dialogs via native
. Toasts bottom-anchored.Animation: Crisp 150ms transitions on all interactive elements. Sidebar width 250ms ease. Custom wrenn-pulse keyframe (2.5s ease infinite box-shadow bloom) on live/building status dots. Top-of-page loading bar (h-0.5, sage green) on navigation.
Dark mode: Full support. Very dark near-black-green backgrounds with warm off-white text and desaturated sage accent. Flat (no card shadows). System preference detection + localStorage persistence.
Overall feel: Warm, earthy, semi-flat. Avoids cold grays entirely — palette leans slightly warm/brown-tinted throughout. The serif + mono + geometric sans type stack gives a designed but unfussy developer-tool character. Organic and considered, not sterile.