Go to file

pptx704 51b5d7b3ba fix: resolve pause/snapshot failures and CoW exhaustion on large VMs

Remove hard 10s timeout from Firecracker HTTP client — callers already
pass context.Context with appropriate deadlines, and 20GB+ memfile
writes easily exceed 10s.

Ensure CoW file is at least as large as the origin rootfs. Previously,
WRENN_DEFAULT_ROOTFS_SIZE=30Gi expanded the base image to 30GB but the
default 5GB CoW could not hold all writes, causing dm-snapshot
invalidation and EIO on all guest I/O.

Destroy frozen VMs in resumeOnError instead of leaving zombies that
report "running" but can't execute. Use fresh context for the resume
attempt so a cancelled caller context doesn't falsely trigger destroy.

Increase CP→Agent ResponseHeaderTimeout from 45s to 5min and
PrepareSnapshot timeout from 3s to 30s for large-memory VMs.

After failed pause, ping agent to detect destroyed sandboxes and mark
DB status as "error" instead of reverting to "running".

2026-05-04 01:46:57 +06:00

cmd

fix: prevent sandbox halt after resume by fixing HTTP/2 HOL blocking and adding timeouts

2026-05-02 13:48:51 +06:00

feat: admin grant/revoke from admin panel

2026-05-03 15:24:34 +06:00

deploy

Add production file logging with logrotate support

2026-04-16 15:09:26 +06:00

envd-rs

fix: drop page cache before snapshot to reduce memory dump size

2026-05-03 14:27:49 +06:00

frontend

fix: fetch sandbox metrics immediately on page load

2026-05-03 16:43:26 +06:00

images

rename guest hostname from "sandbox" to "capsule"

2026-05-03 03:32:03 +06:00

internal

fix: resolve pause/snapshot failures and CoW exhaustion on large VMs

2026-05-04 01:46:57 +06:00

pkg

fix: resolve pause/snapshot failures and CoW exhaustion on large VMs

2026-05-04 01:46:57 +06:00

proto

Refactored to maintain a separate cloud version

2026-04-15 21:41:48 +06:00

recipes

Fix build recipe execution and flatten reliability

2026-04-15 18:24:54 +06:00

scripts

feat: rewrite envd guest agent in Rust (envd-rs)

2026-05-03 02:47:15 +06:00

tests

Initial project structure for Wrenn Sandbox

2026-03-09 17:22:47 +06:00

.env.example

Add production file logging with logrotate support

2026-04-16 15:09:26 +06:00

.gitignore

feat: rewrite envd guest agent in Rust (envd-rs)

2026-05-03 02:47:15 +06:00

CLAUDE.md

refactor: remove Go envd module, update host agent for Rust envd

2026-05-03 03:12:25 +06:00

go.mod

Bump netlink v1.3.1 and netns v0.0.5

2026-04-13 00:13:40 +06:00

go.sum

Bump netlink v1.3.1 and netns v0.0.5

2026-04-13 00:13:40 +06:00

LICENSE

chore: relicense from BSL 1.1 to Apache 2.0

2026-04-09 14:28:19 +06:00

Makefile

Updated static link check for envd

2026-05-03 03:32:41 +06:00

README.md

refactor: remove Go envd module, update host agent for Rust envd

2026-05-03 03:12:25 +06:00

sqlc.yaml

Refactored to maintain a separate cloud version

2026-04-15 21:41:48 +06:00

VERSION_AGENT

fix: prevent Go runtime memory corruption and sandbox halt after snapshot restore

2026-05-02 17:22:51 +06:00

VERSION_CP

fix: prevent Go runtime memory corruption and sandbox halt after snapshot restore

2026-05-02 17:22:51 +06:00

README.md

Wrenn

Secure infrastructure for AI

Prerequisites

Linux host with /dev/kvm access (bare metal or nested virt)
Firecracker binary at /usr/local/bin/firecracker
PostgreSQL
Go 1.25+
Rust 1.88+ with x86_64-unknown-linux-musl target (rustup target add x86_64-unknown-linux-musl)
pnpm (for frontend)
Docker (for dev infra and rootfs builds)

Build

make build    # outputs to builds/

Produces three binaries: wrenn-cp (control plane), wrenn-agent (host agent), envd (guest agent).

Host setup

The host agent needs a kernel, a minimal rootfs image, and working directories on the host machine.

Directory structure

/var/lib/wrenn/
├── kernels/
│   └── vmlinux              # uncompressed Linux kernel (not bzImage)
├── images/
│   └── minimal/
│       └── rootfs.ext4      # base rootfs (all other templates snapshot from this)
├── sandboxes/               # per-sandbox CoW files (created at runtime)
└── snapshots/               # pause/hibernate snapshot files (created at runtime)

Create the directories:

sudo mkdir -p /var/lib/wrenn/{kernels,images/minimal,sandboxes,snapshots}

Kernel

Place an uncompressed vmlinux kernel at /var/lib/wrenn/kernels/vmlinux. Versioned kernels (vmlinux-{semver}) are also supported — the agent picks the latest by semver.

Minimal rootfs

The minimal rootfs is the base image that all other templates (Python, Node, etc.) are built on top of via device-mapper snapshots. It must contain:

Package	Why
`socat`	Bidirectional relay for port forwarding
`chrony`	Time sync from KVM PTP clock (`/dev/ptp0`)
`tini`	PID 1 zombie reaper (injected by build script, not apt)
`sudo`	User privilege management inside the guest
`wget`	HTTP fetching
`curl`	HTTP client
`ca-certificates`	TLS certificate verification

To build a rootfs from a Docker container:

Create and configure a container with the required packages:

docker run -it --name wrenn-minimal debian:bookworm bash
# Inside the container:
apt update && apt install -y socat chrony sudo wget curl ca-certificates
exit

Export to a rootfs image (builds envd, injects wrenn-init + tini, shrinks to minimum size):
```
sudo bash scripts/rootfs-from-container.sh wrenn-minimal minimal
```

To update an existing rootfs after changing envd or wrenn-init.sh:

bash scripts/update-minimal-rootfs.sh

This rebuilds envd via make build-envd and copies the fresh binaries into the mounted rootfs image.

IP forwarding

sudo sysctl -w net.ipv4.ip_forward=1

Configure

Copy .env.example to .env and edit:

# Required
DATABASE_URL=postgres://wrenn:wrenn@localhost:5432/wrenn?sslmode=disable

# Control plane
WRENN_CP_LISTEN_ADDR=:8000
CP_HOST_AGENT_ADDR=http://localhost:50051

# Host agent
WRENN_HOST_LISTEN_ADDR=:50051
WRENN_DIR=/var/lib/wrenn

Development

make dev          # Start PostgreSQL (Docker), run migrations, start control plane
make dev-agent    # Start host agent (separate terminal, sudo)
make dev-frontend # Vite dev server with HMR (port 5173)
make check        # fmt + vet + lint + test

Host registration

Hosts must be registered with the control plane before they can serve sandboxes.

Create a host record (via API or dashboard):

curl -X POST http://localhost:8000/v1/hosts \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"type": "regular"}'

This returns a registration_token (valid for 1 hour).

Start the host agent with the registration token and its externally-reachable address:
```
sudo WRENN_CP_URL=http://localhost:8000 \
     ./builds/wrenn-agent \
     --register <token-from-step-1> \
     --address <host-ip>:50051
```
On first startup the agent sends its specs (arch, CPU, memory, disk) to the control plane, receives a long-lived host JWT, and saves it to $WRENN_DIR/host-token.
Subsequent startups don't need --register — the agent loads the saved JWT automatically:
```
sudo ./builds/wrenn-agent --address <host-ip>:50051
```
If registration fails (e.g., network error after token was consumed), regenerate a token:
```
curl -X POST http://localhost:8000/v1/hosts/$HOST_ID/token \
  -H "Authorization: Bearer $JWT_TOKEN"
```
Then restart the agent with the new token.

The agent sends heartbeats to the control plane every 30 seconds.

See CLAUDE.md for full architecture documentation.

Languages

Go 44.1%

Svelte 39.8%

Rust 9.5%

TypeScript 2.3%

Python 1.5%

Other 2.8%