Go to file

pptx704 7ef9a64613 fix: close stale TCP connections across snapshot/restore to prevent envd hangs

After Firecracker snapshot restore, zombie TCP sockets from the previous
session cause Go runtime corruption inside the guest VM, making envd
unresponsive. This manifests as infinite loading in the file browser and
terminal timeouts (524) in production (HTTP/2 + Cloudflare) but not locally.

Four-part fix:
- Add ServerConnTracker to envd that tracks connections via ConnState callback,
  closes idle connections and disables keep-alives before snapshot, then closes
  all pre-snapshot zombie connections on restore (while preserving post-restore
  connections like the /init request)
- Split envdclient into timeout (2min) and streaming (no timeout) HTTP clients;
  use streaming client for file transfers and process RPCs
- Close host-side idle envdclient connections before PrepareSnapshot so FIN
  packets propagate during the 3s quiesce window
- Add StreamingHTTPClient() accessor; streaming file transfer handlers in
  hostagent use it instead of the timeout client

2026-05-02 05:19:37 +06:00

cmd

fix: sandbox network responsiveness under port-binding apps

2026-04-25 04:21:55 +06:00

feat: send email notification on account hard-delete

2026-04-21 16:01:56 +06:00

deploy

Add production file logging with logrotate support

2026-04-16 15:09:26 +06:00

envd

fix: close stale TCP connections across snapshot/restore to prevent envd hangs

2026-05-02 05:19:37 +06:00

frontend

feat: add audit logging for all admin actions and admin audit page

2026-04-21 15:41:45 +06:00

images

Fix runtime env leaking into templates, add hostname to /etc/hosts

2026-04-12 02:43:09 +06:00

internal

fix: close stale TCP connections across snapshot/restore to prevent envd hangs

2026-05-02 05:19:37 +06:00

pkg

fix: sandbox network responsiveness under port-binding apps

2026-04-25 04:21:55 +06:00

proto

Refactored to maintain a separate cloud version

2026-04-15 21:41:48 +06:00

recipes

Fix build recipe execution and flatten reliability

2026-04-15 18:24:54 +06:00

scripts

Added host preparation script and updated claude md

2026-04-16 16:56:04 +06:00

tests

Initial project structure for Wrenn Sandbox

2026-03-09 17:22:47 +06:00

.env.example

Add production file logging with logrotate support

2026-04-16 15:09:26 +06:00

.gitignore

Updated gitignore

2026-04-12 22:24:54 +06:00

CLAUDE.md

Minor patch

2026-04-16 18:14:50 +06:00

go.mod

Bump netlink v1.3.1 and netns v0.0.5

2026-04-13 00:13:40 +06:00

go.sum

Bump netlink v1.3.1 and netns v0.0.5

2026-04-13 00:13:40 +06:00

LICENSE

chore: relicense from BSL 1.1 to Apache 2.0

2026-04-09 14:28:19 +06:00

Makefile

Refactored to maintain a separate cloud version

2026-04-15 21:41:48 +06:00

NOTICE

chore: relicense from BSL 1.1 to Apache 2.0

2026-04-09 14:28:19 +06:00

README.md

Minor patch

2026-04-16 18:14:50 +06:00

sqlc.yaml

Refactored to maintain a separate cloud version

2026-04-15 21:41:48 +06:00

VERSION_AGENT

Version bump

2026-04-25 04:49:17 +06:00

VERSION_CP

fix: security and stability fixes from code review

2026-04-24 15:48:38 +06:00

README.md

Wrenn

Secure infrastructure for AI

Prerequisites

Linux host with /dev/kvm access (bare metal or nested virt)
Firecracker binary at /usr/local/bin/firecracker
PostgreSQL
Go 1.25+
pnpm (for frontend)
Docker (for dev infra and rootfs builds)

Build

make build    # outputs to builds/

Produces three binaries: wrenn-cp (control plane), wrenn-agent (host agent), envd (guest agent).

Host setup

The host agent needs a kernel, a minimal rootfs image, and working directories on the host machine.

Directory structure

/var/lib/wrenn/
├── kernels/
│   └── vmlinux              # uncompressed Linux kernel (not bzImage)
├── images/
│   └── minimal/
│       └── rootfs.ext4      # base rootfs (all other templates snapshot from this)
├── sandboxes/               # per-sandbox CoW files (created at runtime)
└── snapshots/               # pause/hibernate snapshot files (created at runtime)

Create the directories:

sudo mkdir -p /var/lib/wrenn/{kernels,images/minimal,sandboxes,snapshots}

Kernel

Place an uncompressed vmlinux kernel at /var/lib/wrenn/kernels/vmlinux. Versioned kernels (vmlinux-{semver}) are also supported — the agent picks the latest by semver.

Minimal rootfs

The minimal rootfs is the base image that all other templates (Python, Node, etc.) are built on top of via device-mapper snapshots. It must contain:

Package	Why
`socat`	Bidirectional relay for port forwarding
`chrony`	Time sync from KVM PTP clock (`/dev/ptp0`)
`tini`	PID 1 zombie reaper (injected by build script, not apt)
`sudo`	User privilege management inside the guest
`wget`	HTTP fetching
`curl`	HTTP client
`ca-certificates`	TLS certificate verification

To build a rootfs from a Docker container:

Create and configure a container with the required packages:

docker run -it --name wrenn-minimal debian:bookworm bash
# Inside the container:
apt update && apt install -y socat chrony sudo wget curl ca-certificates
exit

Export to a rootfs image (builds envd, injects wrenn-init + tini, shrinks to minimum size):
```
sudo bash scripts/rootfs-from-container.sh wrenn-minimal minimal
```

To update an existing rootfs after changing envd or wrenn-init.sh:

bash scripts/update-minimal-rootfs.sh

This rebuilds envd via make build-envd and copies the fresh binaries into the mounted rootfs image.

IP forwarding

sudo sysctl -w net.ipv4.ip_forward=1

Configure

Copy .env.example to .env and edit:

# Required
DATABASE_URL=postgres://wrenn:wrenn@localhost:5432/wrenn?sslmode=disable

# Control plane
WRENN_CP_LISTEN_ADDR=:8000
CP_HOST_AGENT_ADDR=http://localhost:50051

# Host agent
WRENN_HOST_LISTEN_ADDR=:50051
WRENN_DIR=/var/lib/wrenn

Development

make dev          # Start PostgreSQL (Docker), run migrations, start control plane
make dev-agent    # Start host agent (separate terminal, sudo)
make dev-frontend # Vite dev server with HMR (port 5173)
make check        # fmt + vet + lint + test

Host registration

Hosts must be registered with the control plane before they can serve sandboxes.

Create a host record (via API or dashboard):

curl -X POST http://localhost:8000/v1/hosts \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"type": "regular"}'

This returns a registration_token (valid for 1 hour).

Start the host agent with the registration token and its externally-reachable address:
```
sudo WRENN_CP_URL=http://localhost:8000 \
     ./builds/wrenn-agent \
     --register <token-from-step-1> \
     --address <host-ip>:50051
```
On first startup the agent sends its specs (arch, CPU, memory, disk) to the control plane, receives a long-lived host JWT, and saves it to $WRENN_DIR/host-token.
Subsequent startups don't need --register — the agent loads the saved JWT automatically:
```
sudo ./builds/wrenn-agent --address <host-ip>:50051
```
If registration fails (e.g., network error after token was consumed), regenerate a token:
```
curl -X POST http://localhost:8000/v1/hosts/$HOST_ID/token \
  -H "Authorization: Bearer $JWT_TOKEN"
```
Then restart the agent with the new token.

The agent sends heartbeats to the control plane every 30 seconds.

See CLAUDE.md for full architecture documentation.

Languages

Go 44.1%

Svelte 39.8%

Rust 9.5%

TypeScript 2.3%

Python 1.5%

Other 2.8%