Go to file

pptx704 bd98610153 fix: sandbox network responsiveness under port-binding apps

Running port-binding applications (Jupyter, http.server, NextJS) inside
sandboxes caused severe PTY sluggishness and proxy navigation errors.

Root cause: the CP sandbox proxy and Connect RPC pool shared a single
HTTP transport. Heavy proxy traffic (Jupyter WebSocket, REST polling)
interfered with PTY RPC streams via HTTP/2 flow control contention.

Transport isolation (main fix):
- Add dedicated proxy transport on CP (NewProxyTransport) with HTTP/2
  disabled, separate from the RPC pool transport
- Add dedicated proxy transport on host agent, replacing
  http.DefaultTransport
- Add dedicated envdclient transport with tuned connection pooling
- Replace http.DefaultClient in file streaming RPCs with per-sandbox
  envd client

Proxy path rewriting (navigation fix):
- Add ModifyResponse to rewrite Location headers with /proxy/{id}/{port}
  prefix, handling both root-relative and absolute-URL redirects
- Strip prefix back out in CP subdomain proxy for correct browser
  behavior
- Replace path.Join with string concat in CP Director to preserve
  trailing slashes (prevents redirect loops on directory listings)

Proxy resilience:
- Add dial retry with linear backoff (3 attempts) to handle socat
  startup delay when ports are first detected
- Cache ReverseProxy instances per sandbox+port+host in sync.Map
- Add EvictProxy callback wired into sandbox Manager.Destroy

Buffer and server hardening:
- Increase PTY and exec stream channel buffers from 16 to 256
- Add ReadHeaderTimeout (10s) and IdleTimeout (620s) to host agent
  HTTP server

Network tuning:
- Set TAP device TxQueueLen to 5000 (up from default 1000)
- Add Firecracker tx_rate_limiter (200 MB/s sustained, 100 MB burst)
  to prevent guest traffic from saturating the TAP

2026-04-25 04:21:55 +06:00

cmd

fix: sandbox network responsiveness under port-binding apps

2026-04-25 04:21:55 +06:00

feat: send email notification on account hard-delete

2026-04-21 16:01:56 +06:00

deploy

Add production file logging with logrotate support

2026-04-16 15:09:26 +06:00

envd

fix: security and stability fixes from code review

2026-04-24 15:48:38 +06:00

frontend

feat: add audit logging for all admin actions and admin audit page

2026-04-21 15:41:45 +06:00

images

Fix runtime env leaking into templates, add hostname to /etc/hosts

2026-04-12 02:43:09 +06:00

internal

fix: sandbox network responsiveness under port-binding apps

2026-04-25 04:21:55 +06:00

pkg

fix: sandbox network responsiveness under port-binding apps

2026-04-25 04:21:55 +06:00

proto

Refactored to maintain a separate cloud version

2026-04-15 21:41:48 +06:00

recipes

Fix build recipe execution and flatten reliability

2026-04-15 18:24:54 +06:00

scripts

Added host preparation script and updated claude md

2026-04-16 16:56:04 +06:00

tests

Initial project structure for Wrenn Sandbox

2026-03-09 17:22:47 +06:00

.env.example

Add production file logging with logrotate support

2026-04-16 15:09:26 +06:00

.gitignore

Updated gitignore

2026-04-12 22:24:54 +06:00

CLAUDE.md

Minor patch

2026-04-16 18:14:50 +06:00

go.mod

Bump netlink v1.3.1 and netns v0.0.5

2026-04-13 00:13:40 +06:00

go.sum

Bump netlink v1.3.1 and netns v0.0.5

2026-04-13 00:13:40 +06:00

LICENSE

chore: relicense from BSL 1.1 to Apache 2.0

2026-04-09 14:28:19 +06:00

Makefile

Refactored to maintain a separate cloud version

2026-04-15 21:41:48 +06:00

NOTICE

chore: relicense from BSL 1.1 to Apache 2.0

2026-04-09 14:28:19 +06:00

README.md

Minor patch

2026-04-16 18:14:50 +06:00

sqlc.yaml

Refactored to maintain a separate cloud version

2026-04-15 21:41:48 +06:00

VERSION_AGENT

Refactored to maintain a separate cloud version

2026-04-15 21:41:48 +06:00

VERSION_CP

fix: security and stability fixes from code review

2026-04-24 15:48:38 +06:00

README.md

Wrenn

Secure infrastructure for AI

Prerequisites

Linux host with /dev/kvm access (bare metal or nested virt)
Firecracker binary at /usr/local/bin/firecracker
PostgreSQL
Go 1.25+
pnpm (for frontend)
Docker (for dev infra and rootfs builds)

Build

make build    # outputs to builds/

Produces three binaries: wrenn-cp (control plane), wrenn-agent (host agent), envd (guest agent).

Host setup

The host agent needs a kernel, a minimal rootfs image, and working directories on the host machine.

Directory structure

/var/lib/wrenn/
├── kernels/
│   └── vmlinux              # uncompressed Linux kernel (not bzImage)
├── images/
│   └── minimal/
│       └── rootfs.ext4      # base rootfs (all other templates snapshot from this)
├── sandboxes/               # per-sandbox CoW files (created at runtime)
└── snapshots/               # pause/hibernate snapshot files (created at runtime)

Create the directories:

sudo mkdir -p /var/lib/wrenn/{kernels,images/minimal,sandboxes,snapshots}

Kernel

Place an uncompressed vmlinux kernel at /var/lib/wrenn/kernels/vmlinux. Versioned kernels (vmlinux-{semver}) are also supported — the agent picks the latest by semver.

Minimal rootfs

The minimal rootfs is the base image that all other templates (Python, Node, etc.) are built on top of via device-mapper snapshots. It must contain:

Package	Why
`socat`	Bidirectional relay for port forwarding
`chrony`	Time sync from KVM PTP clock (`/dev/ptp0`)
`tini`	PID 1 zombie reaper (injected by build script, not apt)
`sudo`	User privilege management inside the guest
`wget`	HTTP fetching
`curl`	HTTP client
`ca-certificates`	TLS certificate verification

To build a rootfs from a Docker container:

Create and configure a container with the required packages:

docker run -it --name wrenn-minimal debian:bookworm bash
# Inside the container:
apt update && apt install -y socat chrony sudo wget curl ca-certificates
exit

Export to a rootfs image (builds envd, injects wrenn-init + tini, shrinks to minimum size):
```
sudo bash scripts/rootfs-from-container.sh wrenn-minimal minimal
```

To update an existing rootfs after changing envd or wrenn-init.sh:

bash scripts/update-minimal-rootfs.sh

This rebuilds envd via make build-envd and copies the fresh binaries into the mounted rootfs image.

IP forwarding

sudo sysctl -w net.ipv4.ip_forward=1

Configure

Copy .env.example to .env and edit:

# Required
DATABASE_URL=postgres://wrenn:wrenn@localhost:5432/wrenn?sslmode=disable

# Control plane
WRENN_CP_LISTEN_ADDR=:8000
CP_HOST_AGENT_ADDR=http://localhost:50051

# Host agent
WRENN_HOST_LISTEN_ADDR=:50051
WRENN_DIR=/var/lib/wrenn

Development

make dev          # Start PostgreSQL (Docker), run migrations, start control plane
make dev-agent    # Start host agent (separate terminal, sudo)
make dev-frontend # Vite dev server with HMR (port 5173)
make check        # fmt + vet + lint + test

Host registration

Hosts must be registered with the control plane before they can serve sandboxes.

Create a host record (via API or dashboard):

curl -X POST http://localhost:8000/v1/hosts \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"type": "regular"}'

This returns a registration_token (valid for 1 hour).

Start the host agent with the registration token and its externally-reachable address:
```
sudo WRENN_CP_URL=http://localhost:8000 \
     ./builds/wrenn-agent \
     --register <token-from-step-1> \
     --address <host-ip>:50051
```
On first startup the agent sends its specs (arch, CPU, memory, disk) to the control plane, receives a long-lived host JWT, and saves it to $WRENN_DIR/host-token.
Subsequent startups don't need --register — the agent loads the saved JWT automatically:
```
sudo ./builds/wrenn-agent --address <host-ip>:50051
```
If registration fails (e.g., network error after token was consumed), regenerate a token:
```
curl -X POST http://localhost:8000/v1/hosts/$HOST_ID/token \
  -H "Authorization: Bearer $JWT_TOKEN"
```
Then restart the agent with the new token.

The agent sends heartbeats to the control plane every 30 seconds.

See CLAUDE.md for full architecture documentation.

Languages

Go 44.1%

Svelte 39.8%

Rust 9.5%

TypeScript 2.3%

Python 1.5%

Other 2.8%