1
0
forked from wrenn/wrenn

Merge pull request 'Improved codebase to prepare for production' (#32) from chore/hardening into dev

Reviewed-on: wrenn/wrenn#32
This commit is contained in:
2026-04-16 13:00:06 +00:00
13 changed files with 761 additions and 79 deletions

View File

@ -1,3 +1,7 @@
# Shared (applies to both control plane and host agent)
WRENN_DIR=/var/lib/wrenn
LOG_LEVEL=info
# Database # Database
DATABASE_URL=postgres://wrenn:wrenn@localhost:5432/wrenn?sslmode=disable DATABASE_URL=postgres://wrenn:wrenn@localhost:5432/wrenn?sslmode=disable
@ -9,7 +13,6 @@ WRENN_CP_LISTEN_ADDR=:9725
# Host Agent # Host Agent
WRENN_HOST_LISTEN_ADDR=:50051 WRENN_HOST_LISTEN_ADDR=:50051
WRENN_DIR=/var/lib/wrenn
WRENN_HOST_INTERFACE=eth0 WRENN_HOST_INTERFACE=eth0
WRENN_CP_URL=http://localhost:9725 WRENN_CP_URL=http://localhost:9725
WRENN_DEFAULT_ROOTFS_SIZE=5Gi WRENN_DEFAULT_ROOTFS_SIZE=5Gi

View File

@ -12,10 +12,10 @@ All commands go through the Makefile. Never use raw `go build` or `go run`.
```bash ```bash
make build # Build all binaries → builds/ make build # Build all binaries → builds/
make build-cp # Control plane only (builds frontend first) make build-cp # Control plane only
make build-agent # Host agent only make build-agent # Host agent only
make build-envd # envd static binary (verified statically linked) make build-envd # envd static binary (verified statically linked)
make build-frontend # SvelteKit dashboard → internal/dashboard/static/ make build-frontend # SvelteKit dashboard → frontend/build/ (served by Caddy)
make dev # Full local dev: infra + migrate + control plane make dev # Full local dev: infra + migrate + control plane
make dev-infra # Start PostgreSQL + Prometheus + Grafana (Docker) make dev-infra # Start PostgreSQL + Prometheus + Grafana (Docker)
@ -55,7 +55,7 @@ User SDK → HTTPS/WS → Control Plane → Connect RPC → Host Agent → HTTP/
| Binary | Module | Entry point | Runs as | | Binary | Module | Entry point | Runs as |
|--------|--------|-------------|---------| |--------|--------|-------------|---------|
| wrenn-cp | `git.omukk.dev/wrenn/wrenn` | `cmd/control-plane/main.go` | Unprivileged | | wrenn-cp | `git.omukk.dev/wrenn/wrenn` | `cmd/control-plane/main.go` | Unprivileged |
| wrenn-agent | `git.omukk.dev/wrenn/wrenn` | `cmd/host-agent/main.go` | Root (NET_ADMIN + /dev/kvm) | | wrenn-agent | `git.omukk.dev/wrenn/wrenn` | `cmd/host-agent/main.go` | `wrenn` user with capabilities (SYS_ADMIN, NET_ADMIN, NET_RAW, SYS_PTRACE, KILL, DAC_OVERRIDE, MKNOD) via setcap; also accepts root |
| envd | `git.omukk.dev/wrenn/wrenn/envd` (standalone `envd/go.mod`) | `envd/main.go` | PID 1 inside guest VM | | envd | `git.omukk.dev/wrenn/wrenn/envd` (standalone `envd/go.mod`) | `envd/main.go` | PID 1 inside guest VM |
envd is a **completely independent Go module**. It is never imported by the main module. The only connection is the protobuf contract. It compiles to a static binary baked into rootfs images. envd is a **completely independent Go module**. It is never imported by the main module. The only connection is the protobuf contract. It compiles to a static binary baked into rootfs images.
@ -64,7 +64,7 @@ envd is a **completely independent Go module**. It is never imported by the main
### Control Plane ### Control Plane
**Internal packages:** `internal/api/`, `internal/dashboard/`, `internal/email/` **Internal packages:** `internal/api/`, `internal/email/`
**Public packages (importable by cloud repo):** `pkg/config/`, `pkg/db/`, `pkg/auth/`, `pkg/auth/oauth/`, `pkg/scheduler/`, `pkg/lifecycle/`, `pkg/channels/`, `pkg/audit/`, `pkg/service/`, `pkg/events/`, `pkg/id/`, `pkg/validate/` **Public packages (importable by cloud repo):** `pkg/config/`, `pkg/db/`, `pkg/auth/`, `pkg/auth/oauth/`, `pkg/scheduler/`, `pkg/lifecycle/`, `pkg/channels/`, `pkg/audit/`, `pkg/service/`, `pkg/events/`, `pkg/id/`, `pkg/validate/`
@ -78,7 +78,7 @@ Startup (`cmd/control-plane/main.go`) is a thin wrapper: `cpserver.Run(cpserver.
- **API Server** (`internal/api/server.go`): chi router with middleware. Creates handler structs (`sandboxHandler`, `execHandler`, `filesHandler`, etc.) injected with `db.Queries` and the host agent Connect RPC client. Routes under `/v1/capsules/*`. Accepts `[]cpextension.Extension` — each extension's `RegisterRoutes()` is called after all core routes are registered. - **API Server** (`internal/api/server.go`): chi router with middleware. Creates handler structs (`sandboxHandler`, `execHandler`, `filesHandler`, etc.) injected with `db.Queries` and the host agent Connect RPC client. Routes under `/v1/capsules/*`. Accepts `[]cpextension.Extension` — each extension's `RegisterRoutes()` is called after all core routes are registered.
- **Reconciler** (`internal/api/reconciler.go`): background goroutine (every 30s) that compares DB records against `agent.ListSandboxes()` RPC. Marks orphaned DB entries as "stopped". - **Reconciler** (`internal/api/reconciler.go`): background goroutine (every 30s) that compares DB records against `agent.ListSandboxes()` RPC. Marks orphaned DB entries as "stopped".
- **Dashboard** (SvelteKit + Tailwind + Bits UI, statically built and embedded via `go:embed`, served as catch-all at root) - **Dashboard** (SvelteKit + Tailwind + Bits UI, built to static files in `frontend/build/`, served by Caddy as a reverse proxy)
- **Database**: PostgreSQL via pgx/v5. Queries generated by sqlc from `db/queries/*.sql``pkg/db/`. Migrations in `db/migrations/` (goose, plain SQL). `db/migrations/embed.go` exposes `migrations.FS` so the cloud repo can run OSS migrations via `go:embed`. - **Database**: PostgreSQL via pgx/v5. Queries generated by sqlc from `db/queries/*.sql``pkg/db/`. Migrations in `db/migrations/` (goose, plain SQL). `db/migrations/embed.go` exposes `migrations.FS` so the cloud repo can run OSS migrations via `go:embed`.
- **Config** (`pkg/config/config.go`): purely environment variables (`DATABASE_URL`, `CP_LISTEN_ADDR`, `CP_HOST_AGENT_ADDR`), no YAML/file config. - **Config** (`pkg/config/config.go`): purely environment variables (`DATABASE_URL`, `CP_LISTEN_ADDR`, `CP_HOST_AGENT_ADDR`), no YAML/file config.
@ -86,7 +86,9 @@ Startup (`cmd/control-plane/main.go`) is a thin wrapper: `cpserver.Run(cpserver.
**Packages:** `internal/hostagent/`, `internal/sandbox/`, `internal/vm/`, `internal/network/`, `internal/devicemapper/`, `internal/envdclient/`, `internal/snapshot/` **Packages:** `internal/hostagent/`, `internal/sandbox/`, `internal/vm/`, `internal/network/`, `internal/devicemapper/`, `internal/envdclient/`, `internal/snapshot/`
Startup (`cmd/host-agent/main.go`) wires: root check → enable IP forwarding → clean up stale dm devices → `sandbox.Manager` (containing `vm.Manager` + `network.SlotAllocator` + `devicemapper.LoopRegistry`) → `hostagent.Server` (Connect RPC handler) → HTTP server. **Production deployment:** `scripts/prepare-wrenn-user.sh` creates the `wrenn` system user, sets Linux capabilities (setcap) on wrenn-agent and all child binaries (iptables, losetup, dmsetup, etc.), installs an apt hook to restore capabilities after package updates, configures udev rules for `/dev/net/tun`, loads required kernel modules, and writes systemd unit files for both services. No sudo grants — all privilege is via capabilities.
Startup (`cmd/host-agent/main.go`) wires: root/capabilities check → enable IP forwarding → clean up stale dm devices → `sandbox.Manager` (containing `vm.Manager` + `network.SlotAllocator` + `devicemapper.LoopRegistry`) → `hostagent.Server` (Connect RPC handler) → HTTP server.
- **RPC Server** (`internal/hostagent/server.go`): implements `hostagentv1connect.HostAgentServiceHandler`. Thin wrapper — every method delegates to `sandbox.Manager`. Maps Connect error codes on return. - **RPC Server** (`internal/hostagent/server.go`): implements `hostagentv1connect.HostAgentServiceHandler`. Thin wrapper — every method delegates to `sandbox.Manager`. Maps Connect error codes on return.
- **Sandbox Manager** (`internal/sandbox/manager.go`): the core orchestration layer. Maintains in-memory state in `boxes map[string]*sandboxState` (protected by `sync.RWMutex`). Each `sandboxState` holds a `models.Sandbox`, a `*network.Slot`, and an `*envdclient.Client`. Runs a TTL reaper (every 10s) that auto-destroys timed-out sandboxes. - **Sandbox Manager** (`internal/sandbox/manager.go`): the core orchestration layer. Maintains in-memory state in `boxes map[string]*sandboxState` (protected by `sync.RWMutex`). Each `sandboxState` holds a `models.Sandbox`, a `*network.Slot`, and an `*envdclient.Client`. Runs a TTL reaper (every 10s) that auto-destroys timed-out sandboxes.
@ -113,8 +115,8 @@ Runs as PID 1 inside the microVM via `wrenn-init.sh` (mounts procfs/sysfs/dev, s
- **Package manager**: pnpm - **Package manager**: pnpm
- **Routing**: SvelteKit file-based routing under `frontend/src/routes/` - **Routing**: SvelteKit file-based routing under `frontend/src/routes/`
- **Routing layout**: `/login` and `/signup` at root, authenticated pages under `/dashboard/*` (e.g. `/dashboard/capsules`, `/dashboard/keys`) - **Routing layout**: `/login` and `/signup` at root, authenticated pages under `/dashboard/*` (e.g. `/dashboard/capsules`, `/dashboard/keys`)
- **Build output**: `frontend/build/` → copied to `internal/dashboard/static/` → embedded via `go:embed` into the control plane binary - **Build output**: `frontend/build/` — static files served by Caddy
- **Serving**: `internal/dashboard/dashboard.go` registers a `NotFound` catch-all SPA handler with fallback to `index.html`. API routes (`/v1/*`, `/openapi.yaml`, `/docs`) are registered first and take priority - **Serving**: Caddy reverse-proxies API requests to the control plane and serves the SvelteKit SPA directly. The control plane does not serve frontend assets.
- **Dev workflow**: `make dev-frontend` runs Vite dev server on port 5173 with HMR. API calls proxy to `http://localhost:8000` - **Dev workflow**: `make dev-frontend` runs Vite dev server on port 5173 with HMR. API calls proxy to `http://localhost:8000`
- **Fonts**: Manrope (UI), Instrument Serif (headings), JetBrains Mono (code), Alice (brand wordmark) — all self-hosted via `@fontsource` - **Fonts**: Manrope (UI), Instrument Serif (headings), JetBrains Mono (code), Alice (brand wordmark) — all self-hosted via `@fontsource`
- **Dark mode**: class-based (`.dark` on `<html>`) with system preference detection + localStorage persistence - **Dark mode**: class-based (`.dark` on `<html>`) with system preference detection + localStorage persistence
@ -209,7 +211,7 @@ To add a new query: add it to the appropriate `.sql` file in `db/queries/` → `
- **TAP networking** (not vsock) for host-to-envd communication - **TAP networking** (not vsock) for host-to-envd communication
- **Device-mapper snapshots** for rootfs CoW — shared read-only loop device per base template, per-sandbox sparse CoW file, Firecracker gets `/dev/mapper/wrenn-{id}` - **Device-mapper snapshots** for rootfs CoW — shared read-only loop device per base template, per-sandbox sparse CoW file, Firecracker gets `/dev/mapper/wrenn-{id}`
- **PostgreSQL** via pgx/v5 + sqlc (type-safe query generation). Goose for migrations (plain SQL, up/down) - **PostgreSQL** via pgx/v5 + sqlc (type-safe query generation). Goose for migrations (plain SQL, up/down)
- **Dashboard**: SvelteKit (Svelte 5, adapter-static) + Tailwind CSS v4 + Bits UI. Built to static files, embedded into the Go binary via `go:embed`, served as catch-all at root - **Dashboard**: SvelteKit (Svelte 5, adapter-static) + Tailwind CSS v4 + Bits UI. Built to static files in `frontend/build/`, served by Caddy (not embedded in the Go binary)
- **Lago** for billing (external service, not in this codebase) - **Lago** for billing (external service, not in this codebase)
## Coding Conventions ## Coding Conventions

133
README.md
View File

@ -2,16 +2,16 @@
Secure infrastructure for AI Secure infrastructure for AI
## Deployment ## Prerequisites
### Prerequisites
- Linux host with `/dev/kvm` access (bare metal or nested virt) - Linux host with `/dev/kvm` access (bare metal or nested virt)
- Firecracker binary at `/usr/local/bin/firecracker` - Firecracker binary at `/usr/local/bin/firecracker`
- PostgreSQL - PostgreSQL
- Go 1.25+ - Go 1.25+
- pnpm (for frontend)
- Docker (for dev infra and rootfs builds)
### Build ## Build
```bash ```bash
make build # outputs to builds/ make build # outputs to builds/
@ -19,30 +19,77 @@ make build # outputs to builds/
Produces three binaries: `wrenn-cp` (control plane), `wrenn-agent` (host agent), `envd` (guest agent). Produces three binaries: `wrenn-cp` (control plane), `wrenn-agent` (host agent), `envd` (guest agent).
### Host setup ## Host setup
The host agent machine needs: The host agent needs a kernel, a minimal rootfs image, and working directories on the host machine.
```bash ### Directory structure
# Kernel for guest VMs
mkdir -p /var/lib/wrenn/kernels
# Place a vmlinux kernel at /var/lib/wrenn/kernels/vmlinux
# Rootfs images ```
mkdir -p /var/lib/wrenn/images /var/lib/wrenn/
# Build or place .ext4 rootfs images (e.g., minimal.ext4) ├── kernels/
│ └── vmlinux # uncompressed Linux kernel (not bzImage)
# Sandbox working directory ├── images/
mkdir -p /var/lib/wrenn/sandboxes │ └── minimal/
│ └── rootfs.ext4 # base rootfs (all other templates snapshot from this)
# Snapshots directory ├── sandboxes/ # per-sandbox CoW files (created at runtime)
mkdir -p /var/lib/wrenn/snapshots └── snapshots/ # pause/hibernate snapshot files (created at runtime)
# Enable IP forwarding
sysctl -w net.ipv4.ip_forward=1
``` ```
### Configure Create the directories:
```bash
sudo mkdir -p /var/lib/wrenn/{kernels,images/minimal,sandboxes,snapshots}
```
### Kernel
Place an uncompressed `vmlinux` kernel at `/var/lib/wrenn/kernels/vmlinux`. Versioned kernels (`vmlinux-{semver}`) are also supported — the agent picks the latest by semver.
### Minimal rootfs
The minimal rootfs is the base image that all other templates (Python, Node, etc.) are built on top of via device-mapper snapshots. It must contain:
| Package | Why |
|---------|-----|
| `socat` | Bidirectional relay for port forwarding |
| `chrony` | Time sync from KVM PTP clock (`/dev/ptp0`) |
| `tini` | PID 1 zombie reaper (injected by build script, not apt) |
| `sudo` | User privilege management inside the guest |
| `wget` | HTTP fetching |
| `curl` | HTTP client |
| `ca-certificates` | TLS certificate verification |
**To build a rootfs from a Docker container:**
1. Create and configure a container with the required packages:
```bash
docker run -it --name wrenn-minimal debian:bookworm bash
# Inside the container:
apt update && apt install -y socat chrony sudo wget curl ca-certificates
exit
```
2. Export to a rootfs image (builds envd, injects wrenn-init + tini, shrinks to minimum size):
```bash
sudo bash scripts/rootfs-from-container.sh wrenn-minimal minimal
```
**To update an existing rootfs** after changing envd or `wrenn-init.sh`:
```bash
bash scripts/update-minimal-rootfs.sh
```
This rebuilds envd via `make build-envd` and copies the fresh binaries into the mounted rootfs image.
### IP forwarding
```bash
sudo sysctl -w net.ipv4.ip_forward=1
```
## Configure
Copy `.env.example` to `.env` and edit: Copy `.env.example` to `.env` and edit:
@ -59,25 +106,21 @@ WRENN_HOST_LISTEN_ADDR=:50051
WRENN_DIR=/var/lib/wrenn WRENN_DIR=/var/lib/wrenn
``` ```
### Run ## Development
```bash ```bash
# Apply database migrations make dev # Start PostgreSQL (Docker), run migrations, start control plane
make migrate-up make dev-agent # Start host agent (separate terminal, sudo)
make dev-frontend # Vite dev server with HMR (port 5173)
# Start control plane make check # fmt + vet + lint + test
./builds/wrenn-cp
``` ```
Control plane listens on `WRENN_CP_LISTEN_ADDR` (default `:8000`).
### Host registration ### Host registration
Hosts must be registered with the control plane before they can serve sandboxes. Hosts must be registered with the control plane before they can serve sandboxes.
1. **Create a host record** (via API or dashboard): 1. **Create a host record** (via API or dashboard):
```bash ```bash
# As an admin (JWT auth)
curl -X POST http://localhost:8000/v1/hosts \ curl -X POST http://localhost:8000/v1/hosts \
-H "Authorization: Bearer $JWT_TOKEN" \ -H "Authorization: Bearer $JWT_TOKEN" \
-H "Content-Type: application/json" \ -H "Content-Type: application/json" \
@ -87,17 +130,16 @@ Hosts must be registered with the control plane before they can serve sandboxes.
2. **Start the host agent** with the registration token and its externally-reachable address: 2. **Start the host agent** with the registration token and its externally-reachable address:
```bash ```bash
sudo WRENN_CP_URL=http://cp-host:8000 \ sudo WRENN_CP_URL=http://localhost:8000 \
./builds/wrenn-agent \ ./builds/wrenn-agent \
--register <token-from-step-1> \ --register <token-from-step-1> \
--address 10.0.1.5:50051 --address <host-ip>:50051
``` ```
On first startup the agent sends its specs (arch, CPU, memory, disk) to the control plane, receives a long-lived host JWT, and saves it to `$WRENN_DIR/host-token`. On first startup the agent sends its specs (arch, CPU, memory, disk) to the control plane, receives a long-lived host JWT, and saves it to `$WRENN_DIR/host-token`.
3. **Subsequent startups** don't need `--register` — the agent loads the saved JWT automatically: 3. **Subsequent startups** don't need `--register` — the agent loads the saved JWT automatically:
```bash ```bash
sudo WRENN_CP_URL=http://cp-host:8000 \ sudo ./builds/wrenn-agent --address <host-ip>:50051
./builds/wrenn-agent --address 10.0.1.5:50051
``` ```
4. **If registration fails** (e.g., network error after token was consumed), regenerate a token: 4. **If registration fails** (e.g., network error after token was consumed), regenerate a token:
@ -107,23 +149,6 @@ Hosts must be registered with the control plane before they can serve sandboxes.
``` ```
Then restart the agent with the new token. Then restart the agent with the new token.
The agent sends heartbeats to the control plane every 30 seconds. Host agent listens on `WRENN_HOST_LISTEN_ADDR` (default `:50051`). The agent sends heartbeats to the control plane every 30 seconds.
### Rootfs images
envd must be baked into every rootfs image. After building:
```bash
make build-envd
bash scripts/update-debug-rootfs.sh /var/lib/wrenn/images/minimal.ext4
```
## Development
```bash
make dev # Start PostgreSQL (Docker), run migrations, start control plane
make dev-agent # Start host agent (separate terminal, sudo)
make check # fmt + vet + lint + test
```
See `CLAUDE.md` for full architecture documentation. See `CLAUDE.md` for full architecture documentation.

View File

@ -1,14 +1,18 @@
package main package main
import ( import (
"bufio"
"context" "context"
"crypto/tls" "crypto/tls"
"flag" "flag"
"fmt"
"log/slog" "log/slog"
"net/http" "net/http"
"os" "os"
"os/signal" "os/signal"
"path/filepath" "path/filepath"
"strconv"
"strings"
"sync" "sync"
"syscall" "syscall"
"time" "time"
@ -21,6 +25,7 @@ import (
"git.omukk.dev/wrenn/wrenn/internal/network" "git.omukk.dev/wrenn/wrenn/internal/network"
"git.omukk.dev/wrenn/wrenn/internal/sandbox" "git.omukk.dev/wrenn/wrenn/internal/sandbox"
"git.omukk.dev/wrenn/wrenn/pkg/auth" "git.omukk.dev/wrenn/wrenn/pkg/auth"
"git.omukk.dev/wrenn/wrenn/pkg/logging"
"git.omukk.dev/wrenn/wrenn/proto/hostagent/gen/hostagentv1connect" "git.omukk.dev/wrenn/wrenn/proto/hostagent/gen/hostagentv1connect"
) )
@ -38,18 +43,24 @@ func main() {
advertiseAddr := flag.String("address", "", "Externally-reachable address (ip:port) for this host agent") advertiseAddr := flag.String("address", "", "Externally-reachable address (ip:port) for this host agent")
flag.Parse() flag.Parse()
slog.SetDefault(slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{ rootDir := envOrDefault("WRENN_DIR", "/var/lib/wrenn")
Level: slog.LevelDebug, cleanupLog := logging.Setup(filepath.Join(rootDir, "logs"), "host-agent")
}))) defer cleanupLog()
if os.Geteuid() != 0 { if err := checkPrivileges(); err != nil {
slog.Error("host agent must run as root") slog.Error("insufficient privileges", "error", err)
os.Exit(1) os.Exit(1)
} }
// Enable IP forwarding (required for NAT). // Enable IP forwarding (required for NAT). The write may fail if running
// as non-root without DAC_OVERRIDE on this path — that's OK if the systemd
// unit's ExecStartPre already set it. We verify the value regardless.
if err := os.WriteFile("/proc/sys/net/ipv4/ip_forward", []byte("1"), 0644); err != nil { if err := os.WriteFile("/proc/sys/net/ipv4/ip_forward", []byte("1"), 0644); err != nil {
slog.Warn("failed to enable ip_forward", "error", err) slog.Warn("failed to enable ip_forward (may have been set by systemd unit)", "error", err)
}
if b, err := os.ReadFile("/proc/sys/net/ipv4/ip_forward"); err != nil || strings.TrimSpace(string(b)) != "1" {
slog.Error("ip_forward is not enabled — sandbox networking will be broken", "error", err)
os.Exit(1)
} }
// Clean up stale resources from a previous crash. // Clean up stale resources from a previous crash.
@ -57,7 +68,6 @@ func main() {
network.CleanupStaleNamespaces() network.CleanupStaleNamespaces()
listenAddr := envOrDefault("WRENN_HOST_LISTEN_ADDR", ":50051") listenAddr := envOrDefault("WRENN_HOST_LISTEN_ADDR", ":50051")
rootDir := envOrDefault("WRENN_DIR", "/var/lib/wrenn")
cpURL := os.Getenv("WRENN_CP_URL") cpURL := os.Getenv("WRENN_CP_URL")
credsFile := filepath.Join(rootDir, "host-credentials.json") credsFile := filepath.Join(rootDir, "host-credentials.json")
@ -170,6 +180,7 @@ func main() {
shutdownCtx, shutdownCancel := context.WithTimeout(context.Background(), 30*time.Second) shutdownCtx, shutdownCancel := context.WithTimeout(context.Background(), 30*time.Second)
defer shutdownCancel() defer shutdownCancel()
mgr.Shutdown(shutdownCtx) mgr.Shutdown(shutdownCtx)
sandbox.ShrinkMinimalImage(rootDir)
if err := httpServer.Shutdown(shutdownCtx); err != nil { if err := httpServer.Shutdown(shutdownCtx); err != nil {
slog.Error("http server shutdown error", "error", err) slog.Error("http server shutdown error", "error", err)
} }
@ -245,3 +256,63 @@ func envOrDefault(key, def string) string {
} }
return def return def
} }
// checkPrivileges verifies the process has the required Linux capabilities.
// Always reads CapEff — even for root — because a root process inside a
// restricted container (e.g. docker --cap-drop=all) may not have all caps.
func checkPrivileges() error {
capEff, err := readEffectiveCaps()
if err != nil {
return fmt.Errorf("read capabilities: %w", err)
}
// All capabilities required by the host agent at runtime.
required := []struct {
bit uint
name string
}{
{1, "CAP_DAC_OVERRIDE"}, // /dev/loop*, /dev/mapper/*, /dev/net/tun
{5, "CAP_KILL"}, // SIGTERM/SIGKILL to Firecracker processes
{12, "CAP_NET_ADMIN"}, // netlink, iptables, routing, TAP/veth
{13, "CAP_NET_RAW"}, // raw sockets (iptables)
{19, "CAP_SYS_PTRACE"}, // reading /proc/self/ns/net (netns.Get)
{21, "CAP_SYS_ADMIN"}, // netns, mount ns, losetup, dmsetup
{27, "CAP_MKNOD"}, // device-mapper node creation
}
var missing []string
for _, cap := range required {
if capEff&(1<<cap.bit) == 0 {
missing = append(missing, cap.name)
}
}
if len(missing) > 0 {
return fmt.Errorf("missing capabilities: %s — run as root or apply setcap to the binary",
strings.Join(missing, ", "))
}
return nil
}
// readEffectiveCaps parses the CapEff bitmask from /proc/self/status.
func readEffectiveCaps() (uint64, error) {
f, err := os.Open("/proc/self/status")
if err != nil {
return 0, err
}
defer f.Close()
scanner := bufio.NewScanner(f)
for scanner.Scan() {
line := scanner.Text()
if hexStr, ok := strings.CutPrefix(line, "CapEff:"); ok {
return strconv.ParseUint(strings.TrimSpace(hexStr), 16, 64)
}
}
if err := scanner.Err(); err != nil {
return 0, fmt.Errorf("read /proc/self/status: %w", err)
}
return 0, fmt.Errorf("CapEff not found in /proc/self/status")
}

19
deploy/logrotate/wrenn Normal file
View File

@ -0,0 +1,19 @@
/var/lib/wrenn/logs/control-plane.log
/var/lib/wrenn/logs/host-agent.log
{
daily
rotate 3
missingok
notifempty
dateext
dateformat -%Y-%m-%d
compress
delaycompress
sharedscripts
postrotate
# Signal the processes to reopen their log files.
# Use SIGHUP — both binaries handle it gracefully.
pkill -HUP -f wrenn-cp || true
pkill -HUP -f wrenn-agent || true
endscript
}

View File

@ -0,0 +1 @@
export const prerender = false;

View File

@ -28,6 +28,7 @@ var openapiYAML []byte
type Server struct { type Server struct {
router chi.Router router chi.Router
BuildSvc *service.BuildService BuildSvc *service.BuildService
version string
} }
// New constructs the chi router and registers all routes. // New constructs the chi router and registers all routes.
@ -48,6 +49,7 @@ func New(
mailer email.Mailer, mailer email.Mailer,
extensions []cpextension.Extension, extensions []cpextension.Extension,
sctx cpextension.ServerContext, sctx cpextension.ServerContext,
version string,
) *Server { ) *Server {
r := chi.NewRouter() r := chi.NewRouter()
r.Use(requestLogger()) r.Use(requestLogger())
@ -86,6 +88,12 @@ func New(
adminCapsules := newAdminCapsuleHandler(sandboxSvc, queries, pool, al) adminCapsules := newAdminCapsuleHandler(sandboxSvc, queries, pool, al)
meH := newMeHandler(queries, pgPool, rdb, jwtSecret, mailer, oauthRegistry, oauthRedirectURL, teamSvc) meH := newMeHandler(queries, pgPool, rdb, jwtSecret, mailer, oauthRegistry, oauthRedirectURL, teamSvc)
// Health check.
r.Get("/health", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, `{"status":"ok","version":%q}`, version)
})
// OpenAPI spec and docs. // OpenAPI spec and docs.
r.Get("/openapi.yaml", serveOpenAPI) r.Get("/openapi.yaml", serveOpenAPI)
r.Get("/docs", serveDocs) r.Get("/docs", serveDocs)
@ -270,7 +278,7 @@ func New(
ext.RegisterRoutes(r, sctx) ext.RegisterRoutes(r, sctx)
} }
return &Server{router: r, BuildSvc: buildSvc} return &Server{router: r, BuildSvc: buildSvc, version: version}
} }
// Handler returns the HTTP handler. // Handler returns the HTTP handler.

View File

@ -24,7 +24,7 @@ func (a *SlotAllocator) Allocate() (int, error) {
a.mu.Lock() a.mu.Lock()
defer a.mu.Unlock() defer a.mu.Unlock()
for i := 1; i <= 65534; i++ { for i := 1; i <= 32767; i++ {
if !a.inUse[i] { if !a.inUse[i] {
a.inUse[i] = true a.inUse[i] = true
return i, nil return i, nil

View File

@ -104,6 +104,37 @@ func ParseSizeToMB(s string) (int, error) {
} }
} }
// ShrinkMinimalImage shrinks the built-in minimal rootfs back to its minimum
// size using resize2fs -M. This is the inverse of EnsureImageSizes and should
// be called during graceful shutdown so the image is stored compactly on disk.
func ShrinkMinimalImage(wrennDir string) {
minimalRootfs := layout.TemplateRootfs(wrennDir, id.PlatformTeamID, id.MinimalTemplateID)
shrinkImage(minimalRootfs)
}
// shrinkImage shrinks a single rootfs image to its minimum size.
func shrinkImage(rootfs string) {
if _, err := os.Stat(rootfs); err != nil {
return
}
slog.Info("shrinking base image", "path", rootfs)
if out, err := exec.Command("e2fsck", "-fy", rootfs).CombinedOutput(); err != nil {
if exitErr, ok := err.(*exec.ExitError); ok && exitErr.ExitCode() > 1 {
slog.Warn("e2fsck before shrink failed", "path", rootfs, "output", string(out), "error", err)
return
}
}
if out, err := exec.Command("resize2fs", "-M", rootfs).CombinedOutput(); err != nil {
slog.Warn("resize2fs -M failed", "path", rootfs, "output", string(out), "error", err)
return
}
slog.Info("base image shrunk", "path", rootfs)
}
// expandImage expands a single rootfs image if it is smaller than targetBytes. // expandImage expands a single rootfs image if it is smaller than targetBytes.
func expandImage(rootfs string, targetBytes int64, targetMB int) error { func expandImage(rootfs string, targetBytes int64, targetMB int) error {
info, err := os.Stat(rootfs) info, err := os.Stat(rootfs)

View File

@ -14,6 +14,7 @@ type Config struct {
RedisURL string RedisURL string
ListenAddr string ListenAddr string
JWTSecret string JWTSecret string
WrennDir string // WRENN_DIR — base directory for wrenn data (logs, etc.)
// mTLS — CP→Agent channel. Both must be set to enable mTLS; omitting either // mTLS — CP→Agent channel. Both must be set to enable mTLS; omitting either
// disables cert issuance and leaves agent connections on plain HTTP (dev mode). // disables cert issuance and leaves agent connections on plain HTTP (dev mode).
@ -48,6 +49,7 @@ func Load() Config {
RedisURL: envOrDefault("REDIS_URL", "redis://localhost:6379/0"), RedisURL: envOrDefault("REDIS_URL", "redis://localhost:6379/0"),
ListenAddr: envOrDefault("WRENN_CP_LISTEN_ADDR", ":8080"), ListenAddr: envOrDefault("WRENN_CP_LISTEN_ADDR", ":8080"),
JWTSecret: os.Getenv("JWT_SECRET"), JWTSecret: os.Getenv("JWT_SECRET"),
WrennDir: envOrDefault("WRENN_DIR", "/var/lib/wrenn"),
CACert: os.Getenv("WRENN_CA_CERT"), CACert: os.Getenv("WRENN_CA_CERT"),
CAKey: os.Getenv("WRENN_CA_KEY"), CAKey: os.Getenv("WRENN_CA_KEY"),

View File

@ -6,6 +6,7 @@ import (
"net/http" "net/http"
"os" "os"
"os/signal" "os/signal"
"path/filepath"
"strings" "strings"
"syscall" "syscall"
"time" "time"
@ -22,6 +23,7 @@ import (
"git.omukk.dev/wrenn/wrenn/pkg/config" "git.omukk.dev/wrenn/wrenn/pkg/config"
"git.omukk.dev/wrenn/wrenn/pkg/db" "git.omukk.dev/wrenn/wrenn/pkg/db"
"git.omukk.dev/wrenn/wrenn/pkg/lifecycle" "git.omukk.dev/wrenn/wrenn/pkg/lifecycle"
"git.omukk.dev/wrenn/wrenn/pkg/logging"
"git.omukk.dev/wrenn/wrenn/pkg/scheduler" "git.omukk.dev/wrenn/wrenn/pkg/scheduler"
) )
@ -39,11 +41,9 @@ func Run(opts ...Option) {
opt(o) opt(o)
} }
slog.SetDefault(slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{
Level: slog.LevelDebug,
})))
cfg := config.Load() cfg := config.Load()
cleanupLog := logging.Setup(filepath.Join(cfg.WrennDir, "logs"), "control-plane")
defer cleanupLog()
if len(cfg.JWTSecret) < 32 { if len(cfg.JWTSecret) < 32 {
slog.Error("JWT_SECRET must be at least 32 characters") slog.Error("JWT_SECRET must be at least 32 characters")
@ -175,7 +175,7 @@ func Run(opts ...Option) {
} }
// API server. // API server.
srv := api.New(queries, hostPool, hostScheduler, pool, rdb, []byte(cfg.JWTSecret), oauthRegistry, cfg.OAuthRedirectURL, ca, al, channelSvc, mailer, o.extensions, sctx) srv := api.New(queries, hostPool, hostScheduler, pool, rdb, []byte(cfg.JWTSecret), oauthRegistry, cfg.OAuthRedirectURL, ca, al, channelSvc, mailer, o.extensions, sctx, o.version)
// Start template build workers (2 concurrent). // Start template build workers (2 concurrent).
stopBuildWorkers := srv.BuildSvc.StartWorkers(ctx, 2) stopBuildWorkers := srv.BuildSvc.StartWorkers(ctx, 2)

135
pkg/logging/logging.go Normal file
View File

@ -0,0 +1,135 @@
package logging
import (
"io"
"log/slog"
"os"
"os/signal"
"path/filepath"
"strings"
"sync"
"syscall"
)
// Setup configures the global slog logger with dual output (stderr + rotating
// log file). logsDir is the directory where log files are written. binaryName
// is used as the log filename (e.g. "control-plane" → "control-plane.log").
//
// If logsDir is empty or the directory cannot be created, Setup falls back to
// stderr-only logging and returns a no-op cleanup function.
//
// The returned cleanup function closes the log file and must be deferred.
// Setup also installs a SIGHUP handler that reopens the log file, allowing
// external log rotation tools (e.g. logrotate) to rotate files in place.
func Setup(logsDir, binaryName string) func() {
level := parseLevel(os.Getenv("LOG_LEVEL"))
if logsDir == "" {
slog.SetDefault(slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{
Level: level,
})))
return func() {}
}
if err := os.MkdirAll(logsDir, 0750); err != nil {
// Fall back to stderr-only; log the error so operators notice.
slog.SetDefault(slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{
Level: level,
})))
slog.Warn("file logging unavailable: failed to create log directory", "dir", logsDir, "error", err)
return func() {}
}
logPath := filepath.Join(logsDir, binaryName+".log")
rf, err := newReopenableFile(logPath)
if err != nil {
slog.SetDefault(slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{
Level: level,
})))
slog.Warn("file logging unavailable: failed to open log file", "path", logPath, "error", err)
return func() {}
}
mw := io.MultiWriter(os.Stderr, rf)
slog.SetDefault(slog.New(slog.NewTextHandler(mw, &slog.HandlerOptions{
Level: level,
})))
// SIGHUP reopens the log file so logrotate can rotate in place.
sigCh := make(chan os.Signal, 1)
signal.Notify(sigCh, syscall.SIGHUP)
go func() {
for range sigCh {
if err := rf.Reopen(); err != nil {
slog.Error("failed to reopen log file on SIGHUP", "path", logPath, "error", err)
} else {
slog.Info("log file reopened", "path", logPath)
}
}
}()
return func() {
signal.Stop(sigCh)
close(sigCh)
rf.Close()
}
}
func parseLevel(s string) slog.Level {
switch strings.ToLower(strings.TrimSpace(s)) {
case "debug":
return slog.LevelDebug
case "warn", "warning":
return slog.LevelWarn
case "error":
return slog.LevelError
default:
return slog.LevelInfo
}
}
// reopenableFile is an io.Writer backed by an *os.File that can be atomically
// reopened (for log rotation via SIGHUP). All operations are goroutine-safe.
type reopenableFile struct {
path string
mu sync.Mutex
f *os.File
}
func newReopenableFile(path string) (*reopenableFile, error) {
f, err := os.OpenFile(path, os.O_CREATE|os.O_APPEND|os.O_WRONLY, 0640)
if err != nil {
return nil, err
}
return &reopenableFile{path: path, f: f}, nil
}
func (r *reopenableFile) Write(p []byte) (int, error) {
r.mu.Lock()
defer r.mu.Unlock()
return r.f.Write(p)
}
// Reopen closes the current file and opens a new one at the same path.
// This is the mechanism that makes logrotate's copytruncate-free rotation work:
// logrotate renames the old file, then sends SIGHUP, and the process opens a
// fresh file at the original path.
func (r *reopenableFile) Reopen() error {
r.mu.Lock()
defer r.mu.Unlock()
// Open the new file before closing the old one so a failed open doesn't
// leave the writer in a broken state with a closed fd.
f, err := os.OpenFile(r.path, os.O_CREATE|os.O_APPEND|os.O_WRONLY, 0640)
if err != nil {
return err
}
r.f.Close()
r.f = f
return nil
}
func (r *reopenableFile) Close() error {
r.mu.Lock()
defer r.mu.Unlock()
return r.f.Close()
}

385
scripts/prepare-wrenn-user.sh Executable file
View File

@ -0,0 +1,385 @@
#!/usr/bin/env bash
#
# prepare-wrenn-user.sh — Create the wrenn system user and configure minimal privileges.
#
# Creates a locked-down 'wrenn' system user that can run wrenn-agent and wrenn-cp
# with only the privileges they need. The agent binary gets Linux capabilities
# via setcap — no sudo is configured for the wrenn user at all. If an attacker
# compromises the wrenn user, they cannot escalate via sudo.
#
# What this script does:
# 1. Creates the 'wrenn' system user (bash shell for debugging, no home dir)
# 2. Creates required directories with correct ownership
# 3. Sets Linux capabilities on wrenn-agent and all child binaries
# 4. Installs an apt hook to restore capabilities after package updates
# 5. Installs a sudoers drop-in (comment-only, no grants — absence is the cage)
# 6. Ensures required kernel modules are loaded
# 7. Writes systemd unit files for both wrenn-agent and wrenn-cp
#
# Usage:
# sudo bash scripts/prepare-wrenn-user.sh
#
# Prerequisites:
# - wrenn-agent binary at /usr/local/bin/wrenn-agent
# - wrenn-cp binary at /usr/local/bin/wrenn-cp
# - firecracker binary at /usr/local/bin/firecracker
# - libcap2-bin installed (for setcap)
set -euo pipefail
# ── Guard ────────────────────────────────────────────────────────────────────
if [[ $EUID -ne 0 ]]; then
echo "ERROR: This script must be run as root."
exit 1
fi
# ── Configuration ────────────────────────────────────────────────────────────
WRENN_USER="wrenn"
WRENN_GROUP="wrenn"
WRENN_DIR="/var/lib/wrenn"
AGENT_BIN="/usr/local/bin/wrenn-agent"
CP_BIN="/usr/local/bin/wrenn-cp"
FC_BIN="/usr/local/bin/firecracker"
RESTORE_CAPS_SCRIPT="/etc/wrenn/restore-caps.sh"
# ── 1. Create system user ───────────────────────────────────────────────────
if id "${WRENN_USER}" &>/dev/null; then
echo "==> User '${WRENN_USER}' already exists, skipping creation."
else
echo "==> Creating system user '${WRENN_USER}'..."
useradd \
--system \
--no-create-home \
--home-dir "${WRENN_DIR}" \
--shell /bin/bash \
"${WRENN_USER}"
fi
# Add wrenn to kvm group for /dev/kvm access.
if getent group kvm &>/dev/null; then
usermod -aG kvm "${WRENN_USER}"
echo "==> Added '${WRENN_USER}' to 'kvm' group."
fi
# ── 2. Create directories with correct ownership ────────────────────────────
echo "==> Setting up directories..."
directories=(
"${WRENN_DIR}"
"${WRENN_DIR}/images"
"${WRENN_DIR}/kernels"
"${WRENN_DIR}/sandboxes"
"${WRENN_DIR}/snapshots"
"${WRENN_DIR}/logs"
"/run/netns"
)
for dir in "${directories[@]}"; do
mkdir -p "${dir}"
done
# Only chown wrenn-owned dirs (not /run/netns which is system-managed).
for dir in "${WRENN_DIR}" "${WRENN_DIR}/images" "${WRENN_DIR}/kernels" \
"${WRENN_DIR}/sandboxes" "${WRENN_DIR}/snapshots" "${WRENN_DIR}/logs"; do
chown "${WRENN_USER}:${WRENN_GROUP}" "${dir}"
chmod 750 "${dir}"
done
# ── 3. Set capabilities on binaries ─────────────────────────────────────────
#
# These capabilities replace full root access. The wrenn-agent binary gets
# exactly the capabilities it needs for:
#
# CAP_SYS_ADMIN — network namespaces (netns create/enter), mount namespaces
# (unshare -m), losetup, dmsetup, mount/umount
# CAP_NET_ADMIN — veth/TAP creation (netlink), iptables rules, IP forwarding,
# routing table manipulation
# CAP_NET_RAW — raw socket access (needed by iptables internally)
# CAP_SYS_PTRACE — reading /proc/self/ns/net (netns.Get)
# CAP_KILL — sending SIGTERM/SIGKILL to Firecracker processes
# CAP_DAC_OVERRIDE — accessing /dev/loop*, /dev/mapper/*, /dev/net/tun,
# /proc/sys/net/ipv4/ip_forward
# CAP_MKNOD — creating device nodes (dm-snapshot)
#
# The 'ep' suffix means Effective + Permitted (granted at exec time).
echo "==> Setting capabilities on wrenn-agent..."
if [[ ! -f "${AGENT_BIN}" ]]; then
echo "WARNING: ${AGENT_BIN} not found, skipping setcap. Install the binary first."
else
setcap \
cap_sys_admin,cap_net_admin,cap_net_raw,cap_sys_ptrace,cap_kill,cap_dac_override,cap_mknod+ep \
"${AGENT_BIN}"
echo " Capabilities set on ${AGENT_BIN}:"
getcap "${AGENT_BIN}"
fi
# Firecracker also needs capabilities when spawned by a non-root parent.
# CAP_NET_ADMIN is required for network device access inside the netns.
if [[ -f "${FC_BIN}" ]]; then
setcap cap_net_admin,cap_sys_admin,cap_dac_override+ep "${FC_BIN}"
echo " Capabilities set on ${FC_BIN}:"
getcap "${FC_BIN}"
fi
# ── Helper: resolve binary path and apply setcap ────────────────────────────
#
# Uses `command -v` to find the binary in PATH (handles /usr/bin vs /usr/sbin
# differences across distros), then `readlink -f` to resolve symlinks so that
# setcap hits the real inode (important for iptables-nft/alternatives).
setcap_binary() {
local name="$1" caps="$2"
local bin
bin=$(command -v "$name" 2>/dev/null) || {
echo " WARNING: ${name} not found in PATH, skipping."
return 0
}
bin=$(readlink -f "$bin")
setcap "$caps" "$bin"
echo " $(getcap "$bin")"
}
# The child binaries invoked by wrenn-agent (iptables, losetup, dmsetup, etc.)
# also need capabilities since they'll be exec'd by a non-root user.
echo "==> Setting capabilities on child binaries..."
setcap_binary iptables "cap_net_admin,cap_net_raw+ep"
setcap_binary iptables-save "cap_net_admin,cap_net_raw+ep"
setcap_binary ip "cap_sys_admin,cap_net_admin+ep"
setcap_binary sysctl "cap_net_admin+ep"
setcap_binary losetup "cap_sys_admin,cap_dac_override+ep"
setcap_binary blockdev "cap_sys_admin,cap_dac_override+ep"
setcap_binary dmsetup "cap_sys_admin,cap_dac_override,cap_mknod+ep"
setcap_binary e2fsck "cap_sys_admin,cap_dac_override+ep"
setcap_binary resize2fs "cap_sys_admin,cap_dac_override+ep"
setcap_binary dd "cap_dac_override+ep"
setcap_binary unshare "cap_sys_admin+ep"
setcap_binary mount "cap_sys_admin,cap_dac_override+ep"
# ── 4. Persist capabilities across package updates ──────────────────────────
#
# apt/dpkg overwrites binaries on package updates, which strips the xattr-based
# capabilities set by setcap. This installs:
# - /etc/wrenn/restore-caps.sh: re-applies setcap to all child binaries
# - /etc/apt/apt.conf.d/99-wrenn-setcap: apt post-invoke hook that calls it
echo "==> Installing capability restore hook..."
mkdir -p /etc/wrenn
cat > "${RESTORE_CAPS_SCRIPT}" << 'RESTORE'
#!/usr/bin/env bash
#
# restore-caps.sh — Re-apply Linux capabilities to wrenn child binaries.
# Called automatically by apt after package updates (see /etc/apt/apt.conf.d/99-wrenn-setcap).
# Can also be run manually: sudo /etc/wrenn/restore-caps.sh
set -euo pipefail
setcap_binary() {
local name="$1" caps="$2"
local bin
bin=$(command -v "$name" 2>/dev/null) || return 0
bin=$(readlink -f "$bin")
setcap "$caps" "$bin" 2>/dev/null || true
}
# wrenn-agent and firecracker (only if present — they aren't package-managed).
[[ -f /usr/local/bin/wrenn-agent ]] && \
setcap cap_sys_admin,cap_net_admin,cap_net_raw,cap_sys_ptrace,cap_kill,cap_dac_override,cap_mknod+ep \
/usr/local/bin/wrenn-agent 2>/dev/null || true
[[ -f /usr/local/bin/firecracker ]] && \
setcap cap_net_admin,cap_sys_admin,cap_dac_override+ep \
/usr/local/bin/firecracker 2>/dev/null || true
# Child binaries (these are the ones wiped by apt).
setcap_binary iptables "cap_net_admin,cap_net_raw+ep"
setcap_binary iptables-save "cap_net_admin,cap_net_raw+ep"
setcap_binary ip "cap_sys_admin,cap_net_admin+ep"
setcap_binary sysctl "cap_net_admin+ep"
setcap_binary losetup "cap_sys_admin,cap_dac_override+ep"
setcap_binary blockdev "cap_sys_admin,cap_dac_override+ep"
setcap_binary dmsetup "cap_sys_admin,cap_dac_override,cap_mknod+ep"
setcap_binary e2fsck "cap_sys_admin,cap_dac_override+ep"
setcap_binary resize2fs "cap_sys_admin,cap_dac_override+ep"
setcap_binary dd "cap_dac_override+ep"
setcap_binary unshare "cap_sys_admin+ep"
setcap_binary mount "cap_sys_admin,cap_dac_override+ep"
RESTORE
chmod 755 "${RESTORE_CAPS_SCRIPT}"
cat > /etc/apt/apt.conf.d/99-wrenn-setcap << 'APT'
// Re-apply Linux capabilities to wrenn child binaries after any package update.
// Capabilities (xattr) are stripped when dpkg overwrites a binary.
DPkg::Post-Invoke { "/etc/wrenn/restore-caps.sh"; };
APT
echo " Installed ${RESTORE_CAPS_SCRIPT} and apt post-invoke hook."
# ── 5. Device access ────────────────────────────────────────────────────────
#
# /dev/kvm — handled by kvm group membership above
# /dev/net/tun — needs to be accessible by wrenn user
echo "==> Configuring device access..."
# Ensure /dev/net/tun is accessible (udev rule for persistence across reboots).
cat > /etc/udev/rules.d/99-wrenn.rules << 'UDEV'
# Allow wrenn user access to TUN device for TAP networking.
SUBSYSTEM=="misc", KERNEL=="tun", GROUP="wrenn", MODE="0660"
UDEV
udevadm control --reload-rules 2>/dev/null || true
echo " Installed udev rule for /dev/net/tun."
# ── 6. Kernel modules ───────────────────────────────────────────────────────
echo "==> Ensuring kernel modules are loaded..."
modules=(dm_snapshot dm_mod loop tun)
for mod in "${modules[@]}"; do
if ! lsmod | grep -q "^${mod}"; then
modprobe "${mod}" 2>/dev/null && echo " Loaded ${mod}" || echo " WARNING: Could not load ${mod}"
else
echo " ${mod} already loaded."
fi
done
# Persist across reboots.
for mod in "${modules[@]}"; do
grep -qxF "${mod}" /etc/modules-load.d/wrenn.conf 2>/dev/null || echo "${mod}" >> /etc/modules-load.d/wrenn.conf
done
echo " Module persistence written to /etc/modules-load.d/wrenn.conf."
# ── 7. Sudoers ──────────────────────────────────────────────────────────────
#
# The wrenn user has no sudo grants. The absence of a grant is the cage — an
# explicit "!ALL" deny is weaker due to known bypasses (CVE-2019-14287).
# This file exists purely as documentation for operators running `sudo -l`.
echo "==> Writing sudoers drop-in..."
cat > /etc/sudoers.d/wrenn << 'SUDOERS'
# Wrenn system user — no sudo access permitted.
# All privilege is granted via Linux capabilities on specific binaries (setcap).
# This file contains no active rules. The absence of any grant is intentional
# and is the strongest way to deny escalation.
#
# Do not add rules here. If the wrenn user needs new privileges, use setcap
# on the specific binary instead.
SUDOERS
chmod 440 /etc/sudoers.d/wrenn
visudo -c -f /etc/sudoers.d/wrenn
echo " /etc/sudoers.d/wrenn installed and validated."
# ── 8. Systemd units ────────────────────────────────────────────────────────
echo "==> Writing systemd service files..."
cat > /etc/systemd/system/wrenn-agent.service << 'UNIT'
[Unit]
Description=Wrenn Host Agent
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=wrenn
Group=wrenn
EnvironmentFile=-/etc/wrenn/agent.env
# The binary has capabilities set via setcap. These systemd directives ensure
# the capabilities are inherited into the process at exec time.
AmbientCapabilities=CAP_SYS_ADMIN CAP_NET_ADMIN CAP_NET_RAW CAP_SYS_PTRACE CAP_KILL CAP_DAC_OVERRIDE CAP_MKNOD
CapabilityBoundingSet=CAP_SYS_ADMIN CAP_NET_ADMIN CAP_NET_RAW CAP_SYS_PTRACE CAP_KILL CAP_DAC_OVERRIDE CAP_MKNOD
# IMPORTANT: must be false — child binaries (iptables, losetup, dmsetup, etc.)
# have their own file capabilities via setcap which must be honored at exec time.
NoNewPrivileges=false
# Enable IP forwarding before the agent starts. The "+" prefix runs this
# directive as root (bypassing User=wrenn) so it can write to procfs.
ExecStartPre=+/bin/sh -c 'sysctl -w net.ipv4.ip_forward=1'
ExecStart=/usr/local/bin/wrenn-agent --address ${WRENN_ADVERTISE_ADDR}
Restart=on-failure
RestartSec=5
# File descriptor limits (Firecracker + loop devices + sockets).
LimitNOFILE=65536
LimitNPROC=4096
# Protect host filesystem — only allow access to what's needed.
ProtectHome=true
ReadWritePaths=/var/lib/wrenn /tmp /run/netns /dev/mapper
ReadOnlyPaths=/usr/local/bin/firecracker
[Install]
WantedBy=multi-user.target
UNIT
cat > /etc/systemd/system/wrenn-cp.service << 'UNIT'
[Unit]
Description=Wrenn Control Plane
After=network-online.target postgresql.service
Wants=network-online.target
[Service]
Type=simple
User=wrenn
Group=wrenn
EnvironmentFile=-/etc/wrenn/cp.env
# Control plane is fully unprivileged — no capabilities needed.
NoNewPrivileges=true
CapabilityBoundingSet=
ExecStart=/usr/local/bin/wrenn-cp
Restart=on-failure
RestartSec=5
ProtectHome=true
ProtectSystem=strict
ReadWritePaths=/tmp
[Install]
WantedBy=multi-user.target
UNIT
mkdir -p /etc/wrenn
touch /etc/wrenn/agent.env /etc/wrenn/cp.env
chmod 640 /etc/wrenn/agent.env /etc/wrenn/cp.env
chown root:${WRENN_GROUP} /etc/wrenn/agent.env /etc/wrenn/cp.env
systemctl daemon-reload
echo " wrenn-agent.service and wrenn-cp.service installed."
# ── Done ─────────────────────────────────────────────────────────────────────
echo ""
echo "=== Setup complete ==="
echo ""
echo "Next steps:"
echo " 1. Copy wrenn-agent and wrenn-cp binaries to /usr/local/bin/"
echo " 2. Edit /etc/wrenn/agent.env with WRENN_CP_URL and WRENN_ADVERTISE_ADDR"
echo " 3. Edit /etc/wrenn/cp.env with DATABASE_URL and other control plane config"
echo " 4. systemctl enable --now wrenn-agent"
echo " 5. systemctl enable --now wrenn-cp"
echo ""
echo "Security summary:"
echo " - wrenn user: bash shell (for debugging), no home, no sudo (no grants in sudoers)"
echo " - wrenn-agent: runs as wrenn with 7 capabilities via setcap (not root)"
echo " - wrenn-cp: runs as wrenn with zero capabilities"
echo " - Capabilities auto-restored after apt upgrades via /etc/wrenn/restore-caps.sh"
echo ""