Merge pull request 'Added a multi-distro system and asynchronized snapshot action' (#52 ) from fix/image-creation-and-maintenance into dev

Reviewed-on: #52
feat(templates): multi-distro system base images + paused-state snapshotting
2026-05-22 20:07:18 +00:00 · 2026-05-23 01:58:51 +06:00 · 2026-05-22 21:36:46 +06:00
53 changed files with 1148 additions and 839 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -97,7 +97,7 @@ Startup (`cmd/control-plane/main.go`) is a thin wrapper: `cpserver.Run(cpserver.

 **Packages:** `internal/hostagent/`, `internal/sandbox/`, `internal/vm/`, `internal/network/`, `internal/devicemapper/`, `internal/envdclient/`, `internal/snapshot/`

-**Production deployment:** `scripts/prepare-wrenn-user.sh` creates the `wrenn` system user, sets Linux capabilities (setcap) on wrenn-agent and all child binaries (iptables, losetup, dmsetup, etc.), installs an apt hook to restore capabilities after package updates, configures udev rules for `/dev/net/tun`, loads required kernel modules, and writes systemd unit files for both services. No sudo grants — all privilege is via capabilities.
+**Production deployment:** `make setup-host` (→ `scripts/setup-host.sh`) prepares the host: creates the `wrenn` system user, sets Linux capabilities (setcap) on wrenn-agent and all child binaries (iptables, losetup, dmsetup, etc.), installs an apt hook to restore capabilities after package updates, configures udev rules for `/dev/net/tun`, and loads required kernel modules. No sudo grants — all privilege is via capabilities. `make install` then copies the binaries to `/usr/local/bin` and installs the systemd units from `deploy/systemd/`.

 Startup (`cmd/host-agent/main.go`) wires: root/capabilities check → enable IP forwarding → clean up stale dm devices → `sandbox.Manager` (containing `vm.Manager` + `network.SlotAllocator` + `devicemapper.LoopRegistry`) → `hostagent.Server` (Connect RPC handler) → HTTP server.

@ -258,13 +258,14 @@ To add a new query: add it to the appropriate `.sql` file in `db/queries/` → `
 ## Rootfs & Guest Init

 - **wrenn-init** (`images/wrenn-init.sh`): the PID 1 init script baked into every rootfs. Mounts virtual filesystems, sets hostname, writes `/etc/resolv.conf`, then execs envd.
- **Updating the rootfs** after changing envd or wrenn-init: `bash scripts/update-minimal-rootfs.sh`. This builds envd via `make build-envd` (Rust → static musl binary), mounts the rootfs image, copies in the new binaries, and unmounts. Defaults to `/var/lib/wrenn/images/minimal.ext4`.
- Rootfs images are minimal debootstrap — no systemd, no coreutils beyond busybox. Use `/bin/sh -c` for shell builtins inside the guest.
+- **System base templates**: four built-in distro images — `minimal-ubuntu` (id 0, default), `minimal-alpine` (1), `minimal-arch` (2), `minimal-fedora` (3) — built via `images/build-{ubuntu,alpine,arch,fedora}.sh` (or `make images`). All platform-owned, protected from deletion (reserved IDs 0–1024). Same static envd + tini run on all four. Each has a `wrenn-user` with passwordless sudo.
+- **Updating the rootfs** after changing envd or wrenn-init: `bash scripts/update-minimal-rootfs.sh`. Builds envd via `make build-envd` (Rust → static musl binary), then re-injects envd + wrenn-init + tini into all four system base images.
+- Rootfs images are built from distro containers — no systemd (init is overridden to `wrenn-init`). Use `/bin/sh -c` for shell builtins inside the guest.

 ## Fixed Paths (on host machine)

 - Kernel: `/var/lib/wrenn/kernels/vmlinux`
- Base rootfs images: `/var/lib/wrenn/images/{template}.ext4`
+- Base rootfs images: `/var/lib/wrenn/images/teams/{base36(teamID)}/{base36(templateID)}/rootfs.ext4` (system templates use the platform team, base36 all-zeros)
 - Sandbox clones: `/var/lib/wrenn/sandboxes/`
 - Cloud Hypervisor: `/usr/local/bin/cloud-hypervisor`

--- a/34
+++ b/34
@ -131,32 +131,24 @@ check: fmt vet lint test
 # ═══════════════════════════════════════════════════
 #  Rootfs Images
 # ═══════════════════════════════════════════════════
-.PHONY: images image-minimal image-python image-node
+.PHONY: images rootfs-ubuntu rootfs-alpine rootfs-arch rootfs-fedora

-images: build-envd image-minimal image-python image-node
+# Build all four system base rootfs images (ubuntu/alpine/arch/fedora). Each
+# spawns a distro container, installs the required packages + wrenn-user, then
+# exports to images/teams/<platform>/<id>/rootfs.ext4. Requires docker + sudo.
+images: rootfs-ubuntu rootfs-alpine rootfs-arch rootfs-fedora

-image-minimal:
-	sudo bash images/templates/minimal/build.sh
+rootfs-ubuntu:
+	bash images/build-ubuntu.sh

-image-python:
-	sudo bash images/templates/python312/build.sh
+rootfs-alpine:
+	bash images/build-alpine.sh

-image-node:
-	sudo bash images/templates/node20/build.sh
+rootfs-arch:
+	bash images/build-arch.sh

-# ═══════════════════════════════════════════════════
-#  Deployment
-# ═══════════════════════════════════════════════════
-.PHONY: setup-host install
-
-setup-host:
-	sudo bash scripts/setup-host.sh
-
-install: build
-	sudo cp $(BIN_DIR)/wrenn-cp /usr/local/bin/
-	sudo cp $(BIN_DIR)/wrenn-agent /usr/local/bin/
-	sudo cp deploy/systemd/*.service /etc/systemd/system/
-	sudo systemctl daemon-reload
+rootfs-fedora:
+	bash images/build-fedora.sh

 # ═══════════════════════════════════════════════════
 #  Clean
--- a/README.md
+++ b/README.md
@ -22,7 +22,7 @@ Produces three binaries: `wrenn-cp` (control plane), `wrenn-agent` (host agent),

 ## Host setup

-The host agent needs a kernel, a minimal rootfs image, and working directories on the host machine.
+The host agent needs a kernel, the system base rootfs images, and working directories on the host machine.

 ### Directory structure

@ -31,59 +31,74 @@ The host agent needs a kernel, a minimal rootfs image, and working directories o
 ├── kernels/
 │   └── vmlinux              # uncompressed Linux kernel (not bzImage)
 ├── images/
-│   └── minimal/
-│       └── rootfs.ext4      # base rootfs (all other templates snapshot from this)
+│   └── teams/
+│       └── 0000000000000000000000000/   # platform team (base36 all-zeros)
+│           ├── 0000000000000000000000000/rootfs.ext4   # minimal-ubuntu (id 0)
+│           ├── 0000000000000000000000001/rootfs.ext4   # minimal-alpine (id 1)
+│           ├── 0000000000000000000000002/rootfs.ext4   # minimal-arch   (id 2)
+│           └── 0000000000000000000000003/rootfs.ext4   # minimal-fedora (id 3)
 ├── sandboxes/               # per-sandbox CoW files (created at runtime)
 └── snapshots/               # pause/hibernate snapshot files (created at runtime)
 ```

-Create the directories:
+Create the base directories (the per-template image dirs are created by the build scripts):

 ```bash
-sudo mkdir -p /var/lib/wrenn/{kernels,images/minimal,sandboxes,snapshots}
+sudo mkdir -p /var/lib/wrenn/{kernels,images,sandboxes,snapshots}
 ```

 ### Kernel

 Place an uncompressed `vmlinux` kernel at `/var/lib/wrenn/kernels/vmlinux`. Versioned kernels (`vmlinux-{semver}`) are also supported — the agent picks the latest by semver.

-### Minimal rootfs
+### System base rootfs images

-The minimal rootfs is the base image that all other templates (Python, Node, etc.) are built on top of via device-mapper snapshots. It must contain:
+There are four built-in **system base templates** — one per distro — that all other
+templates snapshot from via device-mapper. They are platform-owned (visible to every
+team) and protected from deletion (reserved template IDs 0–1024):
+
+| Template | Distro | ID |
+|----------|--------|----|
+| `minimal-ubuntu` | `ubuntu:26.04` | 0 |
+| `minimal-alpine` | `alpine:3.22` | 1 |
+| `minimal-arch` | `archlinux:base` | 2 |
+| `minimal-fedora` | `fedora:45` | 3 |
+
+`minimal-ubuntu` is the default template for new sandboxes and builds. The same
+statically-linked `envd` + `tini` run on all four regardless of the distro's libc
+(glibc on Ubuntu/Arch/Fedora, musl on Alpine).
+
+Each image contains these packages plus a `wrenn-user` account with passwordless `sudo`:

 | Package | Why |
 |---------|-----|
 | `socat` | Bidirectional relay for port forwarding |
 | `chrony` | Time sync from KVM PTP clock (`/dev/ptp0`) |
-| `tini` | PID 1 zombie reaper (injected by build script, not apt) |
+| `iproute2` (`iproute` on Fedora) | `ip` for guest network setup in `wrenn-init` |
+| `tini` | PID 1 zombie reaper |
 | `sudo` | User privilege management inside the guest |
 | `wget` | HTTP fetching |
 | `curl` | HTTP client |
 | `ca-certificates` | TLS certificate verification |
+| `git` | Version control |

-**To build a rootfs from a Docker container:**
+**To build all four images** (each spawns a distro container, installs the packages +
+`wrenn-user`, builds `envd`, injects `wrenn-init` + `tini`, and exports to the
+team-scoped path). Requires Docker + sudo:

-1. Create and configure a container with the required packages:
-   ```bash
-   docker run -it --name wrenn-minimal debian:bookworm bash
-   # Inside the container:
-   apt update && apt install -y socat chrony sudo wget curl ca-certificates
-   exit
-   ```
+```bash
+make images
+```

-2. Export to a rootfs image (builds envd, injects wrenn-init + tini, shrinks to minimum size):
-   ```bash
-   sudo bash scripts/rootfs-from-container.sh wrenn-minimal minimal
-   ```
+Or build a single distro: `make rootfs-ubuntu` / `rootfs-alpine` / `rootfs-arch` / `rootfs-fedora`.

-**To update an existing rootfs** after changing envd or `wrenn-init.sh`:
+**To update the images** after changing `envd` or `wrenn-init.sh` (rebuilds `envd` once,
+then re-injects `envd` + `wrenn-init` + `tini` into every system base image):

 ```bash
 bash scripts/update-minimal-rootfs.sh
 ```

-This rebuilds envd via `make build-envd` and copies the fresh binaries into the mounted rootfs image.
-
 ### IP forwarding

 ```bash
--- a/cmd/host-agent/main.go
+++ b/cmd/host-agent/main.go
@ -228,7 +228,7 @@ func main() {
 			// snapshotted state. User-initiated Pauses already running are
 			// awaited by PauseAll/Destroy's lifecycleMu serialization.
 			mgr.Shutdown(shutdownCtx)
-			sandbox.ShrinkMinimalImage(rootDir)
+			sandbox.ShrinkSystemImages(rootDir)
 			if err := httpServer.Shutdown(shutdownCtx); err != nil {
 				slog.Error("http server shutdown error", "error", err)
 			}
--- a/db/migrations/20260522154716_seed_system_base_templates.sql
+++ b/db/migrations/20260522154716_seed_system_base_templates.sql
@ -0,0 +1,49 @@
+-- +goose Up
+
+-- Replace the old all-zeros "minimal" base template with the four system base
+-- templates (ubuntu/alpine/arch/fedora). All are platform-owned (team_id
+-- all-zeros) with reserved template IDs 0..3, default user wrenn-user.
+--
+-- Template IDs are well-known: the all-zeros UUID + low byte = {0,1,2,3}.
+-- On disk each lives at images/teams/{base36(0)}/{base36(id)}/rootfs.ext4.
+
+-- 0 → minimal-ubuntu (was "minimal").
+UPDATE templates
+SET name = 'minimal-ubuntu',
+    default_user = 'wrenn-user'
+WHERE id = '00000000-0000-0000-0000-000000000000';
+
+-- Seed the row if it did not already exist (fresh DBs).
+INSERT INTO templates (id, name, type, vcpus, memory_mb, size_bytes, team_id, default_user)
+VALUES ('00000000-0000-0000-0000-000000000000', 'minimal-ubuntu', 'base', 1, 512, 0,
+        '00000000-0000-0000-0000-000000000000', 'wrenn-user')
+ON CONFLICT (id) DO NOTHING;
+
+-- 1 → minimal-alpine, 2 → minimal-arch, 3 → minimal-fedora.
+INSERT INTO templates (id, name, type, vcpus, memory_mb, size_bytes, team_id, default_user)
+VALUES
+    ('00000000-0000-0000-0000-000000000001', 'minimal-alpine', 'base', 1, 512, 0,
+     '00000000-0000-0000-0000-000000000000', 'wrenn-user'),
+    ('00000000-0000-0000-0000-000000000002', 'minimal-arch', 'base', 1, 512, 0,
+     '00000000-0000-0000-0000-000000000000', 'wrenn-user'),
+    ('00000000-0000-0000-0000-000000000003', 'minimal-fedora', 'base', 1, 512, 0,
+     '00000000-0000-0000-0000-000000000000', 'wrenn-user')
+ON CONFLICT (id) DO NOTHING;
+
+-- Point the sandboxes.template column default at the new default base template.
+ALTER TABLE sandboxes ALTER COLUMN template SET DEFAULT 'minimal-ubuntu';
+
+-- +goose Down
+
+ALTER TABLE sandboxes ALTER COLUMN template SET DEFAULT 'minimal';
+
+DELETE FROM templates WHERE id IN (
+    '00000000-0000-0000-0000-000000000001',
+    '00000000-0000-0000-0000-000000000002',
+    '00000000-0000-0000-0000-000000000003'
+);
+
+UPDATE templates
+SET name = 'minimal',
+    default_user = 'root'
+WHERE id = '00000000-0000-0000-0000-000000000000';
--- a/envd-rs/Cargo.toml
+++ b/envd-rs/Cargo.toml
@ -2,7 +2,7 @@
 name = "envd"
 version = "0.3.0"
 edition = "2024"
-rust-version = "1.88"
+rust-version = "1.95"

 [dependencies]
 # Async runtime
--- a/envd-rs/README.md
+++ b/envd-rs/README.md
@ -128,13 +128,15 @@ src/
 After building the static binary, copy it into the rootfs:

 ```bash
-bash scripts/update-debug-rootfs.sh [rootfs_path]
+bash scripts/update-minimal-rootfs.sh [rootfs_path]
 ```

-Or manually:
+With no argument it updates all four system base images; pass a path to target one.
+
+Or manually (example path: the minimal-ubuntu image, platform team + template id 0):

 ```bash
-sudo mount -o loop /var/lib/wrenn/images/minimal.ext4 /mnt
-sudo cp target/x86_64-unknown-linux-musl/release/envd /mnt/usr/bin/envd
+sudo mount -o loop /var/lib/wrenn/images/teams/0000000000000000000000000/0000000000000000000000000/rootfs.ext4 /mnt
+sudo cp target/x86_64-unknown-linux-musl/release/envd /mnt/usr/local/bin/envd
 sudo umount /mnt
 ```
--- a/frontend/src/lib/api/admin-capsules.ts
+++ b/frontend/src/lib/api/admin-capsules.ts
@ -18,7 +18,9 @@ export async function destroyAdminCapsule(id: string): Promise<ApiResult<void>>
 	return apiFetch('DELETE', `/api/v1/admin/capsules/${id}`);
 }

-export async function snapshotAdminCapsule(id: string, name?: string): Promise<ApiResult<Snapshot>> {
+// Async: returns 202 with the capsule now in the "snapshotting" state. The
+// template lands later (watch template.snapshot.create or poll templates).
+export async function snapshotAdminCapsule(id: string, name?: string): Promise<ApiResult<Capsule>> {
 	return apiFetch('POST', `/api/v1/admin/capsules/${id}/snapshot`, { name });
 }

@ -35,6 +37,7 @@ export async function listPlatformTemplates(): Promise<ApiResult<Snapshot[]>> {
 		size_bytes: t.size_bytes,
 		created_at: t.created_at,
 		platform: true,
+		protected: t.protected,
 	}));
 	return { ok: true, data: snapshots };
 }
--- a/frontend/src/lib/api/builds.ts
+++ b/frontend/src/lib/api/builds.ts
@ -97,6 +97,8 @@ export type AdminTemplate = {
 	size_bytes: number;
 	team_id: string;
 	created_at: string;
+	/** True for built-in system base templates, which cannot be deleted. */
+	protected: boolean;
 };

 export async function listAdminTemplates(): Promise<ApiResult<AdminTemplate[]>> {
--- a/frontend/src/lib/api/capsules.ts
+++ b/frontend/src/lib/api/capsules.ts
@ -8,6 +8,7 @@ export type CapsuleStatus =
 	| 'running'
 	| 'pausing'
 	| 'paused'
+	| 'snapshotting'
 	| 'resuming'
 	| 'stopping'
 	| 'hibernated'
@ -26,6 +27,7 @@ export const TRANSIENT_STATUSES: ReadonlySet<CapsuleStatus> = new Set([
 	'pending',
 	'starting',
 	'pausing',
+	'snapshotting',
 	'resuming',
 	'stopping'
 ]);
@ -88,9 +90,14 @@ export type Snapshot = {
 	size_bytes: number;
 	created_at: string;
 	platform: boolean;
+	/** True for built-in system base templates, which cannot be deleted. */
+	protected?: boolean;
 };

-export async function createSnapshot(capsuleId: string, name?: string): Promise<ApiResult<Snapshot>> {
+// Snapshots are async: the call returns 202 with the capsule now in the
+// "snapshotting" state. The resulting template arrives later via the
+// template.snapshot.create SSE event (or by polling listSnapshots).
+export async function createSnapshot(capsuleId: string, name?: string): Promise<ApiResult<Capsule>> {
 	return apiFetch('POST', '/api/v1/snapshots', { sandbox_id: capsuleId, name });
 }

--- a/frontend/src/lib/components/CreateCapsuleDialog.svelte
+++ b/frontend/src/lib/components/CreateCapsuleDialog.svelte
@ -11,7 +11,7 @@
 	};
 	let { open, onclose, oncreated, templateSource = 'team' }: Props = $props();

-	let createForm = $state<CreateCapsuleParams>({ template: 'minimal', vcpus: 1, memory_mb: 512, timeout_sec: 0 });
+	let createForm = $state<CreateCapsuleParams>({ template: 'minimal-ubuntu', vcpus: 1, memory_mb: 512, timeout_sec: 0 });
 	let creating = $state(false);
 	let createError = $state<string | null>(null);

@ -120,8 +120,8 @@
 			const creator = templateSource === 'platform' ? createAdminCapsule : createCapsule;
 			const result = await creator(createForm);
 			if (result.ok) {
-				createForm = { template: 'minimal', vcpus: 1, memory_mb: 512, timeout_sec: 0 };
-				templateQuery = 'minimal';
+				createForm = { template: 'minimal-ubuntu', vcpus: 1, memory_mb: 512, timeout_sec: 0 };
+				templateQuery = 'minimal-ubuntu';
 				onclose();
 				oncreated?.(result.data);
 			} else {
--- a/frontend/src/lib/components/SnapshotDialog.svelte
+++ b/frontend/src/lib/components/SnapshotDialog.svelte
@ -1,13 +1,38 @@
 <script lang="ts">
-	import { createSnapshot } from '$lib/api/capsules';
+	import type { Snippet } from 'svelte';
+	import { createSnapshot, type Capsule } from '$lib/api/capsules';
+	import type { ApiResult } from '$lib/api/client';
+
+	type SnapshotFn = (capsuleId: string, name?: string) => Promise<ApiResult<Capsule>>;

 	type Props = {
 		open: boolean;
 		capsuleId: string;
 		onclose: () => void;
-		onsnapshot?: () => void;
+		onsnapshot?: (capsule: Capsule) => void;
+		title?: string;
+		label?: string;
+		placeholder?: string;
+		hint?: string;
+		confirmLabel?: string;
+		pendingLabel?: string;
+		snapshotFn?: SnapshotFn;
+		description?: Snippet;
 	};
-	let { open, capsuleId, onclose, onsnapshot }: Props = $props();
+	let {
+		open,
+		capsuleId,
+		onclose,
+		onsnapshot,
+		title = 'Capture snapshot',
+		label = 'Snapshot name',
+		placeholder = 'e.g. after-apt-install, pre-migration',
+		hint = 'Leave blank to use an auto-generated name.',
+		confirmLabel = 'Start snapshot',
+		pendingLabel = 'Starting...',
+		snapshotFn = createSnapshot,
+		description
+	}: Props = $props();

 	let snapshotName = $state('');
 	let snapshotting = $state(false);
@ -21,14 +46,14 @@
 	async function handleConfirm() {
 		snapshotting = true;
 		error = null;
-		const result = await createSnapshot(capsuleId, snapshotName.trim() || undefined);
+		const result = await snapshotFn(capsuleId, snapshotName.trim() || undefined);
 		if (!result.ok) {
 			error = result.error;
 			snapshotting = false;
 			return;
 		}
 		reset();
-		onsnapshot?.();
+		onsnapshot?.(result.data);
 		onclose();
 		snapshotting = false;
 	}
@ -41,6 +66,10 @@
 	}
 </script>

+{#snippet defaultDescription()}
+	<p class="text-ui text-[var(--color-text-tertiary)]">The capsule moves to a <span class="font-mono text-[var(--color-blue)]">snapshotting</span> state while its memory and disk are written to a new template, then returns to running. This runs in the background; you'll be notified when it completes.</p>
+{/snippet}
+
 {#if open}
 	<div class="fixed inset-0 z-50 flex items-center justify-center">
 		<!-- svelte-ignore a11y_no_static_element_interactions -->
@ -59,13 +88,13 @@
 					</svg>
 				</div>
 				<div>
-					<h2 class="font-serif text-heading text-[var(--color-text-bright)]">Capture snapshot</h2>
+					<h2 class="font-serif text-heading text-[var(--color-text-bright)]">{title}</h2>
 					<p class="mt-0.5 text-meta text-[var(--color-text-muted)] font-mono">{capsuleId}</p>
 				</div>
 			</div>

 			<div class="px-6 pt-5 pb-6 space-y-4">
-				<p class="text-ui text-[var(--color-text-tertiary)]">Live snapshot: the capsule briefly pauses, its memory + disk are written to a new template, then the capsule resumes — your session keeps running.</p>
+				{@render (description ?? defaultDescription)()}

 				{#if error}
 					<div class="rounded-[var(--radius-input)] border border-[var(--color-red)]/30 bg-[var(--color-red)]/5 px-3 py-2 text-meta text-[var(--color-red)]">
@ -75,7 +104,7 @@

 				<div>
 					<div class="mb-1.5 flex items-baseline justify-between">
-						<label class="text-label font-semibold uppercase tracking-[0.05em] text-[var(--color-text-tertiary)]" for="snapshot-name">Snapshot name</label>
+						<label class="text-label font-semibold uppercase tracking-[0.05em] text-[var(--color-text-tertiary)]" for="snapshot-name">{label}</label>
 						<span class="text-meta text-[var(--color-text-muted)]">optional</span>
 					</div>
 					<input
@ -84,10 +113,10 @@
 						bind:value={snapshotName}
 						disabled={snapshotting}
 						class="w-full rounded-[var(--radius-input)] border border-[var(--color-border)] bg-[var(--color-bg-4)] px-3 py-2 font-mono text-ui text-[var(--color-text-bright)] outline-none placeholder:text-[var(--color-text-muted)] transition-colors duration-150 focus:border-[var(--color-accent)] disabled:opacity-50"
-						placeholder="e.g. after-apt-install, pre-migration"
+						placeholder={placeholder}
 						onkeydown={(e) => { if (e.key === 'Enter' && !snapshotting) handleConfirm(); }}
 					/>
-					<p class="mt-1.5 text-meta text-[var(--color-text-muted)]">Leave blank to use an auto-generated name.</p>
+					<p class="mt-1.5 text-meta text-[var(--color-text-muted)]">{hint}</p>
 				</div>

 				<div class="flex justify-end gap-3 pt-1">
@ -107,9 +136,9 @@
 							<svg class="animate-spin" width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
 								<path d="M21 12a9 9 0 1 1-6.219-8.56" />
 							</svg>
-							Capturing...
+							{pendingLabel}
 						{:else}
-							Capture snapshot
+							{confirmLabel}
 						{/if}
 					</button>
 				</div>
--- a/frontend/src/lib/lifecycle-toasts.ts
+++ b/frontend/src/lib/lifecycle-toasts.ts
@ -0,0 +1,39 @@
+import type { SSEEvent } from '$lib/api/events';
+import { toast } from '$lib/toast.svelte';
+
+// Terminal copy per lifecycle verb. Success and failure are paired so the two
+// can never drift apart.
+const VERBS: Record<string, { done: string; failed: string }> = {
+	'capsule.create': { done: 'Capsule created', failed: 'Capsule failed to start' },
+	'capsule.pause': { done: 'Capsule paused', failed: 'Capsule failed to pause' },
+	'capsule.resume': { done: 'Capsule resumed', failed: 'Capsule failed to resume' },
+	'capsule.destroy': { done: 'Capsule destroyed', failed: 'Capsule failed to destroy' }
+};
+
+/**
+ * Surfaces lifecycle outcomes as toasts. Only system-actor events with an
+ * outcome are terminal: the user-actor events published at request-accept time
+ * carry a premature outcome (the operation has only been accepted, not yet
+ * completed) and are skipped, so each operation toasts exactly once.
+ */
+export function lifecycleToast(event: SSEEvent): void {
+	if (event.actor?.type !== 'system' || !event.outcome) return;
+
+	if (event.event === 'template.snapshot.create') {
+		const name = event.resource?.id;
+		if (event.outcome === 'success') {
+			toast.success(name ? `Snapshot "${name}" captured` : 'Snapshot captured');
+		} else {
+			toast.error(event.error ? `Snapshot failed: ${event.error}` : 'Snapshot failed');
+		}
+		return;
+	}
+
+	const verb = VERBS[event.event];
+	if (!verb) return;
+	if (event.outcome === 'success') {
+		toast.success(verb.done);
+	} else {
+		toast.error(event.error ? `${verb.failed}: ${event.error}` : verb.failed);
+	}
+}
--- a/frontend/src/routes/admin/capsules/[id]/+page.svelte
+++ b/frontend/src/routes/admin/capsules/[id]/+page.svelte
@ -6,6 +6,7 @@
 	import FilesTab from '$lib/components/FilesTab.svelte';
 	import MetricsPanel from '$lib/components/MetricsPanel.svelte';
 	import DestroyDialog from '$lib/components/DestroyDialog.svelte';
+	import SnapshotDialog from '$lib/components/SnapshotDialog.svelte';
 	import CopyButton from '$lib/components/CopyButton.svelte';
 	import { toast } from '$lib/toast.svelte';
 	import {
@ -29,9 +30,6 @@

 	// Snapshot dialog
 	let showSnapshot = $state(false);
-	let snapshotName = $state('');
-	let snapshotting = $state(false);
-	let snapshotError = $state<string | null>(null);

 	const metricsAvailable = $derived(
 		capsule?.status === 'running' || capsule?.status === 'paused'
@ -58,28 +56,12 @@
 		capsuleLoading = false;
 	}

-	async function handleSnapshot() {
-		snapshotting = true;
-		snapshotError = null;
-		const result = await snapshotAdminCapsule(capsuleId, snapshotName.trim() || undefined);
-		if (result.ok) {
-			toast.success(`Snapshot "${result.data.name}" created`);
-			showSnapshot = false;
-			snapshotName = '';
-			// Capsule keeps running after a live snapshot; refresh local state.
-			void loadCapsule();
-		} else {
-			snapshotError = result.error;
-		}
-		snapshotting = false;
-	}
-
 	function statusColor(status: string): string {
 		switch (status) {
 			case 'running': return 'var(--color-accent)';
 			case 'paused': case 'hibernated':  return 'var(--color-amber)';
 			case 'error':   return 'var(--color-red)';
-			case 'pending': case 'starting': case 'resuming': case 'pausing': case 'stopping':
+			case 'pending': case 'starting': case 'resuming': case 'pausing': case 'snapshotting': case 'stopping':
 				return 'var(--color-blue)';
 			default:        return 'var(--color-text-muted)';
 		}
@ -90,7 +72,7 @@
 			case 'running': return 'rgba(94,140,88,0.12)';
 			case 'paused': case 'hibernated':  return 'rgba(212,167,60,0.12)';
 			case 'error':   return 'rgba(207,129,114,0.12)';
-			case 'pending': case 'starting': case 'resuming': case 'pausing': case 'stopping':
+			case 'pending': case 'starting': case 'resuming': case 'pausing': case 'snapshotting': case 'stopping':
 				return 'rgba(90,159,212,0.12)';
 			default:        return 'rgba(255,255,255,0.05)';
 		}
@ -101,7 +83,7 @@
 			case 'running': return 'rgba(94,140,88,0.3)';
 			case 'paused': case 'hibernated':  return 'rgba(212,167,60,0.3)';
 			case 'error':   return 'rgba(207,129,114,0.3)';
-			case 'pending': case 'starting': case 'resuming': case 'pausing': case 'stopping':
+			case 'pending': case 'starting': case 'resuming': case 'pausing': case 'snapshotting': case 'stopping':
 				return 'rgba(90,159,212,0.3)';
 			default:        return 'rgba(255,255,255,0.08)';
 		}
@ -211,8 +193,7 @@
 					<div class="ml-auto flex items-center gap-2">
 						{#if canSnapshot}
 							<button
-								onclick={() => { showSnapshot = true; snapshotName = ''; snapshotError = null; }}
-								disabled={snapshotting}
+								onclick={() => { showSnapshot = true; }}
 								class="flex items-center gap-1.5 rounded-[var(--radius-button)] border border-[var(--color-accent)]/30 bg-[var(--color-accent)]/8 px-3 py-1.5 text-meta font-medium text-[var(--color-accent-bright)] transition-all duration-150 hover:bg-[var(--color-accent)]/15 hover:border-[var(--color-accent)]/50 disabled:opacity-50"
 							>
 								<svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1.75" stroke-linecap="round" stroke-linejoin="round"><path d="M14.5 4h-5L7 7H2v13a2 2 0 002 2h16a2 2 0 002-2V7h-5l-2.5-3z" /><circle cx="12" cy="15" r="3" /></svg>
@ -270,83 +251,24 @@
 		</footer>
 </main>

-<!-- Snapshot dialog -->
-{#if showSnapshot}
-	<div class="fixed inset-0 z-50 flex items-center justify-center">
-		<!-- svelte-ignore a11y_no_static_element_interactions -->
-		<div
-			class="absolute inset-0 bg-black/60"
-			onclick={() => { if (!snapshotting) showSnapshot = false; }}
-			onkeydown={(e) => { if (e.key === 'Escape' && !snapshotting) showSnapshot = false; }}
-		></div>
+{#snippet adminSnapshotDescription()}
+	<p class="text-ui text-[var(--color-text-tertiary)]">The capsule moves to a <span class="font-mono text-[var(--color-blue)]">snapshotting</span> state while its memory and disk are written to a new platform template available to all teams, then returns to running. This runs in the background.</p>
+{/snippet}

-		<div class="relative w-full max-w-[420px] rounded-[var(--radius-card)] border border-[var(--color-border-mid)] bg-[var(--color-bg-2)] overflow-hidden" style="animation: fadeUp 0.2s ease both; box-shadow: var(--shadow-dialog)">
-			<div class="flex items-center gap-4 border-b border-[var(--color-border)] bg-[var(--color-bg-3)] px-6 py-5">
-				<div class="flex h-10 w-10 shrink-0 items-center justify-center rounded-[var(--radius-input)] bg-[var(--color-accent)]/15 text-[var(--color-accent)] shadow-[0_0_12px_var(--color-accent-glow)]">
-					<svg width="18" height="18" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1.75" stroke-linecap="round" stroke-linejoin="round">
-						<path d="M14.5 4h-5L7 7H2v13a2 2 0 002 2h16a2 2 0 002-2V7h-5l-2.5-3z" />
-						<circle cx="12" cy="15" r="3" />
-					</svg>
-				</div>
-				<div>
-					<h2 class="font-serif text-heading text-[var(--color-text-bright)]">Snapshot as platform template</h2>
-					<p class="mt-0.5 text-meta text-[var(--color-text-muted)] font-mono">{capsuleId}</p>
-				</div>
-			</div>
-
-			<div class="px-6 pt-5 pb-6 space-y-4">
-				<p class="text-ui text-[var(--color-text-tertiary)]">Live snapshot: the capsule briefly pauses, its memory + disk are written to a new platform template available to all teams, then the capsule resumes — your session keeps running.</p>
-
-				{#if snapshotError}
-					<div class="rounded-[var(--radius-input)] border border-[var(--color-red)]/30 bg-[var(--color-red)]/5 px-3 py-2 text-meta text-[var(--color-red)]">
-						{snapshotError}
-					</div>
-				{/if}
-
-				<div>
-					<div class="mb-1.5 flex items-baseline justify-between">
-						<label class="text-label font-semibold uppercase tracking-[0.05em] text-[var(--color-text-tertiary)]" for="admin-snapshot-name">Template name</label>
-						<span class="text-meta text-[var(--color-text-muted)]">optional</span>
-					</div>
-					<input
-						id="admin-snapshot-name"
-						type="text"
-						bind:value={snapshotName}
-						disabled={snapshotting}
-						class="w-full rounded-[var(--radius-input)] border border-[var(--color-border)] bg-[var(--color-bg-4)] px-3 py-2 font-mono text-ui text-[var(--color-text-bright)] outline-none placeholder:text-[var(--color-text-muted)] transition-colors duration-150 focus:border-[var(--color-accent)] disabled:opacity-50"
-						placeholder="e.g. python-3.12, node-22-dev"
-						onkeydown={(e) => { if (e.key === 'Enter' && !snapshotting) handleSnapshot(); }}
-					/>
-					<p class="mt-1.5 text-meta text-[var(--color-text-muted)]">Leave blank for an auto-generated name. If the name already exists, it will be overwritten.</p>
-				</div>
-
-				<div class="flex justify-end gap-3 pt-1">
-					<button
-						onclick={() => { showSnapshot = false; }}
-						disabled={snapshotting}
-						class="rounded-[var(--radius-button)] border border-[var(--color-border)] px-4 py-2 text-ui text-[var(--color-text-secondary)] transition-colors duration-150 hover:border-[var(--color-border-mid)] hover:text-[var(--color-text-primary)] disabled:opacity-50"
-					>
-						Cancel
-					</button>
-					<button
-						onclick={handleSnapshot}
-						disabled={snapshotting}
-						class="flex items-center gap-2 rounded-[var(--radius-button)] bg-[var(--color-accent)] px-5 py-2 text-ui font-semibold text-white transition-all duration-150 hover:brightness-115 hover:-translate-y-px active:translate-y-0 disabled:opacity-50 disabled:hover:translate-y-0"
-					>
-						{#if snapshotting}
-							<svg class="animate-spin" width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
-								<path d="M21 12a9 9 0 1 1-6.219-8.56" />
-							</svg>
-							Snapshotting...
-						{:else}
-							Snapshot
-						{/if}
-					</button>
-				</div>
-			</div>
-		</div>
-	</div>
-{/if}
+<SnapshotDialog
+	open={showSnapshot}
+	{capsuleId}
+	onclose={() => { showSnapshot = false; }}
+	onsnapshot={(updated) => { toast.success('Snapshot started'); capsule = updated; }}
+	snapshotFn={snapshotAdminCapsule}
+	title="Snapshot as platform template"
+	label="Template name"
+	placeholder="e.g. python-3.12, node-22-dev"
+	hint="Leave blank for an auto-generated name. Each snapshot needs a unique name."
+	confirmLabel="Snapshot"
+	pendingLabel="Snapshotting..."
+	description={adminSnapshotDescription}
+/>

 <DestroyDialog
 	open={showDestroy}
--- a/frontend/src/routes/admin/templates/+page.svelte
+++ b/frontend/src/routes/admin/templates/+page.svelte
@ -44,7 +44,7 @@
 	let showCreate = $state(false);
 	let createForm = $state({
 		name: '',
-		base_template: 'minimal',
+		base_template: 'minimal-ubuntu',
 		vcpus: 1,
 		memory_mb: 512,
 		recipe: '',
@ -80,7 +80,7 @@
 	const PLATFORM_TEAM_ID = 'team-0000000000000000000000000';

 	function canDeleteTemplate(tmpl: AdminTemplate): boolean {
-		if (tmpl.name === 'minimal') return false;
+		if (tmpl.protected) return false;
 		return tmpl.team_id === PLATFORM_TEAM_ID;
 	}

@ -140,7 +140,7 @@

 		const result = await createBuild({
 			name: createForm.name.trim(),
-			base_template: createForm.base_template.trim() || 'minimal',
+			base_template: createForm.base_template.trim() || 'minimal-ubuntu',
 			recipe: lines,
 			healthcheck: createForm.healthcheck.trim() || undefined,
 			vcpus: createForm.vcpus,
@ -152,7 +152,7 @@

 		if (result.ok) {
 			showCreate = false;
-			createForm = { name: '', base_template: 'minimal', vcpus: 1, memory_mb: 512, recipe: '', healthcheck: '', skip_pre_post: false, run_as_root: false, archive: null };
+			createForm = { name: '', base_template: 'minimal-ubuntu', vcpus: 1, memory_mb: 512, recipe: '', healthcheck: '', skip_pre_post: false, run_as_root: false, archive: null };
 			toast.success('Build queued');
 			goto(`/admin/templates/builds/${result.data.id}`);
 		} else {
@ -246,7 +246,7 @@
 					</p>
 				</div>
 				<button
-					onclick={() => { showCreate = true; createError = null; createForm = { name: '', base_template: 'minimal', vcpus: 1, memory_mb: 512, recipe: '', healthcheck: '', skip_pre_post: false, run_as_root: false, archive: null }; }}
+					onclick={() => { showCreate = true; createError = null; createForm = { name: '', base_template: 'minimal-ubuntu', vcpus: 1, memory_mb: 512, recipe: '', healthcheck: '', skip_pre_post: false, run_as_root: false, archive: null }; }}
 					class="group flex items-center gap-2.5 rounded-[var(--radius-button)] bg-[var(--color-accent)] px-5 py-2.5 text-ui font-semibold text-white shadow-sm transition-all duration-200 hover:shadow-[0_0_20px_var(--color-accent-glow-mid)] hover:brightness-115 hover:-translate-y-px active:translate-y-0"
 				>
 					<svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round" class="transition-transform duration-200 group-hover:rotate-90"><line x1="12" y1="5" x2="12" y2="19"/><line x1="5" y1="12" x2="19" y2="12"/></svg>
@ -397,7 +397,7 @@
 		</p>
 		{#if type === 'templates'}
 			<button
-				onclick={() => { showCreate = true; createError = null; createForm = { name: '', base_template: 'minimal', vcpus: 1, memory_mb: 512, recipe: '', healthcheck: '', skip_pre_post: false, run_as_root: false, archive: null }; }}
+				onclick={() => { showCreate = true; createError = null; createForm = { name: '', base_template: 'minimal-ubuntu', vcpus: 1, memory_mb: 512, recipe: '', healthcheck: '', skip_pre_post: false, run_as_root: false, archive: null }; }}
 				class="mt-6 flex items-center gap-2 rounded-[var(--radius-button)] border border-[var(--color-accent)]/30 bg-[var(--color-accent)]/10 px-4 py-2 text-ui font-medium text-[var(--color-accent-bright)] transition-all duration-200 hover:bg-[var(--color-accent)]/20 hover:border-[var(--color-accent)]/50"
 			>
 				<svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round"><line x1="12" y1="5" x2="12" y2="19"/><line x1="5" y1="12" x2="19" y2="12"/></svg>
@ -476,7 +476,7 @@
 							<button
 								onclick={() => { deleteTarget = tmpl; deleteError = null; }}
 								disabled={!canDeleteTemplate(tmpl)}
-								title={tmpl.name === 'minimal' ? 'The minimal template cannot be deleted' : !canDeleteTemplate(tmpl) ? 'Cannot delete templates owned by other teams' : undefined}
+								title={tmpl.protected ? 'System base templates cannot be deleted' : !canDeleteTemplate(tmpl) ? 'Cannot delete templates owned by other teams' : undefined}
 								class="rounded-[var(--radius-button)] px-3 py-1.5 text-meta transition-all duration-150 {canDeleteTemplate(tmpl)
 									? 'text-[var(--color-text-tertiary)] hover:bg-[var(--color-red)]/10 hover:text-[var(--color-red)]'
 									: 'text-[var(--color-text-muted)] cursor-not-allowed opacity-40'}"
--- a/frontend/src/routes/dashboard/+layout.svelte
+++ b/frontend/src/routes/dashboard/+layout.svelte
@ -2,7 +2,8 @@
 	import { onMount } from 'svelte';
 	import Sidebar from '$lib/components/Sidebar.svelte';
 	import Toaster from '$lib/components/Toaster.svelte';
-	import { startSSE, stopSSE } from '$lib/sse.svelte';
+	import { startSSE, stopSSE, subscribeSSE } from '$lib/sse.svelte';
+	import { lifecycleToast } from '$lib/lifecycle-toasts';
 	let { children } = $props();

 	let collapsed = $state(
@ -13,7 +14,13 @@

 	onMount(() => {
 		startSSE();
-		return () => stopSSE();
+		// Lifecycle toasts live at the layout so they fire regardless of which
+		// dashboard page is open (and survive navigation between them).
+		const unsubscribe = subscribeSSE(lifecycleToast);
+		return () => {
+			unsubscribe();
+			stopSSE();
+		};
 	});
 </script>

--- a/frontend/src/routes/dashboard/capsules/+page.svelte
+++ b/frontend/src/routes/dashboard/capsules/+page.svelte
@ -395,7 +395,7 @@
 			</div>
 		{:else}
 			{#each filteredCapsules as capsule, i (capsule.id)}
-				{@const isTransient = ['starting', 'resuming', 'pausing', 'stopping'].includes(capsule.status)}
+				{@const isTransient = ['starting', 'resuming', 'pausing', 'snapshotting', 'stopping'].includes(capsule.status)}
 				{@const stripeColor = capsule.status === 'running' ? 'bg-[var(--color-accent)]' : (capsule.status === 'paused' || capsule.status === 'hibernated') ? 'bg-[var(--color-amber)]' : isTransient ? 'bg-[var(--color-blue)]' : 'bg-[var(--color-text-muted)]'}
 				<div
 					class="capsule-row relative grid grid-cols-[1.6fr_0.8fr_0.5fr_0.5fr_0.6fr_1fr_0.9fr] items-center overflow-hidden border-b border-[var(--color-border)] transition-colors duration-150 hover:bg-[var(--color-bg-3)] last:border-b-0 {newCapsuleId === capsule.id ? 'capsule-born' : ''}"
--- a/frontend/src/routes/dashboard/capsules/[id]/+page.svelte
+++ b/frontend/src/routes/dashboard/capsules/[id]/+page.svelte
@ -434,7 +434,7 @@
 			case 'running': return 'var(--color-accent)';
 			case 'paused':  return 'var(--color-amber)';
 			case 'error':   return 'var(--color-red)';
-			case 'starting': case 'resuming': case 'pausing': case 'stopping':
+			case 'starting': case 'resuming': case 'pausing': case 'snapshotting': case 'stopping':
 				return 'var(--color-blue)';
 			default:        return 'var(--color-text-muted)';
 		}
@ -445,7 +445,7 @@
 			case 'running': return 'rgba(94,140,88,0.12)';
 			case 'paused':  return 'rgba(212,167,60,0.12)';
 			case 'error':   return 'rgba(207,129,114,0.12)';
-			case 'starting': case 'resuming': case 'pausing': case 'stopping':
+			case 'starting': case 'resuming': case 'pausing': case 'snapshotting': case 'stopping':
 				return 'rgba(90,159,212,0.12)';
 			default:        return 'rgba(255,255,255,0.05)';
 		}
@ -456,7 +456,7 @@
 			case 'running': return 'rgba(94,140,88,0.3)';
 			case 'paused':  return 'rgba(212,167,60,0.3)';
 			case 'error':   return 'rgba(207,129,114,0.3)';
-			case 'starting': case 'resuming': case 'pausing': case 'stopping':
+			case 'starting': case 'resuming': case 'pausing': case 'snapshotting': case 'stopping':
 				return 'rgba(90,159,212,0.3)';
 			default:        return 'rgba(255,255,255,0.08)';
 		}
--- a/images/build-alpine.sh
+++ b/images/build-alpine.sh
@ -0,0 +1,17 @@
+#!/usr/bin/env bash
+#
+# build-alpine.sh — Build the minimal-alpine system base rootfs (template id 1).
+#
+# Usage: bash images/build-alpine.sh
+
+set -euo pipefail
+source "$(cd "$(dirname "$0")" && pwd)/build-common.sh"
+
+# Alpine is musl-based: the static envd + static tini run fine. bash is added so
+# wrenn-user has a familiar login shell; wrenn-init itself only needs /bin/sh.
+PREP="set -e
+apk add --no-cache socat chrony sudo wget curl ca-certificates git iproute2 tini bash
+adduser -D wrenn-user
+${WRENN_SUDOERS_SETUP}"
+
+build_system_rootfs "alpine:3.22" 1 "${PREP}"
--- a/images/build-arch.sh
+++ b/images/build-arch.sh
@ -0,0 +1,20 @@
+#!/usr/bin/env bash
+#
+# build-arch.sh — Build the minimal-arch system base rootfs (template id 2).
+#
+# Arch is rolling-release; archlinux:base is the minimal base group.
+#
+# Usage: bash images/build-arch.sh
+
+set -euo pipefail
+source "$(cd "$(dirname "$0")" && pwd)/build-common.sh"
+
+# tini is AUR-only on Arch (not in core/extra), so it is not installed here —
+# rootfs-from-container.sh injects the static tini binary instead.
+PREP="set -e
+pacman -Sy --noconfirm --needed socat chrony sudo wget curl ca-certificates git iproute2 inetutils
+useradd -m -s /bin/bash wrenn-user
+${WRENN_SUDOERS_SETUP}
+pacman -Scc --noconfirm || true"
+
+build_system_rootfs "archlinux:base" 2 "${PREP}"
--- a/images/build-common.sh
+++ b/images/build-common.sh
@ -0,0 +1,59 @@
+#!/usr/bin/env bash
+#
+# build-common.sh — shared helpers for building the system base rootfs images.
+#
+# Sourced by images/build-{ubuntu,alpine,arch,fedora}.sh. Each caller defines
+# the distro base image, reserved template ID, and the in-container prep snippet
+# (install packages + create wrenn-user), then calls build_system_rootfs.
+#
+# The same statically-linked envd + tini run on every distro; the per-OS prep
+# only differs in the package manager and the user-creation command.
+
+set -euo pipefail
+
+# base36(all-zeros UUID) = the platform team that owns every system base
+# template. Must match id.PlatformTeamID / id.UUIDToBase36 on the Go side.
+PLATFORM_TEAM_B36="0000000000000000000000000"
+
+# WRENN_SUDOERS_SETUP grants wrenn-user passwordless sudo. Identical on every
+# distro; appended to each prep snippet after the user is created.
+WRENN_SUDOERS_SETUP='echo "wrenn-user ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/wrenn-user && chmod 0440 /etc/sudoers.d/wrenn-user'
+
+# build_system_rootfs <base_image> <template_id_int> <prep_snippet>
+#
+# Spawns a throwaway container from base_image, runs prep_snippet inside it,
+# then exports it to the system base template's on-disk path
+# (images/teams/<platform>/<base36(id)>/rootfs.ext4) via rootfs-from-container.sh.
+build_system_rootfs() {
+    local base_image="$1" template_id="$2" prep="$3"
+    local script_dir project_root container dest tmpl_b36
+
+    script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+    project_root="$(cd "${script_dir}/.." && pwd)"
+    container="wrenn-build-${template_id}-$$"
+
+    # base36(template_id). System IDs are single-digit (0-3), so base36 equals
+    # the decimal digit and the 25-char zero-padded decimal matches what
+    # id.UUIDToBase36 produces for these well-known IDs.
+    tmpl_b36="$(printf '%025d' "${template_id}")"
+    dest="teams/${PLATFORM_TEAM_B36}/${tmpl_b36}"
+
+    echo "==> Pulling ${base_image}..."
+    docker pull "${base_image}"
+
+    echo "==> Preparing container ${container}..."
+    docker rm -f "${container}" >/dev/null 2>&1 || true
+
+    # Arm cleanup before starting the container so a failed run still removes it.
+    # Expand the name into the trap now: it must survive after this function's
+    # locals go out of scope (set -u would error on a stale reference otherwise).
+    trap "docker rm -f '${container}' >/dev/null 2>&1 || true" EXIT
+
+    docker run --name "${container}" "${base_image}" /bin/sh -c "${prep}"
+
+    # Run the exporter as the normal user, NOT under sudo: it builds envd via
+    # `make build-envd` (needs cargo on the user's PATH) and uses sudo itself
+    # for the privileged mount/mkfs/copy steps.
+    echo "==> Exporting to images/${dest}/rootfs.ext4..."
+    bash "${project_root}/scripts/rootfs-from-container.sh" "${container}" "${dest}"
+}
--- a/images/build-fedora.sh
+++ b/images/build-fedora.sh
@ -0,0 +1,19 @@
+#!/usr/bin/env bash
+#
+# build-fedora.sh — Build the minimal-fedora system base rootfs (template id 3).
+#
+# Usage: bash images/build-fedora.sh
+
+set -euo pipefail
+source "$(cd "$(dirname "$0")" && pwd)/build-common.sh"
+
+# Fedora's iproute package provides `ip` (no "2" suffix, unlike Debian/Arch).
+PREP="set -e
+# install_weak_deps=False keeps the image lean. The guest never runs systemd:
+# PID 1 is wrenn-init -> tini -> envd.
+dnf install -y --setopt=install_weak_deps=False socat chrony sudo wget curl ca-certificates git iproute hostname tini
+useradd -m -s /bin/bash wrenn-user
+${WRENN_SUDOERS_SETUP}
+dnf clean all"
+
+build_system_rootfs "fedora:45" 3 "${PREP}"
--- a/images/build-rootfs.sh
+++ b/images/build-rootfs.sh
--- a/images/build-ubuntu.sh
+++ b/images/build-ubuntu.sh
@ -0,0 +1,25 @@
+#!/usr/bin/env bash
+#
+# build-ubuntu.sh — Build the minimal-ubuntu system base rootfs (template id 0).
+#
+# Usage: bash images/build-ubuntu.sh
+
+set -euo pipefail
+source "$(cd "$(dirname "$0")" && pwd)/build-common.sh"
+
+PREP="set -e
+export DEBIAN_FRONTEND=noninteractive
+apt-get update
+# --no-install-recommends keeps the image lean (avoids pulling systemd-adjacent
+# recommends). The guest never runs systemd: PID 1 is wrenn-init -> tini -> envd.
+apt-get install -y --no-install-recommends socat chrony sudo wget curl ca-certificates git iproute2 hostname tini
+# Remove the stock 'ubuntu' user (uid 1000) shipped by the base image; it is
+# replaced by wrenn-user. Also drop its cloud-init sudoers drop-in.
+userdel -r ubuntu 2>/dev/null || true
+rm -f /etc/sudoers.d/90-cloud-init-users
+useradd -m -s /bin/bash wrenn-user
+${WRENN_SUDOERS_SETUP}
+apt-get clean
+rm -rf /var/lib/apt/lists/*"
+
+build_system_rootfs "ubuntu:26.04" 0 "${PREP}"
--- a/images/docker-to-rootfs.sh
+++ b/images/docker-to-rootfs.sh
--- a/images/templates/minimal/build.sh
+++ b/images/templates/minimal/build.sh
--- a/images/templates/node20/build.sh
+++ b/images/templates/node20/build.sh
--- a/images/templates/python312/build.sh
+++ b/images/templates/python312/build.sh
--- a/images/wrenn-init.sh
+++ b/images/wrenn-init.sh
@ -23,9 +23,11 @@ echo "+cpu +memory +io" > /sys/fs/cgroup/cgroup.subtree_control 2>/dev/null || t
 { echo 0 > /sys/block/vda/queue/write_zeroes_max_bytes; } 2>/dev/null || true
 { echo 0 > /sys/block/vda/queue/discard_max_bytes; } 2>/dev/null || true

-# Set hostname and make it resolvable (sudo requires this).
-hostname capsule
-echo "127.0.0.1 capsule" >> /etc/hosts
+# Set hostname and make it resolvable (sudo requires this). Use the kernel knob
+# directly so we don't depend on the `hostname` binary, which is absent from
+# minimal Arch/Fedora images. Guard so a failure never aborts init under set -e.
+echo capsule > /proc/sys/kernel/hostname 2>/dev/null || hostname capsule 2>/dev/null || true
+echo "127.0.0.1 capsule" >> /etc/hosts 2>/dev/null || true

 # Configure networking if the kernel ip= boot arg did not already set it up.
 if ! ip addr show eth0 2>/dev/null | grep -q "169.254.0.21"; then
@ -35,9 +37,14 @@ if ! ip addr show eth0 2>/dev/null | grep -q "169.254.0.21"; then
    ip route add default via 169.254.0.22 2>/dev/null || true
 fi

-# Configure DNS resolver.
-echo "nameserver 8.8.8.8" > /etc/resolv.conf
-echo "nameserver 8.8.4.4" >> /etc/resolv.conf
+# Configure DNS resolver. Drop any existing symlink first — on some distros
+# (e.g. Fedora) /etc/resolv.conf is a dangling symlink into systemd-resolved,
+# and writing through it would fail and abort init under set -e.
+rm -f /etc/resolv.conf 2>/dev/null || true
+{
+    echo "nameserver 8.8.8.8"
+    echo "nameserver 8.8.4.4"
+} > /etc/resolv.conf 2>/dev/null || true

 # Set a standard PATH so envd and all child processes can find common binaries.
 export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
--- a/internal/api/handlers_admin_capsules.go
+++ b/internal/api/handlers_admin_capsules.go
@ -137,22 +137,15 @@ func (h *adminCapsuleHandler) Snapshot(w http.ResponseWriter, r *http.Request) {
 		}
 	}

-	tmpl, err := h.svc.CreateSnapshot(r.Context(), sandboxID, id.PlatformTeamID, req.Name)
-	ac := auth.MustFromContext(r.Context())
-	ac.TeamID = id.PlatformTeamID
-	name := req.Name
-	if err == nil {
-		name = tmpl.Name
-	}
-	h.audit.LogSnapshotCreate(r.Context(), ac, name, err)
+	sb, name, err := h.svc.CreateSnapshot(r.Context(), sandboxID, id.PlatformTeamID, req.Name)
 	if err != nil {
-		if name != "" {
-			h.audit.LogSnapshotDeleteSystem(r.Context(), id.PlatformTeamID, name, "cleanup_after_create_error", nil)
-		}
 		status, code, msg := serviceErrToHTTP(err)
 		writeError(w, status, code, msg)
 		return
 	}
+	ac := auth.MustFromContext(r.Context())
+	ac.TeamID = id.PlatformTeamID
+	h.audit.LogSnapshotCreateRequested(r.Context(), ac, name)

-	writeJSON(w, http.StatusCreated, templateToResponse(tmpl))
+	writeJSON(w, http.StatusAccepted, sandboxToResponse(sb))
 }
--- a/internal/api/handlers_builds.go
+++ b/internal/api/handlers_builds.go
@ -246,6 +246,7 @@ func (h *buildHandler) ListTemplates(w http.ResponseWriter, r *http.Request) {
 		SizeBytes int64  `json:"size_bytes"`
 		TeamID    string `json:"team_id"`
 		CreatedAt string `json:"created_at"`
+		Protected bool   `json:"protected"`
 	}

 	resp := make([]templateResponse, len(templates))
@ -257,6 +258,7 @@ func (h *buildHandler) ListTemplates(w http.ResponseWriter, r *http.Request) {
 			MemoryMB:  t.MemoryMb,
 			SizeBytes: t.SizeBytes,
 			TeamID:    id.FormatTeamID(t.TeamID),
+			Protected: layout.IsSystemTemplate(t.TeamID, t.ID),
 		}
 		if t.CreatedAt.Valid {
 			resp[i].CreatedAt = t.CreatedAt.Time.Format(time.RFC3339)
@ -280,8 +282,8 @@ func (h *buildHandler) DeleteTemplate(w http.ResponseWriter, r *http.Request) {
 		writeError(w, http.StatusNotFound, "not_found", "template not found")
 		return
 	}
-	if layout.IsMinimal(tmpl.TeamID, tmpl.ID) {
-		writeError(w, http.StatusForbidden, "forbidden", "the minimal template cannot be deleted")
+	if layout.IsSystemTemplate(tmpl.TeamID, tmpl.ID) {
+		writeError(w, http.StatusForbidden, "forbidden", "system base templates cannot be deleted")
 		return
 	}

--- a/internal/api/handlers_sandbox_events.go
+++ b/internal/api/handlers_sandbox_events.go
@ -158,6 +158,11 @@ func (h *sandboxEventHandler) verbForFailure(ctx context.Context, sandboxID pgty
 		return events.CapsuleResume
 	case "pausing":
 		return events.CapsulePause
+	case "snapshotting":
+		// A snapshot pauses then resumes the VM; a host-side failure leaves the
+		// sandbox errored, not destroyed. Route through CapsuleCreate so the
+		// consumer's handleFailed marks it "error" rather than removing the row.
+		return events.CapsuleCreate
 	default:
 		return events.CapsuleDestroy
 	}
--- a/internal/api/handlers_snapshots.go
+++ b/internal/api/handlers_snapshots.go
@ -79,6 +79,7 @@ type snapshotResponse struct {
 	SizeBytes int64             `json:"size_bytes"`
 	CreatedAt string            `json:"created_at"`
 	Platform  bool              `json:"platform"`
+	Protected bool              `json:"protected"`
 	Metadata  map[string]string `json:"metadata,omitempty"`
 }

@ -88,6 +89,7 @@ func templateToResponse(t db.Template) snapshotResponse {
 		Type:      t.Type,
 		SizeBytes: t.SizeBytes,
 		Platform:  t.TeamID == id.PlatformTeamID,
+		Protected: layout.IsSystemTemplate(t.TeamID, t.ID),
 	}
 	if t.Vcpus != 0 {
 		resp.VCPUs = &t.Vcpus
@ -112,8 +114,8 @@ type createSnapshotRequest struct {
 	Name      string `json:"name"`
 }

-// Create handles POST /v1/snapshots. Takes a live snapshot of a running
-// sandbox and registers the result as a new template.
+// Create handles POST /v1/snapshots. Snapshots a running or paused sandbox and
+// registers the result as a new template.
 func (h *snapshotHandler) Create(w http.ResponseWriter, r *http.Request) {
 	var req createSnapshotRequest
 	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
@ -131,22 +133,18 @@ func (h *snapshotHandler) Create(w http.ResponseWriter, r *http.Request) {
 	}
 	ac := auth.MustFromContext(r.Context())

-	tmpl, err := h.sandboxSvc.CreateSnapshot(r.Context(), sandboxID, ac.TeamID, req.Name)
-	name := req.Name
-	if err == nil {
-		name = tmpl.Name
-	}
-	h.audit.LogSnapshotCreate(r.Context(), ac, name, err)
+	// Async: the VM briefly pauses to a "snapshotting" state, then resumes. The
+	// template is registered by a background goroutine; clients learn of the
+	// result via the SSE template.snapshot.create event (or by polling).
+	sb, name, err := h.sandboxSvc.CreateSnapshot(r.Context(), sandboxID, ac.TeamID, req.Name)
 	if err != nil {
-		if name != "" {
-			h.audit.LogSnapshotDeleteSystem(r.Context(), ac.TeamID, name, "cleanup_after_create_error", nil)
-		}
 		status, code, msg := serviceErrToHTTP(err)
 		writeError(w, status, code, msg)
 		return
 	}
+	h.audit.LogSnapshotCreateRequested(r.Context(), ac, name)

-	writeJSON(w, http.StatusCreated, templateToResponse(tmpl))
+	writeJSON(w, http.StatusAccepted, sandboxToResponse(sb))
 }

 // List handles GET /v1/snapshots.
@ -188,8 +186,8 @@ func (h *snapshotHandler) Delete(w http.ResponseWriter, r *http.Request) {
 		writeError(w, http.StatusForbidden, "forbidden", "platform templates cannot be deleted here")
 		return
 	}
-	if layout.IsMinimal(tmpl.TeamID, tmpl.ID) {
-		writeError(w, http.StatusForbidden, "forbidden", "the minimal template cannot be deleted")
+	if layout.IsSystemTemplate(tmpl.TeamID, tmpl.ID) {
+		writeError(w, http.StatusForbidden, "forbidden", "system base templates cannot be deleted")
 		return
 	}

--- a/internal/api/host_monitor.go
+++ b/internal/api/host_monitor.go
@ -32,6 +32,14 @@ const unreachableThreshold = 90 * time.Second
 // that may not have registered the sandbox on the host agent yet.
 const transientGracePeriod = 2 * time.Minute

+// snapshotGracePeriod is the grace for a sandbox stuck in "snapshotting" while
+// the VM is still alive on the host. Snapshots dump guest RAM and flatten the
+// rootfs, which can run for minutes on large sandboxes, and the agent reports
+// the VM as alive throughout — so we must not race the in-flight operation.
+// It exceeds the background goroutine's 10-minute deadline, so reaching it
+// means the control plane crashed mid-snapshot and the sandbox needs recovery.
+const snapshotGracePeriod = 15 * time.Minute
+
 // HostMonitor runs on a fixed interval and performs two duties:
 //
 //  1. Passive check: marks hosts whose last_heartbeat_at is stale as
@ -350,7 +358,7 @@ func (m *HostMonitor) checkHost(ctx context.Context, host db.Host) {

 	transientSandboxes, err := m.db.ListSandboxesByHostAndStatus(ctx, db.ListSandboxesByHostAndStatusParams{
 		HostID:  host.ID,
-		Column2: []string{"starting", "resuming", "pausing", "stopping"},
+		Column2: []string{"starting", "resuming", "pausing", "stopping", "snapshotting"},
 	})
 	if err != nil {
 		slog.Warn("host monitor: failed to list transient sandboxes", "host_id", id.FormatHostID(host.ID), "error", err)
@ -359,7 +367,7 @@ func (m *HostMonitor) checkHost(ctx context.Context, host db.Host) {

 	for _, sb := range transientSandboxes {
 		sbIDStr := id.FormatSandboxID(sb.ID)
-		if _, ok := aliveStatus[sbIDStr]; ok {
+		if agentStatus, ok := aliveStatus[sbIDStr]; ok {
 			// Sandbox is alive on host — the background goroutine should
 			// finalize the transition. For starting/resuming, if the sandbox
 			// is alive it means creation/resume succeeded.
@ -370,6 +378,26 @@ func (m *HostMonitor) checkHost(ctx context.Context, host db.Host) {
 					slog.Info("host monitor: promoted transient sandbox to running", "sandbox_id", sbIDStr, "from", sb.Status)
 				}
 			}
+			// A snapshot keeps the source sandbox alive throughout, so an alive
+			// sandbox does NOT mean the snapshot finished. Only recover it once
+			// it has been stuck past the snapshot grace period (i.e. the CP
+			// crashed mid-op). Recover to the sandbox's actual host-side status:
+			// a running sandbox is snapshotted live and stays running, but a
+			// paused sandbox is snapshotted from disk and must return to paused.
+			if sb.Status == "snapshotting" &&
+				sb.LastUpdated.Valid && time.Since(sb.LastUpdated.Time) >= snapshotGracePeriod {
+				recoverTo := agentStatus
+				if recoverTo != "running" && recoverTo != "paused" {
+					// Coerced/unknown agent label — default to running.
+					recoverTo = "running"
+				}
+				if _, err := m.db.UpdateSandboxStatusIf(ctx, db.UpdateSandboxStatusIfParams{
+					ID: sb.ID, Status: "snapshotting", Status_2: recoverTo,
+				}); err == nil {
+					slog.Info("host monitor: recovered stuck snapshotting sandbox", "sandbox_id", sbIDStr, "to", recoverTo)
+					m.audit.LogSnapshotCreateSystem(ctx, sb.TeamID, sb.ID, "snapshot_recovered", nil)
+				}
+			}
 			continue
 		}
 		// Sandbox is not alive on host. If the transition is recent, give the
@ -390,6 +418,9 @@ func (m *HostMonitor) checkHost(ctx context.Context, host db.Host) {
 			finalStatus = "paused"
 		case "stopping":
 			finalStatus = "stopped"
+		case "snapshotting":
+			// VM is gone but DB says snapshotting → the snapshot died with the VM.
+			finalStatus = "error"
 		}
 		fromStatus := sb.Status
 		if _, err := m.db.UpdateSandboxStatusIf(ctx, db.UpdateSandboxStatusIfParams{
@ -405,6 +436,9 @@ func (m *HostMonitor) checkHost(ctx context.Context, host db.Host) {
 			case "pausing":
 				// Pause assumed to have succeeded host-side; emit success with inferred metadata.
 				m.audit.LogSandboxAutoPause(ctx, sb.TeamID, sb.ID, "transient_timeout_inferred", nil)
+			case "snapshotting":
+				// VM gone mid-snapshot; the sandbox is errored.
+				m.audit.LogSnapshotCreateSystem(ctx, sb.TeamID, sb.ID, "transient_timeout", inferredErr)
 			case "stopping":
 				m.audit.LogSandboxDestroySystem(ctx, sb.TeamID, sb.ID, "transient_timeout_inferred", nil)
 			}
--- a/internal/api/openapi.yaml
+++ b/internal/api/openapi.yaml
@ -1421,10 +1421,19 @@ paths:
        - apiKeyAuth: []
        - sessionAuth: []
      description: |
-        Live snapshot: briefly pauses the capsule, writes its VM state +
-        memory + flattened rootfs to a new template directory, then resumes
-        the capsule. The source capsule keeps running after the snapshot;
-        the resulting template can be used to create new capsules.
+        Snapshot a capsule, processed asynchronously. The call returns
+        immediately with the capsule in the `snapshotting` state, then it
+        returns to its original state on completion. The capsule must be
+        `running` or `paused`.
+
+        A `running` capsule is snapshotted live: it briefly pauses while its VM
+        state + memory + flattened rootfs are written to a new template, then
+        resumes to `running`. A `paused` capsule is snapshotted directly from
+        its on-disk state without reviving the VM, and stays `paused`.
+
+        Because it is async, the response does NOT contain the template. Watch
+        for the `template.snapshot.create` SSE event (its `outcome` reports
+        success or failure) or poll `GET /v1/snapshots` to observe completion.

        Snapshots are immutable: each call must use a fresh name. Re-using
        an existing name returns 409 Conflict.
@ -1435,14 +1444,14 @@ paths:
            schema:
              $ref: "#/components/schemas/CreateSnapshotRequest"
      responses:
-        "201":
-          description: Snapshot created
+        "202":
+          description: Snapshot accepted; capsule is now snapshotting
          content:
            application/json:
              schema:
-                $ref: "#/components/schemas/Template"
+                $ref: "#/components/schemas/Capsule"
        "409":
-          description: Name already exists or capsule not running
+          description: Name already exists, or capsule is not running or paused
          content:
            application/json:
              schema:
@ -2813,7 +2822,7 @@ paths:
              schema:
                type: array
                items:
-                  $ref: "#/components/schemas/Template"
+                  $ref: "#/components/schemas/AdminTemplate"

  /v1/admin/templates/{name}:
    delete:
@ -2989,6 +2998,10 @@ paths:
      summary: Create snapshot from any capsule (admin)
      operationId: adminCreateSnapshotFromCapsule
      tags: [admin]
+      description: |
+        Snapshots a `running` or `paused` capsule into a platform template,
+        processed asynchronously (see `POST /v1/snapshots`). A running capsule
+        resumes to `running`; a paused capsule stays `paused`.
      security:
        - sessionAuth: []
      parameters:
@ -2997,21 +3010,22 @@ paths:
          required: true
          schema: {type: string}
      requestBody:
-        required: true
+        required: false
        content:
          application/json:
            schema:
              type: object
-              required: [name]
              properties:
-                name: {type: string}
+                name:
+                  type: string
+                  description: Optional; an auto-generated name is used when omitted.
      responses:
-        "201":
-          description: Snapshot created
+        "202":
+          description: Snapshot accepted; capsule is now snapshotting
          content:
            application/json:
              schema:
-                $ref: "#/components/schemas/Template"
+                $ref: "#/components/schemas/Capsule"

  /v1/admin/capsules/{id}/exec:
    parameters:
@ -3506,7 +3520,7 @@ components:
      properties:
        template:
          type: string
-          default: minimal
+          default: minimal-ubuntu
        vcpus:
          type: integer
          default: 1
@ -3610,7 +3624,7 @@ components:
          type: string
        status:
          type: string
-          enum: [pending, starting, running, pausing, paused, resuming, stopping, hibernated, stopped, missing, error]
+          enum: [pending, starting, running, pausing, paused, snapshotting, resuming, stopping, hibernated, stopped, missing, error]
        template:
          type: string
        vcpus:
@ -3684,13 +3698,51 @@ components:
          type: boolean
          description: |
            True when the template is platform-managed (visible to all teams,
-            e.g. the built-in `minimal` rootfs). False for team-owned
+            e.g. the built-in `minimal-ubuntu` rootfs). False for team-owned
            snapshot templates.
+        protected:
+          type: boolean
+          description: |
+            True for built-in system base templates (minimal-ubuntu,
+            minimal-alpine, minimal-arch, minimal-fedora). Protected templates
+            cannot be deleted.
        metadata:
          type: object
          additionalProperties: {type: string}
          nullable: true

+    AdminTemplate:
+      type: object
+      description: |
+        Template as returned by the admin templates list. Unlike `Template`
+        (the team-facing snapshot shape), this includes the owning `team_id`
+        and omits `platform`/`metadata`.
+      properties:
+        name:
+          type: string
+        type:
+          type: string
+          enum: [base, snapshot]
+        vcpus:
+          type: integer
+        memory_mb:
+          type: integer
+        size_bytes:
+          type: integer
+          format: int64
+        team_id:
+          type: string
+          description: Owning team ID (formatted, e.g. `team-…`). Platform team for global templates.
+        created_at:
+          type: string
+          format: date-time
+        protected:
+          type: boolean
+          description: |
+            True for built-in system base templates (minimal-ubuntu,
+            minimal-alpine, minimal-arch, minimal-fedora). Protected templates
+            cannot be deleted.
+
    ExecRequest:
      type: object
      required: [cmd]
--- a/internal/api/sandbox_event_consumer.go
+++ b/internal/api/sandbox_event_consumer.go
@ -266,7 +266,7 @@ func (c *SandboxEventConsumer) handleStopped(ctx context.Context, sandboxID pgty
 // audit.Log writes the row only — it does NOT republish an event, which would
 // loop back into this consumer. Do not switch to LogSandboxCreateSystem here.
 func (c *SandboxEventConsumer) handleFailed(ctx context.Context, sandboxID pgtype.UUID, event events.Event) {
-	for _, fromStatus := range []string{"running", "starting", "pausing", "resuming"} {
+	for _, fromStatus := range []string{"running", "starting", "pausing", "resuming", "snapshotting"} {
 		if _, err := c.db.UpdateSandboxStatusIf(ctx, db.UpdateSandboxStatusIfParams{
 			ID: sandboxID, Status: fromStatus, Status_2: "error",
 		}); err == nil {
--- a/internal/api/server.go
+++ b/internal/api/server.go
@ -83,7 +83,13 @@ func New(
 	sandboxSvc := &service.SandboxService{DB: queries, Pool: pool, Scheduler: sched}
 	sandboxSvc.PublishEvent = func(ctx context.Context, event service.SandboxStateEvent) {
 		if evt, ok := serviceEventToCanonical(event); ok {
-			eventPub.Publish(ctx, evt)
+			// State-change events are ephemeral UI signals — mirror them to the
+			// dashboard via Pub/Sub only, never to durable channel subscribers.
+			if evt.Event == events.CapsuleStateChanged {
+				eventPub.PublishTransient(ctx, evt)
+			} else {
+				eventPub.Publish(ctx, evt)
+			}
 		}
 	}
 	apiKeySvc := &service.APIKeyService{DB: queries}
@ -482,6 +488,39 @@ func serviceEventToCanonical(e service.SandboxStateEvent) (events.Event, bool) {
 		eventType = events.CapsuleCreate
 		outcome = events.OutcomeError
 		metadata = map[string]string{"reason": "create_failed"}
+	case "sandbox.snapshotted":
+		// Completion of an async snapshot. The resource is the template name,
+		// not the sandbox, so the dashboard's snapshot list refreshes.
+		return events.Event{
+			Event:     events.SnapshotCreate,
+			Outcome:   events.OutcomeSuccess,
+			Timestamp: events.Now(),
+			TeamID:    e.TeamID,
+			Actor:     events.SystemActor(),
+			Resource:  events.Resource{ID: e.Metadata["name"], Type: "snapshot"},
+		}, true
+	case "sandbox.snapshot_failed":
+		return events.Event{
+			Event:     events.SnapshotCreate,
+			Outcome:   events.OutcomeError,
+			Timestamp: events.Now(),
+			TeamID:    e.TeamID,
+			Actor:     events.SystemActor(),
+			Resource:  events.Resource{ID: e.Metadata["name"], Type: "snapshot"},
+			Metadata:  map[string]string{"reason": "snapshot_failed"},
+			Error:     e.Error,
+		}, true
+	case "sandbox.state_changed":
+		// Transient badge transition with no terminal verb of its own. Carries
+		// from/to in metadata; routed via Pub/Sub only by the caller.
+		return events.Event{
+			Event:     events.CapsuleStateChanged,
+			Timestamp: events.Now(),
+			TeamID:    e.TeamID,
+			Actor:     events.SystemActor(),
+			Resource:  events.Resource{ID: e.SandboxID, Type: "sandbox"},
+			Metadata:  e.Metadata,
+		}, true
 	default:
 		return events.Event{}, false
 	}
--- a/internal/layout/layout.go
+++ b/internal/layout/layout.go
@ -15,20 +15,19 @@ import (

 func timeNowNano() int64 { return time.Now().UnixNano() }

-// IsMinimal reports whether the given team and template IDs represent the
-// built-in "minimal" template (both all-zeros).
-func IsMinimal(teamID, templateID pgtype.UUID) bool {
-	return teamID.Bytes == id.PlatformTeamID.Bytes && templateID.Bytes == id.MinimalTemplateID.Bytes
+// IsSystemTemplate reports whether the given team and template IDs represent a
+// built-in system base template (minimal-ubuntu / -alpine / -arch / -fedora):
+// platform-owned with a template ID in the reserved range. System templates are
+// protected from deletion.
+func IsSystemTemplate(teamID, templateID pgtype.UUID) bool {
+	return teamID.Bytes == id.PlatformTeamID.Bytes && id.IsReservedTemplateID(templateID)
 }

-// TemplateDir returns the on-disk directory for a template.
+// TemplateDir returns the on-disk directory for a template. Every template —
+// including the built-in system base templates — lives under the teams tree:
 //
-//	minimal (zeros, zeros): {wrennDir}/images/minimal
-//	all others:             {wrennDir}/images/teams/{base36(teamID)}/{base36(templateID)}
+//	{wrennDir}/images/teams/{base36(teamID)}/{base36(templateID)}
 func TemplateDir(wrennDir string, teamID, templateID pgtype.UUID) string {
-	if IsMinimal(teamID, templateID) {
-		return filepath.Join(wrennDir, "images", "minimal")
-	}
 	return filepath.Join(wrennDir, "images", "teams",
 		id.UUIDToBase36(teamID.Bytes),
 		id.UUIDToBase36(templateID.Bytes))
--- a/internal/layout/layout_test.go
+++ b/internal/layout/layout_test.go
@ -9,7 +9,7 @@ import (
 	"git.omukk.dev/wrenn/wrenn/pkg/id"
 )

-func TestIsMinimal(t *testing.T) {
+func TestIsSystemTemplate(t *testing.T) {
 	tests := []struct {
 		name       string
 		teamID     pgtype.UUID
@ -17,35 +17,41 @@ func TestIsMinimal(t *testing.T) {
 		want       bool
 	}{
 		{
-			name:       "both zeros",
+			name:       "ubuntu (zeros, zeros)",
 			teamID:     id.PlatformTeamID,
-			templateID: id.MinimalTemplateID,
+			templateID: id.UbuntuTemplateID,
 			want:       true,
 		},
 		{
-			name:       "non-zero team",
-			teamID:     pgtype.UUID{Bytes: [16]byte{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1}, Valid: true},
-			templateID: id.MinimalTemplateID,
-			want:       false,
-		},
-		{
-			name:       "non-zero template",
+			name:       "fedora (platform, id 3)",
 			teamID:     id.PlatformTeamID,
-			templateID: pgtype.UUID{Bytes: [16]byte{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1}, Valid: true},
+			templateID: id.FedoraTemplateID,
+			want:       true,
+		},
+		{
+			name:       "platform, max reserved id",
+			teamID:     id.PlatformTeamID,
+			templateID: pgtype.UUID{Bytes: [16]byte{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0x04, 0x00}, Valid: true}, // 1024
+			want:       true,
+		},
+		{
+			name:       "platform, above reserved range",
+			teamID:     id.PlatformTeamID,
+			templateID: pgtype.UUID{Bytes: [16]byte{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0x04, 0x01}, Valid: true}, // 1025
 			want:       false,
 		},
 		{
-			name:       "both non-zero",
-			teamID:     pgtype.UUID{Bytes: [16]byte{1}, Valid: true},
-			templateID: pgtype.UUID{Bytes: [16]byte{2}, Valid: true},
+			name:       "non-platform team, reserved id",
+			teamID:     pgtype.UUID{Bytes: [16]byte{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1}, Valid: true},
+			templateID: id.UbuntuTemplateID,
 			want:       false,
 		},
 	}

 	for _, tt := range tests {
 		t.Run(tt.name, func(t *testing.T) {
-			if got := IsMinimal(tt.teamID, tt.templateID); got != tt.want {
-				t.Errorf("IsMinimal() = %v, want %v", got, tt.want)
+			if got := IsSystemTemplate(tt.teamID, tt.templateID); got != tt.want {
+				t.Errorf("IsSystemTemplate() = %v, want %v", got, tt.want)
 			}
 		})
 	}
@ -54,9 +60,11 @@ func TestIsMinimal(t *testing.T) {
 func TestTemplateDir(t *testing.T) {
 	wrennDir := "/var/lib/wrenn"

-	t.Run("minimal", func(t *testing.T) {
-		got := TemplateDir(wrennDir, id.PlatformTeamID, id.MinimalTemplateID)
-		want := filepath.Join(wrennDir, "images", "minimal")
+	t.Run("system base template (ubuntu) lives under teams", func(t *testing.T) {
+		got := TemplateDir(wrennDir, id.PlatformTeamID, id.UbuntuTemplateID)
+		want := filepath.Join(wrennDir, "images", "teams",
+			id.UUIDToBase36(id.PlatformTeamID.Bytes),
+			id.UUIDToBase36(id.UbuntuTemplateID.Bytes))
 		if got != want {
 			t.Errorf("TemplateDir() = %q, want %q", got, want)
 		}
@ -88,8 +96,11 @@ func TestTemplateDir(t *testing.T) {

 func TestTemplateRootfs(t *testing.T) {
 	wrennDir := "/var/lib/wrenn"
-	got := TemplateRootfs(wrennDir, id.PlatformTeamID, id.MinimalTemplateID)
-	want := filepath.Join(wrennDir, "images", "minimal", "rootfs.ext4")
+	got := TemplateRootfs(wrennDir, id.PlatformTeamID, id.UbuntuTemplateID)
+	want := filepath.Join(wrennDir, "images", "teams",
+		id.UUIDToBase36(id.PlatformTeamID.Bytes),
+		id.UUIDToBase36(id.UbuntuTemplateID.Bytes),
+		"rootfs.ext4")
 	if got != want {
 		t.Errorf("TemplateRootfs() = %q, want %q", got, want)
 	}
--- a/internal/models/sandbox.go
+++ b/internal/models/sandbox.go
@ -9,12 +9,13 @@ import (
 type SandboxStatus string

 const (
-	StatusPending SandboxStatus = "pending"
-	StatusRunning SandboxStatus = "running"
-	StatusPausing SandboxStatus = "pausing"
-	StatusPaused  SandboxStatus = "paused"
-	StatusStopped SandboxStatus = "stopped"
-	StatusError   SandboxStatus = "error"
+	StatusPending      SandboxStatus = "pending"
+	StatusRunning      SandboxStatus = "running"
+	StatusPausing      SandboxStatus = "pausing"
+	StatusPaused       SandboxStatus = "paused"
+	StatusSnapshotting SandboxStatus = "snapshotting"
+	StatusStopped      SandboxStatus = "stopped"
+	StatusError        SandboxStatus = "error"
 )

 // Sandbox holds all state for a running sandbox on this host.
--- a/internal/sandbox/images.go
+++ b/internal/sandbox/images.go
@ -9,6 +9,8 @@ import (
 	"strconv"
 	"strings"

+	"github.com/jackc/pgx/v5/pgtype"
+
 	"git.omukk.dev/wrenn/wrenn/internal/layout"
 	"git.omukk.dev/wrenn/wrenn/pkg/id"
 )
@ -29,13 +31,9 @@ func EnsureImageSizes(wrennDir string, targetMB int) error {
 	}
 	targetBytes := int64(targetMB) * 1024 * 1024

-	// Expand the built-in minimal image.
-	minimalRootfs := layout.TemplateRootfs(wrennDir, id.PlatformTeamID, id.MinimalTemplateID)
-	if err := expandImage(minimalRootfs, targetBytes, targetMB); err != nil {
-		return err
-	}
-
-	// Walk teams/{teamDir}/{templateDir}/rootfs.ext4 two levels deep.
+	// Walk teams/{teamDir}/{templateDir}/rootfs.ext4 two levels deep. The
+	// built-in system base templates live under teams/{base36(0)}/... so this
+	// covers them too.
 	teamsDir := layout.TeamsDir(wrennDir)
 	teamEntries, err := os.ReadDir(teamsDir)
 	if err != nil {
@ -104,12 +102,19 @@ func ParseSizeToMB(s string) (int, error) {
 	}
 }

-// ShrinkMinimalImage shrinks the built-in minimal rootfs back to its minimum
-// size using resize2fs -M. This is the inverse of EnsureImageSizes and should
-// be called during graceful shutdown so the image is stored compactly on disk.
-func ShrinkMinimalImage(wrennDir string) {
-	minimalRootfs := layout.TemplateRootfs(wrennDir, id.PlatformTeamID, id.MinimalTemplateID)
-	shrinkImage(minimalRootfs)
+// ShrinkSystemImages shrinks the built-in system base rootfs images back to
+// their minimum size using resize2fs -M. This is the inverse of
+// EnsureImageSizes and should be called during graceful shutdown so the images
+// are stored compactly on disk.
+func ShrinkSystemImages(wrennDir string) {
+	for _, tmplID := range []pgtype.UUID{
+		id.UbuntuTemplateID,
+		id.AlpineTemplateID,
+		id.ArchTemplateID,
+		id.FedoraTemplateID,
+	} {
+		shrinkImage(layout.TemplateRootfs(wrennDir, id.PlatformTeamID, tmplID))
+	}
 }

 // shrinkImage shrinks a single rootfs image to its minimum size.
--- a/internal/sandbox/manager.go
+++ b/internal/sandbox/manager.go
@ -294,12 +294,12 @@ func (m *Manager) Create(
 	// Snapshot template? Route to the CH-restore path; the launcher manages
 	// its own resource lifecycle and registers the sandbox itself.
 	//
-	// The minimal base template never carries a memory snapshot; guarding
-	// here prevents a stray state.json (e.g. from a failed CreateSnapshot
-	// that mis-targeted minimal) from silently rerouting fresh boots into
+	// System base templates never carry a memory snapshot; guarding here
+	// prevents a stray state.json (e.g. from a failed CreateSnapshot that
+	// mis-targeted a base template) from silently rerouting fresh boots into
 	// the restore path with a confusing error downstream.
 	templateDir := layout.TemplateDir(m.cfg.WrennDir, teamID, templateID)
-	if !layout.IsMinimal(teamID, templateID) && layout.IsSnapshotTemplate(templateDir) {
+	if !layout.IsSystemTemplate(teamID, templateID) && layout.IsSnapshotTemplate(templateDir) {
 		return m.createFromSnapshotTemplate(ctx, sandboxID, teamID, templateID,
 			vcpus, memoryMB, timeoutSec, diskSizeMB, defaultUser, defaultEnv)
 	}
--- a/internal/sandbox/pause.go
+++ b/internal/sandbox/pause.go
@ -32,6 +32,7 @@ import (
 	"fmt"
 	"log/slog"
 	"os"
+	"os/exec"
 	"path/filepath"
 	"strconv"
 	"strings"
@ -695,11 +696,12 @@ func (m *Manager) waitForMemoryLoader(ctx context.Context, sb *sandboxState) err
 }

 // CreateSnapshot writes a self-contained template snapshot to
-// WRENN_DIR/images/teams/{teamID}/{templateID}/. The sandbox is briefly
-// paused, the dm-snapshot is flattened into rootfs.ext4, CH writes the
-// memory snapshot, then the sandbox is resumed.
+// WRENN_DIR/images/teams/{teamID}/{templateID}/, then returns the total size
+// (in bytes) of the artefacts written.
 //
-// Returns the total size (in bytes) of the artefacts written.
+// A running sandbox is snapshotted live (briefly paused, memory dumped, rootfs
+// flattened, then resumed). A paused sandbox is snapshotted straight from its
+// on-disk pause artefacts without reviving the VM — it stays paused.
 func (m *Manager) CreateSnapshot(ctx context.Context, sandboxID string, teamID, templateID pgtype.UUID, name string) (int64, error) {
 	sb, err := m.get(sandboxID)
 	if err != nil {
@ -709,10 +711,6 @@ func (m *Manager) CreateSnapshot(ctx context.Context, sandboxID string, teamID,
 	sb.lifecycleMu.Lock()
 	defer sb.lifecycleMu.Unlock()

-	if sb.Status != models.StatusRunning {
-		return 0, fmt.Errorf("%w: %s (status: %s)", ErrNotRunning, sandboxID, sb.Status)
-	}
-
 	// Refuse silent overwrites: every snapshot must land in a fresh
 	// templateID. Defends against caller bugs and concurrent CreateSnapshot
 	// races for the same destination. User-facing snapshot-name uniqueness
@ -722,6 +720,22 @@ func (m *Manager) CreateSnapshot(ctx context.Context, sandboxID string, teamID,
 			id.UUIDString(teamID), id.UUIDString(templateID))
 	}

+	switch sb.Status {
+	case models.StatusRunning:
+		return m.snapshotRunningToTemplate(ctx, sb, teamID, templateID, name)
+	case models.StatusPaused:
+		return m.snapshotPausedToTemplate(ctx, sb, teamID, templateID, name)
+	default:
+		return 0, fmt.Errorf("%w: %s (status: %s)", ErrNotRunning, sandboxID, sb.Status)
+	}
+}
+
+// snapshotRunningToTemplate takes a live snapshot of a running sandbox: pause
+// CH, dump memory + flatten the rootfs into a staging dir, resume CH, then
+// promote the staged template into place. The sandbox returns to running.
+func (m *Manager) snapshotRunningToTemplate(ctx context.Context, sb *sandboxState, teamID, templateID pgtype.UUID, name string) (int64, error) {
+	sandboxID := sb.ID
+
 	// Same rationale as Pause: wait for the background memory loader so the
 	// resulting memory-ranges is self-contained when this sandbox itself was
 	// previously restored from an ondemand snapshot.
@ -821,6 +835,152 @@ func (m *Manager) CreateSnapshot(ctx context.Context, sandboxID string, teamID,
 	return size, nil
 }

+// snapshotPausedToTemplate builds a self-contained template from a paused
+// sandbox's on-disk artefacts without reviving the VM. The pause snapshot
+// already holds a self-contained CH memory image (Pause blocks on the memory
+// loader before snapshotting), so we copy those memory files verbatim and
+// flatten the persistent CoW into rootfs.ext4. The sandbox stays Paused.
+func (m *Manager) snapshotPausedToTemplate(ctx context.Context, sb *sandboxState, teamID, templateID pgtype.UUID, name string) (int64, error) {
+	snapDir := layout.PauseSnapshotDir(m.cfg.WrennDir, sb.ID)
+	meta, err := readSnapshotMeta(snapDir)
+	if err != nil {
+		return 0, fmt.Errorf("load pause snapshot meta: %w", err)
+	}
+
+	dstDir := layout.TemplateDir(m.cfg.WrennDir, teamID, templateID)
+	stageDir := filepath.Join(layout.SandboxesDir(m.cfg.WrennDir),
+		fmt.Sprintf(".stage-%s-%d", sb.ID, time.Now().UnixNano()))
+	if err := os.MkdirAll(stageDir, 0o755); err != nil {
+		return 0, fmt.Errorf("mkdir stage dir: %w", err)
+	}
+	defer os.RemoveAll(stageDir)
+
+	// Flatten the persistent CoW into a standalone rootfs.ext4. The VM is down,
+	// so re-attach a throwaway dm-snapshot over the base image + CoW just long
+	// enough to read through it; the CoW file is left intact for a later Resume.
+	if err := m.flattenPausedCow(ctx, sb.ID, meta, filepath.Join(stageDir, "rootfs.ext4")); err != nil {
+		return 0, err
+	}
+
+	// Copy CH's memory snapshot files verbatim (state.json, config.json,
+	// memory-ranges, …) — everything except the CoW and the pause meta, which
+	// the template replaces with its own rootfs.ext4 and meta below.
+	if err := copyMemorySnapshotFiles(snapDir, stageDir); err != nil {
+		return 0, err
+	}
+
+	// Template meta: no SlotIndex (a template allocates a fresh slot per launch);
+	// SandboxDir + BaseTemplate carried forward so the restore path resolves the
+	// tmpfs disk path baked into CH's config.json.
+	tmplMeta := &snapshotMeta{
+		TemplateName: name,
+		TeamID:       id.UUIDString(teamID),
+		TemplateID:   id.UUIDString(templateID),
+		VCPUs:        meta.VCPUs,
+		MemoryMB:     meta.MemoryMB,
+		TimeoutSec:   meta.TimeoutSec,
+		BaseTemplate: meta.BaseTemplate,
+		SandboxDir:   meta.SandboxDir,
+		CreatedAt:    time.Now(),
+	}
+	if err := writeSnapshotMeta(stageDir, tmplMeta); err != nil {
+		slog.Warn("template meta write failed", "id", sb.ID, "error", err)
+	}
+
+	if err := promoteSnapshotDir(stageDir, dstDir); err != nil {
+		return 0, fmt.Errorf("promote snapshot: %w", err)
+	}
+
+	size, err := snapshot.DirSize(dstDir, "")
+	if err != nil {
+		slog.Warn("snapshot size calc failed", "id", sb.ID, "error", err)
+	}
+	slog.Info("paused snapshot created",
+		"id", sb.ID,
+		"team_id", teamID,
+		"template_id", templateID,
+		"dir", dstDir,
+		"bytes", size,
+	)
+	return size, nil
+}
+
+// flattenPausedCow re-attaches a temporary dm-snapshot over a paused sandbox's
+// base image + persistent CoW, flattens it into outPath, then tears the dm
+// device down. The CoW file is preserved (RemoveSnapshot never deletes it) so a
+// later Resume still works. A distinct dm name avoids colliding with the
+// "wrenn-{id}" device a concurrent Resume would create — though lifecycleMu
+// already serialises the two.
+func (m *Manager) flattenPausedCow(ctx context.Context, sandboxID string, meta *snapshotMeta, outPath string) error {
+	originLoop, err := m.loops.Acquire(meta.BaseTemplate)
+	if err != nil {
+		return fmt.Errorf("acquire loop: %w", err)
+	}
+	defer m.loops.Release(meta.BaseTemplate)
+
+	originSize, err := devicemapper.OriginSizeBytes(originLoop)
+	if err != nil {
+		return fmt.Errorf("origin size: %w", err)
+	}
+
+	dmDev, err := devicemapper.RestoreSnapshot(ctx, "wrenn-flat-"+sandboxID, originLoop, meta.CowPath, originSize)
+	if err != nil {
+		return fmt.Errorf("restore dm-snapshot: %w", err)
+	}
+	defer func() {
+		if rerr := devicemapper.RemoveSnapshot(context.Background(), dmDev); rerr != nil {
+			slog.Warn("dm remove after paused flatten", "id", sandboxID, "error", rerr)
+		}
+	}()
+
+	if err := devicemapper.FlattenSnapshot(dmDev.DevicePath, outPath); err != nil {
+		return fmt.Errorf("flatten rootfs: %w", err)
+	}
+	return nil
+}
+
+// copyMemorySnapshotFiles copies every regular file from a pause snapshot dir
+// into dstDir except the CoW and the wrenn meta — i.e. CH's own memory snapshot
+// artefacts (state.json, config.json, memory-ranges, …). It hardlinks when the
+// dirs share a filesystem (instant, preserves sparseness) and falls back to a
+// sparse-preserving copy across filesystems. Pause never mutates these files in
+// place — the next Pause writes a fresh dir and swaps — so a hardlink stays a
+// valid, immutable view for the template.
+func copyMemorySnapshotFiles(srcDir, dstDir string) error {
+	entries, err := os.ReadDir(srcDir)
+	if err != nil {
+		return fmt.Errorf("read pause dir: %w", err)
+	}
+	for _, e := range entries {
+		if e.IsDir() {
+			continue
+		}
+		name := e.Name()
+		if name == layout.SandboxCowName || name == snapshotMetaFile {
+			continue
+		}
+		if err := linkOrCopyFile(filepath.Join(srcDir, name), filepath.Join(dstDir, name)); err != nil {
+			return fmt.Errorf("copy %s: %w", name, err)
+		}
+	}
+	return nil
+}
+
+// linkOrCopyFile hardlinks from→to, falling back to a sparse-preserving copy
+// when the two paths live on different filesystems (os.Link returns EXDEV). A
+// plain byte copy would materialise the zero pages punched out of memory-ranges
+// — inflating a multi-GB snapshot to its full apparent size — so the fallback
+// uses `cp --sparse=always`, which re-detects and re-punches the holes.
+func linkOrCopyFile(from, to string) error {
+	if err := os.Link(from, to); err == nil {
+		return nil
+	}
+	if out, err := exec.Command("cp", "--sparse=always", from, to).CombinedOutput(); err != nil {
+		return fmt.Errorf("sparse copy: %s: %w", string(out), err)
+	}
+	return nil
+}
+
 // DeleteSnapshot removes a template snapshot directory. Refuses deletion
 // while any in-memory sandbox is still derived from this template — even
 // though Linux unlink lets the open loop device keep working, the agent
@ -983,9 +1143,10 @@ func (m *Manager) PauseAll(ctx context.Context) {
 	wg.Wait()
 }

-// CleanupOrphanPauseDirs removes leftover *.staging-* and *.trash-* dirs
-// under sandboxes/ from any Pause that crashed before completing the swap.
-// Safe to call at agent startup before any sandbox is created or restored.
+// CleanupOrphanPauseDirs removes leftover *.staging-*, *.stage-*, and *.trash-*
+// dirs under sandboxes/ from any Pause/snapshot/flatten that crashed before
+// completing its swap or promote. Safe to call at agent startup before any
+// sandbox is created or restored.
 //
 // Per-sandbox cleanup happens implicitly during Destroy (which removes the
 // whole PauseSnapshotDir) — this function only handles agent-crash orphans.
@ -1001,7 +1162,12 @@ func CleanupOrphanPauseDirs(wrennDir string) {
 			continue
 		}
 		name := e.Name()
-		if !strings.Contains(name, ".staging-") && !strings.Contains(name, ".trash-") {
+		// ".stage-" is the prefix used by snapshot/flatten staging dirs;
+		// ".staging-" + ".trash-" are used by Pause's swap. (".stage-" is not a
+		// substring of ".staging-", so all three need an explicit check.)
+		if !strings.Contains(name, ".stage-") &&
+			!strings.Contains(name, ".staging-") &&
+			!strings.Contains(name, ".trash-") {
 			continue
 		}
 		path := filepath.Join(sandboxesDir, name)
--- a/pkg/audit/logger.go
+++ b/pkg/audit/logger.go
@ -390,16 +390,25 @@ func (l *AuditLogger) LogSandboxStateChanged(ctx context.Context, teamID, sandbo

 // --- Snapshot events (scope: team) ---

-func (l *AuditLogger) LogSnapshotCreate(ctx context.Context, ac auth.AuthContext, name string, err error) {
-	l.Log(ctx, newEntry(ac, ac.TeamID, "team", "snapshot", name, "create", auditStatusFor(err, "success"), mergeMeta(nil, err)))
-	l.publish(ctx, events.Event{
-		Event:     events.SnapshotCreate,
-		Outcome:   outcomeFromErr(err),
-		Timestamp: events.Now(),
-		TeamID:    id.FormatTeamID(ac.TeamID),
-		Actor:     actorToEvent(ac),
-		Resource:  events.Resource{ID: name, Type: "snapshot"},
-		Error:     errString(err),
+// LogSnapshotCreateRequested records that a user requested an async snapshot.
+// It writes the user-attributed audit row only — the terminal success/failure
+// event is published later by the background goroutine (system actor). Mirrors
+// the accept-time audit pattern used by LogSandboxPause.
+func (l *AuditLogger) LogSnapshotCreateRequested(ctx context.Context, ac auth.AuthContext, name string) {
+	l.Log(ctx, newEntry(ac, ac.TeamID, "team", "snapshot", name, "create", "success", nil))
+}
+
+// LogSnapshotCreateSystem records a system-actor snapshot transition inferred
+// by a reconciler (e.g. the HostMonitor recovering or failing a sandbox stuck
+// in "snapshotting"). It writes an audit row only and does NOT publish a
+// SnapshotCreate event: the reconciler has no template name, and emitting one
+// would surface a spurious "snapshot captured/failed" toast.
+func (l *AuditLogger) LogSnapshotCreateSystem(ctx context.Context, teamID, sandboxID pgtype.UUID, reason string, err error) {
+	l.Log(ctx, Entry{
+		TeamID: teamID, ActorType: "system",
+		ResourceType: "sandbox", ResourceID: id.FormatSandboxID(sandboxID),
+		Action: "snapshot", Scope: "team", Status: auditStatusFor(err, "info"),
+		Metadata: mergeMeta(map[string]any{"reason": reason}, err),
 	})
 }

--- a/pkg/id/id.go
+++ b/pkg/id/id.go
@ -2,6 +2,7 @@ package id

 import (
 	"crypto/rand"
+	"encoding/binary"
 	"encoding/hex"
 	"fmt"
 	"math/big"
@ -156,10 +157,37 @@ func ParseChannelID(s string) (pgtype.UUID, error)   { return parseUUID(PrefixCh
 // (e.g. base templates, shared infrastructure).
 var PlatformTeamID = pgtype.UUID{Bytes: [16]byte{}, Valid: true}

-// MinimalTemplateID is the all-zeros UUID sentinel for the built-in "minimal"
-// template. When both team_id and template_id are zero, the host agent uses
-// the minimal rootfs at WRENN_DIR/images/minimal/.
-var MinimalTemplateID = pgtype.UUID{Bytes: [16]byte{}, Valid: true}
+// SystemTemplateMaxID is the highest template ID reserved for built-in system
+// base templates. Template IDs in [0, SystemTemplateMaxID] under the platform
+// team are protected: they cannot be deleted and live at the well-known
+// teams/{base36(0)}/{base36(id)} on-disk paths.
+const SystemTemplateMaxID = 1024
+
+// templateID returns the all-zeros UUID with its low 64 bits set to n. Used to
+// mint the well-known IDs for the built-in system base templates.
+func templateID(n uint64) pgtype.UUID {
+	var b [16]byte
+	binary.BigEndian.PutUint64(b[8:], n)
+	return pgtype.UUID{Bytes: b, Valid: true}
+}
+
+// Well-known system base template IDs (platform team). The on-disk rootfs for
+// each lives at WRENN_DIR/images/teams/{base36(PlatformTeamID)}/{base36(id)}/.
+var (
+	UbuntuTemplateID = templateID(0) // minimal-ubuntu (replaces the old "minimal")
+	AlpineTemplateID = templateID(1) // minimal-alpine
+	ArchTemplateID   = templateID(2) // minimal-arch
+	FedoraTemplateID = templateID(3) // minimal-fedora
+)
+
+// IsReservedTemplateID reports whether t falls in the reserved system template
+// ID range [0, SystemTemplateMaxID] (i.e. the top 64 bits are zero and the
+// bottom 64 bits are <= SystemTemplateMaxID).
+func IsReservedTemplateID(t pgtype.UUID) bool {
+	hi := binary.BigEndian.Uint64(t.Bytes[:8])
+	lo := binary.BigEndian.Uint64(t.Bytes[8:])
+	return hi == 0 && lo <= SystemTemplateMaxID
+}

 // UUIDString converts a pgtype.UUID to a standard hyphenated UUID string
 // (e.g., "6ba7b810-9dad-11d1-80b4-00c04fd430c8"). Used for RPC wire format.
--- a/pkg/service/build.go
+++ b/pkg/service/build.go
@ -106,7 +106,7 @@ func (s *BuildService) takeArchive(buildID string) []byte {
 // Create inserts a new build record and enqueues it to Redis.
 func (s *BuildService) Create(ctx context.Context, p BuildCreateParams) (db.TemplateBuild, error) {
 	if p.BaseTemplate == "" {
-		p.BaseTemplate = "minimal"
+		p.BaseTemplate = "minimal-ubuntu"
 	}
 	if p.VCPUs <= 0 {
 		p.VCPUs = 1
@ -447,17 +447,15 @@ func (s *BuildService) provisionBuildSandbox(
 	sandboxIDStr := id.FormatSandboxID(sandboxID)
 	log.Info("provisioning build sandbox", "sandbox_id", sandboxIDStr, "host_id", id.FormatHostID(host.ID))

-	baseTeamID := id.PlatformTeamID
-	baseTemplateID := id.MinimalTemplateID
-	if build.BaseTemplate != "minimal" {
-		baseTmpl, err := s.DB.GetPlatformTemplateByName(ctx, build.BaseTemplate)
-		if err != nil {
-			s.failBuild(ctx, buildID, fmt.Sprintf("base template %q not found: %v", build.BaseTemplate, err))
-			return nil, "", nil, err
-		}
-		baseTeamID = baseTmpl.TeamID
-		baseTemplateID = baseTmpl.ID
+	// All base templates — including the built-in system ones — are
+	// platform-owned rows, so resolve the path from the DB record.
+	baseTmpl, err := s.DB.GetPlatformTemplateByName(ctx, build.BaseTemplate)
+	if err != nil {
+		s.failBuild(ctx, buildID, fmt.Sprintf("base template %q not found: %v", build.BaseTemplate, err))
+		return nil, "", nil, err
 	}
+	baseTeamID := baseTmpl.TeamID
+	baseTemplateID := baseTmpl.ID

 	resp, err := agent.CreateSandbox(ctx, connect.NewRequest(&pb.CreateSandboxRequest{
 		SandboxId:  sandboxIDStr,
@ -481,6 +479,23 @@ func (s *BuildService) provisionBuildSandbox(
 		HostID:    host.ID,
 	})

+	if _, err := s.DB.InsertSandbox(ctx, db.InsertSandboxParams{
+		ID:             sandboxID,
+		TeamID:         id.PlatformTeamID,
+		HostID:         host.ID,
+		Template:       build.BaseTemplate,
+		Status:         "running",
+		Vcpus:          build.Vcpus,
+		MemoryMb:       build.MemoryMb,
+		TimeoutSec:     0,
+		DiskSizeMb:     5120,
+		TemplateID:     baseTemplateID,
+		TemplateTeamID: baseTeamID,
+		Metadata:       []byte("{}"),
+	}); err != nil {
+		log.Warn("failed to insert builder sandbox record", "error", err)
+	}
+
 	archive := s.takeArchive(buildIDStr)
 	if len(archive) > 0 {
 		if err := s.uploadAndExtractArchive(ctx, agent, sandboxIDStr, archive, buildIDStr); err != nil {
@ -602,6 +617,7 @@ func (s *BuildService) finalizeBuild(
 	}
 	s.publishStatus(ctx, buildID, "success", build.TotalSteps, build.TotalSteps, "")

+	s.destroySandbox(ctx, agent, sandboxIDStr)
 	log.Info("template build completed successfully", "name", build.Name)
 }

@ -796,6 +812,13 @@ func (s *BuildService) destroySandbox(_ context.Context, agent buildAgentClient,
 	})); err != nil {
 		slog.Warn("failed to destroy build sandbox", "sandbox_id", sandboxIDStr, "error", err)
 	}
+	if sbID, err := id.ParseSandboxID(sandboxIDStr); err == nil {
+		if _, err := s.DB.UpdateSandboxStatus(ctx, db.UpdateSandboxStatusParams{
+			ID: sbID, Status: "stopped",
+		}); err != nil {
+			slog.Warn("failed to mark builder sandbox stopped", "sandbox_id", sandboxIDStr, "error", err)
+		}
+	}
 }

 // fetchSandboxEnv executes the 'env' command inside the specified sandbox via
--- a/pkg/service/sandbox.go
+++ b/pkg/service/sandbox.go
@ -121,7 +121,7 @@ type hostagentClient = interface {
 // sandbox event to the Redis stream when the operation completes.
 func (s *SandboxService) Create(ctx context.Context, p SandboxCreateParams) (db.Sandbox, error) {
 	if p.Template == "" {
-		p.Template = "minimal"
+		p.Template = "minimal-ubuntu"
 	}
 	if err := validate.SafeName(p.Template); err != nil {
 		return db.Sandbox{}, fmt.Errorf("invalid template name: %w", err)
@ -137,26 +137,23 @@ func (s *SandboxService) Create(ctx context.Context, p SandboxCreateParams) (db.
 	}
 	p.TimeoutSec = clampTimeout(p.TimeoutSec)

-	// Resolve template name → (teamID, templateID).
-	templateTeamID := id.PlatformTeamID
-	templateID := id.MinimalTemplateID
-	var templateDefaultUser string
+	// Resolve template name → (teamID, templateID). System base templates are
+	// platform-owned rows like any other, so the lookup handles them too (the
+	// query also matches platform templates for any team).
+	tmpl, err := s.DB.GetTemplateByTeam(ctx, db.GetTemplateByTeamParams{Name: p.Template, TeamID: p.TeamID})
+	if err != nil {
+		return db.Sandbox{}, fmt.Errorf("template %q not found: %w", p.Template, err)
+	}
+	templateTeamID := tmpl.TeamID
+	templateID := tmpl.ID
+	templateDefaultUser := tmpl.DefaultUser
 	var templateDefaultEnv map[string]string
-	if p.Template != "minimal" {
-		tmpl, err := s.DB.GetTemplateByTeam(ctx, db.GetTemplateByTeamParams{Name: p.Template, TeamID: p.TeamID})
-		if err != nil {
-			return db.Sandbox{}, fmt.Errorf("template %q not found: %w", p.Template, err)
-		}
-		templateTeamID = tmpl.TeamID
-		templateID = tmpl.ID
-		templateDefaultUser = tmpl.DefaultUser
-		if len(tmpl.DefaultEnv) > 0 {
-			_ = json.Unmarshal(tmpl.DefaultEnv, &templateDefaultEnv)
-		}
-		if tmpl.Type == "snapshot" {
-			p.VCPUs = tmpl.Vcpus
-			p.MemoryMB = tmpl.MemoryMb
-		}
+	if len(tmpl.DefaultEnv) > 0 {
+		_ = json.Unmarshal(tmpl.DefaultEnv, &templateDefaultEnv)
+	}
+	if tmpl.Type == "snapshot" {
+		p.VCPUs = tmpl.Vcpus
+		p.MemoryMB = tmpl.MemoryMb
 	}

 	if !p.TeamID.Valid {
@ -461,59 +458,140 @@ func (s *SandboxService) resumeInBackground(
 	})
 }

-// CreateSnapshot takes a live snapshot of a running sandbox, publishing
-// the result as a new template owned by the sandbox's team. Returns the
-// inserted template record.
-func (s *SandboxService) CreateSnapshot(ctx context.Context, sandboxID, teamID pgtype.UUID, name string) (db.Template, error) {
+// CreateSnapshot asynchronously snapshots a running or paused sandbox,
+// publishing the result as a new template owned by the sandbox's team. The DB
+// CAS from the sandbox's current status to "snapshotting" is the authoritative
+// gate against concurrent Pause/Snapshot/Destroy calls; if it loses, no agent
+// RPC fires. A running sandbox is snapshotted live (CH briefly paused, then
+// resumed); a paused sandbox is snapshotted from its on-disk artefacts without
+// reviving the VM. Either way the sandbox returns to its original status on
+// completion. Returns the sandbox (now "snapshotting") and the resolved name.
+func (s *SandboxService) CreateSnapshot(ctx context.Context, sandboxID, teamID pgtype.UUID, name string) (db.Sandbox, string, error) {
 	sb, err := s.DB.GetSandboxByTeam(ctx, db.GetSandboxByTeamParams{ID: sandboxID, TeamID: teamID})
 	if err != nil {
-		return db.Template{}, fmt.Errorf("sandbox not found: %w", err)
+		return db.Sandbox{}, "", fmt.Errorf("sandbox not found: %w", err)
 	}
-	if sb.Status != "running" {
-		return db.Template{}, fmt.Errorf("sandbox is not running (status: %s)", sb.Status)
+	if sb.Status != "running" && sb.Status != "paused" {
+		return db.Sandbox{}, "", fmt.Errorf("sandbox is not running or paused (status: %s)", sb.Status)
 	}
+	origStatus := sb.Status

 	if name == "" {
 		name = id.NewSnapshotName()
 	}
 	if err := validate.SafeName(name); err != nil {
-		return db.Template{}, fmt.Errorf("invalid name: %w", err)
+		return db.Sandbox{}, "", fmt.Errorf("invalid name: %w", err)
+	}
+	// Reject duplicate names up front so we don't pause the VM and dump memory
+	// only to fail on the template insert at the very end.
+	if _, err := s.DB.GetTemplateByTeam(ctx, db.GetTemplateByTeamParams{Name: name, TeamID: teamID}); err == nil {
+		return db.Sandbox{}, "", fmt.Errorf("conflict: a snapshot named %q already exists", name)
+	}
+
+	if _, err := s.DB.UpdateSandboxStatusIf(ctx, db.UpdateSandboxStatusIfParams{
+		ID: sandboxID, Status: origStatus, Status_2: "snapshotting",
+	}); err != nil {
+		return db.Sandbox{}, "", fmt.Errorf("sandbox not in %s state (current: %s)", origStatus, sb.Status)
 	}

 	agent, err := s.agentForHost(ctx, sb.HostID)
 	if err != nil {
-		return db.Template{}, err
+		// Roll back the CAS so the sandbox isn't stuck in "snapshotting".
+		if _, rerr := s.DB.UpdateSandboxStatusIf(ctx, db.UpdateSandboxStatusIfParams{
+			ID: sandboxID, Status: "snapshotting", Status_2: origStatus,
+		}); rerr != nil {
+			slog.Warn("failed to roll back snapshotting→"+origStatus, "id", id.FormatSandboxID(sandboxID), "error", rerr)
+		}
+		return db.Sandbox{}, "", err
 	}

+	sandboxIDStr := id.FormatSandboxID(sandboxID)
+	hostIDStr := id.FormatHostID(sb.HostID)
+	teamIDStr := id.FormatTeamID(sb.TeamID)
+
+	// Notify other clients that the badge moved to "snapshotting".
+	s.publishStateChanged(ctx, sandboxIDStr, teamIDStr, hostIDStr, origStatus, "snapshotting")
+
+	go s.snapshotInBackground(sandboxID, sandboxIDStr, hostIDStr, teamIDStr, teamID, agent, name, origStatus, sb.Vcpus, sb.MemoryMb)
+
+	sb.Status = "snapshotting"
+	return sb, name, nil
+}
+
+func (s *SandboxService) snapshotInBackground(
+	sandboxID pgtype.UUID, sandboxIDStr, hostIDStr, teamIDStr string, teamID pgtype.UUID,
+	agent hostagentClient, name, origStatus string, vcpus, memoryMB int32,
+) {
+	bgCtx, cancel := context.WithTimeout(context.Background(), 10*time.Minute)
+	defer cancel()
+
 	newTemplateID := id.NewSandboxID() // any random UUID
 	templateUUID := pgtype.UUID{Bytes: newTemplateID.Bytes, Valid: true}

-	resp, err := agent.CreateSnapshot(ctx, connect.NewRequest(&pb.CreateSnapshotRequest{
-		SandboxId:  id.FormatSandboxID(sandboxID),
+	resp, err := agent.CreateSnapshot(bgCtx, connect.NewRequest(&pb.CreateSnapshotRequest{
+		SandboxId:  sandboxIDStr,
 		Name:       name,
 		TeamId:     id.UUIDString(teamID),
 		TemplateId: id.UUIDString(templateUUID),
 	}))
-	if err != nil {
-		return db.Template{}, fmt.Errorf("agent snapshot: %w", err)
+
+	// Either way, the host-side op is done; return the badge to its original
+	// status (running for a live snapshot, paused for an on-disk one). Use a CAS
+	// so a concurrent Destroy (which sets "stopping") wins: if the CAS misses,
+	// the sandbox is no longer ours and we must NOT announce its old status. The
+	// snapshot itself is still valid and is registered below — a snapshot
+	// template outlives its source sandbox.
+	if _, derr := s.DB.UpdateSandboxStatusIf(bgCtx, db.UpdateSandboxStatusIfParams{
+		ID: sandboxID, Status: "snapshotting", Status_2: origStatus,
+	}); derr != nil {
+		slog.Warn("snapshotting→"+origStatus+" CAS missed (sandbox moved on); skipping state signal", "sandbox_id", sandboxIDStr, "error", derr)
+	} else {
+		s.publishStateChanged(bgCtx, sandboxIDStr, teamIDStr, hostIDStr, "snapshotting", origStatus)
 	}

-	tmpl, err := s.DB.InsertTemplate(ctx, db.InsertTemplateParams{
+	if err != nil {
+		slog.Warn("background snapshot failed", "sandbox_id", sandboxIDStr, "error", err)
+		s.publishEvent(bgCtx, SandboxStateEvent{
+			Event: "sandbox.snapshot_failed", SandboxID: sandboxIDStr, TeamID: teamIDStr, HostID: hostIDStr,
+			Metadata: map[string]string{"name": name}, Error: err.Error(), Timestamp: time.Now().Unix(),
+		})
+		return
+	}
+
+	if _, err := s.DB.InsertTemplate(bgCtx, db.InsertTemplateParams{
 		ID:          templateUUID,
 		Name:        name,
 		Type:        "snapshot",
-		Vcpus:       sb.Vcpus,
-		MemoryMb:    sb.MemoryMb,
+		Vcpus:       vcpus,
+		MemoryMb:    memoryMB,
 		SizeBytes:   resp.Msg.SizeBytes,
 		TeamID:      teamID,
 		DefaultUser: "",
 		DefaultEnv:  []byte("{}"),
 		Metadata:    []byte("{}"),
-	})
-	if err != nil {
-		return db.Template{}, fmt.Errorf("insert template: %w", err)
+	}); err != nil {
+		slog.Warn("failed to insert snapshot template", "sandbox_id", sandboxIDStr, "name", name, "error", err)
+		s.publishEvent(bgCtx, SandboxStateEvent{
+			Event: "sandbox.snapshot_failed", SandboxID: sandboxIDStr, TeamID: teamIDStr, HostID: hostIDStr,
+			Metadata: map[string]string{"name": name}, Error: "failed to register snapshot", Timestamp: time.Now().Unix(),
+		})
+		return
 	}
-	return tmpl, nil
+
+	s.publishEvent(bgCtx, SandboxStateEvent{
+		Event: "sandbox.snapshotted", SandboxID: sandboxIDStr, TeamID: teamIDStr, HostID: hostIDStr,
+		Metadata: map[string]string{"name": name}, Timestamp: time.Now().Unix(),
+	})
+}
+
+// publishStateChanged emits a transient capsule.state.changed event so the
+// dashboard flips the status badge during a transition that has no terminal
+// lifecycle verb of its own (e.g. the snapshotting round-trip).
+func (s *SandboxService) publishStateChanged(ctx context.Context, sandboxIDStr, teamIDStr, hostIDStr, from, to string) {
+	s.publishEvent(ctx, SandboxStateEvent{
+		Event: "sandbox.state_changed", SandboxID: sandboxIDStr, TeamID: teamIDStr, HostID: hostIDStr,
+		Metadata: map[string]string{"from": from, "to": to}, Timestamp: time.Now().Unix(),
+	})
 }

 // Destroy stops a sandbox asynchronously. Pre-marks the DB status as
--- a/recipes/code-runner-beta/code-runner-beta.healthcheck
+++ b/recipes/code-runner-beta/code-runner-beta.healthcheck
--- a/recipes/code-runner-beta/code-runner-beta.recipefile
+++ b/recipes/code-runner-beta/code-runner-beta.recipefile
--- a/recipes/code-runner-beta/test-jupyter-kernel.py
+++ b/recipes/code-runner-beta/test-jupyter-kernel.py
--- a/scripts/prepare-wrenn-user.sh
+++ b/scripts/prepare-wrenn-user.sh
@ -1,393 +0,0 @@
-#!/usr/bin/env bash
-#
-# prepare-wrenn-user.sh — Create the wrenn system user and configure minimal privileges.
-#
-# Creates a locked-down 'wrenn' system user that can run wrenn-agent and wrenn-cp
-# with only the privileges they need. The agent binary gets Linux capabilities
-# via setcap — no sudo is configured for the wrenn user at all. If an attacker
-# compromises the wrenn user, they cannot escalate via sudo.
-#
-# What this script does:
-#   1. Creates the 'wrenn' system user (bash shell for debugging, no home dir)
-#   2. Creates required directories with correct ownership
-#   3. Sets Linux capabilities on wrenn-agent and all child binaries
-#   4. Installs an apt hook to restore capabilities after package updates
-#   5. Installs a sudoers drop-in (comment-only, no grants — absence is the cage)
-#   6. Ensures required kernel modules are loaded
-#   7. Writes systemd unit files for both wrenn-agent and wrenn-cp
-#
-# Usage:
-#   sudo bash scripts/prepare-wrenn-user.sh
-#
-# Prerequisites:
-#   - wrenn-agent binary at /usr/local/bin/wrenn-agent
-#   - wrenn-cp binary at /usr/local/bin/wrenn-cp
-#   - cloud-hypervisor binary at /usr/local/bin/cloud-hypervisor
-#   - libcap2-bin installed (for setcap)
-
-set -euo pipefail
-
-# ── Guard ────────────────────────────────────────────────────────────────────
-
-if [[ $EUID -ne 0 ]]; then
-    echo "ERROR: This script must be run as root."
-    exit 1
-fi
-
-# ── Configuration ────────────────────────────────────────────────────────────
-
-WRENN_USER="wrenn"
-WRENN_GROUP="wrenn"
-WRENN_DIR="/var/lib/wrenn"
-AGENT_BIN="/usr/local/bin/wrenn-agent"
-CP_BIN="/usr/local/bin/wrenn-cp"
-CH_BIN="/usr/local/bin/cloud-hypervisor"
-RESTORE_CAPS_SCRIPT="/etc/wrenn/restore-caps.sh"
-
-# ── 1. Create system user ───────────────────────────────────────────────────
-
-if id "${WRENN_USER}" &>/dev/null; then
-    echo "==> User '${WRENN_USER}' already exists, skipping creation."
-else
-    echo "==> Creating system user '${WRENN_USER}'..."
-    useradd \
-        --system \
-        --no-create-home \
-        --home-dir "${WRENN_DIR}" \
-        --shell /bin/bash \
-        "${WRENN_USER}"
-fi
-
-# Add wrenn to kvm group for /dev/kvm access.
-if getent group kvm &>/dev/null; then
-    usermod -aG kvm "${WRENN_USER}"
-    echo "==> Added '${WRENN_USER}' to 'kvm' group."
-fi
-
-# ── 2. Create directories with correct ownership ────────────────────────────
-
-echo "==> Setting up directories..."
-
-directories=(
-    "${WRENN_DIR}"
-    "${WRENN_DIR}/images"
-    "${WRENN_DIR}/kernels"
-    "${WRENN_DIR}/sandboxes"
-    "${WRENN_DIR}/snapshots"
-    "${WRENN_DIR}/logs"
-    "/run/netns"
-)
-
-for dir in "${directories[@]}"; do
-    mkdir -p "${dir}"
-done
-
-# Only chown wrenn-owned dirs (not /run/netns which is system-managed).
-for dir in "${WRENN_DIR}" "${WRENN_DIR}/images" "${WRENN_DIR}/kernels" \
-           "${WRENN_DIR}/sandboxes" "${WRENN_DIR}/snapshots" "${WRENN_DIR}/logs"; do
-    chown "${WRENN_USER}:${WRENN_GROUP}" "${dir}"
-    chmod 750 "${dir}"
-done
-
-# ── 3. Set capabilities on binaries ─────────────────────────────────────────
-#
-# These capabilities replace full root access. The wrenn-agent binary gets
-# exactly the capabilities it needs for:
-#
-#   CAP_SYS_ADMIN   — network namespaces (netns create/enter), mount namespaces
-#                     (unshare -m), losetup, dmsetup, mount/umount
-#   CAP_NET_ADMIN   — veth/TAP creation (netlink), iptables rules, IP forwarding,
-#                     routing table manipulation
-#   CAP_NET_RAW     — raw socket access (needed by iptables internally)
-#   CAP_SYS_PTRACE  — reading /proc/self/ns/net (netns.Get)
-#   CAP_KILL        — sending SIGTERM/SIGKILL to Cloud Hypervisor processes
-#   CAP_DAC_OVERRIDE — accessing /dev/loop*, /dev/mapper/*, /dev/net/tun,
-#                      /proc/sys/net/ipv4/ip_forward
-#   CAP_MKNOD       — creating device nodes (dm-snapshot)
-#
-# The 'ep' suffix means Effective + Permitted (granted at exec time).
-
-echo "==> Setting capabilities on wrenn-agent..."
-
-if [[ ! -f "${AGENT_BIN}" ]]; then
-    echo "WARNING: ${AGENT_BIN} not found, skipping setcap. Install the binary first."
-else
-    setcap \
-        cap_sys_admin,cap_net_admin,cap_net_raw,cap_sys_ptrace,cap_kill,cap_dac_override,cap_mknod+ep \
-        "${AGENT_BIN}"
-
-    echo "    Capabilities set on ${AGENT_BIN}:"
-    getcap "${AGENT_BIN}"
-fi
-
-# Cloud Hypervisor also needs capabilities when spawned by a non-root parent.
-# CAP_NET_ADMIN is required for network device access inside the netns.
-if [[ -f "${CH_BIN}" ]]; then
-    setcap cap_net_admin,cap_sys_admin,cap_dac_override+ep "${CH_BIN}"
-    echo "    Capabilities set on ${CH_BIN}:"
-    getcap "${CH_BIN}"
-fi
-
-# ── Helper: resolve binary path and apply setcap ────────────────────────────
-#
-# Uses `command -v` to find the binary in PATH (handles /usr/bin vs /usr/sbin
-# differences across distros), then `readlink -f` to resolve symlinks so that
-# setcap hits the real inode (important for iptables-nft/alternatives).
-
-setcap_binary() {
-    local name="$1" caps="$2"
-    local bin
-    bin=$(command -v "$name" 2>/dev/null) || {
-        echo "    WARNING: ${name} not found in PATH, skipping."
-        return 0
-    }
-    bin=$(readlink -f "$bin")
-    setcap "$caps" "$bin"
-    echo "    $(getcap "$bin")"
-}
-
-# The child binaries invoked by wrenn-agent (iptables, losetup, dmsetup, etc.)
-# also need capabilities since they'll be exec'd by a non-root user.
-echo "==> Setting capabilities on child binaries..."
-
-setcap_binary iptables      "cap_net_admin,cap_net_raw+ep"
-setcap_binary iptables-save "cap_net_admin,cap_net_raw+ep"
-setcap_binary ip            "cap_sys_admin,cap_net_admin+ep"
-setcap_binary sysctl        "cap_net_admin+ep"
-setcap_binary losetup       "cap_sys_admin,cap_dac_override+ep"
-setcap_binary blockdev      "cap_sys_admin,cap_dac_override+ep"
-setcap_binary dmsetup       "cap_sys_admin,cap_dac_override,cap_mknod+ep"
-setcap_binary e2fsck        "cap_sys_admin,cap_dac_override+ep"
-setcap_binary resize2fs     "cap_sys_admin,cap_dac_override+ep"
-setcap_binary dd            "cap_dac_override+ep"
-setcap_binary unshare       "cap_sys_admin+ep"
-setcap_binary mount         "cap_sys_admin,cap_dac_override+ep"
-
-# ── 4. Persist capabilities across package updates ──────────────────────────
-#
-# apt/dpkg overwrites binaries on package updates, which strips the xattr-based
-# capabilities set by setcap. This installs:
-#   - /etc/wrenn/restore-caps.sh: re-applies setcap to all child binaries
-#   - /etc/apt/apt.conf.d/99-wrenn-setcap: apt post-invoke hook that calls it
-
-echo "==> Installing capability restore hook..."
-
-mkdir -p /etc/wrenn
-
-cat > "${RESTORE_CAPS_SCRIPT}" << 'RESTORE'
-#!/usr/bin/env bash
-#
-# restore-caps.sh — Re-apply Linux capabilities to wrenn child binaries.
-# Called automatically by apt after package updates (see /etc/apt/apt.conf.d/99-wrenn-setcap).
-# Can also be run manually: sudo /etc/wrenn/restore-caps.sh
-
-set -euo pipefail
-
-setcap_binary() {
-    local name="$1" caps="$2"
-    local bin
-    bin=$(command -v "$name" 2>/dev/null) || return 0
-    bin=$(readlink -f "$bin")
-    setcap "$caps" "$bin" 2>/dev/null || true
-}
-
-# wrenn-agent and cloud-hypervisor (only if present — they aren't package-managed).
-[[ -f /usr/local/bin/wrenn-agent ]] && \
-    setcap cap_sys_admin,cap_net_admin,cap_net_raw,cap_sys_ptrace,cap_kill,cap_dac_override,cap_mknod+ep \
-        /usr/local/bin/wrenn-agent 2>/dev/null || true
-[[ -f /usr/local/bin/cloud-hypervisor ]] && \
-    setcap cap_net_admin,cap_sys_admin,cap_dac_override+ep \
-        /usr/local/bin/cloud-hypervisor 2>/dev/null || true
-
-# Child binaries (these are the ones wiped by apt).
-setcap_binary iptables      "cap_net_admin,cap_net_raw+ep"
-setcap_binary iptables-save "cap_net_admin,cap_net_raw+ep"
-setcap_binary ip            "cap_sys_admin,cap_net_admin+ep"
-setcap_binary sysctl        "cap_net_admin+ep"
-setcap_binary losetup       "cap_sys_admin,cap_dac_override+ep"
-setcap_binary blockdev      "cap_sys_admin,cap_dac_override+ep"
-setcap_binary dmsetup       "cap_sys_admin,cap_dac_override,cap_mknod+ep"
-setcap_binary e2fsck        "cap_sys_admin,cap_dac_override+ep"
-setcap_binary resize2fs     "cap_sys_admin,cap_dac_override+ep"
-setcap_binary dd            "cap_dac_override+ep"
-setcap_binary unshare       "cap_sys_admin+ep"
-setcap_binary mount         "cap_sys_admin,cap_dac_override+ep"
-RESTORE
-
-chmod 755 "${RESTORE_CAPS_SCRIPT}"
-
-cat > /etc/apt/apt.conf.d/99-wrenn-setcap << 'APT'
-// Re-apply Linux capabilities to wrenn child binaries after any package update.
-// Capabilities (xattr) are stripped when dpkg overwrites a binary.
-DPkg::Post-Invoke { "/etc/wrenn/restore-caps.sh"; };
-APT
-
-echo "    Installed ${RESTORE_CAPS_SCRIPT} and apt post-invoke hook."
-
-# ── 5. Device access ────────────────────────────────────────────────────────
-#
-# /dev/kvm   — handled by kvm group membership above
-# /dev/net/tun — needs to be accessible by wrenn user
-
-echo "==> Configuring device access..."
-
-# Ensure /dev/net/tun is accessible (udev rule for persistence across reboots).
-cat > /etc/udev/rules.d/99-wrenn.rules << 'UDEV'
-# Allow wrenn user access to TUN device for TAP networking.
-SUBSYSTEM=="misc", KERNEL=="tun", GROUP="wrenn", MODE="0660"
-UDEV
-
-udevadm control --reload-rules 2>/dev/null || true
-echo "    Installed udev rule for /dev/net/tun."
-
-# ── 6. Kernel modules ───────────────────────────────────────────────────────
-
-echo "==> Ensuring kernel modules are loaded..."
-
-modules=(dm_snapshot dm_mod loop tun)
-for mod in "${modules[@]}"; do
-    if ! lsmod | grep -q "^${mod}"; then
-        modprobe "${mod}" 2>/dev/null && echo "    Loaded ${mod}" || echo "    WARNING: Could not load ${mod}"
-    else
-        echo "    ${mod} already loaded."
-    fi
-done
-
-# Persist across reboots.
-for mod in "${modules[@]}"; do
-    grep -qxF "${mod}" /etc/modules-load.d/wrenn.conf 2>/dev/null || echo "${mod}" >> /etc/modules-load.d/wrenn.conf
-done
-echo "    Module persistence written to /etc/modules-load.d/wrenn.conf."
-
-# ── 7. Sudoers ──────────────────────────────────────────────────────────────
-#
-# The wrenn user has no sudo grants. The absence of a grant is the cage — an
-# explicit "!ALL" deny is weaker due to known bypasses (CVE-2019-14287).
-# This file exists purely as documentation for operators running `sudo -l`.
-
-echo "==> Writing sudoers drop-in..."
-
-cat > /etc/sudoers.d/wrenn << 'SUDOERS'
-# Wrenn system user — no sudo access permitted.
-# All privilege is granted via Linux capabilities on specific binaries (setcap).
-# This file contains no active rules. The absence of any grant is intentional
-# and is the strongest way to deny escalation.
-#
-# Do not add rules here. If the wrenn user needs new privileges, use setcap
-# on the specific binary instead.
-SUDOERS
-
-chmod 440 /etc/sudoers.d/wrenn
-visudo -c -f /etc/sudoers.d/wrenn
-echo "    /etc/sudoers.d/wrenn installed and validated."
-
-# ── 8. Systemd units ────────────────────────────────────────────────────────
-
-echo "==> Writing systemd service files..."
-
-cat > /etc/systemd/system/wrenn-agent.service << 'UNIT'
-[Unit]
-Description=Wrenn Host Agent
-After=network-online.target
-Wants=network-online.target
-
-[Service]
-Type=simple
-User=wrenn
-Group=wrenn
-EnvironmentFile=-/etc/wrenn/agent.env
-
-# The binary has capabilities set via setcap. These systemd directives ensure
-# the capabilities are inherited into the process at exec time.
-AmbientCapabilities=CAP_SYS_ADMIN CAP_NET_ADMIN CAP_NET_RAW CAP_SYS_PTRACE CAP_KILL CAP_DAC_OVERRIDE CAP_MKNOD
-CapabilityBoundingSet=CAP_SYS_ADMIN CAP_NET_ADMIN CAP_NET_RAW CAP_SYS_PTRACE CAP_KILL CAP_DAC_OVERRIDE CAP_MKNOD
-
-# IMPORTANT: must be false — child binaries (iptables, losetup, dmsetup, etc.)
-# have their own file capabilities via setcap which must be honored at exec time.
-NoNewPrivileges=false
-
-# Enable IP forwarding before the agent starts. The "+" prefix runs this
-# directive as root (bypassing User=wrenn) so it can write to procfs.
-ExecStartPre=+/bin/sh -c 'sysctl -w net.ipv4.ip_forward=1'
-
-ExecStart=/usr/local/bin/wrenn-agent --address ${WRENN_ADVERTISE_ADDR}
-
-Restart=on-failure
-RestartSec=5
-
-# File descriptor limits (Cloud Hypervisor + loop devices + sockets).
-LimitNOFILE=65536
-LimitNPROC=4096
-
-# IO priority + cgroup weight. Large-VM snapshot writes (CH memfile dump,
-# zero-page hole punching, dm-snapshot flatten) can saturate a single-disk
-# host and starve sshd/journal reads. Best-effort scheduling class +
-# below-default cgroup weight lets latency-sensitive workloads keep up.
-IOSchedulingClass=best-effort
-IOSchedulingPriority=5
-IOWeight=50
-
-# Protect host filesystem — only allow access to what's needed.
-ProtectHome=true
-ReadWritePaths=/var/lib/wrenn /tmp /run/netns /dev/mapper
-ReadOnlyPaths=/usr/local/bin/cloud-hypervisor
-
-[Install]
-WantedBy=multi-user.target
-UNIT
-
-cat > /etc/systemd/system/wrenn-cp.service << 'UNIT'
-[Unit]
-Description=Wrenn Control Plane
-After=network-online.target postgresql.service
-Wants=network-online.target
-
-[Service]
-Type=simple
-User=wrenn
-Group=wrenn
-EnvironmentFile=-/etc/wrenn/cp.env
-
-# Control plane is fully unprivileged — no capabilities needed.
-NoNewPrivileges=true
-CapabilityBoundingSet=
-
-ExecStart=/usr/local/bin/wrenn-cp
-
-Restart=on-failure
-RestartSec=5
-
-ProtectHome=true
-ProtectSystem=strict
-ReadWritePaths=/tmp
-
-[Install]
-WantedBy=multi-user.target
-UNIT
-
-mkdir -p /etc/wrenn
-touch /etc/wrenn/agent.env /etc/wrenn/cp.env
-chmod 640 /etc/wrenn/agent.env /etc/wrenn/cp.env
-chown root:${WRENN_GROUP} /etc/wrenn/agent.env /etc/wrenn/cp.env
-
-systemctl daemon-reload
-echo "    wrenn-agent.service and wrenn-cp.service installed."
-
-# ── Done ─────────────────────────────────────────────────────────────────────
-
-echo ""
-echo "=== Setup complete ==="
-echo ""
-echo "Next steps:"
-echo "  1. Copy wrenn-agent and wrenn-cp binaries to /usr/local/bin/"
-echo "  2. Edit /etc/wrenn/agent.env with WRENN_CP_URL and WRENN_ADVERTISE_ADDR"
-echo "  3. Edit /etc/wrenn/cp.env with DATABASE_URL and other control plane config"
-echo "  4. systemctl enable --now wrenn-agent"
-echo "  5. systemctl enable --now wrenn-cp"
-echo ""
-echo "Security summary:"
-echo "  - wrenn user: bash shell (for debugging), no home, no sudo (no grants in sudoers)"
-echo "  - wrenn-agent: runs as wrenn with 7 capabilities via setcap (not root)"
-echo "  - wrenn-cp: runs as wrenn with zero capabilities"
-echo "  - Capabilities auto-restored after apt upgrades via /etc/wrenn/restore-caps.sh"
-echo ""
--- a/scripts/rootfs-from-container.sh
+++ b/scripts/rootfs-from-container.sh
@ -38,7 +38,9 @@ IMAGE_NAME="$2"
 OUTPUT_DIR="${WRENN_IMAGES_PATH}/${IMAGE_NAME}"
 OUTPUT_FILE="${OUTPUT_DIR}/rootfs.ext4"
 MOUNT_DIR="/tmp/wrenn-rootfs-build"
-TAR_FILE="/tmp/wrenn-rootfs-export-${IMAGE_NAME}.tar"
+# IMAGE_NAME may contain slashes (e.g. teams/<team>/<id>); flatten them so the
+# temp tar is a single file in /tmp rather than a path into a missing dir.
+TAR_FILE="/tmp/wrenn-rootfs-export-${IMAGE_NAME//\//_}.tar"

 # Verify the container exists.
 if ! docker inspect "${CONTAINER}" > /dev/null 2>&1; then
@ -121,16 +123,24 @@ if [ -z "${TINI_BIN}" ]; then
        aarch64) TINI_ARCH="arm64" ;;
        *)        echo "ERROR: Unsupported architecture: ${ARCH}"; exit 1 ;;
    esac
+    # Use the statically linked tini so the binary runs regardless of the
+    # guest's libc (glibc on Ubuntu/Arch/Fedora, musl on Alpine).
    TINI_VERSION="v0.19.0"
-    TINI_URL="https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini-${TINI_ARCH}"
-    TINI_TMP="/tmp/tini-${TINI_ARCH}"
-    echo "    Downloading tini ${TINI_VERSION} (${TINI_ARCH})..."
+    TINI_URL="https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini-static-${TINI_ARCH}"
+    TINI_TMP="/tmp/tini-static-${TINI_ARCH}"
+    echo "    Downloading tini ${TINI_VERSION} static (${TINI_ARCH})..."
    curl -fsSL "${TINI_URL}" -o "${TINI_TMP}"
    chmod +x "${TINI_TMP}"
    TINI_BIN="${TINI_TMP}"
 fi
 sudo mkdir -p "${MOUNT_DIR}/sbin"
-sudo cp "${TINI_BIN}" "${MOUNT_DIR}/sbin/tini"
+# On usr-merged distros (e.g. Fedora) /sbin is a symlink to /usr/bin, so a tini
+# already at /usr/bin/tini IS /sbin/tini — copying onto itself errors. Skip then.
+if [ "${TINI_BIN}" -ef "${MOUNT_DIR}/sbin/tini" ]; then
+    echo "    tini already at /sbin/tini (usr-merged); skipping copy"
+else
+    sudo cp "${TINI_BIN}" "${MOUNT_DIR}/sbin/tini"
+fi
 sudo chmod 755 "${MOUNT_DIR}/sbin/tini"

 # Step 6: Verify injected binaries and required container packages.
--- a/scripts/update-minimal-rootfs.sh
+++ b/scripts/update-minimal-rootfs.sh
@ -1,32 +1,46 @@
 #!/usr/bin/env bash
 #
-# update-debug-rootfs.sh — Build envd and inject it (plus wrenn-init + tini) into the debug rootfs.
+# update-minimal-rootfs.sh — Rebuild envd and inject it (plus wrenn-init + tini)
+# into the system base rootfs images.
 #
 # This script:
-#   1. Builds a fresh envd static binary via make
-#   2. Mounts the rootfs image
-#   3. Copies envd, wrenn-init, and tini into the image
-#   4. Unmounts cleanly
+#   1. Builds a fresh envd static binary via make (once)
+#   2. For each system base rootfs (ubuntu/alpine/arch/fedora): mounts it,
+#      copies envd + wrenn-init + tini in, and unmounts cleanly
 #
 # Usage:
-#   bash scripts/update-debug-rootfs.sh [rootfs_path]
+#   bash scripts/update-minimal-rootfs.sh [rootfs_path]
 #
-# Defaults to /var/lib/wrenn/images/minimal/rootfs.ext4
+# With no argument it updates all four system base rootfs images under
+#   ${WRENN_DIR}/images/teams/<platform>/<id>/rootfs.ext4
+# With a path argument it updates only that single rootfs.

 set -euo pipefail

 SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
 PROJECT_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
 WRENN_DIR="${WRENN_DIR:-/var/lib/wrenn}"
-ROOTFS="${1:-${WRENN_DIR}/images/minimal/rootfs.ext4}"
 MOUNT_DIR="/tmp/wrenn-rootfs-update"

-if [ ! -f "${ROOTFS}" ]; then
-    echo "ERROR: Rootfs not found at ${ROOTFS}"
-    exit 1
+# base36(all-zeros UUID) = platform team that owns every system base template.
+PLATFORM_TEAM_B36="0000000000000000000000000"
+
+# System base template IDs (well-known reserved IDs 0..3). Single-digit IDs, so
+# the 25-char base36 string is just the zero-padded decimal.
+SYSTEM_TEMPLATE_IDS=(0 1 2 3)
+
+# Resolve which rootfs images to update.
+ROOTFS_LIST=()
+if [ $# -ge 1 ]; then
+    ROOTFS_LIST=("$1")
+else
+    for tid in "${SYSTEM_TEMPLATE_IDS[@]}"; do
+        tmpl_b36="$(printf '%025d' "${tid}")"
+        ROOTFS_LIST+=("${WRENN_DIR}/images/teams/${PLATFORM_TEAM_B36}/${tmpl_b36}/rootfs.ext4")
+    done
 fi

-# Step 1: Build envd.
+# Step 1: Build envd (once).
 echo "==> Building envd..."
 cd "${PROJECT_ROOT}"
 make build-envd
@ -42,64 +56,84 @@ if ! ldd "${ENVD_BIN}" | grep -q "statically linked"; then
    exit 1
 fi

-# Step 2: Mount the rootfs.
-echo "==> Mounting rootfs at ${MOUNT_DIR}..."
-mkdir -p "${MOUNT_DIR}"
-sudo mount -o loop,rw "${ROOTFS}" "${MOUNT_DIR}"
-
-cleanup() {
-    echo "==> Unmounting rootfs..."
-    sudo umount "${MOUNT_DIR}" 2>/dev/null || true
-    rmdir "${MOUNT_DIR}" 2>/dev/null || true
-}
-trap cleanup EXIT
-
-# Step 3: Copy files into rootfs.
-echo "==> Installing envd..."
-sudo mkdir -p "${MOUNT_DIR}/usr/local/bin"
-sudo cp "${ENVD_BIN}" "${MOUNT_DIR}/usr/local/bin/envd"
-sudo chmod 755 "${MOUNT_DIR}/usr/local/bin/envd"
-
-echo "==> Installing wrenn-init..."
-sudo cp "${PROJECT_ROOT}/images/wrenn-init.sh" "${MOUNT_DIR}/usr/local/bin/wrenn-init"
-sudo chmod 755 "${MOUNT_DIR}/usr/local/bin/wrenn-init"
-
-echo "==> Installing tini..."
-TINI_BIN=""
-# 1. Already in the rootfs?
-for p in "${MOUNT_DIR}/usr/bin/tini" "${MOUNT_DIR}/sbin/tini" "${MOUNT_DIR}/usr/local/bin/tini"; do
-    if [ -f "$p" ]; then TINI_BIN="$p"; break; fi
-done
-# 2. Available on the host?
-if [ -z "${TINI_BIN}" ]; then
-    for p in /usr/bin/tini /usr/local/bin/tini /sbin/tini; do
-        if [ -f "$p" ]; then TINI_BIN="$p"; break; fi
+# resolve_tini ROOTFS_MOUNT — echo a path to a tini binary suitable for the
+# mounted rootfs. Prefers one already in the image, then a static download.
+resolve_tini() {
+    local mount_dir="$1" p tini_arch arch
+    for p in "${mount_dir}/usr/bin/tini" "${mount_dir}/sbin/tini" "${mount_dir}/usr/local/bin/tini"; do
+        if [ -f "$p" ]; then echo "$p"; return; fi
    done
-fi
-# 3. Download from GitHub releases.
-if [ -z "${TINI_BIN}" ]; then
-    ARCH="$(uname -m)"
-    case "${ARCH}" in
-        x86_64)  TINI_ARCH="amd64" ;;
-        aarch64) TINI_ARCH="arm64" ;;
-        *)        echo "ERROR: Unsupported architecture: ${ARCH}"; exit 1 ;;
+    arch="$(uname -m)"
+    case "${arch}" in
+        x86_64)  tini_arch="amd64" ;;
+        aarch64) tini_arch="arm64" ;;
+        *) echo "ERROR: Unsupported architecture: ${arch}" >&2; exit 1 ;;
    esac
-    TINI_VERSION="v0.19.0"
-    TINI_URL="https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini-${TINI_ARCH}"
-    TINI_TMP="/tmp/tini-${TINI_ARCH}"
-    echo "    Downloading tini ${TINI_VERSION} (${TINI_ARCH})..."
-    curl -fsSL "${TINI_URL}" -o "${TINI_TMP}"
-    chmod +x "${TINI_TMP}"
-    TINI_BIN="${TINI_TMP}"
+    # Static tini runs under any libc (glibc or musl).
+    local tmp="/tmp/tini-static-${tini_arch}"
+    if [ ! -f "${tmp}" ]; then
+        echo "    Downloading tini v0.19.0 static (${tini_arch})..." >&2
+        curl -fsSL "https://github.com/krallin/tini/releases/download/v0.19.0/tini-static-${tini_arch}" -o "${tmp}"
+        chmod +x "${tmp}"
+    fi
+    echo "${tmp}"
+}
+
+# inject_rootfs ROOTFS — mount, copy guest binaries in, unmount.
+inject_rootfs() {
+    local rootfs="$1" tini_bin
+    echo ""
+    echo "==> Updating ${rootfs}"
+
+    mkdir -p "${MOUNT_DIR}"
+    sudo mount -o loop,rw "${rootfs}" "${MOUNT_DIR}"
+
+    local mounted=1
+    cleanup_mount() {
+        if [ "${mounted}" = "1" ]; then
+            sudo umount "${MOUNT_DIR}" 2>/dev/null || true
+            rmdir "${MOUNT_DIR}" 2>/dev/null || true
+            mounted=0
+        fi
+    }
+    trap cleanup_mount RETURN
+
+    sudo mkdir -p "${MOUNT_DIR}/usr/local/bin"
+    sudo cp "${ENVD_BIN}" "${MOUNT_DIR}/usr/local/bin/envd"
+    sudo chmod 755 "${MOUNT_DIR}/usr/local/bin/envd"
+
+    sudo cp "${PROJECT_ROOT}/images/wrenn-init.sh" "${MOUNT_DIR}/usr/local/bin/wrenn-init"
+    sudo chmod 755 "${MOUNT_DIR}/usr/local/bin/wrenn-init"
+
+    tini_bin="$(resolve_tini "${MOUNT_DIR}")"
+    sudo mkdir -p "${MOUNT_DIR}/sbin"
+    # On usr-merged distros (e.g. Fedora) /sbin -> /usr/bin, so a tini already at
+    # /usr/bin/tini IS /sbin/tini — copying onto itself errors. Skip then.
+    if [ "${tini_bin}" -ef "${MOUNT_DIR}/sbin/tini" ]; then
+        echo "    tini already at /sbin/tini (usr-merged); skipping copy"
+    else
+        sudo cp "${tini_bin}" "${MOUNT_DIR}/sbin/tini"
+    fi
+    sudo chmod 755 "${MOUNT_DIR}/sbin/tini"
+
+    ls -la "${MOUNT_DIR}/usr/local/bin/envd" "${MOUNT_DIR}/usr/local/bin/wrenn-init" "${MOUNT_DIR}/sbin/tini"
+    cleanup_mount
+}
+
+# Step 2: Update each rootfs that exists.
+UPDATED=0
+for rootfs in "${ROOTFS_LIST[@]}"; do
+    if [ ! -f "${rootfs}" ]; then
+        echo "==> Skipping (not found): ${rootfs}"
+        continue
+    fi
+    inject_rootfs "${rootfs}"
+    UPDATED=$((UPDATED + 1))
+done
+
+echo ""
+if [ "${UPDATED}" -eq 0 ]; then
+    echo "==> No rootfs images updated. Build them first with: make images"
+    exit 1
 fi
-sudo mkdir -p "${MOUNT_DIR}/sbin"
-sudo cp "${TINI_BIN}" "${MOUNT_DIR}/sbin/tini"
-sudo chmod 755 "${MOUNT_DIR}/sbin/tini"
-
-# Step 4: Verify.
-echo ""
-echo "==> Installed files:"
-ls -la "${MOUNT_DIR}/usr/local/bin/envd" "${MOUNT_DIR}/usr/local/bin/wrenn-init" "${MOUNT_DIR}/sbin/tini"
-
-echo ""
-echo "==> Done. Rootfs updated: ${ROOTFS}"
+echo "==> Done. Updated ${UPDATED} rootfs image(s)."