wrenn-releases

Author	SHA1	Message	Date
pptx704	6898528096	Replace one-shot clock_settime with chrony for continuous guest time sync Switch from the envd /init endpoint pushing host time via syscall to chronyd reading the KVM PTP hardware clock (/dev/ptp0) continuously. This fixes clock drift between init calls and handles snapshot resume gracefully. Changes: - Add clocksource=kvm-clock kernel boot arg - Start chronyd in wrenn-init.sh before tini (PHC /dev/ptp0, makestep 1.0 -1) - Remove clock_settime logic from envd SetData and shouldSetSystemTime - Remove client.Init() clock sync calls from sandbox manager (3 sites) - Remove Init() method from envdclient (no longer needed) - Simplify rootfs scripts: socat/chrony now come from apt in the container image, only envd/wrenn-init/tini are injected by build scripts	2026-03-26 04:47:44 +06:00
pptx704	4be65b0abb	WIP: Add sandbox proxy catch-all to control plane Add SandboxProxyWrapper that intercepts requests with Host headers matching {port}-{sandbox_id}.{domain} and proxies them through the owning host agent's /proxy endpoint. Authentication is via X-API-Key only (no JWT). The API key's team must own the sandbox. Export EnsureScheme from lifecycle package for reuse. Request flow: SDK -> Caddy -> CP catch-all -> Host Agent -> sandbox VM. This is an intermediate state — needs further work for the full code interpreter feature.	2026-03-26 02:12:10 +06:00
pptx704	f4675ebfc0	WIP: Add HTTP proxy endpoint to host agent Add /proxy/{sandbox_id}/{port}/* handler that reverse-proxies HTTP requests to services running inside sandbox VMs. The sandbox's host IP (10.11.0.{idx}) is used as the upstream target. Includes port validation (1-65535) and shared HTTP transport for connection pooling. Supports WebSocket upgrades for protocols like Jupyter's streaming API. This is an intermediate state — needs further work for the full code interpreter feature.	2026-03-26 02:12:01 +06:00
pptx704	ed7880bc6c	Add per-capsule stats detail page with live CPU/RAM charts - New detail page at /dashboard/capsules/[id] with Stats and Files tabs - Stats tab shows capsule info card (status, template, CPU, memory, disk, started, idle timeout) and two stacked Chart.js charts with live values - Metrics API client with 10s polling and moving-average smoothing - Capsule ID in list table is now a clickable link to the detail page - Layout breadcrumb header (Capsules > sb-xxx) with back navigation - Fix metrics sampler: use v.PID() directly as Firecracker PID since unshare -m execs (not forks) through the bash/ip-netns-exec/firecracker chain, so all share the same PID. Removes unused findChildPID. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 22:31:05 +06:00
pptx704	27ff828e60	Push GetSandboxMetricPoints time filter into SQL The query was fetching all rows for a (sandbox_id, tier) pair and filtering by timestamp in Go. For repeatedly-paused sandboxes the 24h tier can accumulate up to 30 days of data, causing up to 120x over-fetching for a 6h range request. Add AND ts >= $3 to the query so Postgres filters on the primary key (sandbox_id, tier, ts) directly. Drop the redundant Go-side loop.	2026-03-25 21:53:19 +06:00
pptx704	6eacf0f735	Fix LIKE pattern injection in user email search Escape LIKE metacharacters (% and _) in the email prefix before passing to the SQL query, and enforce the documented '@' requirement to prevent broad user enumeration. Move search logic out of TeamService into usersHandler since it is a site-wide lookup, not team-scoped.	2026-03-25 21:53:09 +06:00
pptx704	49b0b646a8	Add 5m, 1h, 6h, 12h range filters to metrics endpoint Maps each user-facing range to the appropriate underlying ring buffer tier and applies a time cutoff filter. No new ring buffers needed — 5m/10m read from the 10m tier, 1h/2h from the 2h tier, 6h/12h/24h from the 24h tier.	2026-03-25 20:44:28 +06:00
pptx704	9acdbb5ae9	Add per-sandbox CPU/memory/disk metrics collection Samples /proc/{fc_pid}/stat (CPU%), /proc/{fc_pid}/status (VmRSS), and stat() on CoW files at 500ms intervals per running sandbox. Three tiered ring buffers downsample into 30s and 5min averages for 10min/2h/24h retention. Metrics are flushed to DB on pause (all tiers) and destroy (24h only). New GetSandboxMetrics and FlushSandboxMetrics RPCs on the host agent, proxied through GET /v1/sandboxes/{id}/metrics?range= on the control plane. Returns live data for running sandboxes, DB data for paused, and 404 for stopped.	2026-03-25 20:10:33 +06:00
pptx704	e3750f79f9	Fix metrics sampler to record zero-value snapshots when idle SampleSandboxMetrics previously filtered WHERE status IN ('running', 'starting', 'paused'), which returned no rows when all capsules were stopped. This caused zero snapshots to be skipped, leaving the time-series charts with no trailing data points instead of showing the expected zero values. Remove the WHERE filter so the query groups by all teams that have any sandbox row. The per-status FILTER clauses on the aggregates already produce correct zero counts for stopped capsules. Also includes the per-VM RAM ceiling formula change (sum(ceil(each/2)) instead of ceil(sum/2)).	2026-03-25 15:50:19 +06:00
pptx704	47b0ed5b52	Fix metrics correctness, redesign stats page - Replace stale snapshot read (GetCurrentMetrics) with live query (GetLiveMetrics) against sandboxes table — always returns correct zeros when no capsules are running - Fix CPU reserved formula: running + starting only; paused VMs no longer contribute vCPUs (RAM reservation for paused unchanged) - Merge top cards into 3 paired Now/Peak cards with colored accent borders (green/blue/amber matching chart colors) - Move Live badge from Running Capsules card to page-level header - Add colored category dots to card and chart headers - Charts stacked vertically, flex-1 to fill remaining page height - vCPUs chart color changed to blue (#5a9fd4), RAM stays amber	2026-03-25 15:11:46 +06:00
pptx704	fee66bda50	Add live stats page with metrics sampling and route split - New sandbox_metrics_snapshots table sampled every 10s (60-day retention) - Background MetricsSampler goroutine wired into control plane startup - GET /v1/sandboxes/stats?range=5m\|1h\|6h\|24h\|30d endpoint with adaptive polling intervals; reserved CPU/RAM uses ceil(paused/2) formula - StatsPanel component: 4 stat cards + 2 Chart.js line charts (straight lines, integer y-axis for running count, dual-axis for CPU/RAM) - Range filter persisted in URL query param; polls update data silently (no blink — loading state only shown on initial mount) - Split /dashboard/capsules into /list and /stats sub-routes with shared layout; capsuleRunningCount store syncs badge across routes - CreateCapsuleDialog extracted as reusable component	2026-03-25 14:41:05 +06:00
pptx704	1be30034bd	Add audit log infrastructure and GET /v1/audit-logs endpoint Introduces an append-only audit trail for all user and system actions: sandbox lifecycle (create/pause/resume/destroy/auto-pause), snapshots, team rename, API key create/revoke, member add/remove/leave/role_update, and BYOC host add/delete/marked_down/marked_up. - New audit_logs table (migration) with team_id, actor, resource, action, scope (team\|admin), status (success\|info\|warning\|error), metadata, and created_at - AuditLogger (internal/audit) with named fire-and-forget methods per event; system actor used for background events (HostMonitor, TTL reaper) - GET /v1/audit-logs: JWT-only, cursor pagination (max 200), multi-value filters for resource_type and action (comma-sep or repeated params); members see team-scoped events only, admins/owners see all - AuthContext extended with APIKeyID + APIKeyName so API key requests record meaningful actor identity - HostMonitor wired with AuditLogger for auto-pause and host marked_down	2026-03-25 05:15:16 +06:00
pptx704	e069b3e679	Add BYOC page, admin section, and is_byoc team visibility gating - Frontend: BYOC hosts page (/dashboard/byoc) with register/delete flows, shimmer loading, pulsing online status, animated token reveal checkmark - Frontend: Admin section (/admin/hosts) with platform + BYOC tabs, stat pills, skeleton loading, slide-in animations for new rows - Frontend: AdminSidebar component with accent top bar and admin pill badge - Frontend: BYOC nav item shown only when team.is_byoc is true (derived from teams store, not JWT); disabled for members - Frontend: Admin shield button in Sidebar, visible only to platform admins - Backend: is_admin in JWT claims + requireAdmin middleware (DB-validated) - Backend: is_byoc added to teamResponse so frontend derives visibility from fresh team data rather than stale JWT fields - Backend: SetBYOC admin endpoint (PUT /v1/admin/teams/{id}/byoc) - Backend: Admin hosts list enriches BYOC entries with team_name - Host agent: load .env file via godotenv on startup	2026-03-25 03:10:41 +06:00
pptx704	9bf67aa7f7	Implement host registration, JWT refresh tokens, and multi-host scheduling Replaces the hardcoded CP_HOST_AGENT_ADDR single-agent setup with a DB-driven registration system supporting multiple host agents (BYOC). Key changes: - Host agents register via one-time token, receive a 7-day JWT + 60-day refresh token; heartbeat loop auto-refreshes on 401/403 and pauses all sandboxes if refresh fails - HostClientPool: lazy Connect RPC client cache keyed by host ID, replacing the single static agent client throughout the API and service layers - RoundRobinScheduler: picks an online host for each new sandbox via ListActiveHosts; extensible for future scheduling strategies - HostMonitor (replaces Reconciler): passive heartbeat staleness check marks hosts unreachable and sandboxes missing after 90s; active reconciliation per online host restores missing-but-alive sandboxes and stops orphans - Graceful host delete: returns 409 with affected sandbox list without ?force=true; force-delete destroys sandboxes then evicts pool client - Snapshot delete broadcasts to all online hosts (templates have no host_id) - sandbox.Manager.PauseAll: pauses all running VMs on CP connectivity loss - New migration: host_refresh_tokens table with token rotation (issue-then- revoke ordering to prevent lockout on mid-rotation crash) - New sandbox status 'missing' (reversible, unlike 'stopped') and host status 'unreachable'; both reflected in OpenAPI spec - Fix: refresh token auth failure now returns 401 (was 400 via generic 'invalid' substring match in serviceErrToHTTP)	2026-03-24 18:32:05 +06:00
pptx704	3932bc056e	Add user names, team-scoped sandbox guard, and login robustness fixes - Add name column to users (migration + sqlc regen); propagate through JWT claims, auth context, all auth/OAuth handlers, service layer, and frontend - Sidebar and team page show name instead of email; team page splits Name/Email into separate columns - Block sandbox creation in UI and API when user has no active team context - loginTeam helper falls back to first active team when no default is set, fixing login for invited users with no is_default membership - Exclude soft-deleted teams from GetDefaultTeamForUser, GetBYOCTeams queries - Guard host creation against soft-deleted teams in service/host.go - SwitchTeam re-fetches name from DB instead of trusting stale JWT claim - Reset teams store on login so stale data from a previous session never persists - Update openapi.yaml: add name to SignupRequest and AuthResponse schemas	2026-03-24 16:56:10 +06:00
pptx704	71a7fdb76f	Fix user search to trigger on 3 characters without requiring @ The anti-enumeration guard required @ in the email prefix, causing the typeahead to silently return nothing until the user typed @. Replace with a minimum 3-character length check to match the frontend trigger condition.	2026-03-24 14:41:01 +06:00
pptx704	b3e8bdd171	Refine team management: name chars, danger zone, no-team state - Allow hyphens, @, and apostrophes in team names (backend regex) - After delete/leave, switch to next available team instead of logging out; if no teams remain, show a toast prompting to create one - Disable delete/leave button when user has only one team, with explanatory hint to create another team first - Show empty state on /dashboard/team when auth has no team context, pointing user to the sidebar to create a team - Fetch all teams in parallel with team detail on page load to power the isLastTeam guard	2026-03-24 14:34:20 +06:00
pptx704	8e5d426638	Add team management endpoints - Three-role model (owner/admin/member) with owner protection invariants - Team CRUD: create, rename (admin+), soft-delete with VM cleanup (owner only) - Member management: add by email, remove, role updates (admin+), leave - Switch-team endpoint re-issues JWT after DB membership verification - User email prefix search for add-member UI autocomplete - JWT carries role as a hint; all authorization decisions verified from DB - Team slug: immutable 12-char hex (e.g. a1b2c3-d1e2f3), reserved on soft-delete - Migration adds slug + deleted_at to teams; backfills existing rows	2026-03-24 13:29:54 +06:00
pptx704	5f0dbadea6	Fix snapshot and sandbox delete consistency - Snapshot delete: make agent RPC failure a hard error so DB record is not removed when files cannot be deleted from disk - Snapshot overwrite: call agent to delete old files before removing the DB record, preventing stale memfile.{uuid} generations from accumulating on disk across repeated overwrites - Sandbox destroy: only swallow CodeNotFound from the agent (sandbox already gone / TTL-reaped); any other error now propagates to the caller instead of being silently ignored	2026-03-23 02:59:30 +06:00
pptx704	36782e1b4f	Add tini as PID 1, guest clock sync, and fix PATH in guest VMs - Use tini as PID 1 in wrenn-init.sh so zombie processes are reaped and signals are forwarded correctly to envd - Set standard PATH in wrenn-init.sh so child processes spawned by envd can find common binaries (fixes "nice: ls command not found") - Add envdclient.Init() to POST /init on envd after every boot/resume, syncing the guest clock via unix.ClockSettime — critical after snapshot resume where the guest clock is frozen - Run Init in a background goroutine so it doesn't block the CreateSandbox RPC response; a slow Init (vCPU busy with envd startup) was causing the RPC context to be canceled before the response reached the control plane - Update rootfs-from-container.sh and update-debug-rootfs.sh to inject tini into the rootfs, checking the container image and host first, downloading from GitHub releases as fallback	2026-03-23 02:45:27 +06:00
pptx704	97292ba0bf	Added basic frontend (#1 ) Reviewed-on: wrenn/sandbox#1 Co-authored-by: pptx704 <rafeed@omukk.dev> Co-committed-by: pptx704 <rafeed@omukk.dev>	2026-03-22 19:01:38 +00:00
pptx704	2c66959b92	Add host registration, heartbeat, and multi-host management Implements the full host ↔ control plane connection flow: - Host CRUD endpoints (POST/GET/DELETE /v1/hosts) with role-based access: regular hosts admin-only, BYOC hosts for admins and team owners - One-time registration token flow: admin creates host → gets token (1hr TTL in Redis + Postgres audit trail) → host agent registers with specs → gets long-lived JWT (1yr) - Host agent registration client with automatic spec detection (arch, CPU, memory, disk) and token persistence to disk - Periodic heartbeat (30s) via POST /v1/hosts/{id}/heartbeat with X-Host-Token auth and host ID cross-check - Token regeneration endpoint (POST /v1/hosts/{id}/token) for retry after failed registration - Tag management (add/remove/list) with team-scoped access control - Host JWT with typ:"host" claim, cross-use prevention in both VerifyJWT and VerifyHostJWT - requireHostToken middleware for host agent authentication - DB-level race protection: RegisterHost uses AND status='pending' with rows-affected check; Redis GetDel for atomic token consume - Migration for future mTLS support (cert_fingerprint, mtls_enabled columns) - Host agent flags: --register (one-time token), --address (required ip:port) - serviceErrToHTTP extended with "forbidden" → 403 mapping - OpenAPI spec, .env.example, and README updated	2026-03-17 05:51:28 +06:00
pptx704	e4ead076e3	Add admin users, BYOC teams, hosts schema, and Redis for host registration Introduce three migrations: admin permissions (is_admin + permissions table), BYOC team tracking, and multi-host support (hosts, host_tokens, host_tags). Add Redis to dev infra and wire up client in control plane for ephemeral host registration tokens. Add go-redis dependency.	2026-03-17 03:26:42 +06:00
pptx704	1d59b50e49	Remove empty admin UI stubs The internal/admin/ package was never imported or mounted — just placeholder files. Removing to avoid confusion before the real dashboard is built.	2026-03-16 05:39:43 +06:00
pptx704	f38d5812d1	Extract shared service layer for sandbox, API key, and template operations Moves business logic from API handlers into internal/service/ so that both the REST API and the upcoming dashboard can share the same operations without duplicating code. API handlers now delegate to the service layer and only handle HTTP-specific concerns (request parsing, response formatting).	2026-03-16 05:39:30 +06:00
pptx704	931b7d54b3	Add GitHub OAuth login with provider registry Implement OAuth 2.0 login via GitHub as an alternative to email/password. Uses a provider registry pattern (internal/auth/oauth/) so adding Google or other providers later requires only a new Provider implementation. Flow: GET /v1/auth/oauth/github redirects to GitHub, callback exchanges the code for a user profile, upserts the user + team atomically, and redirects to the frontend with a JWT token. Key changes: - Migration: make password_hash nullable, add oauth_providers table - Provider registry with GitHubProvider (profile + email fallback) - CSRF state cookie with HMAC-SHA256 validation - Race-safe registration (23505 collision retries as login) - Startup validation: CP_PUBLIC_URL required when OAuth is configured Not fully tested — needs integration tests with a real GitHub OAuth app and end-to-end testing with the frontend callback page.	2026-03-15 06:31:58 +06:00
pptx704	477d4f8cf6	Add auto-pause TTL and ping endpoint for sandbox inactivity management Replace the existing auto-destroy TTL behavior with auto-pause: when a sandbox exceeds its timeout_sec of inactivity, the TTL reaper now pauses it (snapshot + teardown) instead of destroying it, preserving the ability to resume later. Key changes: - TTL reaper calls Pause instead of Destroy, with fallback to Destroy if pause fails (e.g. Firecracker process already gone) - New PingSandbox RPC resets the in-memory LastActiveAt timer - New POST /v1/sandboxes/{id}/ping REST endpoint resets both agent memory and DB last_active_at - ListSandboxes RPC now includes auto_paused_sandbox_ids so the reconciler can distinguish auto-paused sandboxes from crashed ones in a single call - Reconciler polls every 5s (was 30s) and marks auto-paused as "paused" vs orphaned as "stopped" - Resume RPC accepts timeout_sec from DB so TTL survives pause/resume cycles - Reaper checks every 2s (was 10s) and uses a detached context to avoid incomplete pauses on app shutdown - Default timeout_sec changed from 300 to 0 (no auto-pause unless requested)	2026-03-15 05:15:18 +06:00
pptx704	88246fac2b	Fix sandbox lifecycle cleanup and dmsetup remove reliability - Add retry with backoff to dmsetupRemove for transient "device busy" errors caused by kernel not releasing the device immediately after Firecracker exits. Only retries on "Device or resource busy"; other errors (not found, permission denied) return immediately. - Thread context.Context through RemoveSnapshot/RestoreSnapshot so retries respect cancellation. Use context.Background() in all error cleanup paths to prevent cancelled contexts from skipping cleanup and leaking dm devices on the host. - Resume vCPUs on pause failure: if snapshot creation or memfile processing fails after freezing the VM, unfreeze vCPUs so the sandbox stays usable instead of becoming a frozen zombie. - Fix resource leaks in Pause when CoW rename or metadata write fails: properly clean up network, slot, loop device, and remove from boxes map instead of leaving a dead sandbox with leaked host resources. - Fix Resume WaitUntilReady failure: roll back CoW file to the snapshot directory instead of deleting it, preserving the paused state so the user can retry. - Skip m.loops.Release when RemoveSnapshot fails during pause since the stale dm device still references the origin loop device. - Fix incorrect VCPUs placeholder in Resume VMConfig that used memory size instead of a sensible default.	2026-03-14 06:42:34 +06:00
pptx704	1846168736	Fix device-mapper "Device or resource busy" error on sandbox resume Pause was logging RemoveSnapshot failures as warnings and continuing, which left stale dm devices behind. Resume then failed trying to create a device with the same name. - Make RemoveSnapshot failure a hard error in Pause (clean up remaining resources and return error instead of silently proceeding) - Add defensive stale device cleanup in RestoreSnapshot before creating the new dm device	2026-03-14 03:57:14 +06:00
pptx704	c92cc29b88	Add authentication, authorization, and team-scoped access control Implement email/password auth with JWT sessions and API key auth for sandbox lifecycle. Users get a default team on signup; sandboxes, snapshots, and API keys are scoped to teams. - Add user, team, users_teams, and team_api_keys tables (goose migrations) - Add JWT middleware (Bearer token) for user management endpoints - Add API key middleware (X-API-Key header, SHA-256 hashed) for sandbox ops - Add signup/login handlers with transactional user+team creation - Add API key CRUD endpoints (create/list/delete) - Replace owner_id with team_id on sandboxes and templates - Update all handlers to use team-scoped queries - Add godotenv for .env file loading - Update OpenAPI spec and test UI with auth flows	2026-03-14 03:57:06 +06:00
pptx704	80a99eec87	Add diff snapshots for re-pause to avoid UFFD fault-in storm Use Firecracker's Diff snapshot type when re-pausing a previously resumed sandbox, capturing only dirty pages instead of a full memory dump. Chains up to 10 incremental generations before collapsing back to a Full snapshot. Multi-generation diff files (memfile.{buildID}) are supported alongside the legacy single-file format in resume, template creation, and snapshot existence checks.	2026-03-13 09:41:58 +06:00
pptx704	a0d635ae5e	Fix path traversal in template/snapshot names and network cleanup leaks Add SafeName validator (allowlist regex) to reject directory traversal in user-supplied template and snapshot names. Validated at both API handlers (400 response) and sandbox manager (defense in depth). Refactor CreateNetwork with rollback slice so partially created resources (namespace, veth, routes, iptables rules) are cleaned up on any error. Refactor RemoveNetwork to collect and return errors instead of silently ignoring them.	2026-03-13 08:40:36 +06:00
pptx704	63e9132d38	Add device-mapper snapshots, test UI, fix pause ordering and lint errors - Replace reflink rootfs copy with device-mapper snapshots (shared read-only loop device per base template, per-sandbox sparse CoW file) - Add devicemapper package with create/restore/remove/flatten operations and refcounted LoopRegistry for base image loop devices - Fix pause ordering: destroy VM before removing dm-snapshot to avoid "device busy" error (FC must release the dm device first) - Add test UI at GET /test for sandbox lifecycle management (create, pause, resume, destroy, exec, snapshot create/list/delete) - Fix DirSize to report actual disk usage (stat.Blocks * 512) instead of apparent size, so sparse CoW files report correctly - Add timing logs to pause flow for performance diagnostics - Fix all lint errors across api, network, vm, uffd, and sandbox packages - Remove obsolete internal/filesystem package (replaced by devicemapper) - Update CLAUDE.md with device-mapper architecture documentation	2026-03-13 08:25:40 +06:00
pptx704	778894b488	Made license related changes	2026-03-13 05:42:10 +06:00
pptx704	a1bd439c75	Add sandbox snapshot and restore with UFFD lazy memory loading Implement full snapshot lifecycle: pause (snapshot + free resources), resume (UFFD-based lazy restore), and named snapshot templates that can spawn new sandboxes from frozen VM state. Key changes: - Snapshot header system with generational diff mapping (inspired by e2b) - UFFD server for lazy page fault handling during snapshot restore - Stable rootfs symlink path (/tmp/fc-vm/) for snapshot compatibility - Templates DB table and CRUD API endpoints (POST/GET/DELETE /v1/snapshots) - CreateSnapshot/DeleteSnapshot RPCs in hostagent proto - Reconciler excludes paused sandboxes (expected absent from host agent) - Snapshot templates lock vcpus/memory to baked-in values - Proper cleanup of uffd sockets and pause snapshot files on destroy	2026-03-12 09:19:37 +06:00
pptx704	0c245e9e1c	Fix guest VM outbound networking and DNS resolution Add resolv.conf to wrenn-init so guests can resolve DNS, and fix the host MASQUERADE rule to match vpeerIP (the actual source after namespace SNAT) instead of hostIP.	2026-03-11 06:02:31 +06:00
pptx704	b4d8edb65b	Add streaming exec and file transfer endpoints Add WebSocket-based streaming exec endpoint and streaming file upload/download endpoints to the control plane API. Includes new host agent RPC methods (ExecStream, StreamWriteFile, StreamReadFile), envd client streaming support, and OpenAPI spec updates.	2026-03-11 05:42:42 +06:00
pptx704	ec3360d9ad	Add minimal control plane with REST API, database, and reconciler - REST API (chi router): sandbox CRUD, exec, pause/resume, file write/read - PostgreSQL persistence via pgx/v5 + sqlc (sandboxes table with goose migration) - Connect RPC client to host agent for all VM operations - Reconciler syncs host agent state with DB every 30s (detects TTL-reaped sandboxes) - OpenAPI 3.1 spec served at /openapi.yaml, Swagger UI at /docs - Added WriteFile/ReadFile RPCs to hostagent proto and implementations - File upload via multipart form, download via JSON body POST - sandbox_id propagated from control plane to host agent on create	2026-03-10 16:50:12 +06:00
pptx704	6f0c365d44	Add host agent RPC server with sandbox lifecycle management Implement the host agent as a Connect RPC server that orchestrates sandbox creation, destruction, pause/resume, and command execution. Includes sandbox manager with TTL-based reaper, network slot allocator, rootfs cloning, hostagent proto definition with generated stubs, and test/debug scripts. Fix Firecracker process lifetime bug where VM was tied to HTTP request context instead of background context.	2026-03-10 03:54:53 +06:00
pptx704	7753938044	Add host agent with VM lifecycle, TAP networking, and envd client Implements Phase 1: boot a Firecracker microVM, execute a command inside it via envd, and get the output back. Uses raw Firecracker HTTP API via Unix socket (not the Go SDK) for full control over the VM lifecycle. - internal/vm: VM manager with create/pause/resume/destroy, Firecracker HTTP client, process launcher with unshare + ip netns exec isolation - internal/network: per-sandbox network namespace with veth pair, TAP device, NAT rules, and IP forwarding - internal/envdclient: Connect RPC client for envd process/filesystem services with health check retry - cmd/host-agent: demo binary that boots a VM, runs "echo hello", prints output, and cleans up - proto/envd: canonical proto files with buf + protoc-gen-connect-go code generation - images/wrenn-init.sh: minimal PID 1 init script for guest VMs - CLAUDE.md: updated architecture to reflect TAP networking (not vsock) and Firecracker HTTP API (not Go SDK)	2026-03-10 00:06:47 +06:00
pptx704	bd78cc068c	Initial project structure for Wrenn Sandbox Set up directory layout, Makefiles, go.mod files, docker-compose, and empty placeholder files for all packages.	2026-03-09 17:22:47 +06:00

41 Commits