1
0
forked from wrenn/wrenn
Commit Graph

149 Commits

Author SHA1 Message Date
9878156798 Merge pull request 'Set up working host registration (including BYOC) with the CP' (#6) from host-registration into dev
Reviewed-on: wrenn/sandbox#6
2026-03-24 21:19:12 +00:00
e069b3e679 Add BYOC page, admin section, and is_byoc team visibility gating
- Frontend: BYOC hosts page (/dashboard/byoc) with register/delete flows,
  shimmer loading, pulsing online status, animated token reveal checkmark
- Frontend: Admin section (/admin/hosts) with platform + BYOC tabs, stat
  pills, skeleton loading, slide-in animations for new rows
- Frontend: AdminSidebar component with accent top bar and admin pill badge
- Frontend: BYOC nav item shown only when team.is_byoc is true (derived
  from teams store, not JWT); disabled for members
- Frontend: Admin shield button in Sidebar, visible only to platform admins
- Backend: is_admin in JWT claims + requireAdmin middleware (DB-validated)
- Backend: is_byoc added to teamResponse so frontend derives visibility
  from fresh team data rather than stale JWT fields
- Backend: SetBYOC admin endpoint (PUT /v1/admin/teams/{id}/byoc)
- Backend: Admin hosts list enriches BYOC entries with team_name
- Host agent: load .env file via godotenv on startup
2026-03-25 03:10:41 +06:00
9bf67aa7f7 Implement host registration, JWT refresh tokens, and multi-host scheduling
Replaces the hardcoded CP_HOST_AGENT_ADDR single-agent setup with a
DB-driven registration system supporting multiple host agents (BYOC).

Key changes:
- Host agents register via one-time token, receive a 7-day JWT + 60-day
  refresh token; heartbeat loop auto-refreshes on 401/403 and pauses all
  sandboxes if refresh fails
- HostClientPool: lazy Connect RPC client cache keyed by host ID, replacing
  the single static agent client throughout the API and service layers
- RoundRobinScheduler: picks an online host for each new sandbox via
  ListActiveHosts; extensible for future scheduling strategies
- HostMonitor (replaces Reconciler): passive heartbeat staleness check marks
  hosts unreachable and sandboxes missing after 90s; active reconciliation
  per online host restores missing-but-alive sandboxes and stops orphans
- Graceful host delete: returns 409 with affected sandbox list without
  ?force=true; force-delete destroys sandboxes then evicts pool client
- Snapshot delete broadcasts to all online hosts (templates have no host_id)
- sandbox.Manager.PauseAll: pauses all running VMs on CP connectivity loss
- New migration: host_refresh_tokens table with token rotation (issue-then-
  revoke ordering to prevent lockout on mid-rotation crash)
- New sandbox status 'missing' (reversible, unlike 'stopped') and host
  status 'unreachable'; both reflected in OpenAPI spec
- Fix: refresh token auth failure now returns 401 (was 400 via generic
  'invalid' substring match in serviceErrToHTTP)
2026-03-24 18:32:05 +06:00
f968da9768 Minor frontend enhancements 2026-03-24 17:25:00 +06:00
3932bc056e Add user names, team-scoped sandbox guard, and login robustness fixes
- Add name column to users (migration + sqlc regen); propagate through JWT
  claims, auth context, all auth/OAuth handlers, service layer, and frontend
- Sidebar and team page show name instead of email; team page splits Name/Email
  into separate columns
- Block sandbox creation in UI and API when user has no active team context
- loginTeam helper falls back to first active team when no default is set,
  fixing login for invited users with no is_default membership
- Exclude soft-deleted teams from GetDefaultTeamForUser, GetBYOCTeams queries
- Guard host creation against soft-deleted teams in service/host.go
- SwitchTeam re-fetches name from DB instead of trusting stale JWT claim
- Reset teams store on login so stale data from a previous session never persists
- Update openapi.yaml: add name to SignupRequest and AuthResponse schemas
2026-03-24 16:56:10 +06:00
aaeccd32ce Merge pull request 'Frontend consistency and improvements' (#5) from frontend-enhancement into dev
Reviewed-on: wrenn/sandbox#5
2026-03-24 10:00:27 +00:00
915d934c26 Frontend consistency pass: delight, audit, and normalization
Delight (keys page):
- Animated checkmark draw + circle pop on key reveal dialog open
- Key display area pulses accent glow on open to draw eye to "copy this"
- Copy button spring-bounces on successful copy (re-triggers on repeat)
- Empty state key icon floats (iconFloat, now global)
- Row hover uses scaleY left-accent stripe (matches capsules pattern)
- New key row flashes accent on reveal dialog dismiss (matches capsule-born)

Audit fixes (all dashboard pages):
- Page titles standardized to em dash: "Wrenn — X" across all four pages
- formatDate/timeAgo extracted to src/lib/utils/format.ts (string | undefined
  signatures); keys and snapshots now import from there instead of duplicating
- team formatDate gains undefined guard (kept local, date-only format differs)
- spin-once and iconFloat keyframes moved to app.css as globals; scoped copies
  removed from capsules and keys
- Snapshots empty state icon was referencing undefined @keyframes float; fixed
  to iconFloat

Normalization:
- Snapshots table rows: replaced ::before pseudo-element accent (opacity-only,
  single color) with DOM row-stripe element using scaleY transition, type-keyed
  color (green for snapshots, blue for images) — matches capsules pattern
- Create Key dialog: max-w-[400px] → max-w-[420px] to align with form dialogs
- Snapshots count and empty-state heading are now terminology-aware: shows
  "templates/snapshots/images" based on active filter; empty heading for all
  filter reads "No templates yet" instead of "No snapshots yet"

Not done (documented in audit, deferred):
- Sidebar nav items pointing to unimplemented routes (audit, usage, billing,
  notifications, settings) — left as-is, needs product decision
- Dialog max-widths fully normalized beyond Create Key — minor, deferred
- capsules timeAgo not imported from shared util (formatTime differs intentionally)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 15:51:11 +06:00
336080bb6d Merge pull request 'Added team related functionalities' (#4) from team-management into dev
Reviewed-on: wrenn/sandbox#4
2026-03-24 08:58:32 +00:00
90c296f5e1 Polish team page: delight micro-interactions and layout improvements
- Slug + Team ID rows collapsed into a 2-column grid for better density
- "you" badge moved inline with email instead of stacked below it
- Copy checkmark draws itself via SVG stroke-dashoffset animation
- New member row flashes accent-green on entry
- Removed member row slides out smoothly (fly transition)
- Member rows use staggered fly-in on page load
- Team name briefly highlights accent color after a successful rename
- Search result avatars get colorized initials based on email character
2026-03-24 14:56:19 +06:00
bf494f73fc Fix team name blink on navigation by lifting teams into a singleton store
Teams list was fetched on every Sidebar mount (each page navigation),
causing a flash from '…' to the real name on every tab switch. Move teams
into a module-level reactive store (teams.svelte.ts) that fetches once per
session and is shared between Sidebar and the team page.
2026-03-24 14:44:09 +06:00
71a7fdb76f Fix user search to trigger on 3 characters without requiring @
The anti-enumeration guard required @ in the email prefix, causing the
typeahead to silently return nothing until the user typed @. Replace with
a minimum 3-character length check to match the frontend trigger condition.
2026-03-24 14:41:01 +06:00
b3e8bdd171 Refine team management: name chars, danger zone, no-team state
- Allow hyphens, @, and apostrophes in team names (backend regex)
- After delete/leave, switch to next available team instead of logging
  out; if no teams remain, show a toast prompting to create one
- Disable delete/leave button when user has only one team, with
  explanatory hint to create another team first
- Show empty state on /dashboard/team when auth has no team context,
  pointing user to the sidebar to create a team
- Fetch all teams in parallel with team detail on page load to power
  the isLastTeam guard
2026-03-24 14:34:20 +06:00
1e681da738 Add team management frontend
- New /dashboard/team page with inline team name editing, slug/ID copy,
  members table with split-button (remove + make admin/member), add member
  typeahead, and danger zone (delete/leave) with confirmation dialogs
- Sidebar now fetches real teams from API, supports team switching and
  team creation via dialog
- Rename nav item Members → Team, route /dashboard/members → /dashboard/team
- New src/lib/api/team.ts with typed functions for all team endpoints
2026-03-24 14:21:53 +06:00
8e5d426638 Add team management endpoints
- Three-role model (owner/admin/member) with owner protection invariants
- Team CRUD: create, rename (admin+), soft-delete with VM cleanup (owner only)
- Member management: add by email, remove, role updates (admin+), leave
- Switch-team endpoint re-issues JWT after DB membership verification
- User email prefix search for add-member UI autocomplete
- JWT carries role as a hint; all authorization decisions verified from DB
- Team slug: immutable 12-char hex (e.g. a1b2c3-d1e2f3), reserved on soft-delete
- Migration adds slug + deleted_at to teams; backfills existing rows
2026-03-24 13:29:54 +06:00
4e26d7a292 Merge pull request 'Minor frontend enhancement' (#3) from frontend into dev
Reviewed-on: wrenn/sandbox#3
2026-03-24 06:36:17 +00:00
79eba782fb Updated design docs 2026-03-24 12:34:58 +06:00
b786a825d4 Polish dashboard frontend: spacing, copy, resilience
- Increase content padding (p-7→p-8) and table cell padding (px-4→px-5,
  py-3→py-4 for data rows) across capsules, keys, and snapshots pages
- Improve animation performance: wrenn-glow uses opacity instead of
  box-shadow (compositor-only, no paint cost)
- Add prefers-reduced-motion media query covering inline style animations
- Fix OAuth error display on login page (read ?error= param on mount)
- Harden clipboard copy with try-catch and toast fallback
- Improve empty state copy, dialog microcopy, and error messages
- Add retry button to error banners on keys page
- Replace "All systems operational" footer bar with a clean 1px divider
- Fix text truncation on long capsule/snapshot names (min-w-0 + truncate)
2026-03-24 12:33:18 +06:00
71564b202e Merge branch 'main' of git.omukk.dev:wrenn/sandbox into dev 2026-03-24 01:11:43 +06:00
32e5a5a715 Prototype with single host server and no admin panel (#2)
Reviewed-on: wrenn/sandbox#2
Co-authored-by: pptx704 <rafeed@omukk.dev>
Co-committed-by: pptx704 <rafeed@omukk.dev>
2026-03-22 21:01:23 +00:00
5f0dbadea6 Fix snapshot and sandbox delete consistency
- Snapshot delete: make agent RPC failure a hard error so DB record is
  not removed when files cannot be deleted from disk
- Snapshot overwrite: call agent to delete old files before removing the
  DB record, preventing stale memfile.{uuid} generations from accumulating
  on disk across repeated overwrites
- Sandbox destroy: only swallow CodeNotFound from the agent (sandbox
  already gone / TTL-reaped); any other error now propagates to the caller
  instead of being silently ignored
2026-03-23 02:59:30 +06:00
36782e1b4f Add tini as PID 1, guest clock sync, and fix PATH in guest VMs
- Use tini as PID 1 in wrenn-init.sh so zombie processes are reaped
  and signals are forwarded correctly to envd
- Set standard PATH in wrenn-init.sh so child processes spawned by envd
  can find common binaries (fixes "nice: ls command not found")
- Add envdclient.Init() to POST /init on envd after every boot/resume,
  syncing the guest clock via unix.ClockSettime — critical after snapshot
  resume where the guest clock is frozen
- Run Init in a background goroutine so it doesn't block the CreateSandbox
  RPC response; a slow Init (vCPU busy with envd startup) was causing the
  RPC context to be canceled before the response reached the control plane
- Update rootfs-from-container.sh and update-debug-rootfs.sh to inject
  tini into the rootfs, checking the container image and host first,
  downloading from GitHub releases as fallback
2026-03-23 02:45:27 +06:00
97292ba0bf Added basic frontend (#1)
Reviewed-on: wrenn/sandbox#1
Co-authored-by: pptx704 <rafeed@omukk.dev>
Co-committed-by: pptx704 <rafeed@omukk.dev>
2026-03-22 19:01:38 +00:00
866f3ac012 Consolidate host agent path env vars into single AGENT_FILES_ROOTDIR
Replace AGENT_KERNEL_PATH, AGENT_IMAGES_PATH, AGENT_SANDBOXES_PATH,
AGENT_SNAPSHOTS_PATH, and AGENT_TOKEN_FILE with a single
AGENT_FILES_ROOTDIR (default /var/lib/wrenn) that derives all
subdirectory paths automatically.
2026-03-17 05:59:26 +06:00
2c66959b92 Add host registration, heartbeat, and multi-host management
Implements the full host ↔ control plane connection flow:

- Host CRUD endpoints (POST/GET/DELETE /v1/hosts) with role-based access:
  regular hosts admin-only, BYOC hosts for admins and team owners
- One-time registration token flow: admin creates host → gets token (1hr TTL
  in Redis + Postgres audit trail) → host agent registers with specs → gets
  long-lived JWT (1yr)
- Host agent registration client with automatic spec detection (arch, CPU,
  memory, disk) and token persistence to disk
- Periodic heartbeat (30s) via POST /v1/hosts/{id}/heartbeat with X-Host-Token
  auth and host ID cross-check
- Token regeneration endpoint (POST /v1/hosts/{id}/token) for retry after
  failed registration
- Tag management (add/remove/list) with team-scoped access control
- Host JWT with typ:"host" claim, cross-use prevention in both VerifyJWT and
  VerifyHostJWT
- requireHostToken middleware for host agent authentication
- DB-level race protection: RegisterHost uses AND status='pending' with
  rows-affected check; Redis GetDel for atomic token consume
- Migration for future mTLS support (cert_fingerprint, mtls_enabled columns)
- Host agent flags: --register (one-time token), --address (required ip:port)
- serviceErrToHTTP extended with "forbidden" → 403 mapping
- OpenAPI spec, .env.example, and README updated
2026-03-17 05:51:28 +06:00
e4ead076e3 Add admin users, BYOC teams, hosts schema, and Redis for host registration
Introduce three migrations: admin permissions (is_admin + permissions table),
BYOC team tracking, and multi-host support (hosts, host_tokens, host_tags).
Add Redis to dev infra and wire up client in control plane for ephemeral
host registration tokens. Add go-redis dependency.
2026-03-17 03:26:42 +06:00
1d59b50e49 Remove empty admin UI stubs
The internal/admin/ package was never imported or mounted — just
placeholder files. Removing to avoid confusion before the real
dashboard is built.
2026-03-16 05:39:43 +06:00
f38d5812d1 Extract shared service layer for sandbox, API key, and template operations
Moves business logic from API handlers into internal/service/ so that
both the REST API and the upcoming dashboard can share the same operations
without duplicating code. API handlers now delegate to the service layer
and only handle HTTP-specific concerns (request parsing, response formatting).
2026-03-16 05:39:30 +06:00
931b7d54b3 Add GitHub OAuth login with provider registry
Implement OAuth 2.0 login via GitHub as an alternative to email/password.
Uses a provider registry pattern (internal/auth/oauth/) so adding Google
or other providers later requires only a new Provider implementation.

Flow: GET /v1/auth/oauth/github redirects to GitHub, callback exchanges
the code for a user profile, upserts the user + team atomically, and
redirects to the frontend with a JWT token.

Key changes:
- Migration: make password_hash nullable, add oauth_providers table
- Provider registry with GitHubProvider (profile + email fallback)
- CSRF state cookie with HMAC-SHA256 validation
- Race-safe registration (23505 collision retries as login)
- Startup validation: CP_PUBLIC_URL required when OAuth is configured

Not fully tested — needs integration tests with a real GitHub OAuth app
and end-to-end testing with the frontend callback page.
2026-03-15 06:31:58 +06:00
477d4f8cf6 Add auto-pause TTL and ping endpoint for sandbox inactivity management
Replace the existing auto-destroy TTL behavior with auto-pause: when a
sandbox exceeds its timeout_sec of inactivity, the TTL reaper now pauses
it (snapshot + teardown) instead of destroying it, preserving the ability
to resume later.

Key changes:
- TTL reaper calls Pause instead of Destroy, with fallback to Destroy if
  pause fails (e.g. Firecracker process already gone)
- New PingSandbox RPC resets the in-memory LastActiveAt timer
- New POST /v1/sandboxes/{id}/ping REST endpoint resets both agent memory
  and DB last_active_at
- ListSandboxes RPC now includes auto_paused_sandbox_ids so the reconciler
  can distinguish auto-paused sandboxes from crashed ones in a single call
- Reconciler polls every 5s (was 30s) and marks auto-paused as "paused"
  vs orphaned as "stopped"
- Resume RPC accepts timeout_sec from DB so TTL survives pause/resume cycles
- Reaper checks every 2s (was 10s) and uses a detached context to avoid
  incomplete pauses on app shutdown
- Default timeout_sec changed from 300 to 0 (no auto-pause unless requested)
2026-03-15 05:15:18 +06:00
88246fac2b Fix sandbox lifecycle cleanup and dmsetup remove reliability
- Add retry with backoff to dmsetupRemove for transient "device busy"
  errors caused by kernel not releasing the device immediately after
  Firecracker exits. Only retries on "Device or resource busy"; other
  errors (not found, permission denied) return immediately.

- Thread context.Context through RemoveSnapshot/RestoreSnapshot so
  retries respect cancellation. Use context.Background() in all error
  cleanup paths to prevent cancelled contexts from skipping cleanup
  and leaking dm devices on the host.

- Resume vCPUs on pause failure: if snapshot creation or memfile
  processing fails after freezing the VM, unfreeze vCPUs so the
  sandbox stays usable instead of becoming a frozen zombie.

- Fix resource leaks in Pause when CoW rename or metadata write fails:
  properly clean up network, slot, loop device, and remove from boxes
  map instead of leaving a dead sandbox with leaked host resources.

- Fix Resume WaitUntilReady failure: roll back CoW file to the snapshot
  directory instead of deleting it, preserving the paused state so the
  user can retry.

- Skip m.loops.Release when RemoveSnapshot fails during pause since
  the stale dm device still references the origin loop device.

- Fix incorrect VCPUs placeholder in Resume VMConfig that used memory
  size instead of a sensible default.
2026-03-14 06:42:34 +06:00
1846168736 Fix device-mapper "Device or resource busy" error on sandbox resume
Pause was logging RemoveSnapshot failures as warnings and continuing,
which left stale dm devices behind. Resume then failed trying to create
a device with the same name.

- Make RemoveSnapshot failure a hard error in Pause (clean up remaining
  resources and return error instead of silently proceeding)
- Add defensive stale device cleanup in RestoreSnapshot before creating
  the new dm device
2026-03-14 03:57:14 +06:00
c92cc29b88 Add authentication, authorization, and team-scoped access control
Implement email/password auth with JWT sessions and API key auth for
sandbox lifecycle. Users get a default team on signup; sandboxes,
snapshots, and API keys are scoped to teams.

- Add user, team, users_teams, and team_api_keys tables (goose migrations)
- Add JWT middleware (Bearer token) for user management endpoints
- Add API key middleware (X-API-Key header, SHA-256 hashed) for sandbox ops
- Add signup/login handlers with transactional user+team creation
- Add API key CRUD endpoints (create/list/delete)
- Replace owner_id with team_id on sandboxes and templates
- Update all handlers to use team-scoped queries
- Add godotenv for .env file loading
- Update OpenAPI spec and test UI with auth flows
2026-03-14 03:57:06 +06:00
712b77b01c Add script to create rootfs from Docker container 2026-03-13 09:41:58 +06:00
80a99eec87 Add diff snapshots for re-pause to avoid UFFD fault-in storm
Use Firecracker's Diff snapshot type when re-pausing a previously
resumed sandbox, capturing only dirty pages instead of a full memory
dump. Chains up to 10 incremental generations before collapsing back
to a Full snapshot. Multi-generation diff files (memfile.{buildID})
are supported alongside the legacy single-file format in resume,
template creation, and snapshot existence checks.
2026-03-13 09:41:58 +06:00
a0d635ae5e Fix path traversal in template/snapshot names and network cleanup leaks
Add SafeName validator (allowlist regex) to reject directory traversal
in user-supplied template and snapshot names. Validated at both API
handlers (400 response) and sandbox manager (defense in depth).

Refactor CreateNetwork with rollback slice so partially created
resources (namespace, veth, routes, iptables rules) are cleaned up
on any error. Refactor RemoveNetwork to collect and return errors
instead of silently ignoring them.
2026-03-13 08:40:36 +06:00
63e9132d38 Add device-mapper snapshots, test UI, fix pause ordering and lint errors
- Replace reflink rootfs copy with device-mapper snapshots (shared
  read-only loop device per base template, per-sandbox sparse CoW file)
- Add devicemapper package with create/restore/remove/flatten operations
  and refcounted LoopRegistry for base image loop devices
- Fix pause ordering: destroy VM before removing dm-snapshot to avoid
  "device busy" error (FC must release the dm device first)
- Add test UI at GET /test for sandbox lifecycle management (create,
  pause, resume, destroy, exec, snapshot create/list/delete)
- Fix DirSize to report actual disk usage (stat.Blocks * 512) instead
  of apparent size, so sparse CoW files report correctly
- Add timing logs to pause flow for performance diagnostics
- Fix all lint errors across api, network, vm, uffd, and sandbox packages
- Remove obsolete internal/filesystem package (replaced by devicemapper)
- Update CLAUDE.md with device-mapper architecture documentation
2026-03-13 08:25:40 +06:00
778894b488 Made license related changes 2026-03-13 05:42:10 +06:00
a1bd439c75 Add sandbox snapshot and restore with UFFD lazy memory loading
Implement full snapshot lifecycle: pause (snapshot + free resources),
resume (UFFD-based lazy restore), and named snapshot templates that
can spawn new sandboxes from frozen VM state.

Key changes:
- Snapshot header system with generational diff mapping (inspired by e2b)
- UFFD server for lazy page fault handling during snapshot restore
- Stable rootfs symlink path (/tmp/fc-vm/) for snapshot compatibility
- Templates DB table and CRUD API endpoints (POST/GET/DELETE /v1/snapshots)
- CreateSnapshot/DeleteSnapshot RPCs in hostagent proto
- Reconciler excludes paused sandboxes (expected absent from host agent)
- Snapshot templates lock vcpus/memory to baked-in values
- Proper cleanup of uffd sockets and pause snapshot files on destroy
2026-03-12 09:19:37 +06:00
9b94df7f56 Rewrite CLAUDE.md and README.md
CLAUDE.md: replace bloated 850-line version with focused 230-line
guide. Fix inaccuracies (module path, build dir, Connect RPC vs gRPC,
buf vs protoc). Add detailed architecture with request flows, code
generation workflow, rootfs update process, and two-module gotchas.

README.md: add core deployment instructions (prerequisites, build,
host setup, configuration, running, rootfs workflow).
2026-03-11 06:37:11 +06:00
0c245e9e1c Fix guest VM outbound networking and DNS resolution
Add resolv.conf to wrenn-init so guests can resolve DNS, and fix the
host MASQUERADE rule to match vpeerIP (the actual source after namespace
SNAT) instead of hostIP.
2026-03-11 06:02:31 +06:00
b4d8edb65b Add streaming exec and file transfer endpoints
Add WebSocket-based streaming exec endpoint and streaming file
upload/download endpoints to the control plane API. Includes new
host agent RPC methods (ExecStream, StreamWriteFile, StreamReadFile),
envd client streaming support, and OpenAPI spec updates.
2026-03-11 05:42:42 +06:00
ec3360d9ad Add minimal control plane with REST API, database, and reconciler
- REST API (chi router): sandbox CRUD, exec, pause/resume, file write/read
- PostgreSQL persistence via pgx/v5 + sqlc (sandboxes table with goose migration)
- Connect RPC client to host agent for all VM operations
- Reconciler syncs host agent state with DB every 30s (detects TTL-reaped sandboxes)
- OpenAPI 3.1 spec served at /openapi.yaml, Swagger UI at /docs
- Added WriteFile/ReadFile RPCs to hostagent proto and implementations
- File upload via multipart form, download via JSON body POST
- sandbox_id propagated from control plane to host agent on create
2026-03-10 16:50:12 +06:00
d7b25b0891 updated license structure 2026-03-10 04:32:29 +06:00
34c89e814d Added basic license information 2026-03-10 04:28:51 +06:00
6f0c365d44 Add host agent RPC server with sandbox lifecycle management
Implement the host agent as a Connect RPC server that orchestrates
sandbox creation, destruction, pause/resume, and command execution.
Includes sandbox manager with TTL-based reaper, network slot allocator,
rootfs cloning, hostagent proto definition with generated stubs, and
test/debug scripts. Fix Firecracker process lifetime bug where VM was
tied to HTTP request context instead of background context.
2026-03-10 03:54:53 +06:00
c31ce90306 Centralize envd proto source of truth to proto/envd/
Remove duplicate proto files from envd/spec/ and update envd's
buf.gen.yaml to generate stubs from the canonical proto/envd/ location.
Both modules now generate their own Connect RPC stubs from the same
source protos.
2026-03-10 02:49:31 +06:00
7753938044 Add host agent with VM lifecycle, TAP networking, and envd client
Implements Phase 1: boot a Firecracker microVM, execute a command inside
it via envd, and get the output back. Uses raw Firecracker HTTP API via
Unix socket (not the Go SDK) for full control over the VM lifecycle.

- internal/vm: VM manager with create/pause/resume/destroy, Firecracker
  HTTP client, process launcher with unshare + ip netns exec isolation
- internal/network: per-sandbox network namespace with veth pair, TAP
  device, NAT rules, and IP forwarding
- internal/envdclient: Connect RPC client for envd process/filesystem
  services with health check retry
- cmd/host-agent: demo binary that boots a VM, runs "echo hello", prints
  output, and cleans up
- proto/envd: canonical proto files with buf + protoc-gen-connect-go
  code generation
- images/wrenn-init.sh: minimal PID 1 init script for guest VMs
- CLAUDE.md: updated architecture to reflect TAP networking (not vsock)
  and Firecracker HTTP API (not Go SDK)
2026-03-10 00:06:47 +06:00
a3898d68fb Port envd from e2b with internalized shared packages and Connect RPC
- Copy envd source from e2b-dev/infra, internalize shared dependencies
  into envd/internal/shared/ (keys, filesystem, id, smap, utils)
- Switch from gRPC to Connect RPC for all envd services
- Update module paths to git.omukk.dev/wrenn/{sandbox,sandbox/envd}
- Add proto specs (process, filesystem) with buf-based code generation
- Implement full envd: process exec, filesystem ops, port forwarding,
  cgroup management, MMDS integration, and HTTP API
- Update main module dependencies (firecracker SDK, pgx, goose, etc.)
- Remove placeholder .gitkeep files replaced by real implementations
2026-03-09 21:03:19 +06:00
bd78cc068c Initial project structure for Wrenn Sandbox
Set up directory layout, Makefiles, go.mod files, docker-compose,
and empty placeholder files for all packages.
2026-03-09 17:22:47 +06:00