Resume() was building VMConfig without TemplateID, so Firecracker MMDS
received an empty string. envd's PostInit then wrote that empty value to
/run/wrenn/.WRENN_TEMPLATE_ID. Fix by persisting the template ID in
snapshot metadata during Pause and reading it back during Resume.
Running port-binding applications (Jupyter, http.server, NextJS) inside
sandboxes caused severe PTY sluggishness and proxy navigation errors.
Root cause: the CP sandbox proxy and Connect RPC pool shared a single
HTTP transport. Heavy proxy traffic (Jupyter WebSocket, REST polling)
interfered with PTY RPC streams via HTTP/2 flow control contention.
Transport isolation (main fix):
- Add dedicated proxy transport on CP (NewProxyTransport) with HTTP/2
disabled, separate from the RPC pool transport
- Add dedicated proxy transport on host agent, replacing
http.DefaultTransport
- Add dedicated envdclient transport with tuned connection pooling
- Replace http.DefaultClient in file streaming RPCs with per-sandbox
envd client
Proxy path rewriting (navigation fix):
- Add ModifyResponse to rewrite Location headers with /proxy/{id}/{port}
prefix, handling both root-relative and absolute-URL redirects
- Strip prefix back out in CP subdomain proxy for correct browser
behavior
- Replace path.Join with string concat in CP Director to preserve
trailing slashes (prevents redirect loops on directory listings)
Proxy resilience:
- Add dial retry with linear backoff (3 attempts) to handle socat
startup delay when ports are first detected
- Cache ReverseProxy instances per sandbox+port+host in sync.Map
- Add EvictProxy callback wired into sandbox Manager.Destroy
Buffer and server hardening:
- Increase PTY and exec stream channel buffers from 16 to 256
- Add ReadHeaderTimeout (10s) and IdleTimeout (620s) to host agent
HTTP server
Network tuning:
- Set TAP device TxQueueLen to 5000 (up from default 1000)
- Add Firecracker tx_rate_limiter (200 MB/s sustained, 100 MB burst)
to prevent guest traffic from saturating the TAP
ConnectProvider computed HMAC over bare state, but Callback always
verifies HMAC(state+":"+intent). This caused the account-linking
flow to always fail with invalid_state.
- Scope WebSocket auth bypass to only WS endpoints by restructuring
routes into separate chi Groups. Non-WS routes no longer passthrough
unauthenticated requests with spoofed Upgrade headers. Added
optionalAPIKeyOrJWT middleware for WS routes (injects auth context
from API key/JWT if present, passes through otherwise) and
markAdminWS middleware for admin WS routes.
- Fix nil pointer dereference in envd Handler.Wait() — p.tty.Close()
was called unconditionally but p.tty is nil for non-PTY processes,
crashing every non-PTY process exit.
- Fix goroutine leak in sandbox Pause — stopSampler was never called,
leaking one sampler goroutine per successful pause operation.
- Decouple PTY WebSocket reads from RPC dispatch using a buffered
channel to prevent backpressure-induced connection drops under fast
typing. Includes input coalescing to reduce RPC call volume.
## What's new
Compliance, audit, and account lifecycle improvements — admin actions are now fully auditable, user data is properly anonymized on deletion, and OAuth signup flow gives users control over their profile.
### Audit
- Added audit logging for all admin actions (user activate/deactivate, team BYOC toggle, team delete, template delete, build create/cancel)
- Added admin audit page with infinite scroll and hierarchical filters
- Fixed audit log team assignment — admin/host actions now correctly land under PlatformTeamID
- Anonymize audit logs on user hard-delete (actor name, IDs, emails stripped)
- Deduplicated audit logger internals (665 → 374 lines, no behavior change)
### Authentication
- Separated GitHub OAuth login/signup flows — login no longer auto-creates accounts
- Added name confirmation dialog for new GitHub signups
### Account Lifecycle
- Email notification sent when account is permanently deleted after grace period
- Audit log anonymization tied to user purge (per-user transactional)
### UX
- Removed accent gradient bars from admin host dialogs (border + shadow only)
- Frontend renders deleted users as styled badge in audit log view
### Others
- Version bump
- Bug fixes
Reviewed-on: wrenn/wrenn#36
Notify users via email when their account is permanently deleted after
the 15-day soft-delete grace period. Query now returns email alongside
user ID so the notification can be sent after deletion.
Email failure is logged as a warning but does not block cleanup.
Replace repetitive actorFields + write boilerplate across all 25+ typed
Log methods with shared helpers: newEntry (general), newAdminEntry
(platform-level), resolveHostTeamID, and logSystemHostEvent.
Reduces logger.go from 665 to 374 lines with no behavior change.
Log every admin-panel action (user activate/deactivate, team BYOC toggle,
team delete, template delete, build create/cancel) to the audit_logs table
under PlatformTeamID with scope "admin".
Add GET /v1/admin/audit-logs endpoint and /admin/audit frontend page with
infinite scroll and hierarchical filters. Expose audit.Entry + Log() for
cloud repo extensibility.
Fix seed_platform_team down-migration FK violation by deleting dependent
rows before the team row.
Normalize admin host page dialogs to match design system pattern:
border + shadow only, no colored gradient strips. Align animation
timing and shadow to reference components (DestroyDialog, etc).
Anonymize audit logs when soft-deleted users are purged after 15 days:
actor_name set to 'deleted-user', actor_id and resource_id nulled,
email stripped from member metadata. Per-user delete ensures no user
is removed without successful anonymization.
Frontend renders deleted-user as a styled red badge in audit log view.
Fix shared host create/delete audit logs landing in admin's personal
team — now correctly assigned to PlatformTeamID.
Block auto-account creation when signing in via GitHub from login mode.
Signup via GitHub now shows a name confirmation dialog before redirecting
to dashboard, letting users verify/edit their display name pulled from
GitHub.
- Add intent query param to OAuth redirect, persisted in HMAC-signed state cookie
- Block registration in callback when intent=login, return no_account error
- Set wrenn_oauth_new_signup cookie on new account creation
- Frontend callback shows name confirmation dialog for new signups
- Add no_account error message to login page
Separate summary cards with proper surface hierarchy, add staggered
entrance animations, tighten padding, and rewrite labels/descriptions
to be specific and actionable rather than generic.
Introduce pre-computed daily usage rollups from sandbox_metrics_snapshots.
An hourly background worker aggregates completed days, while today's
usage is computed live from snapshots at query time for freshness.
Backend: new daily_usage table, rollup worker, UsageService, and
GET /v1/capsules/usage endpoint with date range filtering (up to 92 days).
Frontend: replace Usage page placeholder with bar charts (Chart.js),
summary total cards, and preset/custom date range controls.
Extracts Mailer interface, EmailData, and Button to pkg/email/types.go
so the cloud repo can use them via ServerContext. internal/email re-exports
the types as aliases so existing callers are unchanged. Also fixes
pre-existing lint errors (unchecked rollback and deadline calls).
On startup EnsureImageSizes expands the minimal rootfs to the configured
disk size. This adds the inverse: ShrinkMinimalImage runs e2fsck + resize2fs -M
during graceful shutdown so the image is stored compactly on disk.
Both control plane and host agent now write structured slog output to
$WRENN_DIR/logs/ in addition to stderr. Log level is configurable via
LOG_LEVEL env var (default: info). SIGHUP reopens the log file so
logrotate can rotate without copytruncate.
The veth addressing uses 10.12.0.0/16 with 2 IPs per slot. At slot
index 32768, vethOffset=65536 overflows byte arithmetic and wraps back
to 10.12.0.0, causing silent IP collisions with existing sandboxes.
Cap the allocator at 32767, which is the actual addressable limit.
When an admin disables a user, all active sandboxes (running, paused,
hibernated) for teams they own are now destroyed and their API keys
are deleted. User queries now filter by status column instead of
deleted_at, so re-enabling a user always works. OAuth login paths
use ensureDefaultTeam to auto-create a team if the user has none,
matching the email/password login behavior.
Sidebar and AdminSidebar were re-instantiated on every page navigation
(17 pages total), causing unnecessary DOM teardown/rebuild and redundant
localStorage reads. Now each lives in its respective +layout.svelte as a
single persistent instance.
Also adds onDestroy cleanup for leaked timers (settings, team, login RAF
loop) and CSS containment on <main> to isolate layout recalculations.
Delete all API keys created by a user when their account is disabled,
deleted, or soft-deleted. Store build archives before enqueuing to Redis
so workers never dequeue a build with missing files.
- Add ON DELETE CASCADE to users_teams, oauth_providers, admin_permissions
and ON DELETE SET NULL (with nullable columns) to team_api_keys.created_by,
hosts.created_by, host_tokens.created_by so HardDeleteExpiredUsers no longer
fails with FK violations
- User account deletion now cascades to sole-owned teams via DeleteTeamInternal,
preventing orphaned teams with live sandboxes after account removal
- ListActiveSandboxesByTeam now includes hibernated sandboxes so their disk
snapshots are cleaned up during team deletion
- Team soft-delete now hard-deletes sandbox metric points, metric snapshots,
API keys, and channels to prevent data accumulation on deleted teams
- Extract deleteTeamCore() to deduplicate shared logic across DeleteTeam,
AdminDeleteTeam, and DeleteTeamInternal
- Fix ListAPIKeysByTeamWithCreator to use LEFT JOIN after created_by became
nullable, and update handler to read pgtype.Text.String for creator_email