Show "[session disconnected]" in terminal when PTY websocket closes cleanly.
Map scheduler and agent unavailability errors to 503 with user-friendly
message instead of leaking internal details.
Running port-binding applications (Jupyter, http.server, NextJS) inside
sandboxes caused severe PTY sluggishness and proxy navigation errors.
Root cause: the CP sandbox proxy and Connect RPC pool shared a single
HTTP transport. Heavy proxy traffic (Jupyter WebSocket, REST polling)
interfered with PTY RPC streams via HTTP/2 flow control contention.
Transport isolation (main fix):
- Add dedicated proxy transport on CP (NewProxyTransport) with HTTP/2
disabled, separate from the RPC pool transport
- Add dedicated proxy transport on host agent, replacing
http.DefaultTransport
- Add dedicated envdclient transport with tuned connection pooling
- Replace http.DefaultClient in file streaming RPCs with per-sandbox
envd client
Proxy path rewriting (navigation fix):
- Add ModifyResponse to rewrite Location headers with /proxy/{id}/{port}
prefix, handling both root-relative and absolute-URL redirects
- Strip prefix back out in CP subdomain proxy for correct browser
behavior
- Replace path.Join with string concat in CP Director to preserve
trailing slashes (prevents redirect loops on directory listings)
Proxy resilience:
- Add dial retry with linear backoff (3 attempts) to handle socat
startup delay when ports are first detected
- Cache ReverseProxy instances per sandbox+port+host in sync.Map
- Add EvictProxy callback wired into sandbox Manager.Destroy
Buffer and server hardening:
- Increase PTY and exec stream channel buffers from 16 to 256
- Add ReadHeaderTimeout (10s) and IdleTimeout (620s) to host agent
HTTP server
Network tuning:
- Set TAP device TxQueueLen to 5000 (up from default 1000)
- Add Firecracker tx_rate_limiter (200 MB/s sustained, 100 MB burst)
to prevent guest traffic from saturating the TAP
ConnectProvider computed HMAC over bare state, but Callback always
verifies HMAC(state+":"+intent). This caused the account-linking
flow to always fail with invalid_state.
- Scope WebSocket auth bypass to only WS endpoints by restructuring
routes into separate chi Groups. Non-WS routes no longer passthrough
unauthenticated requests with spoofed Upgrade headers. Added
optionalAPIKeyOrJWT middleware for WS routes (injects auth context
from API key/JWT if present, passes through otherwise) and
markAdminWS middleware for admin WS routes.
- Fix nil pointer dereference in envd Handler.Wait() — p.tty.Close()
was called unconditionally but p.tty is nil for non-PTY processes,
crashing every non-PTY process exit.
- Fix goroutine leak in sandbox Pause — stopSampler was never called,
leaking one sampler goroutine per successful pause operation.
- Decouple PTY WebSocket reads from RPC dispatch using a buffered
channel to prevent backpressure-induced connection drops under fast
typing. Includes input coalescing to reduce RPC call volume.
Log every admin-panel action (user activate/deactivate, team BYOC toggle,
team delete, template delete, build create/cancel) to the audit_logs table
under PlatformTeamID with scope "admin".
Add GET /v1/admin/audit-logs endpoint and /admin/audit frontend page with
infinite scroll and hierarchical filters. Expose audit.Entry + Log() for
cloud repo extensibility.
Fix seed_platform_team down-migration FK violation by deleting dependent
rows before the team row.
Block auto-account creation when signing in via GitHub from login mode.
Signup via GitHub now shows a name confirmation dialog before redirecting
to dashboard, letting users verify/edit their display name pulled from
GitHub.
- Add intent query param to OAuth redirect, persisted in HMAC-signed state cookie
- Block registration in callback when intent=login, return no_account error
- Set wrenn_oauth_new_signup cookie on new account creation
- Frontend callback shows name confirmation dialog for new signups
- Add no_account error message to login page
Introduce pre-computed daily usage rollups from sandbox_metrics_snapshots.
An hourly background worker aggregates completed days, while today's
usage is computed live from snapshots at query time for freshness.
Backend: new daily_usage table, rollup worker, UsageService, and
GET /v1/capsules/usage endpoint with date range filtering (up to 92 days).
Frontend: replace Usage page placeholder with bar charts (Chart.js),
summary total cards, and preset/custom date range controls.
Extracts Mailer interface, EmailData, and Button to pkg/email/types.go
so the cloud repo can use them via ServerContext. internal/email re-exports
the types as aliases so existing callers are unchanged. Also fixes
pre-existing lint errors (unchecked rollback and deadline calls).