wrenn-releases

Author	SHA1	Message	Date
pptx704	3deecbff89	fix: prevent Go runtime memory corruption and sandbox halt after snapshot restore Three root causes addressed: 1. Go page allocator corruption: allocations between the pre-snapshot GC and VM freeze leave the summary tree inconsistent. After restore, GC reads corrupted metadata — either panicking (killing PID 1 → kernel panic) or silently failing to collect, causing unbounded heap growth until OOM. Fix: move GC to after all HTTP allocations in PostSnapshotPrepare, then set GOMAXPROCS(1) so any remaining allocations run sequentially with no concurrent page allocator access. GOMAXPROCS is restored on first health check after restore. 2. PostInit timeout starvation: WaitUntilReady and PostInit shared a single 30s context. If WaitUntilReady consumed most of it, PostInit failed — RestoreAfterSnapshot never ran, leaving envd with keep-alives disabled and zombie connections. Fix: separate timeout contexts. 3. CP HTTP server missing timeouts: no ReadHeaderTimeout or IdleTimeout caused goroutine leaks from hung proxy connections. Fix: add both, matching host agent values. Also adds UFFD prefetch to proactively load all guest pages after restore, eliminating on-demand page fault latency for subsequent RPC calls.	2026-05-02 17:22:51 +06:00
pptx704	bb582deefa	fix: prevent sandbox halt after resume by fixing HTTP/2 HOL blocking and adding timeouts Disable HTTP/2 on both host agent server and CP→agent transport — multiplexing caused head-of-line blocking when a slow sandbox RPC stalled the shared connection. Add ResponseHeaderTimeout to envd HTTP clients. Merge SetDefaults into Resume's PostInit call to eliminate an extra round-trip that could hang on a stale connection.	2026-05-02 13:48:51 +06:00
pptx704	bd98610153	fix: sandbox network responsiveness under port-binding apps Running port-binding applications (Jupyter, http.server, NextJS) inside sandboxes caused severe PTY sluggishness and proxy navigation errors. Root cause: the CP sandbox proxy and Connect RPC pool shared a single HTTP transport. Heavy proxy traffic (Jupyter WebSocket, REST polling) interfered with PTY RPC streams via HTTP/2 flow control contention. Transport isolation (main fix): - Add dedicated proxy transport on CP (NewProxyTransport) with HTTP/2 disabled, separate from the RPC pool transport - Add dedicated proxy transport on host agent, replacing http.DefaultTransport - Add dedicated envdclient transport with tuned connection pooling - Replace http.DefaultClient in file streaming RPCs with per-sandbox envd client Proxy path rewriting (navigation fix): - Add ModifyResponse to rewrite Location headers with /proxy/{id}/{port} prefix, handling both root-relative and absolute-URL redirects - Strip prefix back out in CP subdomain proxy for correct browser behavior - Replace path.Join with string concat in CP Director to preserve trailing slashes (prevents redirect loops on directory listings) Proxy resilience: - Add dial retry with linear backoff (3 attempts) to handle socat startup delay when ports are first detected - Cache ReverseProxy instances per sandbox+port+host in sync.Map - Add EvictProxy callback wired into sandbox Manager.Destroy Buffer and server hardening: - Increase PTY and exec stream channel buffers from 16 to 256 - Add ReadHeaderTimeout (10s) and IdleTimeout (620s) to host agent HTTP server Network tuning: - Set TAP device TxQueueLen to 5000 (up from default 1000) - Add Firecracker tx_rate_limiter (200 MB/s sustained, 100 MB burst) to prevent guest traffic from saturating the TAP	2026-04-25 04:21:55 +06:00
pptx704	11928a172a	feat: send email notification on account hard-delete Notify users via email when their account is permanently deleted after the 15-day soft-delete grace period. Query now returns email alongside user ID so the notification can be sent after deletion. Email failure is logged as a warning but does not block cleanup.	2026-04-21 16:01:56 +06:00
pptx704	bb2146d838	refactor: deduplicate audit logger with shared entry builders Replace repetitive actorFields + write boilerplate across all 25+ typed Log methods with shared helpers: newEntry (general), newAdminEntry (platform-level), resolveHostTeamID, and logSystemHostEvent. Reduces logger.go from 665 to 374 lines with no behavior change.	2026-04-21 15:54:39 +06:00
pptx704	d270ab7752	Version bump	2026-04-21 15:54:04 +06:00
pptx704	7fd801c1eb	feat: add audit logging for all admin actions and admin audit page Log every admin-panel action (user activate/deactivate, team BYOC toggle, team delete, template delete, build create/cancel) to the audit_logs table under PlatformTeamID with scope "admin". Add GET /v1/admin/audit-logs endpoint and /admin/audit frontend page with infinite scroll and hierarchical filters. Expose audit.Entry + Log() for cloud repo extensibility. Fix seed_platform_team down-migration FK violation by deleting dependent rows before the team row.	2026-04-21 15:41:45 +06:00
pptx704	ebbbde9cd1	feat: anonymize audit logs on user hard-delete and fix host audit log team assignment Anonymize audit logs when soft-deleted users are purged after 15 days: actor_name set to 'deleted-user', actor_id and resource_id nulled, email stripped from member metadata. Per-user delete ensures no user is removed without successful anonymization. Frontend renders deleted-user as a styled red badge in audit log view. Fix shared host create/delete audit logs landing in admin's personal team — now correctly assigned to PlatformTeamID.	2026-04-21 14:42:09 +06:00
pptx704	47be1143fb	Add MiddlewareProvider interface for extension middleware Allows cloud extensions to inject middleware that wraps OSS routes (e.g. billing enforcement) before they are registered.	2026-04-18 14:47:29 +06:00
pptx704	92aab09104	Add daily usage metrics (CPU-minutes, RAM GB-minutes) Introduce pre-computed daily usage rollups from sandbox_metrics_snapshots. An hourly background worker aggregates completed days, while today's usage is computed live from snapshots at query time for freshness. Backend: new daily_usage table, rollup worker, UsageService, and GET /v1/capsules/usage endpoint with date range filtering (up to 92 days). Frontend: replace Usage page placeholder with bar charts (Chart.js), summary total cards, and preset/custom date range controls.	2026-04-18 14:29:09 +06:00
pptx704	5fa3529df9	Move email types to pkg/email for cloud repo access Extracts Mailer interface, EmailData, and Button to pkg/email/types.go so the cloud repo can use them via ServerContext. internal/email re-exports the types as aliases so existing callers are unchanged. Also fixes pre-existing lint errors (unchecked rollback and deadline calls).	2026-04-17 16:36:54 +06:00
Rafeed M. Bhuiyan	605ad666a0	v0.1.0 (#17 )	2026-04-16 19:24:25 +00:00

12 Commits