1
0
forked from wrenn/wrenn
Commit Graph

11 Commits

Author SHA1 Message Date
3671af2498 feat: immediate sandbox reconciliation on host reconnect
When a host transitions from unreachable → online via heartbeat, trigger
ReconcileHost in a background goroutine so "missing" sandboxes are
resolved instantly instead of waiting up to 60s for the next monitor tick.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-16 16:15:49 +06:00
6faad45a28 feat: async sandbox lifecycle with Redis Stream events
Replace synchronous RPC-based CP-host communication for sandbox
lifecycle operations (Create, Pause, Resume, Destroy) with an async
pattern. CP handlers now return 202 Accepted immediately, fire agent
RPCs in background goroutines, and publish state events to a Redis
Stream. A background consumer processes events as a fallback writer.

Agent-side auto-pause events are pushed to the CP via HTTP callback
(POST /v1/hosts/sandbox-events), keeping Redis internal to the CP.

All DB status transitions use conditional updates
(UpdateSandboxStatusIf, UpdateSandboxRunningIf) to prevent race
conditions between concurrent operations and background goroutines.

The HostMonitor reconciler is kept at 60s as a safety net, extended
to handle transient statuses (starting, pausing, resuming, stopping).

Frontend updated to handle 202 responses with empty bodies and render
transient statuses with blue indicators.
2026-05-15 12:25:16 +06:00
3deecbff89 fix: prevent Go runtime memory corruption and sandbox halt after snapshot restore
Three root causes addressed:

1. Go page allocator corruption: allocations between the pre-snapshot GC
   and VM freeze leave the summary tree inconsistent. After restore, GC
   reads corrupted metadata — either panicking (killing PID 1 → kernel
   panic) or silently failing to collect, causing unbounded heap growth
   until OOM. Fix: move GC to after all HTTP allocations in
   PostSnapshotPrepare, then set GOMAXPROCS(1) so any remaining
   allocations run sequentially with no concurrent page allocator access.
   GOMAXPROCS is restored on first health check after restore.

2. PostInit timeout starvation: WaitUntilReady and PostInit shared a
   single 30s context. If WaitUntilReady consumed most of it, PostInit
   failed — RestoreAfterSnapshot never ran, leaving envd with keep-alives
   disabled and zombie connections. Fix: separate timeout contexts.

3. CP HTTP server missing timeouts: no ReadHeaderTimeout or IdleTimeout
   caused goroutine leaks from hung proxy connections. Fix: add both,
   matching host agent values.

Also adds UFFD prefetch to proactively load all guest pages after restore,
eliminating on-demand page fault latency for subsequent RPC calls.
2026-05-02 17:22:51 +06:00
11928a172a feat: send email notification on account hard-delete
Notify users via email when their account is permanently deleted after
the 15-day soft-delete grace period. Query now returns email alongside
user ID so the notification can be sent after deletion.

Email failure is logged as a warning but does not block cleanup.
2026-04-21 16:01:56 +06:00
ebbbde9cd1 feat: anonymize audit logs on user hard-delete and fix host audit log team assignment
Anonymize audit logs when soft-deleted users are purged after 15 days:
actor_name set to 'deleted-user', actor_id and resource_id nulled,
email stripped from member metadata. Per-user delete ensures no user
is removed without successful anonymization.

Frontend renders deleted-user as a styled red badge in audit log view.

Fix shared host create/delete audit logs landing in admin's personal
team — now correctly assigned to PlatformTeamID.
2026-04-21 14:42:09 +06:00
92aab09104 Add daily usage metrics (CPU-minutes, RAM GB-minutes)
Introduce pre-computed daily usage rollups from sandbox_metrics_snapshots.
An hourly background worker aggregates completed days, while today's
usage is computed live from snapshots at query time for freshness.

Backend: new daily_usage table, rollup worker, UsageService, and
GET /v1/capsules/usage endpoint with date range filtering (up to 92 days).

Frontend: replace Usage page placeholder with bar charts (Chart.js),
summary total cards, and preset/custom date range controls.
2026-04-18 14:29:09 +06:00
e6e3975426 Add unauthenticated /health endpoint to control plane
Returns JSON with status and build version for monitoring and
load balancer health checks.
2026-04-16 16:13:42 +06:00
bba5f80294 Add production file logging with logrotate support
Both control plane and host agent now write structured slog output to
$WRENN_DIR/logs/ in addition to stderr. Log level is configurable via
LOG_LEVEL env var (default: info). SIGHUP reopens the log file so
logrotate can rotate without copytruncate.
2026-04-16 15:09:26 +06:00
f69fa8cded Add /v1/me account management endpoints
Adds self-service endpoints: GET/PATCH/DELETE /v1/me, POST /v1/me/password,
POST /v1/me/password/reset{/confirm}, GET/DELETE /v1/me/providers/{provider}.
Includes OAuth account-linking flow via cookie, hard-delete cleanup goroutine
(24h ticker, 15-day grace period), and OpenAPI spec for all new routes.
2026-04-16 03:24:55 +06:00
9d68eb5f00 Add transactional email system via SMTP
Introduce internal/email package with SMTP sending, embedded HTML/text
templates, and multipart MIME assembly. Emails use a generic EmailData
struct (recipient name, message, optional button, optional closing) so
new email types can be added without code changes.

Wired into signup (welcome email), team creation, and team member
addition. No-op mailer when SMTP_HOST is not configured.
2026-04-16 00:46:08 +06:00
a5ad3731f2 Refactored to maintain a separate cloud version
Moves 12 packages from internal/ to pkg/ (config, id, validate, events, db,
auth, lifecycle, scheduler, channels, audit, service) so they can be imported
by the enterprise repo as a Go module dependency.

Introduces pkg/cpextension (shared Extension interface + ServerContext) and
pkg/cpserver (Run() entrypoint with functional options) so the enterprise
main.go can call cpserver.Run(cpserver.WithExtensions(...)) without duplicating
the 20-step server bootstrap. Adds db/migrations/embed.go for go:embed access
to OSS SQL migrations from the enterprise module.

cmd/control-plane/main.go is reduced to a 10-line wrapper around cpserver.Run.
2026-04-15 21:41:48 +06:00