Introduce pre-computed daily usage rollups from sandbox_metrics_snapshots.
An hourly background worker aggregates completed days, while today's
usage is computed live from snapshots at query time for freshness.
Backend: new daily_usage table, rollup worker, UsageService, and
GET /v1/capsules/usage endpoint with date range filtering (up to 92 days).
Frontend: replace Usage page placeholder with bar charts (Chart.js),
summary total cards, and preset/custom date range controls.
- Add ON DELETE CASCADE to users_teams, oauth_providers, admin_permissions
and ON DELETE SET NULL (with nullable columns) to team_api_keys.created_by,
hosts.created_by, host_tokens.created_by so HardDeleteExpiredUsers no longer
fails with FK violations
- User account deletion now cascades to sole-owned teams via DeleteTeamInternal,
preventing orphaned teams with live sandboxes after account removal
- ListActiveSandboxesByTeam now includes hibernated sandboxes so their disk
snapshots are cleaned up during team deletion
- Team soft-delete now hard-deletes sandbox metric points, metric snapshots,
API keys, and channels to prevent data accumulation on deleted teams
- Extract deleteTeamCore() to deduplicate shared logic across DeleteTeam,
AdminDeleteTeam, and DeleteTeamInternal
- Fix ListAPIKeysByTeamWithCreator to use LEFT JOIN after created_by became
nullable, and update handler to read pgtype.Text.String for creator_email
The query was fetching all rows for a (sandbox_id, tier) pair and
filtering by timestamp in Go. For repeatedly-paused sandboxes the
24h tier can accumulate up to 30 days of data, causing up to 120x
over-fetching for a 6h range request.
Add AND ts >= $3 to the query so Postgres filters on the primary key
(sandbox_id, tier, ts) directly. Drop the redundant Go-side loop.
Samples /proc/{fc_pid}/stat (CPU%), /proc/{fc_pid}/status (VmRSS), and
stat() on CoW files at 500ms intervals per running sandbox. Three tiered
ring buffers downsample into 30s and 5min averages for 10min/2h/24h
retention. Metrics are flushed to DB on pause (all tiers) and destroy
(24h only). New GetSandboxMetrics and FlushSandboxMetrics RPCs on the
host agent, proxied through GET /v1/sandboxes/{id}/metrics?range= on
the control plane. Returns live data for running sandboxes, DB data for
paused, and 404 for stopped.
SampleSandboxMetrics previously filtered WHERE status IN ('running',
'starting', 'paused'), which returned no rows when all capsules were
stopped. This caused zero snapshots to be skipped, leaving the
time-series charts with no trailing data points instead of showing
the expected zero values.
Remove the WHERE filter so the query groups by all teams that have
any sandbox row. The per-status FILTER clauses on the aggregates
already produce correct zero counts for stopped capsules.
Also includes the per-VM RAM ceiling formula change (sum(ceil(each/2))
instead of ceil(sum/2)).
- Replace stale snapshot read (GetCurrentMetrics) with live query
(GetLiveMetrics) against sandboxes table — always returns correct
zeros when no capsules are running
- Fix CPU reserved formula: running + starting only; paused VMs no
longer contribute vCPUs (RAM reservation for paused unchanged)
- Merge top cards into 3 paired Now/Peak cards with colored accent
borders (green/blue/amber matching chart colors)
- Move Live badge from Running Capsules card to page-level header
- Add colored category dots to card and chart headers
- Charts stacked vertically, flex-1 to fill remaining page height
- vCPUs chart color changed to blue (#5a9fd4), RAM stays amber
- New sandbox_metrics_snapshots table sampled every 10s (60-day retention)
- Background MetricsSampler goroutine wired into control plane startup
- GET /v1/sandboxes/stats?range=5m|1h|6h|24h|30d endpoint with adaptive
polling intervals; reserved CPU/RAM uses ceil(paused/2) formula
- StatsPanel component: 4 stat cards + 2 Chart.js line charts (straight
lines, integer y-axis for running count, dual-axis for CPU/RAM)
- Range filter persisted in URL query param; polls update data silently
(no blink — loading state only shown on initial mount)
- Split /dashboard/capsules into /list and /stats sub-routes with shared
layout; capsuleRunningCount store syncs badge across routes
- CreateCapsuleDialog extracted as reusable component