1
0
forked from wrenn/wrenn

feat: async sandbox lifecycle with Redis Stream events

Replace synchronous RPC-based CP-host communication for sandbox
lifecycle operations (Create, Pause, Resume, Destroy) with an async
pattern. CP handlers now return 202 Accepted immediately, fire agent
RPCs in background goroutines, and publish state events to a Redis
Stream. A background consumer processes events as a fallback writer.

Agent-side auto-pause events are pushed to the CP via HTTP callback
(POST /v1/hosts/sandbox-events), keeping Redis internal to the CP.

All DB status transitions use conditional updates
(UpdateSandboxStatusIf, UpdateSandboxRunningIf) to prevent race
conditions between concurrent operations and background goroutines.

The HostMonitor reconciler is kept at 60s as a safety net, extended
to handle transient statuses (starting, pausing, resuming, stopping).

Frontend updated to handle 202 responses with empty bodies and render
transient statuses with blue indicators.
This commit is contained in:
2026-05-15 12:25:16 +06:00
parent c08884fa2c
commit 6faad45a28
18 changed files with 944 additions and 128 deletions

View File

@ -41,6 +41,17 @@ type Config struct {
AgentVersion string // host agent version (injected via ldflags)
}
// LifecycleEvent describes an autonomous state change initiated by the agent.
type LifecycleEvent struct {
Event string
SandboxID string
}
// EventSender sends autonomous lifecycle events to the control plane.
type EventSender interface {
SendAsync(event LifecycleEvent)
}
// Manager orchestrates sandbox lifecycle: VM, network, filesystem, envd.
type Manager struct {
cfg Config
@ -57,6 +68,11 @@ type Manager struct {
// onDestroy is called with the sandbox ID after cleanup completes.
// Used by ProxyHandler to evict cached reverse proxies.
onDestroy func(sandboxID string)
// eventSender sends autonomous lifecycle events (auto-pause, auto-destroy)
// to the CP via HTTP callback. Optional — nil means events are only
// propagated through the HostMonitor reconciler.
eventSender EventSender
}
// SetOnDestroy registers a callback invoked after each sandbox is cleaned up.
@ -64,6 +80,11 @@ func (m *Manager) SetOnDestroy(fn func(sandboxID string)) {
m.onDestroy = fn
}
// SetEventSender registers the callback sender for autonomous lifecycle events.
func (m *Manager) SetEventSender(sender EventSender) {
m.eventSender = sender
}
// sandboxState holds the runtime state for a single sandbox.
type sandboxState struct {
models.Sandbox
@ -1681,6 +1702,13 @@ func (m *Manager) reapExpired(_ context.Context) {
m.autoPausedMu.Lock()
m.autoPausedIDs = append(m.autoPausedIDs, id)
m.autoPausedMu.Unlock()
if m.eventSender != nil {
m.eventSender.SendAsync(LifecycleEvent{
Event: "sandbox.auto_paused",
SandboxID: id,
})
}
}
}