forked from wrenn/wrenn
Pre-pause snapshot signal to prevent Go runtime crash on restore
envd crashes with "fatal error: bad summary data" after Firecracker snapshot/restore because the page allocator radix tree is inconsistent when vCPUs are frozen mid-allocation. The port scanner goroutine allocates heavily every second, making it the primary trigger. Add POST /snapshot/prepare to envd — the host agent calls it before vm.Pause to quiesce continuous goroutines and force GC. On restore, PostInit restarts the port subsystem via the existing /init endpoint. - New PortSubsystem abstraction with Start/Stop/Restart lifecycle - Context-based goroutine cancellation (replaces irreversible channel close) - Context-aware Signal to prevent scanner/forwarder deadlock - Fix forwarder goroutine leak (was spinning forever on closed channel) - Kill socat children on stop to prevent orphans across snapshots - Fix double cmd.Wait panic (exec.Command instead of CommandContext)
This commit is contained in:
@ -1,4 +1,5 @@
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
# Modifications by M/S Omukk
|
||||
|
||||
openapi: 3.0.0
|
||||
info:
|
||||
@ -70,6 +71,13 @@ paths:
|
||||
"204":
|
||||
description: Env vars set, the time and metadata is synced with the host
|
||||
|
||||
/snapshot/prepare:
|
||||
post:
|
||||
summary: Quiesce continuous goroutines before Firecracker snapshot
|
||||
responses:
|
||||
"204":
|
||||
description: Goroutines quiesced, safe to snapshot
|
||||
|
||||
/envs:
|
||||
get:
|
||||
summary: Get the environment variables
|
||||
|
||||
Reference in New Issue
Block a user