wrenn-releases

Author	SHA1	Message	Date
pptx704	dd8a940431	feat(envd): update guest agent for Cloud Hypervisor Remove Firecracker-specific MMDS metadata fetching and metrics host module. CH communicates with the guest purely over TAP networking, so MMDS (Firecracker's metadata service via MMDS address) is no longer needed. - Remove src/host/ module (mmds.rs, metrics.rs) - Remove reqwest dependency (was only used for MMDS HTTP calls) - Remove --isnotfc CLI flag (no longer dual-mode) - Simplify health endpoint and init handler - Update state management for CH snapshot lifecycle - Bump version to 0.3.0	2026-05-17 01:33:25 +06:00
pptx704	485be22a16	test(envd): add 136 unit tests across 12 modules Cover all pure-function modules with inline #[cfg(test)] blocks: crypto (NIST/RFC 4231 known-answer vectors), auth (SecureToken ops, signature generation/validation), conntracker (snapshot lifecycle), execcontext, util (AtomicMax concurrent correctness), http/encoding (RFC 7231 negotiation), port/conn (/proc/net/tcp parsing), rpc/entry (format_permissions), and permissions/path (tilde expansion, ensure_dirs). Add tempfile dev-dep for filesystem tests. Update Makefile test target to include cargo test.	2026-05-13 10:39:54 +06:00
pptx704	aca43d51eb	fix: resolve process stream hangs, pause race, and PTY signal loss - Cache terminal EndEvent on ProcessHandle so connect() can detect already-exited processes instead of hanging forever on broadcast receivers that missed the event. Subscribe before checking cache to close the TOCTOU window. - Protect sb.Status writes in Pause with m.mu to prevent data race with concurrent readers (AcquireProxyConn, Exec, etc.). - Restart metrics sampler in restoreRunning so a failed pause attempt doesn't permanently kill sandbox metrics collection. - Return dequeued non-input messages from coalescePtyInput instead of dropping them, preventing silent loss of kill/resize signals during typing bursts.	2026-05-09 18:11:15 +06:00
pptx704	522e1c5e90	fix: subscribe to process channels before spawning threads to prevent event loss Fast-exiting processes (e.g. echo) sent data/end events before start() subscribed to the broadcast channels, causing the stream to hang indefinitely and the exec RPC to time out with 502. Move channel subscription into spawn_process, before reader/waiter threads start, and return pre-subscribed receivers via SpawnedProcess.	2026-05-09 17:28:37 +06:00
pptx704	d1d316f35c	fix: resolve exec 502 by terminating process streams on exit The start() and connect() streaming RPCs blocked forever in the data event loop because ProcessHandle retains a broadcast sender (needed for reconnection via connect()), preventing the channel from closing. Race data_rx against end_rx with tokio::select! so the stream terminates when the process exits. Remaining buffered data is drained before yielding the end event.	2026-05-09 16:36:33 +06:00
pptx704	2af8412cdc	fix: use RwLock for envd Defaults to fix silent mutation loss The /init handler's default_user mutation cloned the Defaults struct, mutated the clone, then dropped it — the actual state was never updated. This caused processes to always run as "root" regardless of the user set via POST /init. Additionally, default_workdir was accepted in the init request but never applied. Wrap user and workdir fields in RwLock with accessor methods so mutations propagate correctly through the shared AppState.	2026-05-09 15:28:09 +06:00
pptx704	01819642cc	fix: drop page cache before snapshot to reduce memory dump size Linux keeps freed memory as page cache, which Firecracker snapshots as non-zero blocks. A 16GB VM with 12GB stale cache would write all 12GB to disk. Dropping pagecache (not dentries/inodes) in /snapshot/prepare before blocking the reclaimer shrinks snapshots to actual working set size with minimal resume latency impact.	2026-05-03 14:27:49 +06:00
pptx704	1178ab8b21	fix: accurate sandbox metrics and memory management Three issues fixed: 1. Memory metrics read host-side VmRSS of the Firecracker process, which includes guest page cache and never decreases. Replaced readMemRSS(fcPID) with readEnvdMemUsed(client) that queries envd's /metrics endpoint for guest-side total - MemAvailable. This matches neofetch and reflects actual process memory. 2. Added Firecracker balloon device (deflate_on_oom, 5s stats) and envd-side periodic page cache reclaimer (drop_caches when >80% used). Reclaimer is gated by snapshot_in_progress flag with sync() before freeze to prevent memory corruption during pause. 3. Sampling interval 500ms → 1s, ring buffer capacities adjusted to maintain same time windows. Reduces per-host HTTP load from 240 calls/sec to 120 calls/sec at 120 capsules. Also: maxDiffGenerations 8 → 1 (merge every re-pause since UFFD lazy-loads anyway), envd mem_used formula uses total - available.	2026-05-03 12:19:01 +06:00
pptx704	31456fd169	fix: resolve PTY failure, MMDS file writes, and metrics instability in envd-rs Three bugs fixed: 1. PTY connections failed because home directory was hardcoded as /home/{username} instead of reading from /etc/passwd. For root, this produced /home/root/ which doesn't exist — CWD validation rejected every PTY Start request without explicit cwd. Fixed all 6 locations to use user.dir from nix::unistd::User. 2. MMDS polling silently failed to parse metadata because the logs_collector_address field lacked #[serde(default)]. The host agent only sends instanceID + envID — missing "address" field caused every deserialize attempt to fail, so .WRENN_SANDBOX_ID and .WRENN_TEMPLATE_ID were never written. Also added error logging and create_dir_all before file writes. 3. Metrics CPU values were non-deterministic because a fresh sysinfo::System was created per request with a 100ms sleep between reads. Replaced with a background thread that samples CPU at fixed 1-second intervals via a persistent System instance, matching gopsutil's internal caching behavior. Metrics endpoint now reads cached atomic values — no blocking, consistent window. Also: close master PTY fd in child pre_exec, add process.Start request logging, bump version to 0.2.0.	2026-05-03 04:28:10 +06:00
pptx704	0b53d34417	feat: rewrite envd guest agent in Rust (envd-rs) Complete Rust rewrite of the Go envd guest daemon that runs as PID 1 inside Firecracker microVMs. Feature-complete across all 8 phases: - Health, metrics, and env var endpoints - Crypto (SHA-256/512, HMAC), auth (secure token, signing), init/snapshot - Connect RPC via connectrpc + buffa (process + filesystem services) - File transfer (GET/POST /files) with gzip, multipart, chown, ENOSPC - Port subsystem (/proc/net/tcp scanner, socat forwarder) - Cgroup2 manager with noop fallback - Snapshot/restore lifecycle (conntracker, port subsystem stop/restart) - SIGTERM graceful shutdown, --cmd initial process spawn - MMDS metadata polling for Firecracker mode 42 source files, ~4200 LOC, 4.1MB stripped release binary. Makefile updated: build-envd now targets Rust (musl static), build-envd-go preserved for Go builds.	2026-05-03 02:47:15 +06:00

10 Commits