forked from wrenn/wrenn
fix: sandbox network responsiveness under port-binding apps
Running port-binding applications (Jupyter, http.server, NextJS) inside
sandboxes caused severe PTY sluggishness and proxy navigation errors.
Root cause: the CP sandbox proxy and Connect RPC pool shared a single
HTTP transport. Heavy proxy traffic (Jupyter WebSocket, REST polling)
interfered with PTY RPC streams via HTTP/2 flow control contention.
Transport isolation (main fix):
- Add dedicated proxy transport on CP (NewProxyTransport) with HTTP/2
disabled, separate from the RPC pool transport
- Add dedicated proxy transport on host agent, replacing
http.DefaultTransport
- Add dedicated envdclient transport with tuned connection pooling
- Replace http.DefaultClient in file streaming RPCs with per-sandbox
envd client
Proxy path rewriting (navigation fix):
- Add ModifyResponse to rewrite Location headers with /proxy/{id}/{port}
prefix, handling both root-relative and absolute-URL redirects
- Strip prefix back out in CP subdomain proxy for correct browser
behavior
- Replace path.Join with string concat in CP Director to preserve
trailing slashes (prevents redirect loops on directory listings)
Proxy resilience:
- Add dial retry with linear backoff (3 attempts) to handle socat
startup delay when ports are first detected
- Cache ReverseProxy instances per sandbox+port+host in sync.Map
- Add EvictProxy callback wired into sandbox Manager.Destroy
Buffer and server hardening:
- Increase PTY and exec stream channel buffers from 16 to 256
- Add ReadHeaderTimeout (10s) and IdleTimeout (620s) to host agent
HTTP server
Network tuning:
- Set TAP device TxQueueLen to 5000 (up from default 1000)
- Add Firecracker tx_rate_limiter (200 MB/s sustained, 100 MB burst)
to prevent guest traffic from saturating the TAP
This commit is contained in:
@ -8,7 +8,6 @@ import (
|
||||
"net/http"
|
||||
"net/http/httputil"
|
||||
"net/url"
|
||||
"path"
|
||||
"regexp"
|
||||
"strconv"
|
||||
"strings"
|
||||
@ -74,7 +73,7 @@ func NewSandboxProxyWrapper(inner http.Handler, queries *db.Queries, pool *lifec
|
||||
inner: inner,
|
||||
db: queries,
|
||||
pool: pool,
|
||||
transport: pool.Transport(),
|
||||
transport: pool.NewProxyTransport(),
|
||||
cache: make(map[pgtype.UUID]proxyCacheEntry),
|
||||
}
|
||||
}
|
||||
@ -167,14 +166,29 @@ func (h *SandboxProxyWrapper) ServeHTTP(w http.ResponseWriter, r *http.Request)
|
||||
return
|
||||
}
|
||||
|
||||
// The host agent's proxy adds a /proxy/{id}/{port} prefix to Location
|
||||
// headers for path-based routing. For subdomain routing the browser is at
|
||||
// {port}-{id}.domain, so we strip the prefix back out.
|
||||
agentProxyPrefix := "/proxy/" + sandboxIDStr + "/" + port
|
||||
|
||||
proxy := &httputil.ReverseProxy{
|
||||
Transport: h.transport,
|
||||
Director: func(req *http.Request) {
|
||||
req.URL.Scheme = agentURL.Scheme
|
||||
req.URL.Host = agentURL.Host
|
||||
req.URL.Path = path.Join("/proxy", sandboxIDStr, port, path.Clean("/"+req.URL.Path))
|
||||
// Use string concatenation instead of path.Join to preserve trailing
|
||||
// slashes. path.Join strips them, causing redirect loops for directory
|
||||
// listings in apps like python http.server and Jupyter.
|
||||
req.URL.Path = "/proxy/" + sandboxIDStr + "/" + port + req.URL.Path
|
||||
req.Host = agentURL.Host
|
||||
},
|
||||
ModifyResponse: func(resp *http.Response) error {
|
||||
if loc := resp.Header.Get("Location"); loc != "" {
|
||||
loc = strings.TrimPrefix(loc, agentProxyPrefix)
|
||||
resp.Header.Set("Location", loc)
|
||||
}
|
||||
return nil
|
||||
},
|
||||
ErrorHandler: func(w http.ResponseWriter, r *http.Request, err error) {
|
||||
slog.Debug("sandbox proxy error",
|
||||
"sandbox_id", sandboxIDStr,
|
||||
|
||||
Reference in New Issue
Block a user