## Design Context ### Users Developers across the full spectrum — solo engineers building side projects, startup teams integrating sandboxed execution into products, and platform/infra engineers at larger organizations running production workloads on Firecracker microVMs. They arrive with context: they know what a process is, what a rootfs is, what a TTY means. The interface must feel at home for all three: approachable enough not to intimidate a hacker, precise enough to earn the trust of a production ops team. Never condescend, never oversimplify. Trust the user to understand what they're looking at. **Primary job to be done:** Understand what's running, act on it confidently, and get back to code. ### Brand Personality **Precise. Warm. Uncompromising.** Wrenn is an engineer's favorite tool — built with visible care, not assembled from defaults. It runs real infrastructure (Firecracker microVMs), so the UI should reflect that seriousness without becoming cold or corporate. The warmth comes from the typography and color palette; the precision comes from hierarchy, density, and data fidelity. Emotional goal: **in control.** Users leave a session with full confidence in what's running, what happened, and what comes next. Nothing is hidden, nothing is ambiguous. ### Aesthetic Direction **Dark-only (permanently), industrial-warm, data-forward.** No light mode planned. All design decisions should optimize for dark. The near-black-green background palette (`#0a0c0b` through `#2a302d`) reads as "black with intention" — not pitch black (cold) and not charcoal (dated). The sage green accent (`#5e8c58`) is muted and organic, a meaningful departure from the startup-green neon that saturates the developer tool space. **Anti-references:** - **Supabase**: avoid the friendly, approachable startup-green energy — too generic, too eager to please - **AWS / GCP consoles**: avoid utility-first density without craft — functional but joyless, visually dated **References that capture the right spirit:** - The precision of a well-calibrated instrument - Editorial typography from technical publications - The quiet confidence of tools that don't need to explain themselves ### Type System Four fonts with strict roles — this is the design system's strongest personality trait and must be respected: | Font | CSS Class | Role | When to use | |------|-----------|------|-------------| | **Manrope** (variable, sans) | `font-sans` | UI workhorse | All body copy, nav, labels, buttons, form text | | **Instrument Serif** | `font-serif` | Display / editorial | Page titles (h1), dialog headings, metric values, hero moments | | **JetBrains Mono** (variable) | `font-mono` | Data / code | IDs, timestamps, key prefixes, file paths, terminal output, metrics | | **Alice** | brand wordmark only | Brand wordmark | "Wrenn" in sidebar and login only — nowhere else | Instrument Serif at scale creates the signature editorial moments. Mono provides the precision signal for technical data. Never swap these roles. **Tracking overrides (app.css):** - `.font-serif` — `letter-spacing: 0.015em` (positive tracking; Instrument Serif reads less condensed at display sizes) - `.font-mono` — `font-variant-numeric: tabular-nums` (numbers align in tables and metric displays) **Type scale (root: 87.5% = 14px base):** | Token | Value | Use | |---|---|---| | `--text-display` | 2.571rem (~36px) | Auth section headings | | `--text-page` | 2rem (~28px) | Page h1 titles | | `--text-heading` | 1.429rem (~20px) | Dialog headings, empty states | | `--text-body` | 1rem (~14px) | Primary body, buttons, inputs | | `--text-ui` | 0.929rem (~13px) | Nav labels, table cells | | `--text-meta` | 0.857rem (~12px) | Key prefixes, minor info | | `--text-label` | 0.786rem (~11px) | Uppercase section labels | | `--text-badge` | 0.714rem (~10px) | Live badges, tiny indicators | ### Color System All values are CSS custom properties in `frontend/src/app.css`. **Backgrounds (6-step near-black-green scale):** | Token | Value | Use | |---|---|---| | `--color-bg-0` | `#0a0c0b` | Page base, sidebar deepest layer | | `--color-bg-1` | `#0f1211` | Sidebar surface | | `--color-bg-2` | `#141817` | Card backgrounds | | `--color-bg-3` | `#1a1e1c` | Table headers, elevated surfaces | | `--color-bg-4` | `#212624` | Hover states, inputs | | `--color-bg-5` | `#2a302d` | Highlighted items, selected rows | **Text (5-level hierarchy):** | Token | Value | Use | |---|---|---| | `--color-text-bright` | `#eae7e2` | H1s, dialog headings | | `--color-text-primary` | `#d0cdc6` | Body copy, primary labels | | `--color-text-secondary` | `#9b9790` | Secondary labels, descriptions | | `--color-text-tertiary` | `#6b6862` | Hints, placeholders | | `--color-text-muted` | `#454340` | Dividers as text, ultra-subtle | **Accent (sage green — use sparingly, must feel earned):** | Token | Value | Use | |---|---|---| | `--color-accent` | `#5e8c58` | Primary CTA, live indicators, focus rings, active nav | | `--color-accent-mid` | `#89a785` | Hover accent text | | `--color-accent-bright` | `#a4c89f` | Accent on dark backgrounds | | `--color-accent-glow` | `rgba(94,140,88,0.07)` | Subtle tinted backgrounds | | `--color-accent-glow-mid` | `rgba(94,140,88,0.14)` | Hover tint on accent items | **Status semantics:** | Token | Value | Use | |---|---|---| | `--color-amber` | `#d4a73c` | Warning, paused state | | `--color-red` | `#cf8172` | Error, destructive actions | | `--color-blue` | `#5a9fd4` | Info, neutral system states | **Borders:** `--color-border` (`#1f2321`) default; `--color-border-mid` (`#2a2f2c`) for inputs/hover. ### Component Patterns **Buttons:** - Primary: solid sage green (`--color-accent`), hover brightness boost + micro-lift (`-translate-y-px`) - Secondary: bordered (`--color-border-mid`), text transitions to accent on hover - Danger: red text + subtle red background on hover - All: `transition-all duration-150` **Inputs:** - Border `--color-border`, background `--color-bg-2`; focus transitions border and icon to accent - Group focus pattern: `group` wrapper + `group-focus-within:text-[var(--color-accent)]` on icon **Tables / data lists:** - Grid layout; header `bg-3` + uppercase `--text-label`; row hover `hover:bg-[var(--color-bg-3)]` - Status stripe: left border color matches sandbox state **Status indicators:** Running = animated ping + sage green dot; Paused = amber dot; Stopped = muted gray. Color is never the sole differentiator. **Modals & dialogs:** Border + shadow only — no accent gradient bars/strips. `fadeUp` 0.35s entrance. **Empty states:** Large icon with glow, Instrument Serif heading, secondary body text, CTA below, `iconFloat` 4s animation. **Animations (always respect `prefers-reduced-motion`):** `fadeUp` (entrance), `status-ping` (live indicator), `iconFloat` (empty states), `spin-once` (refresh), staggered `animation-delay` on lists. ### Design Principles 1. **Precision over friendliness.** Every element earns its place. Wrenn doesn't need to tell you it's developer-friendly — that should be self-evident from the quality of the information architecture. 2. **Density with breathing room.** Data-forward doesn't mean cramped. Strategic whitespace creates calm hierarchy within dense contexts. Sections breathe; rows don't waste space. 3. **Industrial warmth.** The serif + mono + warm-black combination prevents sterility. This is a forge, not a gallery. The warmth is in the details, not the primary colors. 4. **Legible at speed.** Users scan dashboards in seconds. Strong typographic contrast (serif h1, mono IDs, sans body), consistent patterns, and predictable placement let users orientate instantly without reading everything. 5. **Craft signals trust.** For infrastructure that runs production code, the quality of the UI is a proxy for the quality of the product. Pixel-level decisions matter. Polish is not decoration — it's a trust signal. ## MCP Tools: code-review-graph **IMPORTANT: This project has a knowledge graph. ALWAYS use the code-review-graph MCP tools BEFORE using Grep/Glob/Read to explore the codebase.** The graph is faster, cheaper (fewer tokens), and gives you structural context (callers, dependents, test coverage) that file scanning cannot. ### When to use graph tools FIRST - **Exploring code**: `semantic_search_nodes` or `query_graph` instead of Grep - **Understanding impact**: `get_impact_radius` instead of manually tracing imports - **Code review**: `detect_changes` + `get_review_context` instead of reading entire files - **Finding relationships**: `query_graph` with callers_of/callees_of/imports_of/tests_for - **Architecture questions**: `get_architecture_overview` + `list_communities` Fall back to Grep/Glob/Read **only** when the graph doesn't cover what you need. ### Key Tools | Tool | Use when | |------|----------| | `detect_changes` | Reviewing code changes — gives risk-scored analysis | | `get_review_context` | Need source snippets for review — token-efficient | | `get_impact_radius` | Understanding blast radius of a change | | `get_affected_flows` | Finding which execution paths are impacted | | `query_graph` | Tracing callers, callees, imports, tests, dependencies | | `semantic_search_nodes` | Finding functions/classes by name or keyword | | `get_architecture_overview` | Understanding high-level codebase structure | | `refactor_tool` | Planning renames, finding dead code | ### Workflow 1. The graph auto-updates on file changes (via hooks). 2. Use `detect_changes` for code review. 3. Use `get_affected_flows` to understand impact. 4. Use `query_graph` pattern="tests_for" to check coverage. ## Code Runner Module `wrenn.code_runner` — stateful code execution capsule via persistent Jupyter kernel. - **Module path:** `wrenn.code_runner` (canonical). The old path `wrenn.code_interpreter` is a deprecation alias that emits a `FutureWarning` on import; do not introduce new uses. - **Defaults:** template `code-runner-beta`, kernelspec `wrenn`. Both overridable via `Capsule(template=..., kernel=...)`. - **Kernel reuse:** `_ensure_kernel` lists `/api/kernels`, reuses the first kernel whose `name` matches the configured kernelspec, else POSTs `{"name": }` to create one. Matching by name (not just "any kernel") is intentional — multiple kernelspecs may coexist on the same Jupyter. - **Lifecycle invariant:** the constructor sets `_kernel_id`, `_kernel_name`, `_proxy_client` to safe defaults *before* calling `super().__init__`. `__del__` must never assume construction completed. Async `__del__` only drops the reference — the proxy `httpx.AsyncClient` must be closed via `await close()` or `async with`. ## Client Config `WrennClient` / `AsyncWrennClient` accept: - `api_key` — falls back to `WRENN_API_KEY`. - `base_url` — falls back to `WRENN_BASE_URL`, then `DEFAULT_BASE_URL` (`https://app.wrenn.dev/api`). - `proxy_domain` — host suffix for capsule proxy URLs (`{port}-{capsule_id}.`). Resolution: 1. explicit `proxy_domain=` kwarg 2. `WRENN_PROXY_DOMAIN` env 3. `wrenn.dev` when `base_url` host == `app.wrenn.dev` exactly 4. else `base_url` host (with port) verbatim Exact match in step 3 is intentional: staging/other Wrenn envs keep their host so they don't accidentally collapse to prod `wrenn.dev`. - `timeout` — `httpx.Timeout | float | None`. Default `httpx.Timeout(30.0, connect=10.0)`. Helper `_resolve_timeout` centralizes the float-or-Timeout coercion. `_build_proxy_url` / `_build_http_proxy_url` in `wrenn.capsule` now take an optional `proxy_domain` arg. When omitted they fall back to the `base_url` host (legacy behavior, preserved for direct callers/tests). Production call sites pass `self._client._proxy_domain`. ### Tests - `tests/test_code_runner_unit.py` — pure unit tests (respx + mocked WebSocket). Covers `Result.from_bundle`, MIME unpacking, quote-stripping, `Execution.text`, kernel reuse vs create, retry on 5xx, 4xx propagation, ctor-failure-safe `__del__`, deprecation alias. - `tests/test_code_runner_e2e.py` — live integration tests (marked `integration`, skipped without `WRENN_API_KEY`). Covers stateful execution, exceptions, callbacks, rich outputs (HTML, matplotlib, pandas), async variant, isolation between capsules, and the deprecated `code_interpreter` import path. - Run both: `make test-code-runner`.