feat: redesign code interpreter with structured Execution model
Some checks failed
ci/woodpecker/push/check Pipeline failed

Replace flat CodeResult with a proper model hierarchy: Execution
(top-level), Result (per-output with typed MIME fields), Logs
(stdout/stderr as lists), and ExecutionError (structured
name/value/traceback). Handle display_data messages for rich output,
add streaming callbacks (on_result, on_stdout, on_stderr, on_error),
and remove the misleading stdout-to-text fallback.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-04-17 03:13:16 +06:00
parent 7e7ecbd48a
commit 3f97c73b2f
11 changed files with 863 additions and 108 deletions

2
.gitignore vendored
View File

@ -175,3 +175,5 @@ cython_debug/
.pypirc
CODE_EXECUTION.md
docs/

132
CLAUDE.md Normal file
View File

@ -0,0 +1,132 @@
## Design Context
### Users
Developers across the full spectrum — solo engineers building side projects, startup teams integrating sandboxed execution into products, and platform/infra engineers at larger organizations running production workloads on Firecracker microVMs. They arrive with context: they know what a process is, what a rootfs is, what a TTY means. The interface must feel at home for all three: approachable enough not to intimidate a hacker, precise enough to earn the trust of a production ops team. Never condescend, never oversimplify. Trust the user to understand what they're looking at.
**Primary job to be done:** Understand what's running, act on it confidently, and get back to code.
### Brand Personality
**Precise. Warm. Uncompromising.**
Wrenn is an engineer's favorite tool — built with visible care, not assembled from defaults. It runs real infrastructure (Firecracker microVMs), so the UI should reflect that seriousness without becoming cold or corporate. The warmth comes from the typography and color palette; the precision comes from hierarchy, density, and data fidelity.
Emotional goal: **in control.** Users leave a session with full confidence in what's running, what happened, and what comes next. Nothing is hidden, nothing is ambiguous.
### Aesthetic Direction
**Dark-only (permanently), industrial-warm, data-forward.**
No light mode planned. All design decisions should optimize for dark. The near-black-green background palette (`#0a0c0b` through `#2a302d`) reads as "black with intention" — not pitch black (cold) and not charcoal (dated). The sage green accent (`#5e8c58`) is muted and organic, a meaningful departure from the startup-green neon that saturates the developer tool space.
**Anti-references:**
- **Supabase**: avoid the friendly, approachable startup-green energy — too generic, too eager to please
- **AWS / GCP consoles**: avoid utility-first density without craft — functional but joyless, visually dated
**References that capture the right spirit:**
- The precision of a well-calibrated instrument
- Editorial typography from technical publications
- The quiet confidence of tools that don't need to explain themselves
### Type System
Four fonts with strict roles — this is the design system's strongest personality trait and must be respected:
| Font | CSS Class | Role | When to use |
|------|-----------|------|-------------|
| **Manrope** (variable, sans) | `font-sans` | UI workhorse | All body copy, nav, labels, buttons, form text |
| **Instrument Serif** | `font-serif` | Display / editorial | Page titles (h1), dialog headings, metric values, hero moments |
| **JetBrains Mono** (variable) | `font-mono` | Data / code | IDs, timestamps, key prefixes, file paths, terminal output, metrics |
| **Alice** | brand wordmark only | Brand wordmark | "Wrenn" in sidebar and login only — nowhere else |
Instrument Serif at scale creates the signature editorial moments. Mono provides the precision signal for technical data. Never swap these roles.
**Tracking overrides (app.css):**
- `.font-serif``letter-spacing: 0.015em` (positive tracking; Instrument Serif reads less condensed at display sizes)
- `.font-mono``font-variant-numeric: tabular-nums` (numbers align in tables and metric displays)
**Type scale (root: 87.5% = 14px base):**
| Token | Value | Use |
|---|---|---|
| `--text-display` | 2.571rem (~36px) | Auth section headings |
| `--text-page` | 2rem (~28px) | Page h1 titles |
| `--text-heading` | 1.429rem (~20px) | Dialog headings, empty states |
| `--text-body` | 1rem (~14px) | Primary body, buttons, inputs |
| `--text-ui` | 0.929rem (~13px) | Nav labels, table cells |
| `--text-meta` | 0.857rem (~12px) | Key prefixes, minor info |
| `--text-label` | 0.786rem (~11px) | Uppercase section labels |
| `--text-badge` | 0.714rem (~10px) | Live badges, tiny indicators |
### Color System
All values are CSS custom properties in `frontend/src/app.css`.
**Backgrounds (6-step near-black-green scale):**
| Token | Value | Use |
|---|---|---|
| `--color-bg-0` | `#0a0c0b` | Page base, sidebar deepest layer |
| `--color-bg-1` | `#0f1211` | Sidebar surface |
| `--color-bg-2` | `#141817` | Card backgrounds |
| `--color-bg-3` | `#1a1e1c` | Table headers, elevated surfaces |
| `--color-bg-4` | `#212624` | Hover states, inputs |
| `--color-bg-5` | `#2a302d` | Highlighted items, selected rows |
**Text (5-level hierarchy):**
| Token | Value | Use |
|---|---|---|
| `--color-text-bright` | `#eae7e2` | H1s, dialog headings |
| `--color-text-primary` | `#d0cdc6` | Body copy, primary labels |
| `--color-text-secondary` | `#9b9790` | Secondary labels, descriptions |
| `--color-text-tertiary` | `#6b6862` | Hints, placeholders |
| `--color-text-muted` | `#454340` | Dividers as text, ultra-subtle |
**Accent (sage green — use sparingly, must feel earned):**
| Token | Value | Use |
|---|---|---|
| `--color-accent` | `#5e8c58` | Primary CTA, live indicators, focus rings, active nav |
| `--color-accent-mid` | `#89a785` | Hover accent text |
| `--color-accent-bright` | `#a4c89f` | Accent on dark backgrounds |
| `--color-accent-glow` | `rgba(94,140,88,0.07)` | Subtle tinted backgrounds |
| `--color-accent-glow-mid` | `rgba(94,140,88,0.14)` | Hover tint on accent items |
**Status semantics:**
| Token | Value | Use |
|---|---|---|
| `--color-amber` | `#d4a73c` | Warning, paused state |
| `--color-red` | `#cf8172` | Error, destructive actions |
| `--color-blue` | `#5a9fd4` | Info, neutral system states |
**Borders:** `--color-border` (`#1f2321`) default; `--color-border-mid` (`#2a2f2c`) for inputs/hover.
### Component Patterns
**Buttons:**
- Primary: solid sage green (`--color-accent`), hover brightness boost + micro-lift (`-translate-y-px`)
- Secondary: bordered (`--color-border-mid`), text transitions to accent on hover
- Danger: red text + subtle red background on hover
- All: `transition-all duration-150`
**Inputs:**
- Border `--color-border`, background `--color-bg-2`; focus transitions border and icon to accent
- Group focus pattern: `group` wrapper + `group-focus-within:text-[var(--color-accent)]` on icon
**Tables / data lists:**
- Grid layout; header `bg-3` + uppercase `--text-label`; row hover `hover:bg-[var(--color-bg-3)]`
- Status stripe: left border color matches sandbox state
**Status indicators:** Running = animated ping + sage green dot; Paused = amber dot; Stopped = muted gray. Color is never the sole differentiator.
**Modals & dialogs:** Border + shadow only — no accent gradient bars/strips. `fadeUp` 0.35s entrance.
**Empty states:** Large icon with glow, Instrument Serif heading, secondary body text, CTA below, `iconFloat` 4s animation.
**Animations (always respect `prefers-reduced-motion`):** `fadeUp` (entrance), `status-ping` (live indicator), `iconFloat` (empty states), `spin-once` (refresh), staggered `animation-delay` on lists.
### Design Principles
1. **Precision over friendliness.** Every element earns its place. Wrenn doesn't need to tell you it's developer-friendly — that should be self-evident from the quality of the information architecture.
2. **Density with breathing room.** Data-forward doesn't mean cramped. Strategic whitespace creates calm hierarchy within dense contexts. Sections breathe; rows don't waste space.
3. **Industrial warmth.** The serif + mono + warm-black combination prevents sterility. This is a forge, not a gallery. The warmth is in the details, not the primary colors.
4. **Legible at speed.** Users scan dashboards in seconds. Strong typographic contrast (serif h1, mono IDs, sans body), consistent patterns, and predictable placement let users orientate instantly without reading everything.
5. **Craft signals trust.** For infrastructure that runs production code, the quality of the UI is a proxy for the quality of the product. Pixel-level decisions matter. Polish is not decoration — it's a trust signal.

View File

@ -2,7 +2,7 @@
.PHONY: generate lint test check test-integration
# Variables
SPEC_URL = "https://git.omukk.dev/wrenn/wrenn/raw/branch/dev/internal/api/openapi.yaml"
SPEC_URL = "https://git.omukk.dev/wrenn/wrenn/raw/branch/main/internal/api/openapi.yaml"
SPEC_PATH = "api/openapi.yaml"
generate:

View File

@ -273,7 +273,7 @@ from wrenn.code_interpreter import Capsule
with Capsule(wait=True) as capsule:
result = capsule.run_code("print('hello')")
print(result.text) # "hello"
print("".join(result.logs.stdout)) # "hello\n"
```
### Stateful Execution
@ -297,25 +297,43 @@ with Capsule(wait=True) as capsule:
print(result.text) # "hello world"
```
The `text` field returns the expression result when available. For `print()` calls (which produce no expression result), it falls back to the stripped stdout output.
The `text` property returns the `text/plain` value of the main `execute_result` (the last expression in the cell). Printed output goes to `result.logs.stdout` instead.
### Error Handling in Code
```python
result = capsule.run_code("1 / 0")
print(result.error) # "ZeroDivisionError: division by zero\n..."
print(result.error.name) # "ZeroDivisionError"
print(result.error.value) # "division by zero"
print(result.error.traceback) # full traceback string
```
### Rich Output
Each call to `display()`, `plt.show()`, or similar produces a `Result` in `execution.results`. Known MIME types are unpacked into named fields:
```python
result = capsule.run_code("""
import matplotlib.pyplot as plt
plt.plot([1, 2, 3])
plt.savefig('/tmp/plot.png')
plt.show()
""")
print(result.data) # {"image/png": "base64...", "text/plain": "..."}
for r in result.results:
if r.png:
print(f"Got PNG image ({len(r.png)} bytes base64)")
print(r.formats()) # e.g. ["text", "png"]
```
### Streaming Callbacks
```python
capsule.run_code(
code,
on_result=lambda r: print("result:", r.formats()),
on_stdout=lambda text: print("stdout:", text),
on_stderr=lambda text: print("stderr:", text),
on_error=lambda err: print(f"error: {err.name}: {err.value}"),
)
```
### Custom Templates
@ -327,17 +345,19 @@ capsule = Capsule(template="my-custom-jupyter-template", wait=True)
result = capsule.run_code("print('running on custom template')")
```
### CodeResult Fields
### Execution Model
`run_code()` returns an `Execution` object:
| Field | Type | Description |
|-------|------|-------------|
| `text` | `str \| None` | Expression result, or stripped stdout if no expression result |
| `data` | `dict \| None` | Rich MIME bundle (e.g. `{"image/png": "..."}`) |
| `stdout` | `str` | Raw accumulated stdout output |
| `stderr` | `str` | Raw accumulated stderr output |
| `error` | `str \| None` | Error traceback string |
| `results` | `list[Result]` | All rich outputs (charts, images, expression values) |
| `logs` | `Logs` | `.stdout: list[str]` and `.stderr: list[str]` chunks |
| `error` | `ExecutionError \| None` | `.name`, `.value`, `.traceback` |
| `execution_count` | `int \| None` | Jupyter cell execution counter |
| `text` | `str \| None` | (property) `text/plain` of the main `execute_result` |
String expression results have quotes stripped automatically (e.g. `'hello'` becomes `hello`).
Each `Result` has typed MIME fields: `text`, `html`, `markdown`, `svg`, `png`, `jpeg`, `pdf`, `latex`, `json`, `javascript`, plus `extra` for unknown types. String expression results have quotes stripped automatically.
### Code Interpreter + Commands/Files

View File

@ -16,6 +16,10 @@ paths:
summary: Create a new account
operationId: signup
tags: [auth]
description: |
Creates an inactive user account and sends an activation email.
The user must activate their account within 30 minutes.
Does not return a JWT — the user must activate first, then sign in.
requestBody:
required: true
content:
@ -24,11 +28,11 @@ paths:
$ref: "#/components/schemas/SignupRequest"
responses:
"201":
description: Account created
description: Account created, activation email sent
content:
application/json:
schema:
$ref: "#/components/schemas/AuthResponse"
$ref: "#/components/schemas/SignupResponse"
"400":
description: Invalid request (bad email, short password)
content:
@ -36,7 +40,39 @@ paths:
schema:
$ref: "#/components/schemas/Error"
"409":
description: Email already registered
description: Email already registered or signup cooldown active
content:
application/json:
schema:
$ref: "#/components/schemas/Error"
/v1/auth/activate:
post:
summary: Activate account via email token
operationId: activate
tags: [auth]
description: |
Consumes the activation token sent via email and activates the user account.
Creates a default team and returns a JWT to log the user in.
requestBody:
required: true
content:
application/json:
schema:
type: object
required: [token]
properties:
token:
type: string
responses:
"200":
description: Account activated, JWT issued
content:
application/json:
schema:
$ref: "#/components/schemas/AuthResponse"
"400":
description: Invalid or expired token
content:
application/json:
schema:
@ -175,6 +211,252 @@ paths:
"302":
description: Redirect to frontend with token or error
/v1/me:
get:
summary: Get current user profile
operationId: getMe
tags: [account]
security:
- bearerAuth: []
responses:
"200":
description: User profile
content:
application/json:
schema:
$ref: "#/components/schemas/MeResponse"
patch:
summary: Update display name
operationId: updateName
tags: [account]
security:
- bearerAuth: []
requestBody:
required: true
content:
application/json:
schema:
type: object
required: [name]
properties:
name:
type: string
minLength: 1
maxLength: 100
responses:
"200":
description: Name updated, new JWT issued
content:
application/json:
schema:
$ref: "#/components/schemas/AuthResponse"
"400":
description: Invalid name
content:
application/json:
schema:
$ref: "#/components/schemas/Error"
delete:
summary: Delete current account
operationId: deleteAccount
tags: [account]
security:
- bearerAuth: []
description: |
Soft-deletes the account (sets status=deleted, deleted_at=now).
The account is permanently removed after 15 days. Blocked if the user
owns any team that has other members.
requestBody:
required: true
content:
application/json:
schema:
type: object
required: [confirmation]
properties:
confirmation:
type: string
description: Must match the user's email address (case-insensitive)
responses:
"204":
description: Account scheduled for deletion
"400":
description: Confirmation does not match email
content:
application/json:
schema:
$ref: "#/components/schemas/Error"
"409":
description: User owns teams with other members
content:
application/json:
schema:
$ref: "#/components/schemas/Error"
/v1/me/password:
post:
summary: Change or add password
operationId: changePassword
tags: [account]
security:
- bearerAuth: []
description: |
For users with an existing password: requires `current_password` and `new_password`.
For OAuth-only users adding a password: requires `new_password` and `confirm_password`.
requestBody:
required: true
content:
application/json:
schema:
$ref: "#/components/schemas/ChangePasswordRequest"
responses:
"204":
description: Password updated
"400":
description: Invalid request (short password, mismatch, etc.)
content:
application/json:
schema:
$ref: "#/components/schemas/Error"
"401":
description: Current password is incorrect
content:
application/json:
schema:
$ref: "#/components/schemas/Error"
/v1/me/password/reset:
post:
summary: Request a password reset email
operationId: requestPasswordReset
tags: [account]
description: |
Sends a password reset link to the given email. Always returns 200
regardless of whether the email exists, to prevent account enumeration.
The reset token expires in 15 minutes.
requestBody:
required: true
content:
application/json:
schema:
type: object
required: [email]
properties:
email:
type: string
format: email
responses:
"204":
description: Request accepted (email sent if account exists)
/v1/me/password/reset/confirm:
post:
summary: Confirm password reset
operationId: confirmPasswordReset
tags: [account]
description: |
Consumes a password reset token and sets a new password. The token is
single-use and expires after 15 minutes.
requestBody:
required: true
content:
application/json:
schema:
type: object
required: [token, new_password]
properties:
token:
type: string
description: Raw reset token from the email link
new_password:
type: string
minLength: 8
responses:
"204":
description: Password reset successful
"400":
description: Invalid or expired token, or password too short
content:
application/json:
schema:
$ref: "#/components/schemas/Error"
/v1/me/providers/{provider}/connect:
parameters:
- name: provider
in: path
required: true
schema:
type: string
enum: [github]
description: OAuth provider name
get:
summary: Initiate OAuth provider link
operationId: connectProvider
tags: [account]
security:
- bearerAuth: []
description: |
Sets OAuth state and link cookies, then returns the provider's
authorization URL. The frontend navigates to this URL to start the
OAuth flow. On callback, the provider is linked to the current account
(not a new registration).
responses:
"200":
description: Authorization URL
content:
application/json:
schema:
type: object
properties:
auth_url:
type: string
format: uri
"404":
description: Provider not found or not configured
content:
application/json:
schema:
$ref: "#/components/schemas/Error"
/v1/me/providers/{provider}:
parameters:
- name: provider
in: path
required: true
schema:
type: string
enum: [github]
description: OAuth provider name
delete:
summary: Disconnect an OAuth provider
operationId: disconnectProvider
tags: [account]
security:
- bearerAuth: []
description: |
Unlinks the OAuth provider from the current account. Blocked if this
is the user's only login method (no password and no other providers).
responses:
"204":
description: Provider disconnected
"400":
description: Cannot disconnect last login method
content:
application/json:
schema:
$ref: "#/components/schemas/Error"
"404":
description: Provider not connected
content:
application/json:
schema:
$ref: "#/components/schemas/Error"
/v1/api-keys:
post:
summary: Create an API key
@ -1386,7 +1668,6 @@ paths:
PTY data (input and output) is base64-encoded because it contains raw
terminal bytes (escape sequences, control codes) that are not valid UTF-8.
Sessions have a 120-second inactivity timeout (reset on input/resize).
Sessions persist across WebSocket disconnections — the process keeps
running in the capsule. Use the `tag` from the "started" response to
reconnect later.
@ -2078,6 +2359,13 @@ components:
password:
type: string
SignupResponse:
type: object
properties:
message:
type: string
description: Confirmation message instructing user to check email
AuthResponse:
type: object
properties:
@ -2781,6 +3069,37 @@ components:
nullable: true
description: Webhook secret. Only returned on creation, never again.
MeResponse:
type: object
properties:
name:
type: string
email:
type: string
format: email
has_password:
type: boolean
description: Whether the user has a password set (false for OAuth-only accounts)
providers:
type: array
items:
type: string
description: List of linked OAuth provider names (e.g. ["github"])
ChangePasswordRequest:
type: object
required: [new_password]
properties:
current_password:
type: string
description: Required when changing an existing password
new_password:
type: string
minLength: 8
confirm_password:
type: string
description: Required when adding a password to an OAuth-only account (must match new_password)
Error:
type: object
properties:

View File

@ -1,10 +1,19 @@
from wrenn.code_interpreter.async_capsule import AsyncCapsule
from wrenn.code_interpreter.capsule import Capsule, CodeResult
from wrenn.code_interpreter.capsule import Capsule
from wrenn.code_interpreter.models import (
Execution,
ExecutionError,
Logs,
Result,
)
__all__ = [
"AsyncCapsule",
"Capsule",
"CodeResult",
"Execution",
"ExecutionError",
"Logs",
"Result",
"Sandbox",
]

View File

@ -4,6 +4,8 @@ import asyncio
import json
import time
import uuid
from collections.abc import Callable
from typing import Any
import httpx
import httpx_ws
@ -11,7 +13,13 @@ import httpx_ws
from wrenn.async_capsule import AsyncCapsule as BaseAsyncCapsule
from wrenn.capsule import _build_proxy_url
from wrenn.client import AsyncWrennClient
from wrenn.code_interpreter.capsule import CodeResult, DEFAULT_TEMPLATE
from wrenn.code_interpreter.capsule import DEFAULT_TEMPLATE
from wrenn.code_interpreter.models import (
Execution,
ExecutionError,
Logs,
Result,
)
class AsyncCapsule(BaseAsyncCapsule):
@ -151,15 +159,36 @@ class AsyncCapsule(BaseAsyncCapsule):
language: str = "python",
timeout: float = 30,
jupyter_timeout: float = 30,
) -> CodeResult:
"""Execute code in a persistent Jupyter kernel (async)."""
on_result: Callable[[Result], Any] | None = None,
on_stdout: Callable[[str], Any] | None = None,
on_stderr: Callable[[str], Any] | None = None,
on_error: Callable[[ExecutionError], Any] | None = None,
) -> Execution:
"""Execute code in a persistent Jupyter kernel (async).
Args:
code: Code string to execute.
language: Execution backend language. Currently only ``"python"``.
timeout: Maximum seconds to wait for execution to complete.
jupyter_timeout: Maximum seconds to wait for Jupyter to become
available.
on_result: Called for each rich output (charts, images, expression
values).
on_stdout: Called for each stdout chunk.
on_stderr: Called for each stderr chunk.
on_error: Called when the cell raises an exception.
Returns:
An :class:`Execution` with ``.results``, ``.logs``, ``.error``,
and a convenience ``.text`` property.
"""
kernel_id = await self._ensure_kernel(jupyter_timeout=jupyter_timeout)
ws_url = self._jupyter_ws_url(kernel_id)
msg = self._jupyter_execute_request(code)
msg_id = msg["msg_id"]
result = CodeResult()
execution = Execution()
deadline = time.monotonic() + timeout
headers = {"X-API-Key": self._client._api_key}
@ -186,31 +215,43 @@ class AsyncCapsule(BaseAsyncCapsule):
content = data.get("content", {})
if msg_type == "stream":
text = content.get("text", "")
name = content.get("name", "stdout")
if name == "stderr":
result.stderr += content.get("text", "")
execution.logs.stderr.append(text)
if on_stderr is not None:
on_stderr(text)
else:
result.stdout += content.get("text", "")
elif msg_type == "execute_result":
execution.logs.stdout.append(text)
if on_stdout is not None:
on_stdout(text)
elif msg_type in ("execute_result", "display_data"):
bundle = content.get("data", {})
text = bundle.get("text/plain")
if text and (
(text.startswith("'") and text.endswith("'"))
or (text.startswith('"') and text.endswith('"'))
):
text = text[1:-1]
result.text = text
result.data = bundle
is_main = msg_type == "execute_result"
result = Result.from_bundle(bundle, is_main_result=is_main)
execution.results.append(result)
if is_main:
execution.execution_count = content.get(
"execution_count"
)
if on_result is not None:
on_result(result)
elif msg_type == "error":
traceback = content.get("traceback", [])
result.error = "\n".join(traceback)
elif msg_type == "status" and content.get("execution_state") == "idle":
err = ExecutionError(
name=content.get("ename", ""),
value=content.get("evalue", ""),
traceback="\n".join(content.get("traceback", [])),
)
execution.error = err
if on_error is not None:
on_error(err)
elif (
msg_type == "status"
and content.get("execution_state") == "idle"
):
break
if result.text is None and result.stdout:
result.text = result.stdout.strip()
return result
return execution
async def __aexit__(self, *args) -> None:
if self._proxy_client is not None:

View File

@ -3,37 +3,24 @@ from __future__ import annotations
import json
import time
import uuid
from dataclasses import dataclass
from collections.abc import Callable
from typing import Any
import httpx
import httpx_ws
from wrenn.capsule import Capsule as BaseCapsule
from wrenn.capsule import _build_proxy_url
from wrenn.code_interpreter.models import (
Execution,
ExecutionError,
Logs,
Result,
)
DEFAULT_TEMPLATE = "code-runner-beta"
@dataclass
class CodeResult:
"""Result from stateful code execution.
Attributes:
text: text/plain representation of the result.
data: rich MIME bundle (e.g. ``{"image/png": "..."}``).
stdout: accumulated stdout output.
stderr: accumulated stderr output.
error: language-specific error/traceback string.
"""
text: str | None = None
data: dict[str, str] | None = None
stdout: str = ""
stderr: str = ""
error: str | None = None
class Capsule(BaseCapsule):
"""Code interpreter capsule with ``run_code`` support.
@ -43,7 +30,7 @@ class Capsule(BaseCapsule):
capsule = Capsule()
result = capsule.run_code("print('hello')")
print(result.stdout) # "hello\\n"
print(result.logs.stdout) # ["hello\\n"]
"""
_kernel_id: str | None
@ -184,7 +171,11 @@ class Capsule(BaseCapsule):
language: str = "python",
timeout: float = 30,
jupyter_timeout: float = 30,
) -> CodeResult:
on_result: Callable[[Result], Any] | None = None,
on_stdout: Callable[[str], Any] | None = None,
on_stderr: Callable[[str], Any] | None = None,
on_error: Callable[[ExecutionError], Any] | None = None,
) -> Execution:
"""Execute code in a persistent Jupyter kernel.
Variables, imports, and function definitions survive across calls.
@ -193,10 +184,17 @@ class Capsule(BaseCapsule):
code: Code string to execute.
language: Execution backend language. Currently only ``"python"``.
timeout: Maximum seconds to wait for execution to complete.
jupyter_timeout: Maximum seconds to wait for Jupyter to become available.
jupyter_timeout: Maximum seconds to wait for Jupyter to become
available.
on_result: Called for each rich output (charts, images, expression
values).
on_stdout: Called for each stdout chunk.
on_stderr: Called for each stderr chunk.
on_error: Called when the cell raises an exception.
Returns:
A ``CodeResult`` with ``.text``, ``.data``, ``.stdout``, ``.stderr``, ``.error``.
An :class:`Execution` with ``.results``, ``.logs``, ``.error``,
and a convenience ``.text`` property.
"""
kernel_id = self._ensure_kernel(jupyter_timeout=jupyter_timeout)
ws_url = self._jupyter_ws_url(kernel_id)
@ -204,7 +202,7 @@ class Capsule(BaseCapsule):
msg = self._jupyter_execute_request(code)
msg_id = msg["msg_id"]
result = CodeResult()
execution = Execution()
deadline = time.monotonic() + timeout
headers = {"X-API-Key": self._client._api_key}
@ -229,31 +227,43 @@ class Capsule(BaseCapsule):
content = data.get("content", {})
if msg_type == "stream":
text = content.get("text", "")
name = content.get("name", "stdout")
if name == "stderr":
result.stderr += content.get("text", "")
execution.logs.stderr.append(text)
if on_stderr is not None:
on_stderr(text)
else:
result.stdout += content.get("text", "")
elif msg_type == "execute_result":
execution.logs.stdout.append(text)
if on_stdout is not None:
on_stdout(text)
elif msg_type in ("execute_result", "display_data"):
bundle = content.get("data", {})
text = bundle.get("text/plain")
if text and (
(text.startswith("'") and text.endswith("'"))
or (text.startswith('"') and text.endswith('"'))
):
text = text[1:-1]
result.text = text
result.data = bundle
is_main = msg_type == "execute_result"
result = Result.from_bundle(bundle, is_main_result=is_main)
execution.results.append(result)
if is_main:
execution.execution_count = content.get(
"execution_count"
)
if on_result is not None:
on_result(result)
elif msg_type == "error":
traceback = content.get("traceback", [])
result.error = "\n".join(traceback)
elif msg_type == "status" and content.get("execution_state") == "idle":
err = ExecutionError(
name=content.get("ename", ""),
value=content.get("evalue", ""),
traceback="\n".join(content.get("traceback", [])),
)
execution.error = err
if on_error is not None:
on_error(err)
elif (
msg_type == "status"
and content.get("execution_state") == "idle"
):
break
if result.text is None and result.stdout:
result.text = result.stdout.strip()
return result
return execution
def __exit__(self, *args) -> None:
if self._proxy_client is not None:

View File

@ -0,0 +1,156 @@
from __future__ import annotations
from dataclasses import dataclass, field
_MIME_MAP: dict[str, str] = {
"text/plain": "text",
"text/html": "html",
"text/markdown": "markdown",
"image/svg+xml": "svg",
"image/png": "png",
"image/jpeg": "jpeg",
"application/pdf": "pdf",
"text/latex": "latex",
"application/json": "json",
"application/javascript": "javascript",
}
@dataclass
class ExecutionError:
"""Error raised during code execution.
Attributes:
name: Exception class name (e.g. ``"NameError"``).
value: Exception message.
traceback: Full traceback string.
"""
name: str = ""
value: str = ""
traceback: str = ""
@dataclass
class Logs:
"""Captured stdout/stderr streams.
Each element in the list is one chunk of text as it arrived from
the kernel.
"""
stdout: list[str] = field(default_factory=list)
stderr: list[str] = field(default_factory=list)
@dataclass
class Result:
"""A single rich output from code execution.
Jupyter cells can produce multiple outputs — one ``execute_result``
(the expression value) and zero or more ``display_data`` messages
(from ``plt.show()``, ``display()``, etc.). Each becomes a
``Result``.
Known MIME types are unpacked into named attributes; anything else
lands in :pyattr:`extra`.
"""
# --- MIME type fields ---
text: str | None = None
"""``text/plain`` representation."""
html: str | None = None
"""``text/html`` representation."""
markdown: str | None = None
"""``text/markdown`` representation."""
svg: str | None = None
"""``image/svg+xml`` representation."""
png: str | None = None
"""``image/png`` — base64-encoded."""
jpeg: str | None = None
"""``image/jpeg`` — base64-encoded."""
pdf: str | None = None
"""``application/pdf`` — base64-encoded."""
latex: str | None = None
"""``text/latex`` representation."""
json: dict | None = None
"""``application/json`` representation."""
javascript: str | None = None
"""``application/javascript`` representation."""
extra: dict[str, str] | None = None
"""MIME types not covered by the named fields above."""
is_main_result: bool = False
"""``True`` when this came from an ``execute_result`` message
(i.e. the value of the last expression in the cell). ``False``
for ``display_data`` outputs."""
@classmethod
def from_bundle(
cls, bundle: dict[str, str], *, is_main_result: bool = False
) -> Result:
"""Build a ``Result`` from a Jupyter MIME bundle dict."""
kwargs: dict = {"is_main_result": is_main_result}
extra: dict[str, str] = {}
for mime, value in bundle.items():
attr = _MIME_MAP.get(mime)
if attr is not None:
kwargs[attr] = value
else:
extra[mime] = value
if extra:
kwargs["extra"] = extra
# Strip surrounding quotes from text/plain (Jupyter repr artefact)
text = kwargs.get("text")
if isinstance(text, str) and len(text) >= 2:
if (text[0] == text[-1]) and text[0] in ("'", '"'):
kwargs["text"] = text[1:-1]
return cls(**kwargs)
def formats(self) -> list[str]:
"""Return names of non-``None`` MIME-type fields."""
out: list[str] = []
for attr in (
"text",
"html",
"markdown",
"svg",
"png",
"jpeg",
"pdf",
"latex",
"json",
"javascript",
):
if getattr(self, attr) is not None:
out.append(attr)
if self.extra:
out.extend(self.extra)
return out
@dataclass
class Execution:
"""Complete result of a ``run_code`` call.
Attributes:
results: All rich outputs produced by the cell — charts, tables,
images, expression values, etc.
logs: Captured stdout/stderr text.
error: Populated when the cell raised an exception.
execution_count: Jupyter execution counter (the ``[N]`` number).
"""
results: list[Result] = field(default_factory=list)
logs: Logs = field(default_factory=Logs)
error: ExecutionError | None = None
execution_count: int | None = None
@property
def text(self) -> str | None:
"""Convenience — ``text/plain`` of the main ``execute_result``,
or ``None`` if the cell had no expression value."""
for r in self.results:
if r.is_main_result:
return r.text
return None

View File

@ -1,6 +1,6 @@
# generated by datamodel-codegen:
# filename: openapi.yaml
# timestamp: 2026-04-15T08:37:41+00:00
# timestamp: 2026-04-16T20:32:20+00:00
from __future__ import annotations
from pydantic import AwareDatetime, BaseModel, EmailStr, Field
@ -19,6 +19,13 @@ class LoginRequest(BaseModel):
password: str
class SignupResponse(BaseModel):
message: Annotated[
str | None,
Field(description="Confirmation message instructing user to check email"),
] = None
class AuthResponse(BaseModel):
token: Annotated[str | None, Field(description="JWT token (valid for 6 hours)")] = (
None
@ -534,6 +541,34 @@ class ChannelResponse(BaseModel):
] = None
class MeResponse(BaseModel):
name: str | None = None
email: EmailStr | None = None
has_password: Annotated[
bool | None,
Field(
description="Whether the user has a password set (false for OAuth-only accounts)"
),
] = None
providers: Annotated[
list[str] | None,
Field(description='List of linked OAuth provider names (e.g. ["github"])'),
] = None
class ChangePasswordRequest(BaseModel):
current_password: Annotated[
str | None, Field(description="Required when changing an existing password")
] = None
new_password: Annotated[str, Field(min_length=8)]
confirm_password: Annotated[
str | None,
Field(
description="Required when adding a password to an OAuth-only account (must match new_password)"
),
] = None
class Error2(BaseModel):
code: str | None = None
message: str | None = None

View File

@ -4,7 +4,7 @@ import pytest
import respx
from wrenn.capsule import Capsule, _build_proxy_url
from wrenn.code_interpreter.capsule import CodeResult
from wrenn.code_interpreter.models import Execution, ExecutionError, Logs, Result
BASE = "https://app.wrenn.dev/api"
@ -120,30 +120,61 @@ class TestCapsuleConnect:
assert cap.capsule_id == "cl-1"
class TestCodeResult:
def test_defaults(self):
r = CodeResult()
assert r.text is None
assert r.data is None
assert r.stdout == ""
assert r.stderr == ""
assert r.error is None
class TestExecutionModels:
def test_execution_defaults(self):
e = Execution()
assert e.results == []
assert e.logs.stdout == []
assert e.logs.stderr == []
assert e.error is None
assert e.text is None
def test_with_values(self):
r = CodeResult(
text="84",
data={"text/plain": "84"},
stdout="",
stderr="",
error=None,
)
def test_result_from_bundle(self):
bundle = {"text/plain": "84", "image/png": "base64data"}
r = Result.from_bundle(bundle, is_main_result=True)
assert r.text == "84"
assert r.data["text/plain"] == "84"
assert r.png == "base64data"
assert r.is_main_result is True
def test_error_result(self):
r = CodeResult(error="ZeroDivisionError: division by zero\n...")
assert r.error is not None
assert "ZeroDivisionError" in r.error
def test_result_from_bundle_strips_quotes(self):
bundle = {"text/plain": "'hello'"}
r = Result.from_bundle(bundle)
assert r.text == "hello"
def test_result_from_bundle_extra_mimes(self):
bundle = {"text/plain": "x", "application/vnd.custom": "data"}
r = Result.from_bundle(bundle)
assert r.extra == {"application/vnd.custom": "data"}
def test_result_formats(self):
r = Result(text="hi", png="data")
assert "text" in r.formats()
assert "png" in r.formats()
assert "html" not in r.formats()
def test_execution_text_property(self):
e = Execution(
results=[
Result(text="chart", is_main_result=False),
Result(text="42", is_main_result=True),
]
)
assert e.text == "42"
def test_execution_error(self):
err = ExecutionError(
name="ZeroDivisionError",
value="division by zero",
traceback="Traceback ...\nZeroDivisionError: division by zero",
)
e = Execution(error=err)
assert e.error is not None
assert "ZeroDivisionError" in e.error.name
def test_logs(self):
logs = Logs(stdout=["hello\n", "world\n"], stderr=["warn\n"])
assert "".join(logs.stdout) == "hello\nworld\n"
assert "".join(logs.stderr) == "warn\n"
class TestDeprecationWarnings: