feat: redesign code interpreter with structured Execution model

Replace flat CodeResult with a proper model hierarchy: Execution (top-level), Result (per-output with typed MIME fields), Logs (stdout/stderr as lists), and ExecutionError (structured name/value/traceback). Handle display_data messages for rich output, add streaming callbacks (on_result, on_stdout, on_stderr, on_error), and remove the misleading stdout-to-text fallback. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 03:13:16 +06:00
parent 7e7ecbd48a
commit 3f97c73b2f
11 changed files with 863 additions and 108 deletions
--- a/.gitignore
+++ b/.gitignore
@ -175,3 +175,5 @@ cython_debug/
 .pypirc

 CODE_EXECUTION.md
+
+docs/
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -0,0 +1,132 @@
+## Design Context
+
+### Users
+Developers across the full spectrum — solo engineers building side projects, startup teams integrating sandboxed execution into products, and platform/infra engineers at larger organizations running production workloads on Firecracker microVMs. They arrive with context: they know what a process is, what a rootfs is, what a TTY means. The interface must feel at home for all three: approachable enough not to intimidate a hacker, precise enough to earn the trust of a production ops team. Never condescend, never oversimplify. Trust the user to understand what they're looking at.
+
+**Primary job to be done:** Understand what's running, act on it confidently, and get back to code.
+
+### Brand Personality
+**Precise. Warm. Uncompromising.**
+
+Wrenn is an engineer's favorite tool — built with visible care, not assembled from defaults. It runs real infrastructure (Firecracker microVMs), so the UI should reflect that seriousness without becoming cold or corporate. The warmth comes from the typography and color palette; the precision comes from hierarchy, density, and data fidelity.
+
+Emotional goal: **in control.** Users leave a session with full confidence in what's running, what happened, and what comes next. Nothing is hidden, nothing is ambiguous.
+
+### Aesthetic Direction
+**Dark-only (permanently), industrial-warm, data-forward.**
+
+No light mode planned. All design decisions should optimize for dark. The near-black-green background palette (`#0a0c0b` through `#2a302d`) reads as "black with intention" — not pitch black (cold) and not charcoal (dated). The sage green accent (`#5e8c58`) is muted and organic, a meaningful departure from the startup-green neon that saturates the developer tool space.
+
+**Anti-references:**
+- **Supabase**: avoid the friendly, approachable startup-green energy — too generic, too eager to please
+- **AWS / GCP consoles**: avoid utility-first density without craft — functional but joyless, visually dated
+
+**References that capture the right spirit:**
+- The precision of a well-calibrated instrument
+- Editorial typography from technical publications
+- The quiet confidence of tools that don't need to explain themselves
+
+### Type System
+Four fonts with strict roles — this is the design system's strongest personality trait and must be respected:
+
+| Font | CSS Class | Role | When to use |
+|------|-----------|------|-------------|
+| **Manrope** (variable, sans) | `font-sans` | UI workhorse | All body copy, nav, labels, buttons, form text |
+| **Instrument Serif** | `font-serif` | Display / editorial | Page titles (h1), dialog headings, metric values, hero moments |
+| **JetBrains Mono** (variable) | `font-mono` | Data / code | IDs, timestamps, key prefixes, file paths, terminal output, metrics |
+| **Alice** | brand wordmark only | Brand wordmark | "Wrenn" in sidebar and login only — nowhere else |
+
+Instrument Serif at scale creates the signature editorial moments. Mono provides the precision signal for technical data. Never swap these roles.
+
+**Tracking overrides (app.css):**
+- `.font-serif` — `letter-spacing: 0.015em` (positive tracking; Instrument Serif reads less condensed at display sizes)
+- `.font-mono` — `font-variant-numeric: tabular-nums` (numbers align in tables and metric displays)
+
+**Type scale (root: 87.5% = 14px base):**
+| Token | Value | Use |
+|---|---|---|
+| `--text-display` | 2.571rem (~36px) | Auth section headings |
+| `--text-page` | 2rem (~28px) | Page h1 titles |
+| `--text-heading` | 1.429rem (~20px) | Dialog headings, empty states |
+| `--text-body` | 1rem (~14px) | Primary body, buttons, inputs |
+| `--text-ui` | 0.929rem (~13px) | Nav labels, table cells |
+| `--text-meta` | 0.857rem (~12px) | Key prefixes, minor info |
+| `--text-label` | 0.786rem (~11px) | Uppercase section labels |
+| `--text-badge` | 0.714rem (~10px) | Live badges, tiny indicators |
+
+### Color System
+
+All values are CSS custom properties in `frontend/src/app.css`.
+
+**Backgrounds (6-step near-black-green scale):**
+| Token | Value | Use |
+|---|---|---|
+| `--color-bg-0` | `#0a0c0b` | Page base, sidebar deepest layer |
+| `--color-bg-1` | `#0f1211` | Sidebar surface |
+| `--color-bg-2` | `#141817` | Card backgrounds |
+| `--color-bg-3` | `#1a1e1c` | Table headers, elevated surfaces |
+| `--color-bg-4` | `#212624` | Hover states, inputs |
+| `--color-bg-5` | `#2a302d` | Highlighted items, selected rows |
+
+**Text (5-level hierarchy):**
+| Token | Value | Use |
+|---|---|---|
+| `--color-text-bright` | `#eae7e2` | H1s, dialog headings |
+| `--color-text-primary` | `#d0cdc6` | Body copy, primary labels |
+| `--color-text-secondary` | `#9b9790` | Secondary labels, descriptions |
+| `--color-text-tertiary` | `#6b6862` | Hints, placeholders |
+| `--color-text-muted` | `#454340` | Dividers as text, ultra-subtle |
+
+**Accent (sage green — use sparingly, must feel earned):**
+| Token | Value | Use |
+|---|---|---|
+| `--color-accent` | `#5e8c58` | Primary CTA, live indicators, focus rings, active nav |
+| `--color-accent-mid` | `#89a785` | Hover accent text |
+| `--color-accent-bright` | `#a4c89f` | Accent on dark backgrounds |
+| `--color-accent-glow` | `rgba(94,140,88,0.07)` | Subtle tinted backgrounds |
+| `--color-accent-glow-mid` | `rgba(94,140,88,0.14)` | Hover tint on accent items |
+
+**Status semantics:**
+| Token | Value | Use |
+|---|---|---|
+| `--color-amber` | `#d4a73c` | Warning, paused state |
+| `--color-red` | `#cf8172` | Error, destructive actions |
+| `--color-blue` | `#5a9fd4` | Info, neutral system states |
+
+**Borders:** `--color-border` (`#1f2321`) default; `--color-border-mid` (`#2a2f2c`) for inputs/hover.
+
+### Component Patterns
+
+**Buttons:**
+- Primary: solid sage green (`--color-accent`), hover brightness boost + micro-lift (`-translate-y-px`)
+- Secondary: bordered (`--color-border-mid`), text transitions to accent on hover
+- Danger: red text + subtle red background on hover
+- All: `transition-all duration-150`
+
+**Inputs:**
+- Border `--color-border`, background `--color-bg-2`; focus transitions border and icon to accent
+- Group focus pattern: `group` wrapper + `group-focus-within:text-[var(--color-accent)]` on icon
+
+**Tables / data lists:**
+- Grid layout; header `bg-3` + uppercase `--text-label`; row hover `hover:bg-[var(--color-bg-3)]`
+- Status stripe: left border color matches sandbox state
+
+**Status indicators:** Running = animated ping + sage green dot; Paused = amber dot; Stopped = muted gray. Color is never the sole differentiator.
+
+**Modals & dialogs:** Border + shadow only — no accent gradient bars/strips. `fadeUp` 0.35s entrance.
+
+**Empty states:** Large icon with glow, Instrument Serif heading, secondary body text, CTA below, `iconFloat` 4s animation.
+
+**Animations (always respect `prefers-reduced-motion`):** `fadeUp` (entrance), `status-ping` (live indicator), `iconFloat` (empty states), `spin-once` (refresh), staggered `animation-delay` on lists.
+
+### Design Principles
+
+1. **Precision over friendliness.** Every element earns its place. Wrenn doesn't need to tell you it's developer-friendly — that should be self-evident from the quality of the information architecture.
+
+2. **Density with breathing room.** Data-forward doesn't mean cramped. Strategic whitespace creates calm hierarchy within dense contexts. Sections breathe; rows don't waste space.
+
+3. **Industrial warmth.** The serif + mono + warm-black combination prevents sterility. This is a forge, not a gallery. The warmth is in the details, not the primary colors.
+
+4. **Legible at speed.** Users scan dashboards in seconds. Strong typographic contrast (serif h1, mono IDs, sans body), consistent patterns, and predictable placement let users orientate instantly without reading everything.
+
+5. **Craft signals trust.** For infrastructure that runs production code, the quality of the UI is a proxy for the quality of the product. Pixel-level decisions matter. Polish is not decoration — it's a trust signal.
--- a/2
+++ b/2
@ -2,7 +2,7 @@
 .PHONY: generate lint test check test-integration

 # Variables
-SPEC_URL = "https://git.omukk.dev/wrenn/wrenn/raw/branch/dev/internal/api/openapi.yaml"
+SPEC_URL = "https://git.omukk.dev/wrenn/wrenn/raw/branch/main/internal/api/openapi.yaml"
 SPEC_PATH = "api/openapi.yaml"

 generate:
--- a/README.md
+++ b/README.md
@ -273,7 +273,7 @@ from wrenn.code_interpreter import Capsule

 with Capsule(wait=True) as capsule:
    result = capsule.run_code("print('hello')")
-    print(result.text)  # "hello"
+    print("".join(result.logs.stdout))  # "hello\n"
 ```

 ### Stateful Execution
@ -297,25 +297,43 @@ with Capsule(wait=True) as capsule:
    print(result.text)  # "hello world"
 ```

-The `text` field returns the expression result when available. For `print()` calls (which produce no expression result), it falls back to the stripped stdout output.
+The `text` property returns the `text/plain` value of the main `execute_result` (the last expression in the cell). Printed output goes to `result.logs.stdout` instead.

 ### Error Handling in Code

 ```python
 result = capsule.run_code("1 / 0")
-print(result.error)  # "ZeroDivisionError: division by zero\n..."
+print(result.error.name)       # "ZeroDivisionError"
+print(result.error.value)      # "division by zero"
+print(result.error.traceback)  # full traceback string
 ```

 ### Rich Output

+Each call to `display()`, `plt.show()`, or similar produces a `Result` in `execution.results`. Known MIME types are unpacked into named fields:
+
 ```python
 result = capsule.run_code("""
 import matplotlib.pyplot as plt
 plt.plot([1, 2, 3])
-plt.savefig('/tmp/plot.png')
 plt.show()
 """)
-print(result.data)  # {"image/png": "base64...", "text/plain": "..."}
+for r in result.results:
+    if r.png:
+        print(f"Got PNG image ({len(r.png)} bytes base64)")
+    print(r.formats())  # e.g. ["text", "png"]
+```
+
+### Streaming Callbacks
+
+```python
+capsule.run_code(
+    code,
+    on_result=lambda r: print("result:", r.formats()),
+    on_stdout=lambda text: print("stdout:", text),
+    on_stderr=lambda text: print("stderr:", text),
+    on_error=lambda err: print(f"error: {err.name}: {err.value}"),
+)
 ```

 ### Custom Templates
@ -327,17 +345,19 @@ capsule = Capsule(template="my-custom-jupyter-template", wait=True)
 result = capsule.run_code("print('running on custom template')")
 ```

-### CodeResult Fields
+### Execution Model
+
+`run_code()` returns an `Execution` object:

 | Field | Type | Description |
 |-------|------|-------------|
-| `text` | `str \| None` | Expression result, or stripped stdout if no expression result |
-| `data` | `dict \| None` | Rich MIME bundle (e.g. `{"image/png": "..."}`) |
-| `stdout` | `str` | Raw accumulated stdout output |
-| `stderr` | `str` | Raw accumulated stderr output |
-| `error` | `str \| None` | Error traceback string |
+| `results` | `list[Result]` | All rich outputs (charts, images, expression values) |
+| `logs` | `Logs` | `.stdout: list[str]` and `.stderr: list[str]` chunks |
+| `error` | `ExecutionError \| None` | `.name`, `.value`, `.traceback` |
+| `execution_count` | `int \| None` | Jupyter cell execution counter |
+| `text` | `str \| None` | (property) `text/plain` of the main `execute_result` |

-String expression results have quotes stripped automatically (e.g. `'hello'` becomes `hello`).
+Each `Result` has typed MIME fields: `text`, `html`, `markdown`, `svg`, `png`, `jpeg`, `pdf`, `latex`, `json`, `javascript`, plus `extra` for unknown types. String expression results have quotes stripped automatically.

 ### Code Interpreter + Commands/Files

--- a/api/openapi.yaml
+++ b/api/openapi.yaml
@ -16,6 +16,10 @@ paths:
      summary: Create a new account
      operationId: signup
      tags: [auth]
+      description: |
+        Creates an inactive user account and sends an activation email.
+        The user must activate their account within 30 minutes.
+        Does not return a JWT — the user must activate first, then sign in.
      requestBody:
        required: true
        content:
@ -24,11 +28,11 @@ paths:
              $ref: "#/components/schemas/SignupRequest"
      responses:
        "201":
-          description: Account created
+          description: Account created, activation email sent
          content:
            application/json:
              schema:
-                $ref: "#/components/schemas/AuthResponse"
+                $ref: "#/components/schemas/SignupResponse"
        "400":
          description: Invalid request (bad email, short password)
          content:
@ -36,7 +40,39 @@ paths:
              schema:
                $ref: "#/components/schemas/Error"
        "409":
-          description: Email already registered
+          description: Email already registered or signup cooldown active
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/Error"
+
+  /v1/auth/activate:
+    post:
+      summary: Activate account via email token
+      operationId: activate
+      tags: [auth]
+      description: |
+        Consumes the activation token sent via email and activates the user account.
+        Creates a default team and returns a JWT to log the user in.
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              type: object
+              required: [token]
+              properties:
+                token:
+                  type: string
+      responses:
+        "200":
+          description: Account activated, JWT issued
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/AuthResponse"
+        "400":
+          description: Invalid or expired token
          content:
            application/json:
              schema:
@ -175,6 +211,252 @@ paths:
        "302":
          description: Redirect to frontend with token or error

+  /v1/me:
+    get:
+      summary: Get current user profile
+      operationId: getMe
+      tags: [account]
+      security:
+        - bearerAuth: []
+      responses:
+        "200":
+          description: User profile
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/MeResponse"
+
+    patch:
+      summary: Update display name
+      operationId: updateName
+      tags: [account]
+      security:
+        - bearerAuth: []
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              type: object
+              required: [name]
+              properties:
+                name:
+                  type: string
+                  minLength: 1
+                  maxLength: 100
+      responses:
+        "200":
+          description: Name updated, new JWT issued
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/AuthResponse"
+        "400":
+          description: Invalid name
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/Error"
+
+    delete:
+      summary: Delete current account
+      operationId: deleteAccount
+      tags: [account]
+      security:
+        - bearerAuth: []
+      description: |
+        Soft-deletes the account (sets status=deleted, deleted_at=now).
+        The account is permanently removed after 15 days. Blocked if the user
+        owns any team that has other members.
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              type: object
+              required: [confirmation]
+              properties:
+                confirmation:
+                  type: string
+                  description: Must match the user's email address (case-insensitive)
+      responses:
+        "204":
+          description: Account scheduled for deletion
+        "400":
+          description: Confirmation does not match email
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/Error"
+        "409":
+          description: User owns teams with other members
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/Error"
+
+  /v1/me/password:
+    post:
+      summary: Change or add password
+      operationId: changePassword
+      tags: [account]
+      security:
+        - bearerAuth: []
+      description: |
+        For users with an existing password: requires `current_password` and `new_password`.
+        For OAuth-only users adding a password: requires `new_password` and `confirm_password`.
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              $ref: "#/components/schemas/ChangePasswordRequest"
+      responses:
+        "204":
+          description: Password updated
+        "400":
+          description: Invalid request (short password, mismatch, etc.)
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/Error"
+        "401":
+          description: Current password is incorrect
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/Error"
+
+  /v1/me/password/reset:
+    post:
+      summary: Request a password reset email
+      operationId: requestPasswordReset
+      tags: [account]
+      description: |
+        Sends a password reset link to the given email. Always returns 200
+        regardless of whether the email exists, to prevent account enumeration.
+        The reset token expires in 15 minutes.
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              type: object
+              required: [email]
+              properties:
+                email:
+                  type: string
+                  format: email
+      responses:
+        "204":
+          description: Request accepted (email sent if account exists)
+
+  /v1/me/password/reset/confirm:
+    post:
+      summary: Confirm password reset
+      operationId: confirmPasswordReset
+      tags: [account]
+      description: |
+        Consumes a password reset token and sets a new password. The token is
+        single-use and expires after 15 minutes.
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              type: object
+              required: [token, new_password]
+              properties:
+                token:
+                  type: string
+                  description: Raw reset token from the email link
+                new_password:
+                  type: string
+                  minLength: 8
+      responses:
+        "204":
+          description: Password reset successful
+        "400":
+          description: Invalid or expired token, or password too short
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/Error"
+
+  /v1/me/providers/{provider}/connect:
+    parameters:
+      - name: provider
+        in: path
+        required: true
+        schema:
+          type: string
+          enum: [github]
+        description: OAuth provider name
+
+    get:
+      summary: Initiate OAuth provider link
+      operationId: connectProvider
+      tags: [account]
+      security:
+        - bearerAuth: []
+      description: |
+        Sets OAuth state and link cookies, then returns the provider's
+        authorization URL. The frontend navigates to this URL to start the
+        OAuth flow. On callback, the provider is linked to the current account
+        (not a new registration).
+      responses:
+        "200":
+          description: Authorization URL
+          content:
+            application/json:
+              schema:
+                type: object
+                properties:
+                  auth_url:
+                    type: string
+                    format: uri
+        "404":
+          description: Provider not found or not configured
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/Error"
+
+  /v1/me/providers/{provider}:
+    parameters:
+      - name: provider
+        in: path
+        required: true
+        schema:
+          type: string
+          enum: [github]
+        description: OAuth provider name
+
+    delete:
+      summary: Disconnect an OAuth provider
+      operationId: disconnectProvider
+      tags: [account]
+      security:
+        - bearerAuth: []
+      description: |
+        Unlinks the OAuth provider from the current account. Blocked if this
+        is the user's only login method (no password and no other providers).
+      responses:
+        "204":
+          description: Provider disconnected
+        "400":
+          description: Cannot disconnect last login method
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/Error"
+        "404":
+          description: Provider not connected
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/Error"
+
  /v1/api-keys:
    post:
      summary: Create an API key
@ -1386,7 +1668,6 @@ paths:
        PTY data (input and output) is base64-encoded because it contains raw
        terminal bytes (escape sequences, control codes) that are not valid UTF-8.

-        Sessions have a 120-second inactivity timeout (reset on input/resize).
        Sessions persist across WebSocket disconnections — the process keeps
        running in the capsule. Use the `tag` from the "started" response to
        reconnect later.
@ -2078,6 +2359,13 @@ components:
        password:
          type: string

+    SignupResponse:
+      type: object
+      properties:
+        message:
+          type: string
+          description: Confirmation message instructing user to check email
+
    AuthResponse:
      type: object
      properties:
@ -2781,6 +3069,37 @@ components:
          nullable: true
          description: Webhook secret. Only returned on creation, never again.

+    MeResponse:
+      type: object
+      properties:
+        name:
+          type: string
+        email:
+          type: string
+          format: email
+        has_password:
+          type: boolean
+          description: Whether the user has a password set (false for OAuth-only accounts)
+        providers:
+          type: array
+          items:
+            type: string
+          description: List of linked OAuth provider names (e.g. ["github"])
+
+    ChangePasswordRequest:
+      type: object
+      required: [new_password]
+      properties:
+        current_password:
+          type: string
+          description: Required when changing an existing password
+        new_password:
+          type: string
+          minLength: 8
+        confirm_password:
+          type: string
+          description: Required when adding a password to an OAuth-only account (must match new_password)
+
    Error:
      type: object
      properties:
--- a/src/wrenn/code_interpreter/init.py
+++ b/src/wrenn/code_interpreter/init.py
@ -1,10 +1,19 @@
 from wrenn.code_interpreter.async_capsule import AsyncCapsule
-from wrenn.code_interpreter.capsule import Capsule, CodeResult
+from wrenn.code_interpreter.capsule import Capsule
+from wrenn.code_interpreter.models import (
+    Execution,
+    ExecutionError,
+    Logs,
+    Result,
+)

 __all__ = [
    "AsyncCapsule",
    "Capsule",
-    "CodeResult",
+    "Execution",
+    "ExecutionError",
+    "Logs",
+    "Result",
    "Sandbox",
 ]

--- a/src/wrenn/code_interpreter/async_capsule.py
+++ b/src/wrenn/code_interpreter/async_capsule.py
@ -4,6 +4,8 @@ import asyncio
 import json
 import time
 import uuid
+from collections.abc import Callable
+from typing import Any

 import httpx
 import httpx_ws
@ -11,7 +13,13 @@ import httpx_ws
 from wrenn.async_capsule import AsyncCapsule as BaseAsyncCapsule
 from wrenn.capsule import _build_proxy_url
 from wrenn.client import AsyncWrennClient
-from wrenn.code_interpreter.capsule import CodeResult, DEFAULT_TEMPLATE
+from wrenn.code_interpreter.capsule import DEFAULT_TEMPLATE
+from wrenn.code_interpreter.models import (
+    Execution,
+    ExecutionError,
+    Logs,
+    Result,
+)


 class AsyncCapsule(BaseAsyncCapsule):
@ -151,15 +159,36 @@ class AsyncCapsule(BaseAsyncCapsule):
        language: str = "python",
        timeout: float = 30,
        jupyter_timeout: float = 30,
-    ) -> CodeResult:
-        """Execute code in a persistent Jupyter kernel (async)."""
+        on_result: Callable[[Result], Any] | None = None,
+        on_stdout: Callable[[str], Any] | None = None,
+        on_stderr: Callable[[str], Any] | None = None,
+        on_error: Callable[[ExecutionError], Any] | None = None,
+    ) -> Execution:
+        """Execute code in a persistent Jupyter kernel (async).
+
+        Args:
+            code: Code string to execute.
+            language: Execution backend language. Currently only ``"python"``.
+            timeout: Maximum seconds to wait for execution to complete.
+            jupyter_timeout: Maximum seconds to wait for Jupyter to become
+                available.
+            on_result: Called for each rich output (charts, images, expression
+                values).
+            on_stdout: Called for each stdout chunk.
+            on_stderr: Called for each stderr chunk.
+            on_error: Called when the cell raises an exception.
+
+        Returns:
+            An :class:`Execution` with ``.results``, ``.logs``, ``.error``,
+            and a convenience ``.text`` property.
+        """
        kernel_id = await self._ensure_kernel(jupyter_timeout=jupyter_timeout)
        ws_url = self._jupyter_ws_url(kernel_id)

        msg = self._jupyter_execute_request(code)
        msg_id = msg["msg_id"]

-        result = CodeResult()
+        execution = Execution()
        deadline = time.monotonic() + timeout
        headers = {"X-API-Key": self._client._api_key}

@ -186,31 +215,43 @@ class AsyncCapsule(BaseAsyncCapsule):
                content = data.get("content", {})

                if msg_type == "stream":
+                    text = content.get("text", "")
                    name = content.get("name", "stdout")
                    if name == "stderr":
-                        result.stderr += content.get("text", "")
+                        execution.logs.stderr.append(text)
+                        if on_stderr is not None:
+                            on_stderr(text)
                    else:
-                        result.stdout += content.get("text", "")
-                elif msg_type == "execute_result":
+                        execution.logs.stdout.append(text)
+                        if on_stdout is not None:
+                            on_stdout(text)
+                elif msg_type in ("execute_result", "display_data"):
                    bundle = content.get("data", {})
-                    text = bundle.get("text/plain")
-                    if text and (
-                        (text.startswith("'") and text.endswith("'"))
-                        or (text.startswith('"') and text.endswith('"'))
-                    ):
-                        text = text[1:-1]
-                    result.text = text
-                    result.data = bundle
+                    is_main = msg_type == "execute_result"
+                    result = Result.from_bundle(bundle, is_main_result=is_main)
+                    execution.results.append(result)
+                    if is_main:
+                        execution.execution_count = content.get(
+                            "execution_count"
+                        )
+                    if on_result is not None:
+                        on_result(result)
                elif msg_type == "error":
-                    traceback = content.get("traceback", [])
-                    result.error = "\n".join(traceback)
-                elif msg_type == "status" and content.get("execution_state") == "idle":
+                    err = ExecutionError(
+                        name=content.get("ename", ""),
+                        value=content.get("evalue", ""),
+                        traceback="\n".join(content.get("traceback", [])),
+                    )
+                    execution.error = err
+                    if on_error is not None:
+                        on_error(err)
+                elif (
+                    msg_type == "status"
+                    and content.get("execution_state") == "idle"
+                ):
                    break

-        if result.text is None and result.stdout:
-            result.text = result.stdout.strip()
-
-        return result
+        return execution

    async def __aexit__(self, *args) -> None:
        if self._proxy_client is not None:
--- a/src/wrenn/code_interpreter/capsule.py
+++ b/src/wrenn/code_interpreter/capsule.py
@ -3,37 +3,24 @@ from __future__ import annotations
 import json
 import time
 import uuid
-from dataclasses import dataclass
+from collections.abc import Callable
+from typing import Any

 import httpx
 import httpx_ws

 from wrenn.capsule import Capsule as BaseCapsule
 from wrenn.capsule import _build_proxy_url
-
+from wrenn.code_interpreter.models import (
+    Execution,
+    ExecutionError,
+    Logs,
+    Result,
+)

 DEFAULT_TEMPLATE = "code-runner-beta"


-@dataclass
-class CodeResult:
-    """Result from stateful code execution.
-
-    Attributes:
-        text: text/plain representation of the result.
-        data: rich MIME bundle (e.g. ``{"image/png": "..."}``).
-        stdout: accumulated stdout output.
-        stderr: accumulated stderr output.
-        error: language-specific error/traceback string.
-    """
-
-    text: str | None = None
-    data: dict[str, str] | None = None
-    stdout: str = ""
-    stderr: str = ""
-    error: str | None = None
-
-
 class Capsule(BaseCapsule):
    """Code interpreter capsule with ``run_code`` support.

@ -43,7 +30,7 @@ class Capsule(BaseCapsule):

        capsule = Capsule()
        result = capsule.run_code("print('hello')")
-        print(result.stdout)  # "hello\\n"
+        print(result.logs.stdout)  # ["hello\\n"]
    """

    _kernel_id: str | None
@ -184,7 +171,11 @@ class Capsule(BaseCapsule):
        language: str = "python",
        timeout: float = 30,
        jupyter_timeout: float = 30,
-    ) -> CodeResult:
+        on_result: Callable[[Result], Any] | None = None,
+        on_stdout: Callable[[str], Any] | None = None,
+        on_stderr: Callable[[str], Any] | None = None,
+        on_error: Callable[[ExecutionError], Any] | None = None,
+    ) -> Execution:
        """Execute code in a persistent Jupyter kernel.

        Variables, imports, and function definitions survive across calls.
@ -193,10 +184,17 @@ class Capsule(BaseCapsule):
            code: Code string to execute.
            language: Execution backend language. Currently only ``"python"``.
            timeout: Maximum seconds to wait for execution to complete.
-            jupyter_timeout: Maximum seconds to wait for Jupyter to become available.
+            jupyter_timeout: Maximum seconds to wait for Jupyter to become
+                available.
+            on_result: Called for each rich output (charts, images, expression
+                values).
+            on_stdout: Called for each stdout chunk.
+            on_stderr: Called for each stderr chunk.
+            on_error: Called when the cell raises an exception.

        Returns:
-            A ``CodeResult`` with ``.text``, ``.data``, ``.stdout``, ``.stderr``, ``.error``.
+            An :class:`Execution` with ``.results``, ``.logs``, ``.error``,
+            and a convenience ``.text`` property.
        """
        kernel_id = self._ensure_kernel(jupyter_timeout=jupyter_timeout)
        ws_url = self._jupyter_ws_url(kernel_id)
@ -204,7 +202,7 @@ class Capsule(BaseCapsule):
        msg = self._jupyter_execute_request(code)
        msg_id = msg["msg_id"]

-        result = CodeResult()
+        execution = Execution()
        deadline = time.monotonic() + timeout
        headers = {"X-API-Key": self._client._api_key}

@ -229,31 +227,43 @@ class Capsule(BaseCapsule):
                content = data.get("content", {})

                if msg_type == "stream":
+                    text = content.get("text", "")
                    name = content.get("name", "stdout")
                    if name == "stderr":
-                        result.stderr += content.get("text", "")
+                        execution.logs.stderr.append(text)
+                        if on_stderr is not None:
+                            on_stderr(text)
                    else:
-                        result.stdout += content.get("text", "")
-                elif msg_type == "execute_result":
+                        execution.logs.stdout.append(text)
+                        if on_stdout is not None:
+                            on_stdout(text)
+                elif msg_type in ("execute_result", "display_data"):
                    bundle = content.get("data", {})
-                    text = bundle.get("text/plain")
-                    if text and (
-                        (text.startswith("'") and text.endswith("'"))
-                        or (text.startswith('"') and text.endswith('"'))
-                    ):
-                        text = text[1:-1]
-                    result.text = text
-                    result.data = bundle
+                    is_main = msg_type == "execute_result"
+                    result = Result.from_bundle(bundle, is_main_result=is_main)
+                    execution.results.append(result)
+                    if is_main:
+                        execution.execution_count = content.get(
+                            "execution_count"
+                        )
+                    if on_result is not None:
+                        on_result(result)
                elif msg_type == "error":
-                    traceback = content.get("traceback", [])
-                    result.error = "\n".join(traceback)
-                elif msg_type == "status" and content.get("execution_state") == "idle":
+                    err = ExecutionError(
+                        name=content.get("ename", ""),
+                        value=content.get("evalue", ""),
+                        traceback="\n".join(content.get("traceback", [])),
+                    )
+                    execution.error = err
+                    if on_error is not None:
+                        on_error(err)
+                elif (
+                    msg_type == "status"
+                    and content.get("execution_state") == "idle"
+                ):
                    break

-        if result.text is None and result.stdout:
-            result.text = result.stdout.strip()
-
-        return result
+        return execution

    def __exit__(self, *args) -> None:
        if self._proxy_client is not None:
--- a/src/wrenn/code_interpreter/models.py
+++ b/src/wrenn/code_interpreter/models.py
@ -0,0 +1,156 @@
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+
+_MIME_MAP: dict[str, str] = {
+    "text/plain": "text",
+    "text/html": "html",
+    "text/markdown": "markdown",
+    "image/svg+xml": "svg",
+    "image/png": "png",
+    "image/jpeg": "jpeg",
+    "application/pdf": "pdf",
+    "text/latex": "latex",
+    "application/json": "json",
+    "application/javascript": "javascript",
+}
+
+
+@dataclass
+class ExecutionError:
+    """Error raised during code execution.
+
+    Attributes:
+        name: Exception class name (e.g. ``"NameError"``).
+        value: Exception message.
+        traceback: Full traceback string.
+    """
+
+    name: str = ""
+    value: str = ""
+    traceback: str = ""
+
+
+@dataclass
+class Logs:
+    """Captured stdout/stderr streams.
+
+    Each element in the list is one chunk of text as it arrived from
+    the kernel.
+    """
+
+    stdout: list[str] = field(default_factory=list)
+    stderr: list[str] = field(default_factory=list)
+
+
+@dataclass
+class Result:
+    """A single rich output from code execution.
+
+    Jupyter cells can produce multiple outputs — one ``execute_result``
+    (the expression value) and zero or more ``display_data`` messages
+    (from ``plt.show()``, ``display()``, etc.).  Each becomes a
+    ``Result``.
+
+    Known MIME types are unpacked into named attributes; anything else
+    lands in :pyattr:`extra`.
+    """
+
+    # --- MIME type fields ---
+    text: str | None = None
+    """``text/plain`` representation."""
+    html: str | None = None
+    """``text/html`` representation."""
+    markdown: str | None = None
+    """``text/markdown`` representation."""
+    svg: str | None = None
+    """``image/svg+xml`` representation."""
+    png: str | None = None
+    """``image/png`` — base64-encoded."""
+    jpeg: str | None = None
+    """``image/jpeg`` — base64-encoded."""
+    pdf: str | None = None
+    """``application/pdf`` — base64-encoded."""
+    latex: str | None = None
+    """``text/latex`` representation."""
+    json: dict | None = None
+    """``application/json`` representation."""
+    javascript: str | None = None
+    """``application/javascript`` representation."""
+    extra: dict[str, str] | None = None
+    """MIME types not covered by the named fields above."""
+
+    is_main_result: bool = False
+    """``True`` when this came from an ``execute_result`` message
+    (i.e. the value of the last expression in the cell).  ``False``
+    for ``display_data`` outputs."""
+
+    @classmethod
+    def from_bundle(
+        cls, bundle: dict[str, str], *, is_main_result: bool = False
+    ) -> Result:
+        """Build a ``Result`` from a Jupyter MIME bundle dict."""
+        kwargs: dict = {"is_main_result": is_main_result}
+        extra: dict[str, str] = {}
+        for mime, value in bundle.items():
+            attr = _MIME_MAP.get(mime)
+            if attr is not None:
+                kwargs[attr] = value
+            else:
+                extra[mime] = value
+        if extra:
+            kwargs["extra"] = extra
+        # Strip surrounding quotes from text/plain (Jupyter repr artefact)
+        text = kwargs.get("text")
+        if isinstance(text, str) and len(text) >= 2:
+            if (text[0] == text[-1]) and text[0] in ("'", '"'):
+                kwargs["text"] = text[1:-1]
+        return cls(**kwargs)
+
+    def formats(self) -> list[str]:
+        """Return names of non-``None`` MIME-type fields."""
+        out: list[str] = []
+        for attr in (
+            "text",
+            "html",
+            "markdown",
+            "svg",
+            "png",
+            "jpeg",
+            "pdf",
+            "latex",
+            "json",
+            "javascript",
+        ):
+            if getattr(self, attr) is not None:
+                out.append(attr)
+        if self.extra:
+            out.extend(self.extra)
+        return out
+
+
+@dataclass
+class Execution:
+    """Complete result of a ``run_code`` call.
+
+    Attributes:
+        results: All rich outputs produced by the cell — charts, tables,
+            images, expression values, etc.
+        logs: Captured stdout/stderr text.
+        error: Populated when the cell raised an exception.
+        execution_count: Jupyter execution counter (the ``[N]`` number).
+    """
+
+    results: list[Result] = field(default_factory=list)
+    logs: Logs = field(default_factory=Logs)
+    error: ExecutionError | None = None
+    execution_count: int | None = None
+
+    @property
+    def text(self) -> str | None:
+        """Convenience — ``text/plain`` of the main ``execute_result``,
+        or ``None`` if the cell had no expression value."""
+        for r in self.results:
+            if r.is_main_result:
+                return r.text
+        return None
--- a/src/wrenn/models/_generated.py
+++ b/src/wrenn/models/_generated.py
@ -1,6 +1,6 @@
 # generated by datamodel-codegen:
 #   filename:  openapi.yaml
-#   timestamp: 2026-04-15T08:37:41+00:00
+#   timestamp: 2026-04-16T20:32:20+00:00

 from __future__ import annotations
 from pydantic import AwareDatetime, BaseModel, EmailStr, Field
@ -19,6 +19,13 @@ class LoginRequest(BaseModel):
    password: str


+class SignupResponse(BaseModel):
+    message: Annotated[
+        str | None,
+        Field(description="Confirmation message instructing user to check email"),
+    ] = None
+
+
 class AuthResponse(BaseModel):
    token: Annotated[str | None, Field(description="JWT token (valid for 6 hours)")] = (
        None
@ -534,6 +541,34 @@ class ChannelResponse(BaseModel):
    ] = None


+class MeResponse(BaseModel):
+    name: str | None = None
+    email: EmailStr | None = None
+    has_password: Annotated[
+        bool | None,
+        Field(
+            description="Whether the user has a password set (false for OAuth-only accounts)"
+        ),
+    ] = None
+    providers: Annotated[
+        list[str] | None,
+        Field(description='List of linked OAuth provider names (e.g. ["github"])'),
+    ] = None
+
+
+class ChangePasswordRequest(BaseModel):
+    current_password: Annotated[
+        str | None, Field(description="Required when changing an existing password")
+    ] = None
+    new_password: Annotated[str, Field(min_length=8)]
+    confirm_password: Annotated[
+        str | None,
+        Field(
+            description="Required when adding a password to an OAuth-only account (must match new_password)"
+        ),
+    ] = None
+
+
 class Error2(BaseModel):
    code: str | None = None
    message: str | None = None
--- a/tests/test_capsule_features.py
+++ b/tests/test_capsule_features.py
@ -4,7 +4,7 @@ import pytest
 import respx

 from wrenn.capsule import Capsule, _build_proxy_url
-from wrenn.code_interpreter.capsule import CodeResult
+from wrenn.code_interpreter.models import Execution, ExecutionError, Logs, Result

 BASE = "https://app.wrenn.dev/api"

@ -120,30 +120,61 @@ class TestCapsuleConnect:
        assert cap.capsule_id == "cl-1"


-class TestCodeResult:
-    def test_defaults(self):
-        r = CodeResult()
-        assert r.text is None
-        assert r.data is None
-        assert r.stdout == ""
-        assert r.stderr == ""
-        assert r.error is None
+class TestExecutionModels:
+    def test_execution_defaults(self):
+        e = Execution()
+        assert e.results == []
+        assert e.logs.stdout == []
+        assert e.logs.stderr == []
+        assert e.error is None
+        assert e.text is None

-    def test_with_values(self):
-        r = CodeResult(
-            text="84",
-            data={"text/plain": "84"},
-            stdout="",
-            stderr="",
-            error=None,
-        )
+    def test_result_from_bundle(self):
+        bundle = {"text/plain": "84", "image/png": "base64data"}
+        r = Result.from_bundle(bundle, is_main_result=True)
        assert r.text == "84"
-        assert r.data["text/plain"] == "84"
+        assert r.png == "base64data"
+        assert r.is_main_result is True

-    def test_error_result(self):
-        r = CodeResult(error="ZeroDivisionError: division by zero\n...")
-        assert r.error is not None
-        assert "ZeroDivisionError" in r.error
+    def test_result_from_bundle_strips_quotes(self):
+        bundle = {"text/plain": "'hello'"}
+        r = Result.from_bundle(bundle)
+        assert r.text == "hello"
+
+    def test_result_from_bundle_extra_mimes(self):
+        bundle = {"text/plain": "x", "application/vnd.custom": "data"}
+        r = Result.from_bundle(bundle)
+        assert r.extra == {"application/vnd.custom": "data"}
+
+    def test_result_formats(self):
+        r = Result(text="hi", png="data")
+        assert "text" in r.formats()
+        assert "png" in r.formats()
+        assert "html" not in r.formats()
+
+    def test_execution_text_property(self):
+        e = Execution(
+            results=[
+                Result(text="chart", is_main_result=False),
+                Result(text="42", is_main_result=True),
+            ]
+        )
+        assert e.text == "42"
+
+    def test_execution_error(self):
+        err = ExecutionError(
+            name="ZeroDivisionError",
+            value="division by zero",
+            traceback="Traceback ...\nZeroDivisionError: division by zero",
+        )
+        e = Execution(error=err)
+        assert e.error is not None
+        assert "ZeroDivisionError" in e.error.name
+
+    def test_logs(self):
+        logs = Logs(stdout=["hello\n", "world\n"], stderr=["warn\n"])
+        assert "".join(logs.stdout) == "hello\nworld\n"
+        assert "".join(logs.stderr) == "warn\n"


 class TestDeprecationWarnings: