feat: add sandbox filesystem and terminal support

Add sandbox filesystem methods (list_dir, mkdir, remove, upload,
download, stream_upload, stream_download) and interactive PTY sessions
(PtySession, AsyncPtySession) with reconnect support per
FILE_TERMINAL.md spec. Refactor error handling into exceptions.py as
shared handle_response(). Replace API-key-only proxy auth with unified
_proxy_headers() supporting both API key and JWT. Fix stream_upload to
build multipart manually instead of relying on httpx files= with
generators. Switch Makefile SPEC_URL from main to dev branch. Regenerate
models from updated OpenAPI spec (adds teams, channels, metrics, PTY
endpoints). Add comprehensive unit and integration tests. Trim AGENTS.md
to verified facts only.
This commit is contained in:
Tasnim Kabir Sadik
2026-04-12 02:35:20 +06:00
parent f51a962fff
commit a5bf66c199
13 changed files with 3180 additions and 445 deletions

272
AGENTS.md
View File

@ -1,252 +1,80 @@
# AGENTS.md
This file provides strict guidance to AI coding agents and assistants when modifying code in the `wrenn-python-sdk` repository. Read this entirely before writing or refactoring any code.
## What this repo is
## Project Overview
Python SDK for **Wrenn** (microVM code execution platform). Communicates with the Control Plane via REST + WebSockets only — no gRPC. The `envd` and `HostAgentService` are internal to the Go backend and never reachable from this SDK.
This is the official Python SDK for **Wrenn**, a microVM-based code execution platform. The SDK provides developers and AI agents with a clean, typed interface to interact with the Wrenn Control Plane over REST and WebSockets.
## Build & dev commands
**Important:** The SDK communicates exclusively with the Control Plane over HTTP/HTTPS and WebSockets. It does **not** generate or use gRPC stubs. The `envd` guest agent and `HostAgentService` are internal RPCs between the control plane and host agents — they are never reachable from the SDK. All data-plane operations (exec, file I/O) are proxied through the control plane's REST/WS endpoints.
## Repository Architecture & Structure
This is a modern Python package managed entirely by `uv`. It uses a flattened `src/` layout.
```text
.
├── LICENSE
├── Makefile # Central command runner
├── pyproject.toml # uv dependency and build config
├── uv.lock # Exact dependency resolution
├── internal/
│ └── api/
│ └── openapi.yaml # Cached OpenAPI spec from the Go backend
├── src/
│ └── wrenn/ # The actual importable Python package
│ ├── __init__.py # Version + top-level re-exports
│ ├── client.py # WrennClient & AsyncWrennClient (httpx transport)
│ ├── sandbox.py # Sandbox class (exec, files, context manager)
│ ├── exceptions.py # Typed exception hierarchy
│ ├── py.typed # PEP 561 marker
│ └── models/
│ ├── __init__.py # Public re-exports via __all__
│ └── _generated.py # DO NOT EDIT — generated by datamodel-codegen
└── tests/ # Pytest suite
```
## Build & Development Commands
Never use raw `pip`, `venv`, or `python -m venv`. **All dependency management and script execution goes through `uv` and the `Makefile`.**
All commands go through `uv` and the `Makefile`. Never use raw `pip`, `venv`, or `python -m venv`.
```bash
make generate # Fetches openapi.yaml and runs datamodel-codegen → models/_generated.py
make lint # Runs ruff check and ruff format
make test # Runs pytest
make check # Runs lint + test
make generate # Fetch openapi.yaml → src/wrenn/models/_generated.py
make lint # ruff check + ruff format --check on src/
make test # runs ONLY tests/test_client.py
make test-integration # runs ALL tests (unit + integration, needs live server)
make check # lint + test (test_client.py only)
```
There is no `make proto`. The SDK does not generate gRPC stubs — the `envd` and `HostAgentService` protos are internal to the Go backend.
To run all unit tests (not just test_client.py):
## Dependency Management (`uv`)
- **Adding a runtime dependency:** `uv add <package>` (e.g., `uv add httpx pydantic`)
- **Adding a dev dependency:** `uv add --dev <package>` (e.g., `uv add --dev pytest ruff`)
- **Running isolated scripts:** Use `uv run <command>`. `uv` implicitly manages the `.venv`; do not try to manually activate it in automation scripts.
## Code Generation Invariants (CRITICAL)
The data models for this SDK are generated directly from the Go backend's OpenAPI contract (`internal/api/openapi.yaml`).
1. **Never manually edit `src/wrenn/models/_generated.py`.** Any custom logic placed here will be destroyed on the next `make generate`.
2. If the Go API contract changes, run `make generate`.
3. **Export routing:** The `_generated.py` file is large. Users must never import from it directly. All user-facing models must be explicitly re-exported in `src/wrenn/models/__init__.py` using the `__all__` dunder list.
4. **Extending models:** If a generated Pydantic model needs custom Python methods, subclass it in a new file (e.g., `src/wrenn/sandbox.py` extends the generated `Sandbox` model) and export the subclass.
## Authentication
The SDK supports two authentication mechanisms, set via the `WrennClient` constructor:
1. **API Key (primary):** Pass `api_key="wrn_..."` to the constructor. Sent as `X-API-Key` header. Format: `wrn_` + 32 hex chars. Used for programmatic/agent access.
2. **JWT (secondary):** Pass `token="<jwt>"` to the constructor. Sent as `Authorization: Bearer <jwt>` header. Used for user-facing tooling. Tokens expire after 6 hours.
Host tokens (`X-Host-Token`) are for the host agent binary only and are **not** exposed in the SDK.
```python
client = WrennClient(api_key="wrn_ab12cd34...") # typical usage
client = WrennClient(token="eyJhbGci...") # alternative
```bash
uv run pytest tests/test_client.py tests/test_sandbox_features.py tests/test_filesystem_pty.py -v
```
## Core SDK Design Patterns
To run a single test:
### 1. Sync and Async Parity
The SDK must natively support both synchronous and asynchronous workflows.
- Core logic lives in `WrennClient` and `AsyncWrennClient` inside `client.py`.
- Under the hood, rely on `httpx.Client` and `httpx.AsyncClient`.
- Resource namespaces are injected via constructor.
### 2. Resource Namespaces
The client exposes resources as plural namespaces matching the API path convention:
```python
client = WrennClient(api_key="wrn_...")
client.sandboxes.create(template="base-python")
client.sandboxes.list()
client.snapshots.create(sandbox_id="cl-...")
client.api_keys.create(name="my-key")
client.hosts.list()
client.teams.list()
client.audit.list(limit=50)
client.builds.list() # admin-only
```bash
uv run pytest tests/test_client.py::TestAuth::test_signup -v
```
### 3. The Sandbox Class
## Code generation (CRITICAL)
The `Sandbox` object is the primary developer-facing interface. It wraps the generated `Sandbox` model with lifecycle and data-plane methods:
Models in `src/wrenn/models/_generated.py` are generated by `datamodel-codegen` from `api/openapi.yaml`.
```python
with client.sandboxes.create("base-python") as sb:
sb.wait_ready(timeout=30)
1. **Never edit `_generated.py`** — overwritten on next `make generate`.
2. All user-facing models must be re-exported in `src/wrenn/models/__init__.py` via `__all__`.
3. To extend a generated model with custom methods, subclass it (e.g. `Sandbox` in `sandbox.py` subclasses the generated `SandboxModel`).
result = sb.exec("echo hello")
print(result.stdout) # "hello\n"
print(result.exit_code) # 0
## Dependency management
sb.upload("/app/main.py", b"print('hello')")
data = sb.download("/app/main.py")
sb.ping()
sb.pause()
sb.resume()
# Exiting the block automatically calls sb.destroy()
```bash
uv add <package> # runtime dep
uv add --dev <package> # dev dep
uv run <command> # run in managed .venv
```
**Key methods:**
## Implemented resource namespaces
| Method | Endpoint | Description |
|--------|----------|-------------|
| `sb.exec(cmd)` | `POST /v1/sandboxes/{id}/exec` | Synchronous exec. Returns `ExecResult` with `stdout`, `stderr`, `exit_code`, `duration_ms`. |
| `sb.exec_stream(cmd)` | `WS GET /v1/sandboxes/{id}/exec/stream` | Streaming exec via WebSocket. Returns an `Iterator[StreamEvent]` yielding `start`, `stdout`, `stderr`, `exit`, `error` events. |
| `sb.upload(path, data)` | `POST /v1/sandboxes/{id}/files/write` | Upload a small file (multipart form-data). |
| `sb.download(path)` | `POST /v1/sandboxes/{id}/files/read` | Download a small file. Returns bytes. |
| `sb.stream_upload(path, stream)` | `POST /v1/sandboxes/{id}/files/stream/write` | Streaming multipart upload for large files. No in-memory buffering. |
| `sb.stream_download(path)` | `POST /v1/sandboxes/{id}/files/stream/read` | Streaming chunked download for large files. Returns `Iterator[bytes]`. |
| `sb.wait_ready(timeout=30)` | Polls `GET /v1/sandboxes/{id}` | Blocks until status is `running`. Raises `TimeoutError` on expiry. |
| `sb.ping()` | `POST /v1/sandboxes/{id}/ping` | Resets inactivity timer. |
| `sb.pause()` | `POST /v1/sandboxes/{id}/pause` | Snapshots and releases resources. |
| `sb.resume()` | `POST /v1/sandboxes/{id}/resume` | Restores from snapshot. |
| `sb.destroy()` | `DELETE /v1/sandboxes/{id}` | Tears down the sandbox. Called automatically by context manager. |
| `sb.metrics(range="10m")` | `GET /v1/sandboxes/{id}/metrics` | Returns CPU, memory, disk time-series. |
| `sb.run_code(code, language="python")` | Jupyter kernel via proxy WS | Stateful code execution in any language with a Jupyter kernel. Variables persist across calls. Returns `CodeResult` with `.text`, `.stdout`, `.stderr`, `.error`, `.data`. See `CODE_EXECUTION.md`. |
Only these are currently implemented in `client.py`:
### 4. Context Managers
- **`client.auth`** — `signup`, `login`
- **`client.api_keys`** — `create`, `list`, `delete`
- **`client.sandboxes`** — `create`, `list`, `get`, `destroy`
- **`client.snapshots`** — `create`, `list`, `delete`
- **`client.hosts`** — `create`, `list`, `get`, `delete`, `regenerate_token`, `list_tags`, `add_tag`, `remove_tag`
Sandboxes are ephemeral. The SDK must use context managers (`with` and `async with`) to guarantee cleanup:
Both sync and async variants exist for every resource.
```python
with client.sandboxes.create("base-python") as sb:
sb.wait_ready(timeout=30)
result = sb.exec("python -c 'print(42)'")
# __exit__ calls sb.destroy() / DELETE /v1/sandboxes/{id}
```
## Architecture notes
### 5. Streaming Executions
- **Sync/async parity**: `WrennClient` + `AsyncWrennClient` in `client.py`, using `httpx.Client`/`httpx.AsyncClient`. Async methods on `Sandbox` are prefixed `async_` (e.g. `async_exec`, `async_upload`).
- **WebSocket library**: `httpx-ws` (not `websockets`). Used for `exec_stream`, `pty`, and `run_code`.
- **Sandbox proxy URL**: `get_url(port)` returns `ws://` or `wss://` scheme. The `http_client` property converts to `http://`/`https://` automatically.
- **`Sandbox`** (in `sandbox.py`) is the main developer-facing class — subclasses generated model, adds lifecycle methods (`exec`, `upload`, `download`, `list_dir`, `mkdir`, `remove`, `pty`, `run_code`, `wait_ready`, `pause`, `resume`, `destroy`, `ping`, `metrics`), context manager support, and proxy helpers.
- **Error handling**: `handle_response()` in `exceptions.py` maps server error `code` field to typed exceptions (not just HTTP status). All inherit from `WrennError` with `.code`, `.message`, `.status_code`.
There are two distinct exec endpoints:
## Testing
**Synchronous exec**`sb.exec(cmd, args=[], timeout_sec=30)`
- Calls `POST /v1/sandboxes/{id}/exec`. Blocks until the command completes.
- Returns an `ExecResult` with `stdout`, `stderr`, `exit_code`, `duration_ms`, `encoding`.
- **HTTP mocking**: `respx` library (not `responses` or `pytest-httpx`). Mock routes with `@respx.mock` decorator or `respx.mock` context manager.
- **Async tests**: use `@pytest.mark.asyncio` (backed by `pytest-asyncio`).
- **Integration tests**: in `test_integration.py`, require env vars `WRENN_API_KEY` or `WRENN_TOKEN` (plus optional `WRENN_BASE_URL`, `WRENN_TEST_EMAIL`, `WRENN_TEST_PASSWORD`). They are skipped via `@requires_auth` if credentials are absent.
- **Fixtures**: test fixtures create `WrennClient(api_key="wrn_test1234567890abcdef12345678")` with context manager cleanup.
**Streaming exec**`sb.exec_stream(cmd, args=[])`
- Opens a WebSocket to `GET /v1/sandboxes/{id}/exec/stream`.
- Returns an `Iterator[StreamEvent]` (or `AsyncIterator[StreamEvent]` for async).
- The client sends `{"type": "start", "cmd": "...", "args": [...]}` as the first message.
- The server sends events: `StreamStartEvent(pid)`, `StreamStdoutEvent(data)`, `StreamStderrEvent(data)`, `StreamExitEvent(exit_code)`, `StreamErrorEvent(data)`.
- The connection closes after the process exits. The client can send `{"type": "stop"}` to terminate early.
## Coding conventions
### 6. Error Handling
Do not leak raw `httpx.HTTPStatusError` to the user. The server returns errors as:
```json
{"error": {"code": "not_found", "message": "sandbox not found"}}
```
Map the `code` field (not just HTTP status) to typed exceptions:
| Error code | HTTP status | Exception |
|-----------|-------------|-----------|
| `invalid_request` | 400 | `WrennValidationError` |
| `unauthorized` | 401 | `WrennAuthenticationError` |
| `forbidden` | 403 | `WrennForbiddenError` |
| `not_found` | 404 | `WrennNotFoundError` |
| `invalid_state` | 409 | `WrennConflictError` |
| `conflict` | 409 | `WrennConflictError` |
| `host_has_sandboxes` | 409 | `WrennHostHasSandboxesError` (includes `sandbox_ids`) |
| `host_unavailable` | 503 | `WrennHostUnavailableError` |
| `agent_error` | 502 | `WrennAgentError` |
| `internal_error` | 500 | `WrennInternalError` |
All exceptions inherit from `WrennError` and expose `.code`, `.message`, and `.status_code`.
### 7. Resource Coverage
The full API surface exposed through resource namespaces:
**`client.sandboxes`** — `create`, `list`, `get`, `destroy`, `get_stats`
**`client.snapshots`** — `create`, `list`, `delete`
**`client.api_keys`** — `create`, `list`, `delete`
**`client.hosts`** — `create`, `list`, `get`, `delete`, `delete_preview`, `regenerate_token`, `list_tags`, `add_tag`, `remove_tag`
**`client.teams`** — `list`, `create`, `get`, `rename`, `delete`, `list_members`, `add_member`, `update_member_role`, `remove_member`, `leave`
**`client.audit`** — `list` (paginated with `before`/`before_id` cursors)
**`client.builds`** — `create`, `list`, `get`, `cancel` (admin-only)
**`client.admin`** — `set_team_byoc`, `list_templates`, `delete_template`
### 8. Sandbox Proxy / Port Forwarding
Services running inside a sandbox are accessible via a reverse proxy. The control plane intercepts requests whose `Host` header matches `{port}-{sandbox_id}.{domain}` and forwards them to the host agent.
The SDK exposes two helpers on the `Sandbox` object:
**`sb.get_url(port) -> str`**
- Constructs the proxy URL from the client's `base_url`.
- Derivation: parse `base_url` host, build `http://{port}-{sandbox_id}.{host}`.
- Example: `base_url="https://api.wrenn.dev"`, `sb.id="cl-abc123"``"http://8888-cl-abc123.api.wrenn.dev"`
- Example: `base_url="http://localhost:8080"`, `sb.id="cl-abc123"``"http://8888-cl-abc123.localhost:8080"`
**`sb.http_client -> httpx.Client`**
- A pre-configured `httpx.Client` with:
- `base_url` set to the proxy URL (root `/` maps to the proxied service)
- `X-API-Key` header set from the parent client's API key
- Allows direct HTTP interaction with services inside the sandbox without manual header management.
- Closed automatically when the sandbox context manager exits.
**Auth:** Proxy requests require the `X-API-Key` header. JWT is not supported for proxy routes. If the client was constructed with a JWT token only, `sb.get_url()` and `sb.http_client` must raise `WrennAuthenticationError`.
**Example: Jupyter inside a sandbox**
```python
with client.sandboxes.create("python-jupyter") as sb:
sb.wait_ready(timeout=60)
# High-level: stateful code execution (see CODE_EXECUTION.md)
result = sb.run_code("print('hello from persistent kernel')")
print(result.stdout)
# Low-level: direct HTTP to Jupyter REST API
resp = sb.http_client.get("/api/kernels")
print(resp.json())
# Low-level: direct proxy URL for browser access
jupyter_url = sb.get_url(8888)
```
## Coding Conventions & Typing
- **Python Target:** `3.13+`. Use modern syntax (`|` for Unions, standard library generics like `list[str]`).
- **Typing:** Everything must be strictly typed. Use `pyright` for validation.
- **Formatting:** `ruff` is the sole linter and formatter. Do not use `black`, `isort`, or `flake8`.
- **Docstrings:** Use Google-style docstrings. These surface to end-users via IDE hover.
- **No comments:** Do not add comments unless explicitly asked.
- **Python 3.13+** with modern syntax (`|` unions, `list[str]` generics).
- **Strict typing** throughout. `pyright`/`mypy` available but not in CI.
- **`ruff`** is the sole linter and formatter. Do not use `black`, `isort`, or `flake8`.
- **Google-style docstrings** on all public APIs.
- **No comments** unless explicitly asked.