- unit.yml: unit tests on every push and pull_request, all branches.
- code-runner.yml: PR to dev/main, gated on src/wrenn/code_runner/**
or tests/test_code_runner_*.py; runs `make test-code-runner`.
- integration.yml: PR to dev/main, gated on src/** excluding
src/wrenn/code_runner/**; runs `make test-integration`.
E2E pipelines require a src/** change, so docs/test-only PRs only
trigger the unit pipeline.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Rename `wrenn.code_interpreter` → `wrenn.code_runner` (canonical).
Keep old path as deprecation alias that emits a FutureWarning on
import, mirroring the existing `Sandbox` → `Capsule` pattern.
Submodule shims `code_interpreter/{capsule,async_capsule,models}.py`
keep direct-submodule imports working.
- Fix sync/async ctor-failure-safe `__del__`: initialise `_kernel_id`,
`_kernel_name`, `_proxy_client` before calling `super().__init__` so
a failed creation no longer crashes the destructor with
AttributeError.
- Send the kernel name to Jupyter. Previously `POST /api/kernels` had
no body, so the server picked an arbitrary default kernelspec. Now
sends `{"name": "wrenn"}` (override via `Capsule(kernel=...)`) and
reuses an existing kernel only when its `name` matches.
- Preserve Jupyter `text/plain` verbatim in `Result.from_bundle`.
The previous outer-quote strip was lossy (the string `'2'` became
indistinguishable from the int `2`, and strings containing escaped
quotes were mangled). `text` is now the `repr()` Jupyter sends.
Updated the stale `test_capsule_features` quote-strip test.
- Validate `run_code(language=...)`. Anything other than `"python"`
now raises `ValueError` instead of being silently ignored.
- Async `__del__` no longer touches the event loop; users must call
`await close()` or use `async with`.
- New unit suite `tests/test_code_runner_unit.py` (46 tests): MIME
unpacking, deprecation alias + warning, default template + kernel,
custom kernel override, ctor-failure-safe __del__, kernel
create/reuse/cache, retry on 5xx, 4xx propagation, request shape,
run_code stream/result/error/foreign-parent/idle/unsupported-language,
async variants.
- New e2e suite `tests/test_code_runner_e2e.py` (44 tests, integration
marker): template == `code-runner-beta`, kernel == `wrenn`, stdout
/stderr capture, state/import/function/class persistence, exceptions
(Value/Name/Syntax), callbacks, multi-line, `text` repr preservation,
filesystem round-trip, isolation between capsules, deprecated import
path. MIME-type class covers html, markdown, json, latex, svg,
javascript, png (matplotlib + seaborn), jpeg, multi-format bundles,
and text-round-trip via numpy + requests.
- `make test-code-runner` runs unit + e2e together. `make test`
extended to include the unit file.
- README: "Code Interpreter" section renamed to "Code Runner", all
imports updated, `kernel=` documented, removed the incorrect
"quotes stripped automatically" claim, replaced with the actual
`text/plain` semantics.
- CLAUDE.md: appended a "Code Runner Module" section covering module
path, defaults, kernel-reuse semantics, lifecycle invariant, and
the new test files + make target.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move per-step `when` filters: unit tests now run on every branch push,
integration tests keep pull_request + main/dev branch restriction.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tests:
- tests/test_commands.py: unit coverage for Commands/AsyncCommands —
payload construction (cwd, envs, tag, timeout), background dispatch,
base64 response decoding, stream-event parsing, stream/connect iterators.
- tests/test_integration_advanced.py: live tests for cwd/env handling,
long-running commands (apt-get), PTY sessions, streaming exec,
process connect, and git workflows including cloning wrennhq/wrenn.
- test_filesystem_pty.py: PTY ping/pong reply tests.
- test_integration.py: poll for async process-registry prune in
test_kill_process instead of asserting on a zero-delay list().
Fixes:
- commands.py / pty.py: stream(), connect() and the PTY iterators only
caught WebSocketDisconnect. The server closes exec/process streams
abruptly, raising WebSocketNetworkError — a sibling under
HTTPXWSException — which crashed connect() entirely. Both are now
caught via _WS_CLOSED so abrupt closes end iteration cleanly.
- pty.py: reply to the server keepalive ping with a pong so idle PTY
sessions stay open.
Resolve conflicts in api/openapi.yaml and src/wrenn/models/_generated.py
by keeping the fix/0.2-compatibility versions (v0.2 API is authoritative).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Drop AuthResponse from models __init__ (renamed SessionResponse server-side; SDK auths via API key, doesn't need either)
- Regenerate models from updated 0.2 openapi spec
- Add wait: bool = False kwarg to Capsule/AsyncCapsule destroy/pause/resume (instance + _static_*); 500ms poll for resume/destroy, 2s for pause
- Unify polling into _poll_until / _apoll_until + _wait_for_status helper; remove duplicated _POLL_INTERVALS tables
- wait_ready: drop implicit paused->resume side effect; treat missing as fail
- Capsule.connect: handle transient pausing (wait for paused first) before resuming, fixes hang when caller pauses then connects immediately
- Drop dead "if self._id is None" branch in Capsule.__init__ after assigning from already-truthy _capsule_id
- files.make_dir: detect already_exists across 409/wrapped error messages via shared _is_already_exists helper
- tests/test_integration.py: assertions on final lifecycle state use wait=True
Sync OpenAPI spec to v0.2.0, fix type annotation shadowing by using
builtins.list in annotated signatures, guard poll interval lookup
against None status, and reorder capsule ID assignment to validate
before storing.
Bugs fixed:
- files.py: use typed error checking (_raise_for_status) instead of raw
raise_for_status(), ensuring WrennNotFoundError etc. are raised
correctly
- exceptions.py: check both "capsule_ids" and "sandbox_ids" response
keys
for backwards compatibility
- code_interpreter: retry _ensure_kernel on 5xx errors (only fail on
4xx),
remove redundant TimeoutError in bare except, clean up non-standard
top-level msg_id/msg_type from Jupyter messages
Resource leaks fixed:
- capsule.py: close WrennClient if capsule creation or init fails
- code_interpreter: add close()/__del__ for _proxy_client cleanup when
not using context manager
Logic fixes:
- pty.py: yield exit events to callers instead of silently discarding
them
- capsule.py: auto-resume paused capsules in wait_ready instead of
failing
- capsule.py: log warnings on destroy failure in __exit__ instead of
silently swallowing errors
non-JSON error responses
- Set per-request httpx timeout (command timeout + 10s buffer) in
Commands.run() and AsyncCommands.run() for foreground exec calls,
preventing HTTP read timeouts on long-running commands
- Raise WrennInternalError instead of raw httpx.HTTPStatusError when
handle_response() encounters a non-JSON error body (e.g. 502 from
a reverse proxy)
- Update Woodpecker to run unit and integration tests in parallel
- Add GitHub Actions workflow for PyPI trusted publishing on main
- Add license, classifiers, keywords, and URLs to pyproject.toml
- Fix ruff lint errors (unused imports, duplicate class name) and formatting
43 tests across 4 classes hitting the live API. Shared capsule per class
to minimize VM boot overhead. All capsules destroyed in teardown.
Skips automatically when WRENN_API_KEY is not available.
The server-side agent runs commands through a nice wrapper that uses
"${@}" expansion. Sending the full command string as a single cmd field
caused nice to treat it as one executable name. Now Commands.run sends
cmd=/bin/sh args=["-c", cmd_string] so "${@}" expands into proper argv.
Replace flat CodeResult with a proper model hierarchy: Execution
(top-level), Result (per-output with typed MIME fields), Logs
(stdout/stderr as lists), and ExecutionError (structured
name/value/traceback). Handle display_data messages for rich output,
add streaming callbacks (on_result, on_stdout, on_stderr, on_error),
and remove the misleading stdout-to-text fallback.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Rename Capsule.kill/AsyncCapsule.kill to destroy for frontend consistency
- Add Sandbox deprecation alias to wrenn.code_interpreter module
- run_code text falls back to stripped stdout when no expression result
- Strip quotes from string expression results (matching e2b behavior)
- _ensure_kernel reuses existing Jupyter kernels before creating new ones
- Rewrite README with complete examples for capsules and code interpreter
- Remove stale AGENTS.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the WrennClient-centric API with a top-level Capsule class that
mirrors e2b's Sandbox interface, enabling drop-in migration. Key changes:
- Capsule/AsyncCapsule with direct construction (reads WRENN_API_KEY and
WRENN_BASE_URL env vars), namespaced sub-objects (capsule.commands,
capsule.files), dual instance/static lifecycle methods via _DualMethod
descriptor (capsule.kill() and Capsule.kill(id))
- WrennClient simplified to API-key-only endpoints (capsules, snapshots);
JWT-based resources (auth, hosts, teams) removed
- wrenn.code_interpreter submodule with Capsule subclass defaulting to
code-runner-beta template and run_code() support
- Sandbox alias emits FutureWarning instead of DeprecationWarning
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add sandbox filesystem methods (list_dir, mkdir, remove, upload,
download, stream_upload, stream_download) and interactive PTY sessions
(PtySession, AsyncPtySession) with reconnect support per
FILE_TERMINAL.md spec. Refactor error handling into exceptions.py as
shared handle_response(). Replace API-key-only proxy auth with unified
_proxy_headers() supporting both API key and JWT. Fix stream_upload to
build multipart manually instead of relying on httpx files= with
generators. Switch Makefile SPEC_URL from main to dev branch. Regenerate
models from updated OpenAPI spec (adds teams, channels, metrics, PTY
endpoints). Add comprehensive unit and integration tests. Trim AGENTS.md
to verified facts only.
Introduces the core Wrenn client and a dedicated sandbox execution
environment. This includes automated model generation and a custom
exception hierarchy to support robust integration.
- Add `WrennClient` in `src/wrenn/client.py` for API interaction.
- Implement `Sandbox` in `src/wrenn/sandbox.py` for isolated execution.
- Add Pydantic/model support via `_generated.py`.
- Define project-specific error types in `exceptions.py`.
- Include AGENTS.md documentation for specialized logic.
- Add comprehensive unit and integration tests.
- Update build system (Makefile, uv.lock, pyproject.toml) and LICENSE.