CLI Reference¶
Complete documentation for all Assay commands.
Installation¶
Verify installation:
Commands Overview¶
| Command | Description |
|---|---|
assay run | Run tests against traces |
assay generate | Generate policy scaffolding from trace/profile input |
assay explain | Explain why trace steps were allowed/blocked |
assay bundle | Create/verify replay bundles |
assay replay | Replay from a replay bundle |
assay import | Import sessions from MCP Inspector, etc. |
assay migrate | Upgrade config from v0 to v1 |
assay doctor | Diagnose setup and optionally auto-fix known issues |
assay watch | Re-run on config/policy/trace changes |
assay monitor | Runtime Security (Linux Kernel Enforcement) |
assay mcp wrap | Wrap an MCP process with policy enforcement |
Global Options¶
Common top-level options:
| Option | Description |
|---|---|
--help, -h | Show help message |
--version, -V | Show version |
Quick Examples¶
Run Tests¶
# Basic run
assay run --config eval.yaml
# Strict mode (fail on any violation)
assay run --config eval.yaml --strict
# Specific trace file
assay run --config eval.yaml --trace-file traces/golden.jsonl
# CI reports
assay ci --config eval.yaml --trace-file traces/golden.jsonl --sarif sarif.json --junit junit.xml
Generate Policy (With Diff Preview)¶
# Generate policy from trace
assay generate --input traces/session.jsonl --output policy.yaml
# Preview changes against existing policy file
assay generate --input traces/session.jsonl --output policy.yaml --diff --dry-run
Replay Bundles¶
# Create bundle from latest run artifacts
assay bundle create
# Verify bundle safety/integrity
assay bundle verify --bundle .assay/bundles/12345.tar.gz
# Replay from bundle (offline default)
assay replay --bundle .assay/bundles/12345.tar.gz
# Replay live with seed override
assay replay --bundle .assay/bundles/12345.tar.gz --live --seed 42
Migrate Config¶
# Upgrade to v1 format
assay migrate --config old-eval.yaml
# Preview changes without writing
assay migrate --config old-eval.yaml --dry-run
Start MCP Wrapper¶
# Enforcing mode
assay mcp wrap --policy assay.yaml -- <real-mcp-command> [args...]
# Dry-run mode
assay mcp wrap --policy assay.yaml --dry-run -- <real-mcp-command> [args...]
Diagnose and Watch¶
# Diagnose and auto-fix known issues
assay doctor --config eval.yaml --trace-file traces/dev.jsonl --fix --yes
# Live re-run loop on local edits
assay watch --config eval.yaml --trace-file traces/dev.jsonl --strict
Exit Codes¶
| Code | Meaning |
|---|---|
| 0 | Success (all tests passed) |
| 1 | Test failure (one or more tests failed) |
| 2 | Configuration error |
| 3 | Infrastructure/judge error |
| 4 | Would block (sandbox/policy) |
Environment Variables¶
| Variable | Description | Default |
|---|---|---|
ASSAY_EXIT_CODES | Exit code compatibility mode (v1 or v2) | v2 |
MCP_CONFIG_LEGACY | Enable legacy config mode when set to 1 | disabled |
ASSAY_STRICT_DEPRECATIONS | Fail on deprecated policy/config usage when set to 1 | disabled |
OPENAI_API_KEY | API key for OpenAI-backed judge/embedder paths | unset |
NO_COLOR | Disable colored output | unset |
Configuration File¶
Most run/ci commands read from eval.yaml by default:
version: 1
suite: my-agent
model: gpt-4o-mini
tests:
- id: args_valid
input:
prompt: "Summarize this task."
expected:
type: args_valid
policy: policies/default.yaml
See Configuration for full reference.
Command Details¶
-
assay run
Run tests against traces. The main command for CI/CD.
-
assay explain
Explain blocked/allowed trace steps and evaluated rules.
-
assay generate
Generate policy scaffolding from traces/profiles and preview diffs.
-
assay import
Import sessions from MCP Inspector and other formats.
-
assay migrate
Upgrade configuration from v0 to v1 format.
-
assay doctor
Diagnose environment/config issues and apply known fixes.
-
assay watch
Watch files and rerun Assay on changes.
-
assay replay
Replay runs from a replay bundle (
--bundle), offline by default. -
assay bundle
Create and verify replay bundles.
-
assay mcp wrap
Wrap a real MCP process with policy enforcement for agent self-correction.
-
assay monitor
Real-time kernel enforcement (SOTA). Blocks attacks before they happen.