Skip to content

Quick Reference

Purpose: Fast lookup for AI agents - commands, patterns, exit codes, and common operations. Version: 2.15.0 (February 2026)

TL;DR - What is Assay?

Assay = Policy-as-Code engine for AI agent validation - Input: Agent traces (JSONL) + Policy (YAML) - Output: Pass/Fail + SARIF report - Key insight: Deterministic replay testing (no LLM calls needed in CI)

Most Common Commands

# First-time setup
assay init                    # Generate eval.yaml + policy.yaml
assay init --ci               # Also generate GitHub workflow

# Validate traces
assay validate --config eval.yaml --trace-file traces.jsonl
assay run --config eval.yaml --trace-file traces.jsonl

# CI gate (strict mode)
assay ci --config eval.yaml --trace-file traces.jsonl

# Debug failures
assay doctor                  # Diagnose common issues
assay explain --trace traces.jsonl --policy policy.yaml  # Explain violations

Exit Codes

Code Name Reason Code Pattern When
0 SUCCESS (none) All tests pass
1 TEST_FAILURE E_TEST_FAILED, E_POLICY_VIOLATION, E_JUDGE_UNCERTAIN Test or policy failure; judge abstain → E_JUDGE_UNCERTAIN
2 CONFIG_ERROR E_CFG_PARSE, E_TRACE_NOT_FOUND, E_MISSING_CONFIG Config or input error
3 INFRA_ERROR E_JUDGE_UNAVAILABLE, E_RATE_LIMIT, E_TIMEOUT Infrastructure issue
4 WOULD_BLOCK (sandbox/policy) Execution would be blocked

Migration note: Use --exit-codes=v2 (default) or --exit-codes=v1 for legacy behavior.

DX Note: assay ci treats Report IO failures (JUnit/SARIF writing) as Warnings (checks strictly pass), ensuring robust pipelines. Diagnostics are injected into run.json warnings.

Reason Code Registry

Config Errors (exit 2)

Code Meaning Next Step
E_CFG_PARSE YAML/JSON parse error assay doctor --config <file>
E_TRACE_NOT_FOUND Trace file missing Check path exists
E_MISSING_CONFIG Config file missing assay init
E_BASELINE_INVALID Baseline file invalid assay baseline record
E_POLICY_PARSE Policy syntax error assay policy validate <file>

Infra Errors (exit 3)

Code Meaning Next Step
E_JUDGE_UNAVAILABLE LLM judge down Check API key, retry
E_RATE_LIMIT Rate limited Wait, reduce concurrency
E_PROVIDER_5XX Provider error Retry, check status page
E_TIMEOUT Request timeout Increase timeout, check network

Test Failures (exit 1)

Code Meaning Next Step
E_TEST_FAILED Test assertion failed assay explain <test-id>
E_JUDGE_UNCERTAIN Judge returned abstain (could not decide) Review borderline result; assay explain <test-id>; adjust threshold
E_POLICY_VIOLATION Policy rule violated Review policy or fix agent
E_SEQUENCE_VIOLATION Wrong tool call order Check sequence rules

Run / CI Output (PR gate, PR #159)

After assay run or assay ci:

Output Contents
run.json exit_code, reason_code, reason_code_version, seed_version, order_seed, judge_seed (string or null), judge_metrics (abstain_rate, flip_rate, etc.)
summary.json Same plus seeds object, schema_version, full Summary
Console footer One line: Seeds: seed_version=1 order_seed=… judge_seed=…; then judge metrics line if present

Seeds are decimal strings or null (no JSON number) for JS/TS precision safety. See Run Output and SPEC-PR-Gate-Outputs-v1.

File Locations

File Purpose Created By
eval.yaml Main config assay init
policy.yaml Policy rules assay init
traces/*.jsonl Agent traces SDK or import
baseline.json Regression baseline assay run --export-baseline
run.json Run outcome (exit, reason_code, seeds, judge_metrics, sarif.omitted when truncated) assay run / assay ci
summary.json Machine-readable summary (seeds, judge_metrics, sarif.omitted when truncated) assay run / assay ci
.github/workflows/assay.yml CI workflow assay init --ci
.assay/reports/junit.xml JUnit output assay run --junit
.assay/reports/sarif.json SARIF output (truncated at 25k results by default; run.json/summary have sarif.omitted when truncated) assay run --sarif
.assay/evidence/*.tar.gz Evidence bundles Test runs

GitHub Action Usage

# Recommended (v2 action)
- uses: Rul1an/assay/assay-action@v2
  with:
    fail_on: error      # error | warn | info | none
    sarif: true         # Upload to Security tab
    comment_diff: true  # PR comment on findings

# Alternative (CLI only)
- run: |
    assay ci \
      --config ci-eval.yaml \
      --trace-file traces/ci.jsonl \
      --junit .assay-reports/junit.xml \
      --sarif .assay-reports/sarif.json

Policy Quick Reference

# policy.yaml structure
version: "1"

tools:
  filesystem_read:
    args:
      path:
        type: string
        pattern: "^/allowed/.*"

  http_request:
    args:
      url:
        blocklist:
          - "*.internal.*"

sequences:
  - name: auth_before_data
    pattern: [authenticate, fetch_data]
    required: true

blocklist:
  - "rm_rf"
  - "drop_database"

Trace Format

{"tool": "filesystem_read", "args": {"path": "/tmp/file.txt"}, "result": "contents..."}
{"tool": "http_request", "args": {"url": "https://api.example.com"}, "result": {"status": 200}}

Python SDK Quick Start

from assay import AssayClient, Coverage, validate

# Record traces
client = AssayClient("traces.jsonl")
client.record_trace({"tool": "read_file", "args": {"path": "/tmp/x"}})

# Validate
result = validate("policy.yaml", traces)
assert result["passed"]

# Coverage analysis
coverage = Coverage.analyze(traces, min_coverage=80.0)
print(f"Coverage: {coverage.score}%")

MCP Server Quick Start

# Start MCP proxy with policy enforcement
assay mcp wrap \
  --policy policy.yaml \
  --decision-log decisions.jsonl \
  --event-source "assay://myapp"

# Dry-run mode (log but don't block)
assay mcp wrap --policy policy.yaml --dry-run

Evidence Commands

# Export bundle from profile
assay evidence export --profile profile.yaml --out bundle.tar.gz

# Verify bundle integrity
assay evidence verify bundle.tar.gz

# Lint for security issues (SARIF output)
assay evidence lint bundle.tar.gz --format sarif

# Compare two bundles
assay evidence diff baseline.tar.gz current.tar.gz

Tool Signing

# Generate keypair
assay tool keygen --out keys/

# Sign tool definition
assay tool sign tool.yaml --key keys/private.pem --out tool-signed.yaml

# Verify signature
assay tool verify tool-signed.yaml --trust-policy trust.yaml

Common Patterns

Pattern 1: CI Gate

assay run --config eval.yaml --trace-file traces.jsonl --baseline baseline.json
# Exit 0 = merge allowed
# Exit 1 = block PR

Pattern 2: Learning Mode

assay record --output policy.yaml -- your-agent-command
assay generate -i traces/session.jsonl --output policy.yaml

Pattern 3: Debug Violation

assay doctor                           # Check setup
assay explain --trace traces.jsonl --policy policy.yaml  # Explain failure
assay coverage --trace-file traces.jsonl # Check coverage

Pattern 4: Baseline Regression

# On main branch
assay run --config eval.yaml --export-baseline baseline.json

# On feature branch
assay run --config eval.yaml --baseline baseline.json

Crate Responsibilities

Crate Responsibility Key Types
assay-core Evaluation engine Runner, Store, EvalConfig
assay-cli CLI interface Cli, Command, dispatchers
assay-metrics Metric implementations MustContain, JsonSchema, etc.
assay-mcp-server MCP proxy McpProxy, JSON-RPC handlers
assay-policy Policy compilation CompiledPolicy, Tier ½
assay-evidence Evidence bundles BundleWriter, Manifest
assay-monitor eBPF monitoring Linux kernel integration

Key Paths in Codebase

crates/assay-cli/src/cli/commands/mod.rs  # Command dispatch
crates/assay-core/src/engine/runner.rs    # Test execution
crates/assay-core/src/storage/store.rs    # SQLite persistence
crates/assay-core/src/mcp/proxy.rs        # MCP proxy
crates/assay-core/src/report/sarif.rs     # SARIF output
crates/assay-cli/src/templates.rs         # CI templates
infra/bpf-runner/health_check.sh          # Runner health
.github/workflows/kernel-matrix.yml       # eBPF CI

Environment Variables

Variable Purpose Default
RUST_LOG Log level info
ASSAY_EXIT_CODES Exit code version v2
OPENAI_API_KEY LLM API key (required for judge)