Assay
CI-native evidence compiler for agent governance
Assay compiles agent runtime signals and selected external outcomes into verifiable evidence and bounded Trust Basis claims. MCP policy enforcement is the fast path: Assay can sit between an agent and its tools, make deterministic allow/deny decisions, and preserve the evidence chain for CI, security review, and audit without a hosted backend.
If you already have machine-readable AI run artifacts, start with the smallest receipt path: Promptfoo JSONL to evidence receipts. For runtime flag decisions, use the second adoption path: OpenFeature EvaluationDetails to CI Review Artifact. For model inventory/provenance boundaries, use the third adoption path: CycloneDX ML-BOM Model to Inventory Receipt.
Install¶
Core Capabilities¶
-
Protocol Policy Enforcement
Validate MCP tool calls against JSON Schema constraints, sequence rules, and allowlists. No LLM calls in CI.
-
Evidence Bundles
Tamper-evident audit trails with content-addressed IDs. Verify bundles offline and keep canonical evidence separate from projections.
-
Trust Basis & Receipts
Compile verified bundles into Trust Basis claims and import bounded external receipt families for eval outcomes, runtime decisions, and model inventory.
Receipt Matrix Promptfoo JSONL Receipts OpenFeature Decision Receipts CycloneDX Inventory Receipts
-
Tool Signing
Ed25519 signatures for tool definitions. DSSE envelope format. Trust policies for supply chain security.
Quick Start¶
1. Capture Traces¶
2. Validate¶
3. Export Evidence¶
assay profile init --output assay-profile.yaml --name quickstart
assay evidence export --profile assay-profile.yaml --out bundle.tar.gz
assay evidence verify bundle.tar.gz
4. Generate Trust Artifacts¶
assay trust-basis generate bundle.tar.gz --out trust-basis.json
assay trust-card generate bundle.tar.gz --out-dir trustcard
trustcard.json is the canonical Trust Card artifact. trustcard.md and trustcard.html are deterministic reviewer projections of the same claim rows and frozen non-goals.
5. Optional: Lint with a Pack¶
| Result | Exit Code | Output |
|---|---|---|
| Pass | 0 | Summary |
| Fail | 1 | SARIF with findings |
| Error | 2 | Config/Schema validation |
GitHub Action¶
Zero-config. Discovers evidence bundles, verifies integrity, uploads SARIF to GitHub Security.
Defense in Depth: Runtime Enforcement (Linux, Optional)¶
Optional kernel-level hardening for Linux deployments.
# Landlock sandbox (rootless)
assay sandbox --policy policy.yaml -- python agent.py
# eBPF/LSM kernel-level enforcement
sudo assay monitor --policy policy.yaml --pid <agent-pid>
Standards Alignment¶
| Standard | Integration |
|---|---|
| CloudEvents v1.0 | Evidence envelope format |
| W3C Trace Context | traceparent correlation |
| SARIF 2.1.0 | GitHub Code Scanning |
| EU AI Act Article 12 | Optional pack mapping |
Next Steps¶
- Getting Started
- Scope & Boundaries
- Evidence Receipts Technical Note
- Promptfoo JSONL to Evidence Receipts
- OpenFeature EvaluationDetails to CI Review Artifact
- CycloneDX ML-BOM Model to Inventory Receipt
- Evidence Receipt Assurance Mapping
- Receipt Family Matrix
- Receipt Schema Registry
- Operator Proof Flow
- Python SDK
- OpenTelemetry & Langfuse
- CLI Reference
- Architecture