What Assay Does (and Doesn't Do)¶

Assay is a deterministic policy enforcement engine for AI agents. This page clarifies our scope to help you understand when Assay is the right tool—and when you should look elsewhere.

Core Principle¶

If it needs a classifier, we don't build it. We build gates.

Assay enforces rules that can be evaluated with 100% determinism. No probability scores. No "maybe". Just Pass or Fail.

✅ In Scope: Tool-Call Enforcement¶

Assay validates agent actions (tool calls) against policies you define.

Argument Validation (`args_valid`)¶

Enforce that tool arguments match a JSON Schema.

assertions:
  - type: args_valid
    tool: ApplyDiscount
    schema:
      type: object
      properties:
        percent:
          type: number
          maximum: 30  # Block discounts > 30%
        reason:
          type: string
          minLength: 10
      required: [percent, reason]

Use case: Prevent agents from applying excessive discounts, sending malformed API requests, or passing invalid parameters.

Sequence Enforcement (`sequence_valid`)¶

Ensure tools are called in the correct order.

assertions:
  - type: sequence_valid
    rules:
      - type: require
        tool: VerifyIdentity
      - type: before
        first: VerifyIdentity
        then: DeleteAccount

Use case: Require verification before destructive actions. Enforce multi-step approval workflows.

Tool Blocklists (`tool_blocklist`)¶

Prevent specific tools from being called.

assertions:
  - type: tool_blocklist
    blocked:
      - DeleteDatabase
      - DropTable
      - ExecuteRawSQL

Use case: Hard blocks on dangerous operations. Defense in depth for agent sandboxing.

❌ Out of Scope: Classifier-Based Safety¶

The following capabilities are explicitly out of scope for Assay. They require probabilistic classifiers and introduce non-determinism.

Capability	Why Not Assay	Alternative
Toxicity Detection	Requires language model classifier	Llama Guard, Perspective API
Jailbreak Detection	Arms race, adversarial by nature	Prompt gateways, Rebuff
Hallucination Detection	Requires ground truth comparison	LLM-as-judge pipelines, RAG evaluation tools
RAG Grounding	Context-dependent, semantic matching	RAGAS, TruLens
Bias Detection	Subjective, contested definitions	Academic research tools, human review
PII Detection	Pattern matching is incomplete	Presidio, cloud DLP APIs

Why This Boundary Matters¶

Determinism: Classifiers have variance. The same input can produce different outputs across runs. Assay guarantees: same trace + same policy = same result, always.
Latency: Classifier-based checks add 100ms-1000ms. Assay's pure-function checks run in <10ms p95.
False Positives: Classifiers trade off precision vs recall. Assay's rules are explicit—you control exactly what passes and fails.
Auditability: "The model said it was toxic" is not a compliance answer. "The discount exceeded 30% (schema violation)" is.

The Integration Model¶

Assay is designed to complement classifier-based tools, not replace them.

┌─────────────────────────────────────────────────────────────┐
│                     YOUR AGENT RUNTIME                       │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐     │
│  │ Prompt      │    │   ASSAY     │    │  Response   │     │
│  │ Gateway     │───▶│  Preflight  │───▶│  Filter     │     │
│  │ (jailbreak) │    │ (tool args) │    │ (toxicity)  │     │
│  └─────────────┘    └─────────────┘    └─────────────┘     │
│                                                             │
│  Classifier-based   DETERMINISTIC     Classifier-based     │
│  ~200ms             <10ms p95         ~150ms               │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Assay owns the middle: the tool-call decision point where deterministic enforcement is both possible and critical.

Decision Framework¶

Use this to decide if Assay is right for your check:

Question	If Yes → Assay	If No → Other Tool
Can I express this as a JSON Schema?	✅ `args_valid`	Use classifier
Can I express this as a tool sequence?	✅ `sequence_valid`	Use workflow engine
Can I express this as a blocklist?	✅ `tool_blocklist`	Use allowlist/RBAC
Does "maybe" ever make sense?	❌ Not Assay	Use probabilistic check
Must the check be <10ms?	✅ Assay	Async classifier OK

FAQ¶

Can Assay detect prompt injection?¶

No. Prompt injection detection requires semantic understanding of adversarial inputs. Use a dedicated prompt gateway or input sanitization layer.

Can Assay validate response quality?¶

No. Quality is subjective and requires LLM-as-judge or human evaluation. Assay validates actions, not content.

Can Assay enforce rate limits?¶

No. Rate limiting is a runtime infrastructure concern. Use your API gateway or agent framework's built-in throttling.

Can Assay replace my observability stack?¶

No. Assay produces audit events, but it's not a monitoring platform. Export Assay results to your existing observability tools (Datadog, Grafana, etc.) via the planned OTel integration.

Summary¶

Assay Is	Assay Is Not
Deterministic	Probabilistic
Rule-based	Classifier-based
Tool-focused	Content-focused
Fast (<10ms)	Expensive (100ms+)
Auditable	"Trust the model"

Tagline: If you can write it as a rule, Assay enforces it. If you need a model to decide, look elsewhere.