Sequence Rules DSL¶

Define valid tool call sequences with declarative rules.

Overview¶

The Sequence Rules DSL lets you enforce order constraints on tool calls:

"Always verify identity before deleting a customer"
"Never call admin tools from untrusted contexts"
"Read before write"

These rules are deterministic — they produce pass/fail results with no ambiguity.

Quick Example¶

# mcp-eval.yaml
tests:
  - id: verify_before_delete
    metric: sequence_valid
    rules:
      - type: before
        first: VerifyIdentity
        then: DeleteCustomer

If your agent calls DeleteCustomer without first calling VerifyIdentity, the test fails.

Rule Types¶

`require` — Must Contain¶

The trace must contain at least one call to the specified tool.

rules:
  - type: require
    tool: VerifyIdentity

Trace	Result
`[GetCustomer, VerifyIdentity, UpdateCustomer]`	✅ Pass
`[GetCustomer, UpdateCustomer]`	❌ Fail

`before` — Order Constraint¶

Tool A must be called before Tool B (at least once).

rules:
  - type: before
    first: GetCustomer
    then: UpdateCustomer

Trace	Result
`[GetCustomer, UpdateCustomer]`	✅ Pass
`[UpdateCustomer, GetCustomer]`	❌ Fail
`[GetCustomer, UpdateCustomer, GetCustomer]`	✅ Pass

Note: before checks that at least one call to first happens before the first call to then.

`immediately_before` — Strict Adjacency¶

Tool A must be called immediately before Tool B (no other calls in between).

rules:
  - type: immediately_before
    first: ValidateInput
    then: ExecuteAction

Trace	Result
`[ValidateInput, ExecuteAction]`	✅ Pass
`[ValidateInput, LogEvent, ExecuteAction]`	❌ Fail

`blocklist` — Forbidden Tools¶

These tools must never be called.

rules:
  - type: blocklist
    tools:
      - admin_delete
      - system_reset
      - drop_database

Trace	Result
`[GetCustomer, UpdateCustomer]`	✅ Pass
`[GetCustomer, admin_delete]`	❌ Fail

Glob patterns are supported:

rules:
  - type: blocklist
    tools:
      - admin_*
      - system_*
      - *_dangerous

`allowlist` — Only These Tools¶

Only the specified tools are allowed. Everything else fails.

rules:
  - type: allowlist
    tools:
      - GetCustomer
      - UpdateCustomer
      - SendEmail

Trace	Result
`[GetCustomer, UpdateCustomer]`	✅ Pass
`[GetCustomer, DeleteCustomer]`	❌ Fail (DeleteCustomer not in allowlist)

`max_calls` — Call Frequency Limit¶

Limit how many times a tool can be called (formerly count).

rules:
  - type: max_calls
    tool: SendEmail
    max: 3

Trace	Result
`[SendEmail, SendEmail]`	✅ Pass
`[SendEmail, SendEmail, SendEmail, SendEmail]`	❌ Fail

`eventually` — Temporal Deadline¶

A tool must be called within N steps of the start of the trace.

rules:
  - type: eventually
    tool: ValidateOutput
    within: 5

Trace	Result
`[Step1, ..., Step4, ValidateOutput]`	✅ Pass
`[Step1, ..., Step10]` (no validation)	❌ Fail

`never_after` — Forbidden Transition¶

A forbidden tool must never be called after a trigger tool has been used.

rules:
  - type: never_after
    trigger: CommitTransaction
    forbidden: ModifyData

Trace	Result
`[ModifyData, CommitTransaction]`	✅ Pass
`[CommitTransaction, ModifyData]`	❌ Fail

`after` — Dependent Deadline¶

Tool B must be called within N steps after Tool A occurred.

rules:
  - type: after
    trigger: OpenFile
    then: CloseFile
    within: 10

Trace	Result
`[OpenFile, Read, CloseFile]`	✅ Pass
`[OpenFile, ..., (10 steps), ...]`	❌ Fail

Combining Rules¶

Rules are evaluated with AND logic. All rules must pass.

tests:
  - id: customer_workflow
    metric: sequence_valid
    rules:
      # Must verify identity
      - type: require
        tool: VerifyIdentity

      # Must verify before any destructive action
      - type: before
        first: VerifyIdentity
        then: DeleteCustomer

      # Never call admin tools
      - type: blocklist
        tools: [admin_*]

      # Max 5 API calls
      - type: count
        tool: ExternalAPI
        max: 5

Error Messages¶

When a rule fails, Assay provides actionable feedback:

❌ FAIL: sequence_valid (verify_before_delete)

   Rule: before
   Expected: VerifyIdentity before DeleteCustomer
   Actual: DeleteCustomer called at position 2, but VerifyIdentity never called

   Trace:
     1. GetCustomer
     2. DeleteCustomer  ← violation
     3. SendEmail

   Suggestion: Add VerifyIdentity call before DeleteCustomer

Real-World Patterns¶

E-commerce: Payment Flow¶

rules:
  # Validate cart before checkout
  - type: before
    first: ValidateCart
    then: ProcessPayment

  # Verify inventory before charging
  - type: before
    first: CheckInventory
    then: ProcessPayment

  # Never refund more than once
  - type: count
    tool: ProcessRefund
    max: 1

Healthcare: Data Access¶

rules:
  # Always authenticate
  - type: require
    tool: AuthenticateUser

  # Authenticate before any data access
  - type: before
    first: AuthenticateUser
    then: GetPatientRecord

  # Log all access
  - type: immediately_before
    first: GetPatientRecord
    then: LogAccess

  # No admin tools
  - type: blocklist
    tools: [admin_*, system_override]

Agent Handoffs: Multi-Agent¶

rules:
  # Router must run first
  - type: before
    first: RouterAgent
    then: [SpecialistA, SpecialistB, SpecialistC]

  # Only one specialist per request
  - type: count
    tool: SpecialistA
    max: 1
  - type: count
    tool: SpecialistB
    max: 1

Advanced: Conditional Rules¶

(Coming in v1.1)

rules:
  - type: before
    first: VerifyIdentity
    then: DeleteCustomer
    when:
      context.user_role: "standard"  # Only for non-admins

Migrating from v0¶

If you have old-style sequence configs:

assay migrate --config mcp-eval.yaml

This converts:

# Old format (v0)
sequences:
  - [GetCustomer, UpdateCustomer]

To:

# New format (v1)
rules:
  - type: before
    first: GetCustomer
    then: UpdateCustomer

Best Practices¶

1. Start Simple¶

Begin with blocklist and require, then add before rules.

rules:
  - type: blocklist
    tools: [admin_*, dangerous_*]
  - type: require
    tool: Authenticate

2. Use Descriptive IDs¶

tests:
  - id: auth_before_data_access  # ✅ Clear
  - id: test_1                   # ❌ Unclear

3. Keep Rules Focused¶

One rule per concern. Don't combine unrelated checks.

4. Test the Rules Themselves¶

Create traces that should fail to verify your rules catch violations.

Reference¶

Rule Type	Required Fields	Optional Fields
`require`	`tool`	—
`before`	`first`, `then`	—
`immediately_before`	`first`, `then`	—
`blocklist`	`pattern`	—
`allowlist`	`tools`	—
`max_calls`	`tool`, `max`	—
`eventually`	`tool`, `within`	—
`never_after`	`trigger`, `forbidden`	—
`after`	`trigger`, `then`, `within`	—
`sequence`	`tools`	`strict`