MCP Tool Evidence Binding Quickstart¶
Assay does not detect tool poisoning. It shows what bounded evidence can safely connect: which MCP tool descriptions were visible, which tool was called, what effect was measured, and what claim is safe.
This is an experiment-scoped synthetic harness (assay.experiment.mcp_tool_evidence_binding.binding_cell.v0). It is not a product API, not a security scanner, and it never contacts a live MCP server. It exists to make one question concrete and reviewable:
When an MCP tool description is visible to the model and a tool call produces a measured effect, what claim is safe, and what is deliberately left unsaid?
Run It¶
cd docs/experiments/mcp-tool-evidence-binding-harness-2026-05
python3 mcp_tool_binding_harness.py --out-dir ./out --assay-commit demo
This emits one directory per scenario under ./out/, each containing the visible tool descriptions, the tool call, the measured effect when one was observed, and a binding-cell.json with the claim outcome and its non-claims.
The six committed reference outputs are indexed from runs/README.md and stored under runs/starter-synthetic/. They are regenerated and compared byte-for-byte by the harness test suite.
Golden Example 1¶
Scenario: effect_outside_declared_tool_boundary.
The visible read_file tool declares a read-only boundary: filesystem_read:/workspace/allowed/*. The measured effect is a write to /workspace/outside/hidden.txt.
The binding cell records:
| Field | Value |
|---|---|
called_tool_name | read_file |
measured_effect_kind | filesystem_write |
effect_capture_status | observed |
effect_within_declared_boundary | false |
join_key / join_grade | tool_call_id / strong |
claim_outcome | effect_outside_declared_tool_boundary |
The safe read: the measured write left the visible read-only boundary, bound to the same tool_call_id. That is the whole claim.
What it deliberately does not say:
does_not_classify_malicious_intentdoes_not_claim_policy_failuredoes_not_claim_root_causedoes_not_detect_tool_poisoning
A boundary divergence is evidence about an effect, not a verdict about intent. The cell carries the divergence and refuses the accusation in the same row.
Golden Example 2¶
Scenario: call_made_with_other_descriptions_visible.
Two tools are visible to the model: read_file (read-only) and write_file (write to /workspace/out/*). The model calls read_file; the measured effect stays inside its boundary.
This is the MCP-ITP shape: an influencing tool description can be co-visible without itself being called. The binding cell records the complete visible set, not just the called tool:
| Field | Value |
|---|---|
called_tool_name | read_file |
co_visible_tool_names | ["read_file", "write_file"] |
effect_within_declared_boundary | true |
claim_outcome | call_isolated_in_visible_context |
The safe read: the called tool is bound to its own description and effect, while the full co-visible description set is preserved as context.
What it deliberately does not say:
does_not_claim_co_visible_description_caused_call
Co-visibility is recorded as a fact. Causation between another visible description and the call is not claimed. That is exactly the inference the evidence cannot support, so it is named as a non-claim rather than left ambiguous.
All Six Scenarios¶
| Scenario | Claim outcome | Safe read | What it deliberately does not say |
|---|---|---|---|
benign_tool_call_bound | bound_tool_evidence | Visible description, call, and measured effect align inside the declared boundary, joined by tool_call_id. | Does not claim the tool is safe in general. |
description_changed_before_call | description_drift | The model-visible description digest differs from the referenced manifest before the call. | Does not claim the change was malicious or intentional. |
effect_outside_declared_tool_boundary | effect_outside_declared_tool_boundary | A bound call produced a measured effect beyond the visible boundary. | Does not claim maliciousness, policy failure, or root cause. |
call_made_with_other_descriptions_visible | call_isolated_in_visible_context | The called tool is bound; other visible descriptions are recorded as co-visible context. | Does not claim co-visible descriptions caused the call. |
description_visible_no_call | diagnostic_only | A description was visible; no call and no effect followed. | Does not claim the tool had no effect in general. |
call_made_no_measurable_effect | inconclusive | A call was observed but no effect could be measured in the capture surface. | Does not claim the call was inert. |
What This Is¶
- A synthetic, runnable demonstration of bounded
description -> call -> effect -> claimreading. - A review aid where non-claims are first-class output.
- An experiment-scoped reference for the starter harness outputs.
What This Is Not¶
- A tool-poisoning detector.
- An intent classifier.
- An MCP client, server, provider, or transport ranking.
- A product API or receipt family.
- A live MCP server or tunnel deployment.
For the research framing and scenario rationale, see ../mcp-tool-evidence-binding-2026-05.md.