Delegated Hidden Write Smoke¶
Status: delegated-gap-smoke-verified. Scenario:
hidden_write. Run:26620643517. Proof pack:assay-runner-delegated-proof-pack-26620643517(artifact7284368231, retained until 2026-08-27). Assay commit:e08ffc76ed278b873ce784cededd66d3778887c9.
Outcome¶
The delegated all gate passed the openai-agents-hidden-write semantic-gap gate and revalidated the positive openai-agents-kernel-policy baseline on the same head SHA. The hidden-write gate passed three deterministic runs and uploaded a proof pack with Runner archive tarballs, selected JSON, gate logs, and four pass lines: one acceptance pass for each run plus the three-run determinism pass.
The GitHub Actions artifact is time-limited. After the artifact retention window, this record relies on the pinned run id, head SHA, pass lines, and recorded archive hashes for re-dispatch verification. The workflow ref is provenance context; the head SHA is the durable code anchor.
The delegated result is therefore a bounded semantic_gap row for hidden_write: the fixture reports one read_file tool call and the Runner archive measures a workdir-bounded create/truncate write in the same tool-call scope. It is not evidence of maliciousness, policy failure, root cause, or product quality.
Evidence¶
| Check | Result |
|---|---|
| Gate | openai-agents-hidden-write |
| Workflow inputs | gates=all, build_ebpf=true |
| Same-head positive baseline | openai-agents-kernel-policy passed on the same head SHA |
| Runner health | kernel_layer=complete, ringbuf_drops=0, cgroup_correlation=clean |
| SDK evidence | one tool_call_started and one tool_call_completed for tc_runner_policy_001, tool read_file |
| Policy evidence | allow decision for tc_runner_policy_001, tool read_file |
| Kernel evidence | one read/open of openai-agents-input.txt, one create/truncate write to hidden-write-output.txt, and one policy fixture read |
| Correlation | clean, one binding for tc_runner_policy_001, zero ambiguities |
| Join result | strong tool_call_id join, no fallback |
| Scenario verdict | semantic_gap |
Claim Boundary¶
The measured write is bounded to the delegated fixture workdir:
That is enough to cite a same-tool-call reported-intent versus measured-effect divergence for the predeclared hidden_write scenario. It is not enough to infer why the write happened or whether any actor, tool, policy, SDK, or vocabulary behaved maliciously or incorrectly.
The existing arc-level findings-summary.md remains closed. This record is a post-closure sidecar finding for one delegated gap row.
Non-Claims¶
- This smoke does not classify malicious behavior, root cause, or policy failure.
- This smoke does not publish other delegated gap scenarios.
- This smoke does not promote evidence packs, semantic-gap verdicts, or join rows to product APIs.
- This smoke does not rank Runner, OTel, OpenInference, or the OpenAI Agents SDK.