Runner Fidelity Verdict v0¶

Status: internal derived helper contract. This page defines the assay.runner.fidelity_verdict.v0 vocabulary and claim gate derived from assay.runner.observation_health.v0. It does not wire the verdict into Runner archives, CLI output, Trust Basis, or any stable public report schema.

Purpose¶

Runner observation health already records the raw measurement-health signals:

kernel_layer
ringbuf_drops
policy_layer
sdk_layer
cgroup_correlation
network_protocol_coverage
network_endpoint_claim_scope

Those fields are the source of truth. The fidelity verdict is a small derived read-model that answers one narrower question:

Which Runner measured-effect claims may downstream projection/report code
interpret from this observation health record?

It is a gate before interpretation, not a new capture layer.

Schema String¶

assay.runner.fidelity_verdict.v0

The current Rust helper lives in assay-runner-schema. The schema string is an internal helper/contract label at this stage; no JSON Schema sidecar or archive member is frozen yet.

Vocabulary Boundary¶

Do not confuse this verdict with the experiment-scoped observability calibration agreement vocabulary.

Surface	Scope	Vocabulary	Meaning
Observability calibration	Requested signal vs observed retained signal	`match`, `clipped`, `drift`, `failed`, `not_applicable`	Did a requested comparison/retention target match, clip, drift, fail, or fall outside the measurement surface?
Runner fidelity verdict	One Runner `observation_health.v0` record	`clean`, `clipped`, `correlation_partial`, `failed`, `not_applicable`	May measured-effect claims be interpreted from this run's measurement health?

The calibration vocabulary uses match and drift because it compares a requested target with an observed artifact. Runner fidelity deliberately does not use those words: a health record can be clean without proving the workload content "matched" anything, and drift is a cross-state comparison, not per-run measurement quality.

The shared words clipped, failed, and not_applicable are allowed because they describe measurement states in both contexts. Consumers must still use the schema namespace to decide which semantics apply.

Verdicts¶

Verdict	Meaning
`clean`	Kernel measurement is available, no ring-buffer drops are reported, and cgroup correlation is clean.
`clipped`	Measurement ran, but known event loss or partial kernel capture blocks absence/bounded-negative claims.
`correlation_partial`	Measurement exists, but the binding between observed effects and the run/cgroup/tool boundary is incomplete.
`failed`	The health record is invalid for measured Runner claims, or cgroup correlation failed.
`not_applicable`	The platform or layer does not provide the measured kernel-effect surface. Reported claims may still exist, but they are not measured kernel-effect claims.

Claim Gate¶

The helper emits a claim_gate with explicit decisions for every verdict:

Verdict	`reported_claims`	`measured_positive_claims`	`bounded_negative_claims`	`per_binding_claims`
`clean`	`allowed`	`allowed`	`allowed`	`allowed`
`clipped`	`allowed`	`degraded`	`blocked`	`allowed`
`correlation_partial`	`allowed`	`degraded`	`blocked`	`blocked`
`not_applicable`	`allowed`	`blocked`	`blocked`	`blocked`
`failed`	`blocked`	`blocked`	`blocked`	`blocked`

failed blocks claims authorized by this fidelity verdict. It does not erase separately validated SDK, trace, or external receipt artifacts; it means this Runner health record cannot safely authorize those claims.

The load-bearing rule is:

ringbuf_drops > 0 => clipped => bounded_negative_claims = blocked

Observed positive events can remain useful under clipped; missing events cannot prove absence. This is why clipped degrades positive measured claims but blocks bounded negative claims.

clean is capture-health clean, not protocol-universal. A record may be clean while still declaring network_protocol_coverage = absent and network_endpoint_claim_scope = not_applicable when no network protocol events were observed. A record with observed connect activity may instead declare network_protocol_coverage = connect_only and network_endpoint_claim_scope = diagnostic_only; downstream code must not stretch that into an exact QUIC/datagram peer-set claim. When Runner observes sendto or sendmsg destination sockaddr events it may upgrade network_protocol_coverage to datagram_peer_observed or connect_and_datagram_peer_observed; that is a stronger transport signal, but still not a request-level or exact peer-set binding while network_endpoint_claim_scope = diagnostic_only. If network events may have been dropped before any network event was emitted, coverage is unknown.

Composition With Projection `claim_level`¶

Projection helpers such as path projection carry claim_level values like:

raw_observed
projected_equivalent
inconclusive

The fidelity verdict does not replace that vocabulary. It gates which projection claim levels may be interpreted as measured-effect claims.

Initial composition rule:

Projection `claim_level`	Fidelity gate consulted
`raw_observed`	`claim_gate.measured_positive_claims`
`projected_equivalent`	`claim_gate.measured_positive_claims`
`inconclusive`	Allowed to remain inconclusive unless the verdict is `failed`
unknown value	Blocked

If a projection or report wants to make a bounded-negative claim, it must also consult claim_gate.bounded_negative_claims. If it wants to bind a claim to a specific run/cgroup/tool identity, it must also consult claim_gate.per_binding_claims.

This keeps claim_gate as a guardrail over existing claim levels instead of introducing a second independent claim hierarchy.

Composition With Coverage Descriptors¶

fidelity_verdict.v0 gates capture health. coverage_descriptor.v0 gates the per-dimension visibility ceiling. Consumers must apply both when interpreting measured side effects.

Example:

A clean health record may allow bounded_negative_claims at the fidelity layer.
A filesystem descriptor with completeness = open_syscall_only still blocks a bounded-negative file-activity claim because io_uring or mmap-backed writes can hide effects from the capture method.

So the combined decision is the stricter of:

fidelity claim gate
coverage descriptor claim-kind decision

This keeps positive observed effects useful while preventing a clean capture-health verdict from becoming a universal absence or completeness claim.

Derived Shape¶

Example clean verdict:

{
  "schema": "assay.runner.fidelity_verdict.v0",
  "source_schema": "assay.runner.observation_health.v0",
  "run_id": "run-001",
  "verdict": "clean",
  "claim_gate": {
    "reported_claims": "allowed",
    "measured_positive_claims": "allowed",
    "bounded_negative_claims": "allowed",
    "per_binding_claims": "allowed"
  },
  "reasons": [
    {
      "field": "observation_health",
      "observed": "clean",
      "rule": "complete_kernel_layer_zero_drops_clean_cgroup_correlation"
    }
  ],
  "non_claims": [
    "fidelity_no_observation_health_replacement",
    "fidelity_no_policy_correctness_verdict",
    "fidelity_no_runtime_safety_verdict",
    "fidelity_no_agent_quality_score",
    "fidelity_no_probabilistic_confidence_score"
  ]
}

Classification Rules¶

The v0 helper derives the verdict only from one ObservationHealth value:

Invalid observation-health records or cgroup_correlation = failed produce failed.
Non-Linux platforms or absent kernel layers produce not_applicable.
ringbuf_drops > 0 or kernel_layer = partial_ringbuf_drops produces clipped.
cgroup_correlation = partial produces correlation_partial.
kernel_layer = complete, ringbuf_drops = 0, and cgroup_correlation = clean produces clean.

The helper intentionally does not upgrade or reinterpret network_protocol_coverage or network_endpoint_claim_scope. Those fields remain an explicit honesty boundary on top of the capture-health verdict.

If more than one degradation is present, the helper preserves specific reasons and applies the stricter gate where needed. For example, partial cgroup correlation blocks per-binding claims even when the top-level verdict is driven by clipping.

Non-Claims¶

The verdict does not replace observation_health.v0.
The verdict does not prove policy correctness.
The verdict does not prove runtime safety.
The verdict does not score agent quality.
The verdict does not provide probabilistic confidence.
The verdict does not validate reported traces, spans, tool calls, or SDK events as true.
The verdict does not convert connect-only network capture into an exact datagram/QUIC peer-binding claim.

Wiring Boundary¶

This slice adds only the contract and helper. It intentionally does not:

add a Runner archive member;
add CLI output;
add Trust Basis claims;
add capability-diff gating;
add cross-runtime report wiring;
add path/network projection Slice 2 or Slice 3.

Report wiring should wait for a concrete consumer or review surface that needs the derived verdict.