Data Flow¶

This page describes the primary data flows through the Assay system.

Core Pipeline: Trace → Gate → Evidence¶

flowchart LR
    T[Trace File<br>JSONL] --> R[Runner<br>Engine]
    P[Policy<br>YAML] --> R
    C[Config<br>eval.yaml] --> R
    R --> |per test| M[Metrics<br>Evaluation]
    M --> V[Verdict<br>Pass/Fail/Warn]
    V --> O[Outputs]
    O --> CON[Console]
    O --> JSON[run.json<br>summary.json]
    O --> SARIF[SARIF]
    O --> JUNIT[JUnit XML]
    R --> E[Evidence<br>Export]
    E --> B[Bundle<br>.tar.gz]

Trace Ingestion¶

Read: read_events() parses JSONL trace file line-by-line
Aggregate: aggregate() groups tool calls by name with statistics
Evaluate: Each test applies its metric evaluator to the aggregated data
Gate: Exit code determined by pass/fail counts and --strict mode

Evidence Pipeline¶

Collect: ProfileCollector gathers events during a run (OTel Collector pattern)
Map: EvidenceMapper transforms to EvidenceEvent (CloudEvents v1.0 envelope)
Export: assay evidence export creates content-addressed, JCS-canonicalized bundle
Verify: assay evidence verify checks integrity offline (SHA-256 manifests)
Lint: assay evidence lint scans for compliance findings (SARIF output)

Generate / Profile Pipeline¶

Ingest: Parse trace events via ingest::read_events()
Aggregate: Count tool calls, compute statistics via ingest::aggregate()
Classify: Wilson lower-bound scoring via profile::classify_entry()
Output: Generate policy.yaml with allow/review/deny sections

Data Flow¶

Core Pipeline: Trace → Gate → Evidence¶

Trace Ingestion¶

Evidence Pipeline¶

Generate / Profile Pipeline¶

Further Reading¶