CI Integration¶
Add Assay to your CI/CD pipeline for zero-flake AI agent testing.
Why CI Integration?¶
Traditional approach:
With Assay:
GitHub Actions¶
Using the Assay Action (Recommended)¶
# .github/workflows/assay.yml
name: AI Agent Security
on:
push:
branches: [main]
pull_request:
permissions:
contents: read
security-events: write
pull-requests: write
jobs:
assay:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run tests with Assay
run: |
curl -fsSL https://getassay.dev/install.sh | sh
assay ci --config ci-eval.yaml --trace-file traces/ci.jsonl --sarif .assay/reports/sarif.json --junit .assay/reports/junit.xml
- name: Verify AI agent behavior
uses: Rul1an/assay-action@v2
with:
fail_on: error
Canonical public slug: Rul1an/assay-action@v2 (Marketplace). Internal monorepo contract tests may reference ./assay-action only inside this repository.
Action Inputs¶
| Input | Description | Default |
|---|---|---|
bundles | Glob pattern for evidence bundles | Auto-detect |
fail_on | Fail threshold: error, warn, info, none | error |
sarif | Upload to GitHub Security tab | true |
comment_diff | Post PR comment (only if findings) | true |
baseline_key | Key for baseline comparison | - |
write_baseline | Save baseline (main branch only) | false |
Action Outputs¶
| Output | Description |
|---|---|
verified | true if all bundles verified |
findings_error | Count of error-level findings |
findings_warn | Count of warning-level findings |
SARIF Integration (Automatic)¶
The action automatically uploads SARIF results to GitHub Code Scanning. Findings appear in the Security tab and inline in PR diffs.
No manual SARIF upload step needed - just add security-events: write permission.
GitLab CI¶
# .gitlab-ci.yml
stages:
- test
assay:
stage: test
image: rust:latest
before_script:
- cargo install assay
script:
- assay ci --config eval.yaml --trace-file traces/golden.jsonl --junit .assay/reports/junit.xml
- assay ci --config eval.yaml --trace-file traces/golden.jsonl --sarif .assay/reports/sarif.json
artifacts:
reports:
junit: .assay/reports/junit.xml
when: always
GitLab Security Report (SARIF)¶
assay:
script:
- assay ci --config eval.yaml --trace-file traces/golden.jsonl --sarif .assay/reports/sarif.json
artifacts:
paths:
- .assay/reports/sarif.json
Azure Pipelines¶
# azure-pipelines.yml
trigger:
- main
pool:
vmImage: 'ubuntu-latest'
steps:
- script: cargo install assay
displayName: 'Install Assay'
- script: assay ci --config eval.yaml --trace-file traces/golden.jsonl --strict --junit .assay/reports/junit.xml
displayName: 'Run Assay Tests'
- task: PublishTestResults@2
inputs:
testResultsFormat: 'JUnit'
testResultsFiles: '.assay/reports/junit.xml'
condition: always()
CircleCI¶
# .circleci/config.yml
version: 2.1
jobs:
assay:
docker:
- image: rust:latest
steps:
- checkout
- run:
name: Install Assay
command: cargo install assay
- run:
name: Run Tests
command: assay ci --config eval.yaml --trace-file traces/golden.jsonl --strict --junit .assay/reports/junit.xml
- store_test_results:
path: .assay/reports
workflows:
version: 2
test:
jobs:
- assay
Jenkins¶
// Jenkinsfile
pipeline {
agent any
stages {
stage('Install Assay') {
steps {
sh 'cargo install assay'
}
}
stage('Run Tests') {
steps {
sh 'assay ci --config eval.yaml --trace-file traces/golden.jsonl --junit .assay/reports/junit.xml'
}
}
}
post {
always {
junit '.assay/reports/junit.xml'
}
}
}
Docker-Based CI¶
For environments without Rust:
# Any CI system
steps:
- run: |
docker run --rm \
-v $(pwd):/workspace \
ghcr.io/rul1an/assay:latest \
run --config /workspace/eval.yaml --strict
Best Practices¶
1. Store Golden Traces in Git¶
your-repo/
├── eval.yaml
├── policies/
│ └── discount.yaml
└── traces/
└── golden.jsonl # ← Commit this
2. Use fail_on for Strict Mode¶
3. Cache Cargo Installation¶
4. Run on Relevant Changes Only¶
5. Separate Fast and Slow Tests¶
jobs:
assay:
# Evidence verification (fast)
steps:
- uses: actions/checkout@v4
- uses: Rul1an/assay-action@v2
integration:
needs: assay
# Real LLM tests (slow) — only if Assay passes
steps:
- run: pytest tests/integration
Debugging CI Failures¶
View Detailed Output¶
Download Artifacts¶
Local Reproduction¶
# Same command as CI
assay ci --config eval.yaml --trace-file traces/golden.jsonl --strict --db :memory: --sarif .assay/reports/sarif.json --junit .assay/reports/junit.xml
Performance¶
| Metric | GitHub Actions | GitLab CI |
|---|---|---|
| Install time | ~60s (cached: 2s) | ~60s |
| Test time (100 tests) | ~50ms | ~50ms |
| Total job time | ~70s | ~70s |
Compare to LLM-based tests: 3-10 minutes, \(0.50-\)5.00 per run.