Why it exists
Every agent-security tool claims to catch attacks. Buyers had no neutral way to check. The antivirus industry solved this with MITRE ATT&CK Evaluations: run a fixed set of real attacks against each product and publish what it caught. OASB is that, for agent security, a vendor-neutral, reproducible measurement of detection coverage.
OASB measures the tools, not the agents
The Threat Matrix catalogs the attacks. OASB runs them against a scanner and scores how many it detects. It is the benchmark a security tool is held to, distinct from ABGS / OASB-2, which audits an agent's declared governance.222 attack scenarios, 10 categories
The benchmark is a fixed corpus of attack scenarios spanning the full surface, from process and network behavior to the AI layer itself and multi-step chained attacks.
How a tool is scored
Each scenario is run against the tool under test and the result is tallied as a confusion matrix, yielding standard detection metrics. The benign Baseline scenarios matter as much as the attacks, a tool that flags everything is as useless as one that flags nothing.
- Detection rate (recall)
- Of the real attacks, how many were caught? TP / (TP + FN).
- False-positive rate
- Of the benign cases, how many were wrongly flagged? FP / (FP + TN).
- Precision
- Of everything flagged, how much was a real attack? TP / (TP + FP).
- F1 score
- The harmonic mean of precision and recall, one number balancing both.
- P95 latency
- 95th-percentile detection time, in milliseconds.
An example scorecard
One measured result from the repository, HackMyAgent's full pipeline over 4,245 labeled samples. Shown to illustrate the output shape; numbers are specific to that tool and version.
Verdicts count attacks, not posture
OASB's verdict counts high/critical attack findings. Posture findings, a missing governance file, wildcard tool access, are surfaced to the user but excluded from the malicious verdict, because they fire on benign and malicious agents alike. That distinction is what keeps the false-positive rate honest.The Skills Security controls
Alongside the attack corpus, OASB defines a 10-item Skills Security checklist (SS-01-SS-10), argument validation, output integrity, least-privilege scope, signed manifests, audit logging, dependency provenance, graceful degradation, and more, tiered L1 → L3.
Anchored to the standards everyone uses
Every scenario maps to MITRE ATLAS (15 techniques) and the OWASP LLM/Agentic Top 10, so results are comparable to the wider security world rather than living in a silo. OASB also ships a DVAA comparison (70 scenarios) for agent-level evaluation.