Why it exists
You cannot defend against threats you haven't named. Network defenders have MITRE ATT&CK, a shared catalog of how adversaries behave. AI agents needed their own, because their attack surface is genuinely new:
“AI agents make decisions based on natural language, delegate actions to tools with varying trust levels, communicate with other agents via open protocols, and maintain persistent memory that can be poisoned. These properties create attack surfaces that existing frameworks do not adequately model.”
The AI Agent Threat Matrix classifies 61 techniques across 9 attacker tactics, and, unusually for a threat framework, tags every technique with how strong the evidence for it is.
The kill chain
The nine tactics are the stages of an attack, from mapping a target to causing impact. Each technique (id format T-XYYY, where X is the stage) belongs to exactly one.
Every technique belongs to one tactic. Defenders don't need to stop all nine, breaking the chain at any stage stops the attack.
Evidence, not just imagination
The matrix's distinguishing feature: every technique carries an evidence tier, so readers can tell a seen-in-the-wild attack from a theoretical one.
- Observed
- Confirmed in real-world production systems, exposure sweeps, npm attacks, agent cards found in the wild.
- Validated
- Reproduced in a controlled lab, DVAA (Damn Vulnerable AI Agent) challenges, HackMyAgent scans, penetration tests.
- Adapted
- A well-understood traditional technique applied to the agent context, not yet observed agent-specifically (e.g. DNS exfiltration).
Anatomy of a technique
Each technique is more than a description, it's a join point between the threat, the tools that detect it, and the controls that mitigate it.
{
"id": "T-1001",
"name": "Endpoint Enumeration",
"tactic": "reconnaissance",
"description": "Discover exposed API endpoints, health checks, and
information disclosure routes on target agents",
"evidenceTier": "observed",
"attackClass": "RETROACTIVE-PRIV",
"hmaChecks": ["WEBEXPOSE-001", "WEBEXPOSE-002", "MCP-011"], // → detection
"dvaaValidation": "All agents expose /health and /info", // → lab repro
"oasbControls": ["1.1", "1.2", "1.3", "1.4", "10.5"] // → mitigation
}One technique, three tools
hmaChecks point to HackMyAgent scanner checks that detect it; dvaaValidation points to a reproducible lab scenario; oasbControls point to OASB controls that defend against it. The matrix is the connective tissue between threat, detection and defense.How it relates to MITRE
The matrix is a sibling to MITRE's frameworks, not a fork. MITRE ATLAS models attacks on the ML pipeline (training data, model extraction); this matrix models the agent infrastructure above the model, governance files, memory poisoning, multi-agent protocols, tool supply chains. Roughly 37 of its techniques have no ATLAS equivalent.
Distinct IDs, on purpose
The matrix usesT-XYYY(the stage is encoded in the id), deliberately distinct from MITRE's sequential TXXXX. It has not been submitted to MITRE; all MITRE references are cross-mappings, not claims of inclusion.How it applies to agents
- Red teams walk the chain stage by stage and cite techniques by id (e.g. “ATM T-2001”).
- Blue teams map their defenses to each tactic and aim to break the chain early.
- Tool vendors map detection coverage to technique ids for honest, comparable claims.
- It is the shared vocabulary that lets the rest of the stack, AIIS, OASB, the scanners, agree on what they're defending against.