OpenA2A /specs
ATM1.1Published (June 2026)

AI Agent Threat Matrix

A MITRE ATT&CK-style catalog of 61 techniques across 9 attacker tactics, each tagged by real-world evidence and mapped to detection checks, lab scenarios and controls.

The question it answers

How can an agent be attacked, and is each threat real?

If you know one thing already

It’s like MITRE ATT&CK, for AI agents.

Why it exists

You cannot defend against threats you haven't named. Network defenders have MITRE ATT&CK, a shared catalog of how adversaries behave. AI agents needed their own, because their attack surface is genuinely new:

“AI agents make decisions based on natural language, delegate actions to tools with varying trust levels, communicate with other agents via open protocols, and maintain persistent memory that can be poisoned. These properties create attack surfaces that existing frameworks do not adequately model.”

The AI Agent Threat Matrix classifies 61 techniques across 9 attacker tactics, and, unusually for a threat framework, tags every technique with how strong the evidence for it is.

The kill chain

The nine tactics are the stages of an attack, from mapping a target to causing impact. Each technique (id format T-XYYY, where X is the stage) belongs to exactly one.

The 9 attacker tacticsattacker advances →
Stage 1ReconnaissanceT-1001 Endpoint Enumeration
Stage 2Initial AccessT-2001 Direct Prompt Injection
Stage 3Credential HarvestT-3001 Prompt Credential Extraction
Stage 4Privilege EscalationT-4002 Admin Impersonation
Stage 5Lateral MovementT-5002 A2A Agent Pivoting
Stage 6PersistenceT-6001 Memory Injection
Stage 7CollectionT-7004 Memory Dump
Stage 8ExfiltrationT-8002 HTTP Callback
Stage 9ImpactT-9001 Data Manipulation

Every technique belongs to one tactic. Defenders don't need to stop all nine, breaking the chain at any stage stops the attack.

Evidence, not just imagination

The matrix's distinguishing feature: every technique carries an evidence tier, so readers can tell a seen-in-the-wild attack from a theoretical one.

Validatedreproduced in a controlled lab (DVAA, HMA, pentests)
42
Observedconfirmed in real-world systems
16
Adaptedtraditional technique applied to agents
3
Total techniques61
Tier definitions and counts from the matrix's own EVIDENCE_AUDIT. 95% are observed or validated; only 3 are adapted from traditional security.
Observed
Confirmed in real-world production systems, exposure sweeps, npm attacks, agent cards found in the wild.
Validated
Reproduced in a controlled lab, DVAA (Damn Vulnerable AI Agent) challenges, HackMyAgent scans, penetration tests.
Adapted
A well-understood traditional technique applied to the agent context, not yet observed agent-specifically (e.g. DNS exfiltration).

Anatomy of a technique

Each technique is more than a description, it's a join point between the threat, the tools that detect it, and the controls that mitigate it.

A technique entry (matrix.json)json
{
  "id": "T-1001",
  "name": "Endpoint Enumeration",
  "tactic": "reconnaissance",
  "description": "Discover exposed API endpoints, health checks, and
                  information disclosure routes on target agents",
  "evidenceTier": "observed",
  "attackClass": "RETROACTIVE-PRIV",
  "hmaChecks": ["WEBEXPOSE-001", "WEBEXPOSE-002", "MCP-011"],  // → detection
  "dvaaValidation": "All agents expose /health and /info",     // → lab repro
  "oasbControls": ["1.1", "1.2", "1.3", "1.4", "10.5"]         // → mitigation
}
Key idea

One technique, three tools

hmaChecks point to HackMyAgent scanner checks that detect it; dvaaValidation points to a reproducible lab scenario; oasbControls point to OASB controls that defend against it. The matrix is the connective tissue between threat, detection and defense.

How it relates to MITRE

The matrix is a sibling to MITRE's frameworks, not a fork. MITRE ATLAS models attacks on the ML pipeline (training data, model extraction); this matrix models the agent infrastructure above the model, governance files, memory poisoning, multi-agent protocols, tool supply chains. Roughly 37 of its techniques have no ATLAS equivalent.

Note

Distinct IDs, on purpose

The matrix uses T-XYYY(the stage is encoded in the id), deliberately distinct from MITRE's sequential TXXXX. It has not been submitted to MITRE; all MITRE references are cross-mappings, not claims of inclusion.

How it applies to agents

  • Red teams walk the chain stage by stage and cite techniques by id (e.g. “ATM T-2001”).
  • Blue teams map their defenses to each tactic and aim to break the chain early.
  • Tool vendors map detection coverage to technique ids for honest, comparable claims.
  • It is the shared vocabulary that lets the rest of the stack, AIIS, OASB, the scanners, agree on what they're defending against.