How Zero Trust Defeats Agentic AI Attackers

Written by Zero Trust | Mar 13, 2026 12:21:37 PM

First in a series of blog posts looking at the intersection of AI and cybersecurity - and why Zero Trust matters more than ever.

This blog focuses on the recent threat research by Anthropic mapping out an attack by a Chinese state actor using agentic AI, and how Zero Trust can help defend from such threats.

Advances in AI

AI has been evolving fast. Let’s zoom out for a moment.

In 2023 LLMs took the spotlight. The Turing Test finally passed unequivocally, and ”copilots” started to appear. You could now communicate with your computer. Over the next couple of years new models were released, and incremental improvements around quality and cost were delivered.
In 2025 a similar but less obvious step change occurred. The people training the AIs took lessons from other types of Machine Learning – Reinforced Learning from Verifiable Rewards – and used these techniques to train the LLMs in a novel way. This training spontaneously unlocked “reasoning” – strategies that break down more complex problems into intermediate calculations. This was the missing feature that allows models to start acting like agents that can chain together tasks and make decisions with only minimal human input.

Alongside reasoning a few other developments have recently occurred that unlock the potential for bad actors to use agentic AI in their nefarious activities.

The ability of AI to understand, produce and debug software code skyrocketed.
This is very helpful for building software (the primary goal of the enhancement), however it’s also very helpful when attacking networks as it lets the agent write & execute shell scripts automatically.
Standardized protocols like Model Context Protocol gives AI access to a growing ecosystem of tools. In particular, it gives AI access to the typical tools of human pen tester.
Leading to the recent announcement that pen testing toolkit Kali Llinux has added agentic AI functionality. Users no longer need to deploy hacking tools via arcane shell scripts, instead they can type natural-language instructions such as ”scan this host, tell me about any vulnerabilities, and then exploit them.” This significantly increases the pool of cybercriminals that can effectively compromise enterprise networks, leading to DoS, ransomware, data exfiltration etc.

From Theory to Reality: Anthropic’s Findings

This isn’t hypothetical. Anthropic recently reported the first AI-orchestrated cyber espionage campaign using Claude Code. Here’s how it unfolded:

Targeting: A human operator gives Claude the target.
Reconnaissance: Claude maps the attack surface using external tools.
Payload Generation: Claude creates exploit scripts for known vulnerabilities, and checks they work.
Exploitation: Credentials are harvested; lateral movement begins.
Exfiltration: High-priority data is stolen.
Reporting: Claude auto-generates documentation of the attack.

Anthropic estimates 80–90% of the work was automated. This level of sophistication simply wasn’t possible until 2025.

So how can Zero Trust help?

What can security teams do to protect their enterprises against this new class of attack? The good news is that a strong adherence to Zero Trust tenets will defend the network against these new attack technologies, just like they did in the old world.

You will notice the Anthropic threat report summarized above looks very familiar. The key difference between agentic AI attacks and the old human way of working is in terms of speed of execution. The agents can execute many more commands more quickly than human operators. But the general shape of the attacks hasn’t changed. And that means existing zero trust controls will protect against both agentic AI and human actors.

Think of the classic hotel analogy:
If one admin key opens every door, a stolen key means total compromise. A human attacker might take hours to exploit that. An AI agent? Seconds. But if every room has its own key, the blast radius is contained - regardless of the attacker’s speed – they are just faster to open that one door they can.

Leaving our hotel analogy and returning to security architecture: let’s consider some points around implementation – how can we implement a key for every room?

The key technical pillars of a zero trust architecture that can do this are:

A strong ZTNA solution constrains how all humans and agents can access individual assets of an organization in a least privilege and conditional access context.
- The hardened external facing side of a ZTNA access solution is fantastic at reducing the attack surface - with some ZTNA providers even removing all open ports on the edge
  - This leaves no attack surface for an attacker-controlled Claude agent to find
  - What you can’t see you can’t attack.
- Moreover, ZTNA solutions are typically tightly integrated with identity-centric security that provides a high degree of contextual signals for each access session. Context-aware policies can flag anomalies that indicate an AI agent is accessing services rather than a human. This allows you to quickly and automatically lock down sessions that exhibit malicious behaviour.
However, a core tenet of zero trust is to always “assume breach”. So, let’s assume the attacking agent can get to the asset somehow - how do we ensure that the risk is minimized when the likelihood is already close to 100%? - we need to mitigate the impact - this is where Micro-segmentation comes into play. Microsegmentation maximal number of assets accessible even when ZTNA fails. This prevents lateral movement by the attacker from the asset.
Beyond these two key pillars: enhanced detection, monitoring and automation is useful to look for rapid actions indicating the presence of an unauthorized AI agent inside your network and take rapid action of your own to prevent the agent from achieving its aim.
And you should be encouraging your Red Team to embrace the attacking agentic AI model and start testing your controls before a real attacker does.

The Bottom Line

Agentic AI is here - and attackers will use it and are embracing it quicker than most defenders. But a robust Zero Trust architecture remains your best defence. And here at Zero Trust Solutions we can help you plan & implement that defence: check out our Microsegmentation AI Engineer (MAIE) which is able to build and recommend policies for you.

View full post