Malice in the Mesh // 04: Multi-Agent Architectures, Hierarchies, and Swarms

Until this point, we have analyzed the behaviors and dynamics of single agents in isolation. As complex as they may seem, single agent systems are the tip of the iceberg. It is important to understand there are several types of agent systems that increase complexity significantly: multi-agent systems, hierarchical systems, and swarm architectures.

Multi-Agent Architecture

Multi-agent systems add a new dimension: inter-agent communication. AutoGen implements this as a literal chat between agents. CrewAI structures this as a crew of agents with defined roles. While allowing separate agents to specialize in their roles, division of labor produces positive outcomes. Often fruitful yet this extra layer of complexity adds an extra dimension of security implications. A multi-agent system that does simply add more components. They add trust relationships, delegation paths, and cross-agent feedback loops. Compromise not only can propagate downward, but sideways as well.

This means an exponentially growing attack surface. Complexity breeds insecurity. To draw a parallel to traditional IT infrastructure, it is similar to the differences between compromising a single machine vs. an Active Directory network. The network does not “spread out” risk. It becomes the enabler of risk. More opportunity for lateral movement, and privilege escalation becomes easier. Organizationally, as cybersecurity budgets experience deflation in recent years, this becomes a scaling issue as attack surface has an exponential growth trajectory and department headcount does not.

Hierarchical systems “unflatten” the standard multi-agent framework and add explicit control structure. The top level agent is used in a supervisory, or orchestrator role. This extra layer of control comes with even more security implications. The orchestrator agent now becomes the crown jewel. Mapping to traditional IT infrastructure, the orchestrator is akin to a Domain Controller in an Active Directory network.

Orchestration layer compromises are booming in prominence. Recent LiteLLM, n8n, and Langflow CVEs to name a few, all with notable exploitation in major campaigns. Google DeepMind’s CaMeL architecture, which we briefly covered in our first installment, is a direct answer to the “crown jewel” orchestrator problem. Taking a que for it’s trust-boundary pattern from an operating system’s privilege separation, CaMeL separates the orchestrator in two: a privileged agent which sees only the trusted user request and is responsible for deciding the plan or sequence of actions and a quarantined agent which is the only agent allowed to process untrusted content, such as emails, webpages, retrieved docs, or other text. These two agents do not talk freely and communicate strictly through a restricted interface. A kernel-style security pattern.

Swarm architectures are cutting edge. Of the four main architectures, swarm is the most dynamic. With no hierarchy and no predefined roles, swarm agents hand off control to each other based on context. This pushes model reasoning to its limits. While successful swarm architectures represent the pinnacle of autonomous agentic AI with the highest real world impact, they also represent the greatest of concerns for security. With no control structures in place by design, the blast radius is unbounded.

While the orchestration layer presents some of the most difficult challenges for present day agent security, let’s dive deeper. The orchestration layer has three responsibilities that each introduce potential attack surface expansion. They are task decomposition, results aggregation, and failure handling.

The Orchestration Layer

Task decomposition is the process of breaking a high-level user goal into subtasks that individual worker agents can handle. There are several benefits derived from task decomposition. They include allowing for parallel execution, establishing clear procedural checkpoints for review and restore, and reducing cognitive load for each call. 3 main types of task decomposition include sequential, parallel, and hybrid decomposition:

Sequential is straightforward — each subtask depends on the previous and they are executed in order.

For parallel, subtask branches stem out from one main task, run concurrently, and then converge together again to form a result.

Hybrid combines the previous two types, in which several intermediate points of convergence happen in between subtask branches stemming out from those points.

Quite simply, task decomposition can follow any type of algorithmic pattern along the way. Other examples include recursive search and the all powerful map-reduce.

As we venture out from the single-agent mindset and embrace multi-agent architecture it is also important to remember that any of the task decomposition types can also implement a hierarchical framework in which master agent(s) assign subtasks to worker agents. Some sources may consider this a separate type altogether, but it’s important to remember any of our 3 main types can operate in a hierarchical fashion. Faulty logic during task decomposition is a common path for attackers to take advantage of an agent’s reasoning engine.

Result aggregation is the process of combining outputs from multiple worker agents into a coherent result. This responsibility is a precise callback to the PPAO loop. In multi-agent systems, PPAO chains connect: Agent A’s Observe phase produces output that becomes Agent B’s Perceive phase input. The injection crosses at this junction, the Observe-to-Perceive boundary between agents. Similar to attacks on Active Directory infrastructure where the ultimate bullseye is operating as NT AUTHORITY/SYSTEM on a Domain Controller, a common attack path in a multi-agent system is abusing result aggregation with the orchestration agent as the target. This agent typically operates with elevated privileges. From there the attacker can more easily escape the orchestration boundary, establish persistence, or poison downstream execution.

Least privilege should be implemented wherever possible throughout multi-agent systems, especially for the orchestration agent. It should mean splitting authority by function and making privileges explicit at delegation time. This reduces blast radius and also increases auditability for the system. Monitoring is made easier by clearly defined boundaries. An additional architectural design to consider is to split the orchestration head into two agents under dual control. For example, the split could be comprised of planner/decomposer or controller/approver roles. It is important to be mindful of the tradeoffs – as this increases complexity, as well as possible latency and coordination cost.

Failure handling determines what happens when a worker agent fails, times out, or produces unexpected output. Most orchestrators implement retry logic, fallback strategies, and escalation paths. Typically, these fallback routes are found less secured as the primary path, leading to attacker abuse. An often overlooked yet highly consequential aspect of design in multi-agent systems is errors can be amplified via cross-agent communication. If Agent A’s Observe phase produces erroneous output and is picked up in Agent B’s Perceive phase, not only does the error flow downstream, but it is possible Agent B may actually amplify that error. CVE-2025-68664 in LangChain is an example of this, where the attacker can pass along user shaped data as trusted internal data using the ‘lc’ key. When Agent B picks that up in its Perceive phase, it reinterprets the user data as trusted framework structure, which amplifies the problem from a bad observation into the next step of the attack chain.

Claude Code’s failure handling consists of several pieces. First, a centralized model execution component called the QueryEngine. It is responsible for assembling prompts, streaming responses, counting tokens, and applying retry logic. By concentrating these functions in one place, Claude Code can ensure that failure handling is enforced consistently rather than piecemeal. Second, tool errors go back to the LLM, meaning failure handling is routed back through the reasoning engine instead of being handled entirely by deterministic logic. Third, model fallback occurs when limits are hit. If Opus is running and hits its limit, it falls back to Sonnet to ensure the workflow continues. Fourth, automatic context compression and built-in circuit breakers. The former makes sure a workflow continues running instead of crashing and the latter retires a workflow after 3 tries to prevent endless loops racking up API spend.

The orchestration layer is where we see the greatest differences amongst frameworks. LangGraph, AutoGen, n8n, CrewAI all have their own unique approaches. This makes detection and monitoring significantly more difficult. To compound the difficulty, as of April 2026 the orchestration layer is one of the top targets in the AI stack for threat actors. We will dive into the differences and security implications of each during a later installment.

Conclusion

Our first 4 installments laid the ground work for the security discussion we will have going forward. The purpose of this series is to give us the tools for proper detection and defensible architecture of AI agents. In the current day technology space which evolves at lightning speed, it was vitally important we first understood the fundamentals and dissected the stack from the perspective of a technologist. As I stated in the first installment, it is of prime importance we take hold of one phrase:

slow is smooth, and smooth is fast.

Have suggestions or want to collaborate on a future project? Shoot me an email at roccofiorecyber@gmail.com or find me on LinkedIn at the icon below.

The content published on this site reflects personal views and research only. It does not represent the views, positions, or policies of any current or former employer, client, or affiliated organization.

Any references to technologies, vulnerabilities, or security practices are for educational and informational purposes only. Nothing on this site should be interpreted as endorsement, disclosure of confidential information, or professional advice.

All examples are generalized or fictionalized unless explicitly stated otherwise.

Latest Posts

roccofiorecyber@gmail.com

Building an AI Analysis Lab: Part 2May 20, 2026
Welcome to the second installment of our AI Analysis Lab. Part 1 gave a thorough ordering of our host-native stack. Part 2 walks through the build of the Kubernetes observability layer that turns nameless syscalls into named flows. A kind cluster running Cilium as the CNI, Hubble for the flow log, Tetragon for in-container process… Read more: Building an AI Analysis Lab: Part 2
Building an AI Analysis Lab: Part 1May 11, 2026
Welcome to the first installment of our AI Analysis Lab, built for running the practical examples in the Malice in the Mesh series, as well as further continuing analysis. The first thing to know as we jump right in, is the examples below are examples only. While much of the same code and process will… Read more: Building an AI Analysis Lab: Part 1
Building an AI Analysis Lab: IntroductionMay 10, 2026
All rigorous AI Analysis needs a sturdy AI Analysis Lab. There is no better way to learn than to get hands on, so let’s get hand on. If you have not seen some of the practical examples in the Malice in the Mesh series, I highly suggest you go over there and check it out.… Read more: Building an AI Analysis Lab: Introduction
Malice in the Mesh // 06: Behavioral Analysis at the Network and Memory LayersMay 9, 2026
Welcome back. For the foreseeable future, we will be tacking detection engineering for AI Agents one detection surface at a time. In this installment, we begin by diving into behavioral analysis at the network layer. Each analysis type may be comprised of multiple installments, as we move through the discipline in a slow, but smooth… Read more: Malice in the Mesh // 06: Behavioral Analysis at the Network and Memory Layers
Malice in the Mesh // 05: Intro to Detection Engineering for AI Agent SystemsMay 2, 2026
If modern AI agents were as simple as one outbound API call and a model output, there would not be a need for a comprehensive guide regarding detection engineering of these systems. Agents and orchestration tools have grown to now encompass tool use, file access, local execution, memory, workflow orchestration, network access. Not only have… Read more: Malice in the Mesh // 05: Intro to Detection Engineering for AI Agent Systems