Deep Tech

 

June 2026

 

The rise of Multi-Agent Systems: why most business isn't ready

For most of 2023 and 2024, the dominant metaphor for AI in enterprises was the copilot: a tireless assistant drafting emails, summarising documents, writing first cuts of code. It was a reassuring image, one in which the human remained firmly in the cockpit. That metaphor is now quietly being retired.

The frontier of enterprise AI has shifted toward multi-agent systems: architectures in which networks of specialised AI models coordinate, delegate tasks, call external tools, and operate over extended time horizons with minimal human supervision. Major Generative AI providers and a growing constellation of startups are building frameworks and services around this paradigm. Innovation centres increasingly speak less of "productivity tool" and more of "autonomous workforce”.

This shift is another of the most significant operational opportunities in a generation, the next big step in the information era. It is also one of the most challenging organisational tests ever encountered, perhaps even more than the advent of AI copilots. On top of the alluring opportunities, forgetting about the challenges would put firms in a tight spot. Let us see why.

 

What Multi-Agent Systems are

A single large language model (the backbone of modern generative AI) operates within a conversation: it receives an input prompt, produces an output response, and forgets. An AI-powered agent exits its app and coordinates multiple models, each with defined roles, and access tools such as web search, code execution, or database queries, across sequences of tasks geared towards achieving a goal – not just giving a reply. Such orchestration is not limited in short times, but can span minutes, hours, or longer.

Grouping and synchronising multiple AI agents produces AI-powered multi-agent systems (AI-MAS), which bring together the capabilities of numerous agents to complete even more complex tasks. For instance, an AI-MAS asked to conduct competitive intelligence does not simply summarise a prompt. It might autonomously search the web, extract financial filings, cross-reference product announcements, commission a sub-agent to run sentiment analysis, and synthesise results into a structured report, all without human intervention at each step.

This is not a marginal productivity improvement. It is a qualitative shift in what machines can be asked to do. The spreadsheet did not merely speed up accounting; it restructured what finance departments were for. Multi-agent systems are potentially capable of doing something similar to knowledge work at large.

 

The business case is real, and larger than most estimates suggest

The organisations moving fastest are not doing so by spreading AI thinly across the enterprise. According to recent McKinsey research, horizontal copilots, although widely adopted, tend to deliver diffuse gains that are not easily visible in top or bottom-line results, while vertical, function-specific deployments have far higher potential for direct economic impact [1]. Even after three years of constant diffusion of AI copilots and agents, the use of AI in companies is far below the projected capacity [2].

Yet, something new is likely to come: rather than focusing on short-term cost reduction, or traditional ROI, the most valuable output of multi-agent systems may not be cost reduction at all. It may be a radical re-thinking of workforce allocation, renewed emphasis of responsibility and quality, and  the realignment of human attention toward judgment, creativity, and relationship-building that organisations have always claimed to value but rarely managed to protect [3].

Another big change in business management is the introduction of compounding effects, associated with agents having memory (or better: orchestrating data storage, systematization and retrieval).. When done with forward-looking strategies, every deployment of AI-MAS generates data about where pipelines succeed, where they fail, and how workflows need redesigning. Organisations that start now are accumulating institutional knowledge about autonomous AI operations that will be difficult for later movers to replicate. This technology transition, if fully implemented with feedback between agents and companies, brings the possibility to dig competition barriers that may be extremely difficult to overcome.

 

Three tensions that separate leaders from followers

Despite the possibility to become yet another disruptive element in business, AI-MAS come with limits that must be fully addressed before full deployment – or virtuous feedback may become vicious circles,

 

1. Reliability compounds errors, and rewards those who build evaluation culture

Multi-agent systems compound errors in ways single-agent systems do not. When one agent produces a flawed output passed downstream to a second, which feeds a third writing a client-facing document, the original error is laundered through layers of plausible-looking process. The failure mode is not a visible hallucination, but a subtly wrong conclusion wrapped in the aesthetic of rigorous analysis.

There is an even subtler problem underneath: the nature of explainability in stateless systems. When a human analyst produces a flawed conclusion, you can ask why. The answer draws on the same cognitive process that produced the original decision. Agents do not work this way. A language model (and its legacy agents) is stateless by design: it processes a prompt, produces an output, and retains nothing. When asked to explain a decision made three steps ago, it does not remember. It reconstructs. It reads the outputs it left behind and confabulates a plausible account of why they were produced. That account may be coherent, even convincing. It is not the actual reasoning: the actual computation is gone.

We have a word for this in human contexts: rationalisation. In AI systems it tends to be called interpretability and treated as a technical problem awaiting a solution. But by treating it as just another technical puzzle, we're not taking seriously how dangerous it could actually be. Compounding error rates as agent interactions multiply represent one of the primary risks of scaled deployment [4].

This is also where a real competitive moat exists. Rigorous AI quality assurance, systematic and adversarial pipeline testing with human review checkpoints calibrated to actual risk, is today genuinely rare. Organisations that institutionalise it early will operate with a level of trust from regulators, clients, and boards that competitors will find structurally difficult to replicate.

 

2. Accountability structures are unprepared, but redesigning them creates organisational clarity

When an AI orchestrator delegates to a sub-agent, which calls a tool, which triggers an action, the accountability question becomes genuinely complex: which system, or who, should be held accountable for any action?

The question is not just legal, but operational: Gartner predicts that over 40% of agentic AI projects will be cancelled by the end of 2027 due to escalating costs, unclear business value, or inadequate risk controls [5]. Organisations deploying multi-agent systems without mapping their accountability structures are accumulating exactly the kind of exposure that produces those cancellations.

The opportunity hidden inside this pressure is significant. Organisations that map accountability before deployment, deciding which decisions require human sign-off, which can be delegated to supervised automation, and which should never be delegated, emerge with genuine institutional self-knowledge. Many organisations have never been forced to answer these questions rigorously. Multi-agent adoption is a forcing function for the operational redesign that most firms have perpetually deferred. The compliance burden is real; so is the prize on the other side of it.

 

3. Organisational capacity is the binding constraint, and the most durable advantage

Designing and maintaining multi-agent systems requires a genuinely new institutional capability: decomposing complex workflows into agent-legible tasks, specifying objectives precisely enough that autonomous systems can pursue them without constant correction, and intervening intelligently when pipelines drift. This is closer to systems engineering than to the prompt-crafting that passes for AI literacy in most enterprises. We identify three new roles now essential to agentic deployments:

  • agent designers who translate business goals into agent behaviours;

  • multi-agent engineers who build systems where several agents interact;

  • governance experts who integrate agents into regulated workflows.

As clear from these three roles, only one is purely technical. The other two require deep attention and knowledge about companies’ structure and workflows, expertise about complex systems management, critical thinking, ability to interact with humans and machines, and eventually interpersonal abilities to integrate IT services in human systems: AI-MAS are not automating employees, they call for a humanism 2.0 in middle managers.

 

The deeper question: what is autonomous work for?

Multi-agent systems do not merely automate tasks. They reorganise what tasks exist. The tacit knowledge organisations accumulate through human experience does not automatically transfer into AI systems. It has to be deliberately captured, codified, and maintained.

Firms that treat this as a design challenge will build AI infrastructure that reflects their strategic priorities. Those that do not will produce sophisticated pipelines that answer the wrong questions very efficiently.

The enterprise AI parallel to China's industrial involution [6] is already visible: companies racing to automate are building elaborate agent architectures that optimise for speed and volume while eroding the quality of thinking that made their output valuable in the first place. McKinsey's research consistently finds that the highest-performing AI companies are those that treat AI as a catalyst to transform their organisations and redesign workflows, setting growth and innovation as objectives rather than efficiency alone [7]. Competitive urgency is real but it is not a substitute for purpose. Re-defining goals, meaning and the value of quality and critical thinking are the deeper challenges that AI-MAS bring, on top of technology alone.

 

What serious adoption looks like

The organisations extracting durable value share characteristics that have little to do with technology choice. They invest in workflow archaeology, understanding at a granular level where value is genuinely created and where automation introduces disproportionate risk. They build evaluation culture, measuring AI output quality systematically rather than celebrating demos. They map accountability before deployment, not after. And they maintain a clear answer to the question most technology conversations skip: what is this for?

The copilot metaphor has already partially failed not because it is wrong, but because it is too comfortable. It lets organisations believe that the human remained in control by default, that the machine would wait to be asked, and that everything would rum a usual. Multi-agent systems dissolve that assumption. The question is no longer whether AI will act autonomously inside your organisation, it is whether the organisation you have built is one whose values, judgments, and priorities are legible enough to be acted on faithfully.

That is not a technology question. It is a question about institutional self-knowledge that most firms have never had to answer under pressure. The organisations that will lead are not necessarily those with the most sophisticated architectures or the largest model budgets. They are the ones that know, precisely and honestly, what they are trying to become, and have built systems that serve that answer rather than obscure it.

 

[1]  Alexander Sukharevsky, Dave Kerr, Klemens Hjartar, Lari Hämäläinen, Stéphane Bout, Vito Di Leo and Guillaume Dagorret, "Seizing the agentic AI advantage", (New York, McKinsey & Company, 13 June 2025), https://www.mckinsey.com/capabilities/quantumblack/our-insights/seizing-the-agentic-ai-advantage

[2] Maxim Massenkoff and Peter McCrory, “Labor market impacts of AI: A new measure and early evidence”, (San Francisco, Anthropic, March 05 2026), https://www.anthropic.com/research/labor-market-impacts

[3] Alessio Buscemi and Daniele Proverbio, “Leading with AI in the EU: Governance, practices and compliance”, (De Gruyter, in press, 2026)

[3] Tom Coshow and Kiumarse Zamanian, "Multiagent Systems in Enterprise AI: Efficiency, Innovation and Vendor Advantage", (Gartner, December 18 2025), https://www.gartner.com/en/articles/multiagent-systems

[4] Anushree Verma, "Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027", (Sydney, Gartner, 25 June 2025), https://www.gartner.com/en/newsroom/press-releases/2025-06-25-gartner-predicts-over-40-percent-of-agentic-ai-projects-will-be-canceled-by-end-of-2027

[5] Federico Cuppoloni, "Europe's industrial policy: why reindustrialization needs purpose", (Paris, Collège des Ingénieurs, March 2026), https://cdi.eu/blog/detail/europe-s-industrial-policy-why-reindustrialization-needs-purpose.html

[6] Alex Singla, Alexander Sukharevsky, Bryce Hall, Lareina Yee and Michael Chui, "The state of AI in 2025: Agents, innovation, and transformation", (New York, McKinsey & Company, 5 November 2025), https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai

Contributors

Alessio Buscemi
CDI Cohort 2022
Daniele Proverbio
CDI Cohort 2023