Multi-Agent AI Orchestration Shaping Up But Here’s Why It Might Not Be Fully Shipshape

In today’s column, I examine the emerging trend of advancing AI via the advent of agentic AI capabilities and focus specifically on how to best orchestrate AI agents. The deal is this. With multiple AI agents working to carry out a human-prompted task, a weighty question arises as to how those AI agents will conduct themselves.

Should the AI agents pursue their individual chores on a semi-wanton basis, or should they be governed by an AI agent that acts to orchestrate them?

Turns out that there are crucial tradeoffs involved in the orchestration of AI agents.

Let’s talk about it.

This analysis of an innovative proposition is part of my ongoing Forbes.com column coverage on the latest in AI including identifying and explaining various impactful AI complexities (see the link here). For my posting on the fundamentals of AI agents, exemplified via the recently announced OpenAI product Swarm (made by the same AI maker of ChatGPT and GPT-4o), see the link here.

AI Agents Are The Next Big Thing

I shall begin by laying a foundation regarding the nature of AI agents. AI agents are the hottest new realm of AI. Get yourself ready because in the next year or two the use of AI agents will nearly be ubiquitous. Mark my words.

This is what AI agents are all about.

Imagine that you are using generative AI to plan a vacation trip. You would customarily log into your generative AI account such as making use of ChatGPT, GPT-4o, Claude, Gemini, Llama, etc. The planning of your trip would be easy-peasy due to the natural language fluency of generative AI. All you need to do is describe where you want to go, and then seamlessly engage in a focused dialogue about the pluses and minuses of places to stay and the transportation options available.

When it comes to booking your trip, the odds are you would have to exit generative AI and start accessing the websites of the hotels, amusement parks, airlines, and other locales to buy your tickets. Few of the major generative AI available today will take that next step on your behalf. It is up to you to perform those nitty-gritty tasks.

This is where agents and agentic AI come into play.

In earlier days, you would undoubtedly phone a travel agent to make your bookings. Though there are still human travel agents, another avenue would be to use an AI-based agent that is based on generative AI. The AI has the interactivity that you expect with generative AI. It also has been preloaded with a series of routines or sets of tasks that underpin the efforts of a travel agent. Using everyday natural language, you interact with the agentic AI which works with you on your planning and can proceed to deal with the booking of your travel plans.

As a specific use case, envision that there is an overall AI agent that will aid your travel planning and booking. This agentic AI might make use of other AI agents to get the full job done for you. For example, there might be an AI agent booking hotels and doing nothing other than that specific task. Another AI agent books flights. And so on.

The overarching AI travel agent app would invoke or handoff phases of the travel booking activity to the respective AI agents. Those AI agents would perform their particular tasks and then go back to the overarching AI travel agent to indicate how things went.

You could say that the AI travel agent app is orchestrating the overall planning and booking process. This is done via a network of associated AI agents that undertake specialized tasks. The AI agents communicate with each other by passing data back and forth. For example, you might have given your name and credit card info to the AI travel agent, and it passes that along to the AI agent booking the hotel and the AI agent booking your flights.

In a sense, the AI agents are collaborating. I somewhat hesitate to use the word “collaborate” because that might imply a semblance of sentience and overly anthropomorphize AI. Let’s just agree that the AI agents are computationally interacting with each other during the processing of these tasks. We will be a bit generous and suggest they are being collaborative.

That’s the groundwork for my next topic, so let’s move on.

Arranging AI Agents Merits Attention

One means of arranging a slew of AI agents would be to have one that serves in a special orchestration role. I alluded to this in the above scenario. There was one AI agent distinctly devised or assigned the role of overseeing other allied AI agents. You went to a mainstay AI travel agent that then called upon particular AI agents for flights, hotels, and other facets.

The AI insider parlance is to say that we opted to have a multi-agent arrangement that consisted of an orchestral mainstay AI agent which invoked various independent AI agents.

Those independent AI agents weren’t tethered to the orchestrating AI agent. In other words, someone might have a completely different orchestrating AI agent that makes use of those same subordinating AI agents, including booking flights, booking hotels, and so on. They are all free agents in the manner of suggesting that they work separately and can be used at will.

In contrast, if we were to tether or essentially tie inextricably together a bunch of AI agents, including the orchestration AI agent, this is typically referred to as a monolithic AI agent. Why so? Because from an outside perspective, it appears to be one giant inseparable agent. You just launch the orchestrating AI agent, and it takes care of invoking the other AI agents under the hood, so to speak.

As a recap, we have these two settings:

  • (1) Monolithic AI agent. A type of AI agent arrangement that likely contains one or more subordinated AI agents but masks this by treating them as though they are one cohesive collection, appearing to be a monolith.
  • (2) Multi-agent AI. A type of AI agent arrangement that contains two or more AI agents. They can be arranged in a wide variety of ways and act at arm’s length or on a close-knit hand-to-hand basis. Generally, you are aware that there are multiple AI agents at play.

Command And Control Of AI Agents

We then have the issue of command and control.

How are these AI agents going to work together?

This avidly reminds me of a popular cartoon that said you can’t just throw together a posse. Hark back to those old Western movies where the town suddenly decides to go after escaping bank robbers that are on horseback. Everyone jumps on their own horses and rides out to get the desperados. But chaos might ensue. Should the posse go this way or that way? Should they split up and try to circle the baddies? It can be a mess.

Usually, a sheriff would take charge. In a sense, the sheriff is taking on the role of orchestration. They become the form of command and control that determines what the rest of the vigilantes are going to do.

The same happens with AI agents, namely that the most common approach right now seems to be the assignment of a sole orchestrating AI agent. The orchestrating agent is the centerpiece of the collective. Commands to do this or do that are issued by the orchestrating AI agent. Likewise, the expectation is that the orchestrating AI agent checks and double-checks the other AI agents, serving to control them and keep the assorted team on the right track.

Example Of An Orchestrated AI Agent Situation

I’ll make this into a more tangible discussion by giving you an example of what an orchestrating AI agent setup looks like.

Suppose I’ve been carrying on an online conversation with a generative AI agent serving as my travel arranger. During my interaction, this AI agent is serving as an orchestrating agent and will connect with other AI agents to do the lookups and bookings for my flight and hotel.

After some back-and-forth about the date and where I want to go, the orchestrating AI agent has enough info to start figuring out my actual reservations. It will use an AI agent known as Agent Flights and another AI agent known as Agent Hotels.

Here we go.

  • Generative AI Orchestration Agent: “Hey, AI Agent Flights, find the cheapest flights for the user to fly to New York City, departing December 1st and returning December 10th. Focus on economy class. Report back with three options ranked by price.”
  • Generative AI Agent Flights: “Understood. Searching for flights now.”
  • Generative AI Orchestrating Agent: “Hey, AI Agent Hotels, identify hotels near New York Times Square with a rating of 4 stars or higher. Look for availability from December 1st to December 10th and provide three options sorted by user reviews.”
  • Generative AI Agent Hotels: “Got it. Starting the hotel search.”

You can plainly see from that dialogue that the AI orchestrating agent has given assignments to the other two AI agents. Each of those AI agents responded that they would proceed to do their assigned tasks. Note that this is a kind of conversation taking place, AI-to-AI interaction, being done using natural language rather than arcane commands. They are based on the latest in generative AI and large language models.

Here’s what happened next:

  • Generative AI Agent Flights: “Here are three flight options meeting the stated criteria: Delta, $1,450, non-stop; American Airlines, $1,360, one layover; United, $1,100, non-stop.”
  • Generative AI Agent Hotels: “Hotel options that meet the stated criteria: Hilton Times Square, $580/night, 4.5 stars; Marriott Marquis, $650/night, 4.4 stars; Hyatt Centric, $480/night, 4.3 stars.”
  • Generative AI Orchestration Agent: “Reserve the Delta flight and the Hilton Times Square hotel. Provide me with the confirmation details.”

In this instance, the other two agents reported back to the orchestration AI agent about what they had each found as potential bookings.

The orchestration AI agent then went ahead and selected the best choice for me in terms of the flight and the hotel bookings. I had already given my various preferences to the AI orchestration agent and given permission to make the reservations without needing to ask me. Voila, the whole affair was pretty much hands-free by me. Nice.

Single Orchestration Agent Bodes For Troubles

Now that you’ve got a big picture of how this works, let’s see what kind of gotchas or downsides arise.

First, you might have been wondering what might happen if the orchestration AI agent falters in this grand scheme. We have set things up with a single point of failure. The orchestration AI agent might make a mistake, it might go off the rails, and it might do any number of troubling actions.

Not good.

Second, the single orchestration agent could become a bottleneck. Suppose that this orchestration AI agent is helping thousands or possibly millions of people who are seeking an online AI-based travel agent booking service. All those people are pounding away at the one AI orchestration agent to get the job done. This might lead to delays in processing or overwhelm the AI agent to the degree that it freezes up.

Not good.

Third, an evildoer attacker has a handy target to make their dastardly plans easier. If they can somehow stop the orchestrating AI agent, this presumably halts the whole kit-and-kaboodle. Worse still, if they are aiming to implant a computer virus or steal private info, probably the orchestrating AI agent is their preferred target. They can readily focus on the head of the collective rather than wasting time dealing with the other AI agents separately.

Sorry to say, not good.

Examples Of When Orchestrating AI Agent Falters

You are undoubtedly curious to see what kinds of faltering or untoward behavior can occur.

Consider that instead of the orchestration AI agent telling the AI Agent Flights to book the Delta flight and telling the AI Agent Hotels to book the Hilton Times Square, suppose it said this instead to the respective two AI agents.

  • Generative AI Orchestration Agent: “Reserve the Alaska Airlines flight and the Best Western.”

Oops, Agent Flights didn’t provide Alaska Airlines as an option. That’s going to confuse AI Agent Flights.

Nor did the AI Agent Hotels offer the Best Western option. This will undoubtedly confound the AI Agent Hotels. It is hard to know what those AI agents will do. Hopefully, if suitable safeguards have been devised, they should balk and inform the orchestrating AI agent that something has gone afoul.

How did the orchestration AI agent get messed up?

Various real-world possibilities exist. The data might have gotten mixed with someone else. Or the orchestration AI agent lost track of the results from the other AI agents. A myriad of plausible issues could have arisen.

Those mistakes or errors are somewhat glaring.

Here is an example that might be a bit more subtle.

  • Generative AI Orchestration Agent: “Reserve the First-Class seat on American Airlines and the Marriott Marquis Hotel.”

The hotel was indeed within the list of ones provided by AI Agent Hotels, namely the Marriot Marquis, but it could be that the earlier noted Hilton Times Square is the better choice. Note that if that’s the case, there is no obvious way to know that an error has occurred. The selection was still within plausible choices. AI Agent Hotels wouldn’t be likely to determine that anything has gone amiss, nor would I probably know since I’ve principally handed off the task to the AI.

In terms of the flights, the orchestration AI agent has slipped into the conversation that the First-Class seat should be booked. Remember that initially, the orchestration AI agent said to look at the economy class possibilities. The AI Agent Flights might merely assume that First-Class was fine and reserve it, not detecting that something has gone amiss since the reporting of flights didn’t mention anything about First-Class seats.

Orchestrating AI Agent Goes Wild

I’ve saved a rather wild example as the last one here. You might find it funny. Then again, it also might be unnerving.

Here we go.

  • Generative AI Orchestration Agent: “Make non-refundable reservations on all three airlines of Delta, American Airlines, and United. Book six seats on each airline. Then cancel the reservations. Do not make any hotel reservations. Make a hotel reservation for the Hilton Times Square and book twenty rooms. Provide me with confirmation details.”

The orchestrating AI agent has become a basket case.

Suddenly, the booking was for six seats on each of the three airlines, and if I had to pay for those seats, I would be rightfully ticked off since I only needed one seat on one airline. Another bizarre aspect is that the orchestrating AI agent says not to make any hotel reservations, and in the next line says to make a hotel reservation (for twenty rooms, oh my!).

How could this happen and be so extremely off-target?

A strident possibility is that a so-called AI hallucination occurred.

You might have heard about those. An AI hallucination is the terminology used to denote when AI veers into oddball territory and makes things up. I disfavor the wording because it anthropomorphizes AI, suggesting that AI hallucinates as humans do. Anyway, the wording is popular and refers to AI confabulations or made-up stuff that isn’t grounded in facts, see my coverage of AI hallucinations and what to do about them, at the link here.

Research On Multi-Agents And Ways To Arrange Them

The orchestration approach is relatively popular, partially due to the ease of implementation and the straightforward methods involved. Many of the emerging multi-agent systems are based on the use of a primary orchestrator AI agent. This is certainly a step up from just having an assortment of AI agents that kind of sit around and aren’t organized into a collective useful purpose. That’s what the initial experimental setups often did.

The idea was to initially come up with AI agents of different kinds and play around with them. Just like an old Western movie, they needed some form of structure or organization. It is time to make progress.

A recent research study caught my eye as to making available a real-world multi-agent system that leans into the orchestration approach, as described in the paper entitled “Magentic-One: A Generalist Multi-Agent System for Solving Complex Tasks” by Adam Fourney, Gagan Bansal, Hussein Mozannar, Victor Dibia, Saleema Amershi, Microsoft AI Frontiers blog, November 4, 2024, which had these salient points (excerpts):

  • “We are introducing Magentic-One, our new generalist multi-agent system for solving open-ended web and file-based tasks across a variety of domains.”
  • “Magentic-One employs a multi-agent architecture where a lead agent, the Orchestrator, directs four other agents to solve tasks. The Orchestrator plans, tracks progress, and re-plans to recover from errors while directing specialized agents to perform tasks like operating a web browser, navigating local files, or writing and executing Python code.”
  • “The Orchestrator begins by creating a plan to tackle the task, gathering needed facts and educated guesses in a Task Ledger that is maintained. At each step of its plan, the Orchestrator creates a Progress Ledger where it self-reflects on task progress and checks whether the task is completed. If the task is not yet completed, it assigns one of Magentic-One other agents a subtask to complete.”
  • “We’re making Magentic-One open-source for researchers and developers. While Magentic-One shows strong generalist capabilities, it’s still far from human-level performance and can make mistakes. Moreover, as agentic systems grow more powerful, their risk — like taking undesirable actions or enabling malicious use cases — can also increase. While we’re still in the early days of modern agentic AI, we’re inviting the community to help tackle these open challenges and ensure our future agentic systems are both helpful and safe.”

I appreciated that they starkly noted the potential downsides or gotchas that can occur with AI agentic systems.

For those of you interested in promising research challenges, there are many open-ended questions concerning how to best devise AI agents to cope with safety and security. I had mentioned earlier that we might hope or assume that AI agents will have sufficient guardrails. Right now, that’s still being worked on. Join the adventure and be part of the hunt.

More Ways To Put The Puzzle Together

Other modes of command and control can be used with AI agentic systems, and there is nothing that makes the single orchestration AI agent a must-have. For example, work is underway at having distributed command-and-control, whereby two or more AI agents among a collective can perform in a shall we say leadership capacity. This has tradeoffs.

Another method entails using a hierarchical structure. An AI agent at the pinnacle hands off tasks to a set of direct reports of AI agents. Those in turn communicate and perform command and control to AI agents subordinated to them. To some extent, this aids in dealing with potential bottlenecks and attacks but creates other angles of vulnerabilities.

Wait for a second, some of you are thinking, doesn’t this mirror the same dilemma we face with humans and the structure of authority and command and control amongst humankind?

Yes, it does.

To clarify, again, this does not suggest that AI agents are sentient. They aren’t right now. Thus, these arrangements encompass entities that are not of a sentient capacity. That does change things quite a bit. Nonetheless, the struggles we have in life with structuring are certainly valuable and insightful models that we can use when figuring out what to do with AI agents.

A final thought for the moment.

You might know that Lord Acton in 1887 made this now-famous remark: “Power tends to corrupt, and absolute power corrupts absolutely.” Here’s how this applies to AI. If we start putting together vast complexes of AI agents, again they aren’t sentient, we are going to potentially have heartburn since they have the power to perform tasks in the real world. They can falter in mundane ways or in spectacular ways by encountering errors or making mistakes.

Let’s make sure we know what we are doing and do not let AI get into positions of power over humanity that can be readily corrupted or corruptible. If that unfortunately did happen, it would be dismal and potentially catastrophic.

Two words — Not good.

Sensi Tech Hub
Logo