TLDR: This research proposes a multi-agent AI framework for Beyond 5G and 6G networks, moving away from centralized O-RAN RIC control. Specialized, collaborating agents handle tasks from data collection to policy deployment, with a crucial independent verification layer. Experiments on a traffic steering use case demonstrate that this agentic system prevents unsafe policy deployment under network drift, preserving global network stability, unlike naive AI approaches that can destabilize neighboring cells for local gains.
As our mobile networks evolve beyond 5G and towards 6G, they are becoming incredibly complex. We’re talking about massive numbers of devices, diverse service needs, and very strict performance expectations. To manage this, Artificial Intelligence and Machine Learning (AI/ML) have been increasingly used for tasks like traffic steering and resource allocation. Current network architectures, like those based on O-RAN’s RIC (Radio Intelligent Controller), have helped make networks more programmable. However, these centralized approaches have significant drawbacks.
The main issue with RIC-centric systems is that decision-making is concentrated in one place. This can lead to bottlenecks and limits how truly autonomous the network can be at the edge. AI models in these systems can also become unreliable over time due to changes in the network environment, potentially leading to unsafe or biased actions. Plus, there’s often no independent check to ensure that new policies are safe before they are put into action. These problems raise concerns about how well these systems can scale, their safety, and whether operators can fully trust them in critical network operations.
A New Vision: Multi-Agent AI for Network Autonomy
To overcome these limitations, a new approach is proposed: a multi-agentic architecture. Instead of a single, centralized controller, this framework uses many specialized AI agents that work together in a distributed way. Each agent has a specific role, such as collecting data, training models, making predictions, generating policies, verifying those policies, deploying them, and ensuring security and auditability. This collaborative setup aims to achieve true autonomy, resilience, explainability, and system-wide safety in the network.
The design of this multi-agent system is guided by five core principles:
- Autonomy: Each agent can perform its tasks and recover from issues independently without disrupting the entire network.
- Interoperability: Agents communicate seamlessly using standardized methods, allowing diverse functions to coordinate effectively.
- Explainability: Every decision made by the system comes with a clear, human-understandable reason, building trust with network operators.
- Resilience: If one agent fails, the entire system isn’t compromised; tasks can be dynamically rerouted.
- Feedback loops: Agents continuously refine processes based on ongoing performance and detected issues.
Imagine a team of experts, each with a specific job, all working together to manage the network. For example, an Orchestrator Agent kicks off tasks, a Data Collector Agent gathers raw network information, a Preprocessor and Feature Agent cleans and prepares this data, and a Model Trainer Agent builds AI models. Crucially, a Predictor Agent forecasts network behavior, and a Policy Generator Agent suggests actions. But before any action is taken, a Verifier Agent steps in to ensure safety.
Ensuring Safety with Policy Verification
One of the most significant contributions of this multi-agent architecture is its independent verification layer. In traditional AI-driven network control, policies suggested by AI models are often deployed directly without thorough checking. While these policies might improve local network performance in the short term, they can unintentionally harm neighboring cells, cause interference, or degrade overall service quality, especially when network conditions change unexpectedly.
The multi-agent system addresses this by introducing a critical step: policy verification. After a Predictor Agent forecasts network behavior and a Policy Generator Agent suggests actions (like shifting traffic), a Simulator/Baseline Agent independently creates a reference of what safe or expected network behavior should look like. Then, the Verifier Agent compares the predicted outcomes of the proposed policy against these safe baselines. If a policy is found to violate safety limits (e.g., overloading a neighboring cell), it is rejected. This triggers a feedback loop, where a Drift Detector Agent might initiate retraining of models if the rejection is due to the AI model becoming outdated.
Only policies that pass this rigorous verification process are handed over to the Deployment Agent for execution in the live network. An Audit and Explainability Agent then records the reasons behind approvals or rejections, ensuring transparency, while a Security Agent maintains the integrity of all communications between agents.
Also Read:
- Securing Vehicular Networks: Understanding and Defending Against Cyber Threats in Distributed Federated Learning
- SentinelNet: A Decentralized Shield for Collaborative AI Systems
Real-World Demonstration: Traffic Steering
To prove the effectiveness of this approach, the researchers conducted an experiment using a traffic steering scenario. Traffic steering involves moving users from a congested cell to a less busy neighboring cell to balance the load. While this can help the congested cell, a naive AI approach might overload the neighbor, especially if the network conditions change unexpectedly (a “drift” scenario).
The experiment compared two scenarios: a “No-Agent” approach where AI policies were deployed directly, and an “Agentic” approach where policies went through the independent verification stage. Under conditions of a user surge and a simulated degradation (drift) in the neighboring cell, the results were clear. The No-Agent scenario showed improvements in the congested cell (e.g., fewer connected users, higher throughput) but at a severe cost to the neighboring cell (e.g., collapsed throughput, degraded signal quality).
In stark contrast, the Agentic system, through its Verifier Agent, blocked the unsafe offloading policy. This meant that while the congested cell didn’t see immediate local gains, the global network health and stability of the neighboring cell were preserved. This demonstrates that even powerful AI predictors can make risky recommendations under unexpected conditions, and independent verification is crucial to prevent network destabilization.
This research highlights that the future of autonomous networks in Beyond 5G and 6G cannot rely solely on centralized AI or tightly coupled controllers. Instead, it requires a collaborative framework of distributed agents that work together to ensure safety, resilience, and explainability. For more details, you can read the full research paper here.


