MOASEI Competition Unveils Promising AI Strategies for Dynamic Open-World Environments

TLDR: The inaugural MOASEI Competition evaluated multi-agent AI systems in dynamic, open-world environments across Wildfire, Rideshare, and Cybersecurity tracks. It showcased diverse solutions, including GNNs, CNNs, and LLM-driven meta-optimization, highlighting their adaptability and robustness. The University of Tehran, Markov Mayhem, and Zana Cyber teams were recognized as winners, demonstrating effective strategies for handling unpredictable agent and task changes. Future competitions will increase complexity and introduce new forms of openness.

The inaugural Methods for Open Agent Systems Evaluation Initiative (MOASEI) Competition, held at AAMAS’2025, marked a significant step in evaluating artificial intelligence systems designed to operate in complex, unpredictable environments. This international benchmarking event focused on multi-agent AI systems under “open-world conditions,” where agents and tasks can dynamically appear, disappear, or change their behavior over time. The competition was built upon the free-range-zoo environment suite, providing a practical testbed for exploring how AI agents adapt to dynamic and evolving scenarios. You can find the full technical report here.

Competition Structure and Challenges

The 2025 competition featured three distinct tracks, each designed to highlight different dimensions of openness and coordination complexity:

The Wildfire track incorporated both agent and task openness. Here, participants had to manage shifting agent availability and dynamically appearing fires in a high-stakes, coordination-heavy setting. Agents needed to efficiently suppress fires while adapting to new fires emerging and other agents potentially joining or leaving the operation.

The Rideshare track focused primarily on task openness. In this scenario, driver agents were tasked with delivering passengers who entered the system unpredictably over time. The challenge lay in prioritizing and fulfilling these dynamic requests efficiently.

The Cybersecurity track emphasized agent openness. Defender agents had to adapt to changes in the collective strength of both their allies and opposing attacker agents. This simulated the real-world challenge of managing rotating personnel or automated defense systems within a network.

Evaluation metrics for all tracks centered on expected utility, robustness to environmental changes, and responsiveness to agent or task availability. Additional domain-specific metrics, such as efficiency and action preferences, were also used.

Participation and Key Findings

Eleven teams from international institutions registered for the competition. Four of these teams submitted diverse solutions for the Wildfire and Cybersecurity tracks, demonstrating significant interest in open-system challenges. Notably, the Rideshare track did not receive any submissions in this inaugural cycle.

The competition revealed several promising strategies for generalization and adaptation in open environments:

Graph Neural Networks (GNNs) proved highly adaptable, especially in scenarios with agent dropout and dynamic tasks. These policies leveraged context-aware embeddings to capture relational structures, leading to better task selection and emergent coordination.

Convolutional Neural Networks (CNNs) also showed strong performance, matching the peak results of GNN-based policies. Teams using spatial convolutions for task selection demonstrated that CNNs can be a viable and lightweight alternative to graph-based methods.

Some teams found success by utilizing augmented loss functions and pre-trained predictors. By pretraining predictors with supervised learning, agents could better model the intentions of other agents, leading to improved coordination and mitigating issues caused by partial observability.

A particularly interesting finding was the viability of Large Language Models (LLMs) for handling open environments. One team successfully used an LLM to iteratively refine a baseline policy through self-prompting and episodic feedback. This approach demonstrated that LLMs could act as “meta-policy optimizers,” guiding policy adaptation over time in response to observed openness, rather than serving as direct decision-makers. This opens up possibilities for hybrid designs combining LLM-guided meta-adaptation with other policy architectures.

Competition Winners

In the Wildfire track, the University of Tehran (using a CNN-based approach) and Markov Mayhem (using a GNN-based approach) were recognized as the winners, achieving the highest cumulative episodic rewards. The BIT Student team (using an LLM-based approach) received an honorable mention.

For the Cybersecurity track, the Zana Cyber team, which utilized a weighted scoring approach, emerged as the winner, demonstrating superior performance in resisting attacker agents and managing network states effectively.

Also Read:

Looking Ahead to MOASEI 2026

Building on the success of the 2025 competition, organizers are planning several enhancements for MOASEI 2026. The Wildfire domain will increase in spatial scale with more agents and tasks, and will introduce “frame openness,” where agent capabilities can dynamically change over time (e.g., equipment degradation or altered skill sets). The Cybersecurity domain will feature enhanced attacker policies and an increased number of defensive agents and subnetworks, encouraging deeper cooperative strategies.

The Rideshare domain will be reintroduced with a stronger emphasis on direct competition, where submitted agent policies will compete for passengers, fostering policies robust against adversarial interference. A public archive of past submissions and leaderboards will also be released to support research continuity and community engagement.

The inaugural MOASEI Competition successfully demonstrated the potential of benchmarking AI policies in multi-agent environments with recognized forms of openness. It highlighted diverse solution approaches and established a robust infrastructure for evaluating AI systems under real-world conditions of uncertainty and dynamism, laying a strong foundation for future research.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

MOASEI Competition Unveils Promising AI Strategies for Dynamic Open-World Environments

Competition Structure and Challenges

Participation and Key Findings

Competition Winners

Looking Ahead to MOASEI 2026

Gen AI News and Updates

PASA Unveils New ‘Data for AI’ Guidance to Foster Responsible Innovation in Pensions Administration

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates