AI-Agent School: Simulating Educational Dynamics with Evolving AI Teachers and Students

TLDR: The AI-Agent School (AAS) is a multi-agent simulation system that uses large language models (LLMs) to create high-fidelity educational scenarios. It features a “Zero-Exp” strategy with a dual memory system (experience and knowledge, each with short-term and long-term components) that allows AI teacher and student agents to autonomously evolve through interactions. Experiments show that this system effectively simulates complex educational dynamics, fostering advanced agent cognitive abilities and generating realistic behavioral data, moving education towards an “Era of Simulation.”

In the rapidly evolving landscape of artificial intelligence, large language models (LLMs) are increasingly being used to simulate and understand complex human systems. A new research paper introduces an innovative system called the AI-Agent School (AAS), designed to simulate intricate educational dynamics with remarkable fidelity.

The core challenge addressed by this research is the difficulty in systematically modeling the teaching process and the limitations of current AI agents in accurately simulating the diverse behaviors of students and teachers in educational settings. To overcome these hurdles, the AAS system proposes a self-evolving mechanism that allows AI agents to learn and adapt within a simulated school environment.

The Zero-Exp Strategy and Dual Memory System

Central to the AAS is the ‘Zero-Exp’ strategy, which guides agents to evolve from a state of no experience to expert-level behavior. This strategy is built upon a continuous cycle of “experience-reflection-optimization” and is powered by a sophisticated dual memory base. This memory system is divided into two main components: an Experience Base, which stores records of past events and interactions, and a Knowledge Base, containing structured information like academic knowledge or teaching methodologies.

Both the Experience and Knowledge Bases are further organized into short-term and long-term memory components. Short-term memory holds information deemed most relevant for current tasks, mimicking human attention, while long-term memory serves as a comprehensive repository of all accumulated experiences and knowledge. This hierarchical and dual memory structure allows agents to retain vast amounts of information, enabling long-term learning, reflection, and decision-making crucial for their autonomous evolution.

A Simulated School Environment

The AAS environment is a detailed virtual school, inspired by real-world layouts, featuring 25 distinct areas such as classrooms, libraries, laboratories, and sports fields. Within this environment, two main types of interactive roles exist: teacher agents and student agents. These roles are meticulously designed using LLMs to generate rich and diverse backgrounds, personality traits, and specific characteristics.

Agents perform a variety of actions tailored to their roles. Teacher agents engage in teaching practices, reflection, and guidance, while student agents participate in classroom learning, laboratory work, peer interaction, self-directed learning, and extracurricular activities. These actions drive the simulation and generate valuable behavioral data.

How Agents Evolve

The Zero-Exp mechanism ensures that agents continuously improve their behaviors. At each step of the simulation, agents process the current environment and their roles. They retrieve relevant information, prioritizing short-term memory, and then integrate it with their working memory (previous interaction history). The agent’s response and the outcomes of their actions trigger a crucial process of memory update and self-reflection. This means that new insights and optimized strategies are added to their memory bases, and their internal role settings (like teaching methods or study habits) are dynamically updated.

Experimental Validation

To evaluate the effectiveness of the AAS and its Zero-Exp mechanism, researchers conducted extensive experiments using various LLMs as agents, including GPT-4o, Qwen3-235B-A22B, and Qwen3-8B. They designed nine different memory configurations to analyze the impact of the dual memory structure and the short-term/long-term hierarchy.

The dataset for these simulations was created through a multi-step process involving LLM generation and rigorous expert refinement, ensuring realistic initial conditions and high-fidelity interaction sequences. Evaluation was performed using both automated metrics (ROUGE-L scores for text similarity) and human evaluation by educational experts.

The results were compelling. The full model, incorporating both the dual experience/knowledge base and the short-term/long-term hierarchy, consistently achieved the highest ROUGE-L scores. This demonstrated the significant benefits of external memory, the separate organization of experience and knowledge, and the prioritization of salient memories in short-term memory.

Human evaluation further corroborated these findings. Educational experts judged the interactions generated by the full model as significantly more realistic than those from baseline configurations. Over time, the perceived realism of the full model’s agents approached that of expert-curated ground truth data, indicating a strong learning and adaptation curve.

Also Read:

Pioneering Computational Education Science

This research marks a significant step towards a new paradigm of “Computational Education Science,” integrating traditional educational research with advanced AI technologies. The AAS environment and Zero-Exp mechanism provide a verifiable technical model for developing educational digital twins and generating valuable behavioral data. This work helps propel the education field from the “Era of Experience” to the “Era of Simulation,” offering foundational elements for future educational systems, teacher training platforms, and policy simulation tools.

While promising, the research acknowledges limitations such as the current simulation scale (50 agents over 5 days) and the reliance on LLMs without visual perception capabilities. Future work aims to scale the environment, incorporate multimodal models, and apply the generated data to specific educational applications like personalized learning. You can read the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

AI-Agent School: Simulating Educational Dynamics with Evolving AI Teachers and Students

The Zero-Exp Strategy and Dual Memory System

A Simulated School Environment

How Agents Evolve

Experimental Validation

Pioneering Computational Education Science

Gen AI News and Updates

Amazon Bedrock’s A2A Protocol: The Catalyst for Next-Gen Cross-Framework Multi-Agent AI Systems

Astreya Unveils New Wave of Enterprise AI Agents to Boost Business Efficiency and Automation

Vida Secures $4 Million Series A Funding to Advance AI Voice Technology and Expand Leadership

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates