TLDR: This research explores how AI agents balance curiosity (seeking new knowledge) and competence (mastering the environment) to explore effectively. By comparing agents with fixed and learned internal representations, the study shows that while individual motivations have trade-offs, combining curiosity and competence leads to more robust and safer exploration, especially in complex and unpredictable environments.
Intelligent agents, whether they are children at play or advanced AI systems, face a fundamental challenge: how to explore the world to gain new knowledge while also maintaining control over their environment. This balancing act between curiosity, the drive to seek new information, and competence, the drive to master and influence the surroundings, is crucial for effective learning and adaptation.
This research delves into this intricate relationship, bridging ideas from cognitive science and reinforcement learning to understand how an agent’s internal understanding of the world, known as its ‘world model’, mediates the trade-off between curiosity (seeking novelty or information) and competence (achieving control or empowerment).
The Dual Drives: Curiosity and Competence
Curiosity compels agents to explore the unknown, reduce uncertainty, and build better mental models of how the world works. This can manifest as seeking out novel experiences or actively trying to gain more information about uncertain outcomes. For example, a child might be curious about a new toy’s unpredictable flickering lights.
Competence, on the other hand, motivates agents to predict and control outcomes. It’s about leveraging what is known to influence the environment. The child might prefer a toy that lights up predictably when a button is pressed, demonstrating a desire for control.
While these drives might seem sequential – first learn, then act – they are deeply interconnected. Learning to walk, for instance, allows a child to access new areas, fueling curiosity. Conversely, curiosity about distant places can motivate the child to master locomotion. This creates a feedback loop where world models shape exploration, and exploration, in turn, refines the world models.
Challenges for AI Agents
Traditional reinforcement learning agents often struggle with this balance. Curiosity-driven agents can get stuck in the ‘noisy TV problem’, becoming distracted by random, uncontrollable stimuli that offer no real opportunities for mastery. Conversely, competence-focused agents might assume a fixed understanding of the world, neglecting how their exploration could actually improve that understanding.
The Research Approach
To investigate this, the researchers compared two types of model-based agents in simulated grid-world environments: a ‘Tabular’ agent with predefined, handcrafted state representations, and a ‘Dreamer’ agent that learns its internal world model from raw visual observations. They evaluated three intrinsic motivations: novelty (exploring unfamiliar states), information gain (reducing uncertainty about outcomes), and empowerment (maximizing control over future states).
The environments were designed to mimic real-world challenges, featuring areas with irreversible penalties (lava), stochastic transitions (ice), and barriers (walls), forcing agents to navigate trade-offs between risk, uncertainty, and control.
Key Findings
The simulations revealed distinct patterns for each motivation:
- Novelty: While it encourages exploration, it can sometimes lead to agents getting stuck in local loops, finding trivial forms of novelty without truly expanding their understanding.
- Information Gain: This drive led to thorough exploration in deterministic environments, as agents sought to reduce uncertainty. However, it struggled in stochastic environments, often fixating on inherently unpredictable elements (like randomly moving walls) that couldn’t be learned or controlled. This highlights a challenge in distinguishing between reducible (epistemic) and irreducible (aleatoric) uncertainty.
- Empowerment: This motivation prioritized control. In deterministic settings, it could be maladaptive, causing agents to stay in a ‘comfort zone’ where they had maximum influence, thus limiting exploration. However, in stochastic environments, empowerment proved adaptive, as agents actively roamed to maintain influence over outcomes, avoiding areas of high unpredictability.
Crucially, the research found that combining information gain and empowerment, particularly through a simple sum, led to a more balanced and effective exploration strategy. For the Tabular agent, this hybrid approach achieved a higher discovery-to-death ratio, exploring most of the environment while intelligently avoiding uncontrollable dangers. Similar synergistic effects were observed in the Dreamer agent, leading to more robust generalization in novel environments.
Also Read:
- Protecting Autonomous AI Agents from User and Tool Threats
- Balancing Logic and Scale: New Grounding Methods for Neural-Symbolic AI
Implications and Future Directions
This study underscores that curiosity and competence are not redundant but complementary forces in driving exploration. While each has its context-specific advantages and drawbacks, their combination offers a promising path towards more adaptive and safer exploration for AI agents. The findings provide valuable insights for both cognitive theories of human learning and the development of more efficient reinforcement learning algorithms.
Future work could explore dynamically adjusting the balance between curiosity and competence based on the environment, validating these mechanisms against human behavior, and scaling these principles to real-world robotics tasks requiring robustness to environmental unpredictability. For more details, you can read the full paper here.


