spot_img
HomeAnalytical Insights & PerspectivesGoogle's Gemini AI Struggles with Classic Pokémon, Taking Hundreds...

Google’s Gemini AI Struggles with Classic Pokémon, Taking Hundreds of Hours to Complete Game Due to ‘Agent Panic’

TLDR: Google DeepMind’s Gemini AI demonstrated significant inefficiencies in playing the classic game Pokémon Blue, requiring over 800 hours for its initial completion. This prolonged playtime was attributed to a phenomenon dubbed ‘Agent Panic,’ where the AI’s reasoning capabilities degraded under pressure, coupled with a fixation on a non-existent in-game item.

A recent case study highlighted by Google DeepMind reveals that its Gemini AI model exhibited surprisingly poor performance when attempting to play the classic Nintendo game, Pokémon Blue. The project, initiated by independent engineer Joel Zhang on the Twitch channel ‘Gemini_Plays_Pokemon,’ saw the AI undertake two full playthroughs of the game, consistently choosing Squirtle as its starter Pokémon.

During these runs, the Gemini team observed a peculiar behavior they termed ‘Agent Panic.’ This phenomenon occurred when the AI’s Pokémon party was low on health or Power Points (PP). Under such conditions, the model’s reasoning capability appeared to degrade significantly, leading to observable errors such as ‘completely forgetting to use the pathfinder tool in stretches of gameplay.’

The initial attempt by the Gemini 2.5 Pro model to complete Pokémon Blue was remarkably lengthy, clocking in at over 813 hours. This extended duration was not only due to the ‘Agent Panic’ but also a curious fixation on a ‘hallucinated Tea item,’ which exists in remakes of the game but not in the original 90s version the AI was playing.

Following some adjustments by Zhang, the AI managed to reduce its completion time for the second run, finishing the game in 406.5 hours. While an improvement, this still starkly contrasts with the average human completion time for the main story of Pokémon Blue, which is approximately 26 hours according to ‘How Long to Beat.’

Also Read:

Jess Kinghorn of PC Gamer, who reported on the findings, noted the irony of an advanced AI struggling with a quarter-century-old children’s video game. Kinghorn also expressed reservations about the term ‘Agent Panic,’ emphasizing that AI agents do not experience emotions like panic or truly ‘think.’ Instead, these seemingly hasty decisions are likely the Gemini model mimicking patterns found in its training data. The report also questioned the overall value of using video games for AI benchmarking, suggesting that such attempts often provide limited insights into AI capabilities.

Karthik Mehta
Karthik Mehtahttps://blogs.edgentiq.com
Karthik Mehta is a data journalist known for his data-rich, insightful coverage of AI news and developments. Armed with a degree in Data Science from IIT Bombay and years of newsroom experience, Karthik merges storytelling with metrics to surface deeper narratives in AI-related events. His writing cuts through hype, revealing the real-world impact of Generative AI on industries, policy, and society. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -

Previous article
Next article