TLDR: Google DeepMind has launched Gemini Robotics 1.5, a new suite of AI models designed to bring advanced agentic capabilities to robots. These models enable robots to perceive, reason, plan, use tools, and interact with humans to solve complex tasks in the physical world, marking a significant step towards achieving Artificial General Intelligence (AGI) in robotics.
Google DeepMind has announced a significant leap in artificial intelligence with the introduction of Gemini Robotics 1.5, a new generation of AI models aimed at empowering robots with advanced agentic capabilities in the physical world. This development is poised to transform how robots understand their environments and execute intricate tasks, even those they haven’t been explicitly trained for.
Carolina Parada, Senior Engineering Manager at Google AI, emphasized the importance of this release, stating, ‘Gemini Robotics 1.5 marks an important milestone toward solving AGI in the physical world.’ She added that by integrating agentic capabilities, Google DeepMind is moving beyond reactive models to create systems that can ‘truly reason, plan, actively use tools and act to solve complex tasks.’
At the core of Gemini Robotics 1.5 is a dual-model approach, combining a Vision-Language-Action (VLA) model with an Embodied Reasoning (ER) model. The VLA model is designed to ‘see’ (vision), ‘understand’ (language), and ‘act’ (action) within the physical world, processing visual inputs and user prompts to generalize problem-solving across different robotic embodiments. Complementing this, the ER model provides state-of-the-art embodied reasoning, enabling robots to ‘think before acting’ and make informed decisions.
These models allow robots to break down complex goals into manageable steps, formulate plans of action, and autonomously carry out each stage. A key feature is the ability to natively call digital tools, such as Google Search, to look up information and adapt their behavior to new situations. For instance, a robot could be asked to sort objects into compost, recycling, and trash bins based on local guidelines. It would then use the internet to find the rules, analyze the objects, and execute the sorting process.
Availability of these groundbreaking models varies: Gemini Robotics 1.5 is currently offered to select partners, while Gemini Robotics-ER 1.5 has been made available to developers. This strategic rollout aims to foster innovation and broader application of these advanced AI systems.
Also Read:
- Google Enhances AI Data Interaction with Model Context Protocol (MCP) Server and Toolbox Rollouts
- Google Expands On-Device Generative AI Across Chrome, Chromebooks, and Pixel Devices
The introduction of Gemini Robotics 1.5 also aligns with Google’s broader efforts to enhance the trustworthiness of AI. The company recently launched the Data Commons Model Context Protocol (MCP) Server, which allows AI systems to query verified public datasets in plain language. Prem Ramaswami, Google Head of Data Commons, highlighted that the MCP ‘is letting us use the intelligence of the large language model to pick the right data at the right time, without having to understand how we model the data, how our API works.’ This initiative helps AI models rely less on potentially ‘messy internet text’ and more on structured, reliable facts, addressing concerns about autonomous AI systems in high-stakes environments.


