Google DeepMind has unveiled Gemini Robotics 1.5 and Gemini Robotics-ER 1.5, two advancements that mark a leap toward reasoning robots capable of operating in physical environments with autonomy and advanced understanding. These models are designed to overcome one of the greatest challenges in robotics: performing complex multi-step tasks with spatial, linguistic, and motor context.
Model collaboration for complex physics tasks
Gemini Robotics 1.5 is based on a
vision-language-action
(VLA) architecture that enables robots to translate visual inputs and natural commands into coordinated physical movements. In parallel,
Spatial understanding and transferable learning
The ER 1.5 model has demonstrated top-notch performance in tests such as ERQA, Point-Bench, and MindCube, highlighting its ability to estimate the physical context and respond accurately.
In addition, Gemini Robotics 1.5 incorporates the ability to learn between deployments , allowing skills acquired by a robot to be reused in other physical systems without specific retraining.
Real-life applications for reasoning robots
The new architecture allows robots to “think before acting” by generating internal reasoning in natural language, bringing transparency to the decision-making process.
From sorting clothes by color to separating waste according to local regulations, robots can reason, search for information, and perform physical actions tailored to each environment.
Gemini Robotics 1.5 takes robots to the next level. Source: Google DeepMind
Safe and responsible approach
DeepMind has implemented semantic and physical safeguards built into these models. Gemini Robotics-ER 1.5 uses safety frameworks such as ASIMOV to prevent critical errors in physics decisions. It also incorporates user-friendly policies and real-time collision avoidance mechanisms.
Gemini Robotics-ER 1.5 is now available through the Gemini API in Google AI Studio . Gemini Robotics 1.5 is currently in initial deployment with strategic partners. This open source will allow developers to integrate intelligent physical agents into real-world products.
Source and photo: Google DeepMind