“The development of robots (especially humanoids) has been
stimulating the imagination of technology enthusiasts for a long time.
According to many, this is one of the main directions of its development.
However, the robots we have dealt with so far had many limitations. They were
programmed to perform specific activities, or possibly controlled by humans. It
is supposed to be different with robots equipped with AI. The models, the
creation of which has just been announced by Google (or more precisely Google
DeepMind, a part of Google specializing in work on artificial intelligence),
are to – according to the company – create the foundations for a new generation
of robots.
Robots with AI: They see, hear, and respond to changes on an
ongoing basis
According to Google DeepMind,
artificial intelligence models for robots must meet three main criteria:
interactivity (i.e. understanding and
responding to instructions or changes in the environment);
the ability to draw general conclusions –
adapting to the situation;
dexterity – i.e. the ability to
perform activities that people usually perform with their hands and fingers
(for example, carefully manipulating objects).
“While our previous work has shown progress in these areas,
Gemini Robotics represents a significant leap forward in performance in all
three areas, bringing us closer to truly universal robots,” we read on Google’s
DeepMind blog.
What can Google’s AI-equipped robots do?
Google’s AI-equipped robots respond
to verbal commands, can perform tasks in changing circumstances (by responding
to visual signals from the environment and showing that they are spatially
oriented), and perform tasks for which they have not been previously trained.
What does this look like in practice? In the videos
presented by Google, we see, for example, a robot that responds adequately to
short commands (e.g. putting bananas into a container). The robot can also
fulfill them when the assistant changes the position of the container on an
ongoing basis. In turn, the manual skills of the robots (their dexterity) are
shown in a film in which we see the folding of an origami figure. The robots
can also put on a conveyor belt, play tic-tac-toe, arrange tools or play cards.
Artificial intelligence for robots from Google: Gemini
Robotics and Gemini Robotics-ER
AI for robots created by Google DeepMind is Gemini Robotics
and Gemini Robotics-ER (ER - embodied reasoning).
Gemini Robotics is an advanced VLA
(vision-language-action) model.
It was built on Gemini 2.0, which was additionally equipped
with a new modality - direct control of the robot in the physical world.
The second model, Gemini Robotics-ER,
is distinguished by its advanced understanding of space, including
understanding 3D space. The model can perform all the steps necessary to
control the robot on an ongoing basis, including perception, state assessment,
understanding space, planning and code generation. “In this comprehensive
setting, the model achieves a success rate two to three times greater than
Gemini 2.0. And where code generation is not enough, Gemini Robotics-ER can
even leverage the power of contextual learning, following the patterns of
several human-led demonstrations to find a solution,” it says.
Google’s AI tailored to different types of robots
Gemini Robotics models are designed to be easily adaptable
to different types of robots. “We trained the model primarily on data from the
dual-arm ALOHA 2 robot platform, but we also showed that it can control a
dual-arm platform based on the Frank arms used in many academic labs. Gemini
Robotics could even be specialized for more complex incarnations, such as the
humanoid Apollo robot developed by Apptronik, which aims to perform real-world
tasks,” Google DeepMind reports.
Google DeepMind is working on Google’s “robot” models in
partnership with Apptronik. “Our Gemini Robotics-ER model is also available for
trusted testers, including Agile Robots, Agility Robots, Boston Dynamics, and
Enchanted Tools,” we read.”
Komentarų nėra:
Rašyti komentarą