Project Astra: Google wants to make multimodal AI useful in everyday life

Deal Score0
Deal Score0

Project Astra: Google wants to make multimodal AI useful in everyday life

An AI “universal and useful every day“. It is in these terms that Google described a few moments ago its Project Astra, a multimodal AI project capable of being truly relevant on a daily basis, for individuals and professionals alike. Its particularity? Its ability to understand and to respond to a complex and changing environment, exactly as a human would do, while managing natural language exchanges with the user without any glitches (and without excessive latency).

Advertising, your content continues below

Based on Gemini, this device is able to memorize elements seen in the user's environment, in order to accumulate context related to their requests, but also in order to provide more precise answers when questioned. . Google explains that this Project Astra is “proactive, teachable and personal“. How? By taking advantage of a smartphone (or augmented reality glasses) and the video stream from its camera.

Project Astra: Google wants to make multimodal AI useful in everyday life

To achieve a convincing result, Google specifies that it has developed new agents based on Gemini, capable of synthesizing information very quickly by continuously encoding the images from this video stream, and combining them with the user's speech in such a way as to obtain a timeline of events on which to rely to achieve the most natural interactions possible. Project Astra is also able to take into account the user's intonation to respond more precisely, without missing the context of the conversation or request.

Project Astra: Google wants to make multimodal AI useful in everyday life

A demonstration (which Google promises to have filmed in a single take and in real time) allows us to take stock of the work accomplished by the Mountain View giant's teams. The AI ​​obtained is capable of detecting without latency any visible element in the user's environment, of answering questions, including complex ones, relating to a particular element of this environment, and even of keeping in memory the place where the user forgot his glasses.

The most convincing moment of the demonstration? Probably the one where the user asks the AI ​​how a server system, simply drawn in the form of a sketch, could be improved… The AI ​​then suggests adding a little cache between the server and its database to boost the installation.

Advertising, your content continues below

More Info

We will be happy to hear your thoughts

Leave a reply

Bonplans French
Logo