Google I/O 2024: DeepMind Showcases Real-Time Computer Vision-Based AI Interaction With Project Astra


Google I/O 2024’s keynote session allowed the company to showcase its impressive lineup of artificial intelligence (AI) models and tools that it has been working on for a while. Most of the introduced features will make their way to public previews in the coming months. However, the most interesting technology previewed in the event will not be here for a while. Developed by Google DeepMind, this new AI assistant was called Project Astra and it showcased real-time, computer vision-based AI interaction.

Project Astra is an AI model that can perform tasks that are extremely advanced for the existing chatbots. Google follows a system where it uses its largest and the most powerful AI models to train its production-ready models. Highlighting one such example of an AI model which is currently in training, the co-founder and CEO of Google DeepMind Demis Hassabis showcased Project Astra. Introducing it, he said, “Today, we have some exciting new progress to share about the future of AI assistants that we are calling Project Astra. For a long time, we wanted to build a universal AI agent that can be truly helpful in everyday life.”

Hassabis also listed a set of requirements the company had set for such AI agents. They need to understand and respond to the complex and dynamic real-world environment, and they need to remember what they see to develop context and take action. Further, it also needs to be teachable and personal so it can learn new skills and have conversations without delays.

With that description, the DeepMind CEO showcased a demo video where a user could be seen holding up a smartphone with its camera app open. The user speaks with an AI and the AI instantly responds, answering various vision-based queries. The AI was also able to use the visual information for context and answer related questions required generative capabilities. For instance, the user showed the AI some crayons and asked the AI to describe it with alliteration. Without any lag, the chatbot says, “Creative crayons colour cheerfully. They certainly craft colourful creations.”

But that was not all. Further in the video, the user points towards the window, from which some buildings and roads can be seen. When asked about the neighbourhood, the AI promptly gives the correct answer. This shows the capability of the AI model’s computer vision processing and the massive visual dataset it would have taken to train it. But perhaps the most interesting demonstration was when the AI was asked about the user’s glasses. They appeared on the screen briefly for a few seconds and it had already left the screen. Yet, the AI could remember its position and guide the user to it.

Project Astra is not available either in public or private preview. Google is still working on the model, and it has to figure out the use cases for the AI feature and decide how to make it available to users. This demonstration would have been the most ridiculous feat by AI so far, but OpenAI’s Spring Update event a day ago took away some of its thunder. During its event, OpenAI unveiled GPT-4o which showcased similar capabilities and emotive voices that made the AI sound more human.

Source link

Related Posts

Dogecoin Developer Issues Important Warning to Investors as Market Turns Volatile

The crypto sector, after seeing a massive upswing in March this year, has found itself rather stagnated for over a month now. In light of the current volatile market conditions,…

Read more

Samsung Galaxy S24 FE Leaked in Renders; Suggests 6.65-Inch Display, Triple Rear Cameras

Samsung is expected to announce the Galaxy S24 FE later this year as a successor to the Galaxy S23 FE. A set of leaked renders have now given us the first…

Read more

Samsung Galaxy Tab S10+ Listed on Geekbench With MediaTek Dimensity Chipset

Samsung Galaxy Tab S10+ may launch later this year as a successor to the Galaxy Tab S9+. The latter was unveiled in July 2023 alongside a base Galaxy Tab S9…

Read more

LinkedIn Introduces New AI-Powered Features to Assist Professionals in Job Search

LinkedIn is rolling out several new artificial intelligence (AI) features that will assist users in job search and personalised learning on the platform. Announced on Thursday, these AI-powered features include…

Read more

Google’s Magic Editor Reportedly Available on Samsung Phones; Magic Eraser, More Become Free-to-Use

Google’s AI-powered photo-editing feature – Magic Editor – is now arriving on Samsung smartphones and older Pixel devices. The feature was first introduced with the company’s Pixel 8 lineup of…

Read more

Poco F6 Review: Excellent Performance, Mid-Range Cameras

The F-series is an important lineup for Poco, as the Poco F1 was the first phone from the brand to launch in India. A friend of mine still has the…

Read more

Leave a Reply