Google Teases Computer Vision, Conversational Capabilities of Gemini AI Ahead of Google I/O Event


Google shared a video on its social media platforms on Monday, teasing new capabilities of its artificial intelligence (AI)-powered chatbot Gemini. The video was released just a day before the company’s annual developer-focused Google I/O event. It is believed that the tech giant could make several announcements around AI and unveil new features and possibly new AI models. Besides that, the centre-stage is likely to be taken by Android 15 and Wear OS 5, which could be unveiled during the event.

In a short video posted on X (formerly known as Twitter), the official account of Google teased new capabilities of its in-house AI chatbot. The 50 second-long video highlighted marked improvements in its speech, giving Gemini a more emotive voice and modulations that gives it a more human-like appearance. Further, the video highlighted new computer vision capabilities. The AI could pick up on the visuals on the screen and analyse it.

Gemini could also access the camera of the smartphone, a capability it does not possess at present. The user was moving the camera across the space and asked the AI to describe what it saw. Almost without any time lag, the chatbot could describe the setting as a stage and when prompted, could even recognise the Google I/O logo and share information around it.

The video shared no further details about the AI, and instead asked people to watch the event to know more. There are some questions that might be answered during the event such as whether Google is using a new large language model (LLM) for computer vision or if it an upgraded version of Gemini 1.5 Pro. Further, Google may also reveal what else can the AI do with its computer vision. Notably, there are rumours that the tech giant might introduce Gems, which are considered to be chatbot agents that can be designed for particular tasks, similar to OpenAI’s GPTs.

While Google’s event is expected to introduce new features to Gemini, OpenAI held its Spring Update event on Monday and unveiled its latest GPT-4o AI model that added features to ChatGPT, similar to the video shared by Google. The new AI model allows it to have a conversational speech, computer vision, real-time language translation, and more.

Source link

Related Posts

Dogecoin Developer Issues Important Warning to Investors as Market Turns Volatile

The crypto sector, after seeing a massive upswing in March this year, has found itself rather stagnated for over a month now. In light of the current volatile market conditions,…

Read more

Samsung Galaxy S24 FE Leaked in Renders; Suggests 6.65-Inch Display, Triple Rear Cameras

Samsung is expected to announce the Galaxy S24 FE later this year as a successor to the Galaxy S23 FE. A set of leaked renders have now given us the first…

Read more

Samsung Galaxy Tab S10+ Listed on Geekbench With MediaTek Dimensity Chipset

Samsung Galaxy Tab S10+ may launch later this year as a successor to the Galaxy Tab S9+. The latter was unveiled in July 2023 alongside a base Galaxy Tab S9…

Read more

LinkedIn Introduces New AI-Powered Features to Assist Professionals in Job Search

LinkedIn is rolling out several new artificial intelligence (AI) features that will assist users in job search and personalised learning on the platform. Announced on Thursday, these AI-powered features include…

Read more

Google’s Magic Editor Reportedly Available on Samsung Phones; Magic Eraser, More Become Free-to-Use

Google’s AI-powered photo-editing feature – Magic Editor – is now arriving on Samsung smartphones and older Pixel devices. The feature was first introduced with the company’s Pixel 8 lineup of…

Read more

Poco F6 Review: Excellent Performance, Mid-Range Cameras

The F-series is an important lineup for Poco, as the Poco F1 was the first phone from the brand to launch in India. A friend of mine still has the…

Read more

Leave a Reply