Google enters its Gemini era

Author: Editorial
event 15.05.2024.
Foto: Shutterstock

Android, Wear OS, and Pixel may be household names, but it was Gemini, Google’s new AI technology, that grabbed all the attention at this year’s conference.

The company’s annual software celebration sets the stage for everything Google has planned for the coming year, with CEO Sundar Pichai unequivocally declaring that Google is entering the “Gemini Era.” From AI searches in your Google Photos to virtual AI assistants that will work together with you, Google is integrating Gemini into absolutely everything, with significant consequences. We bring you an overview of everything Google has announced this year.

The conference was opened by Marc Rebillet, an American electronic musician and YouTuber, who used Google’s new generative models to create a new song live on stage.

Gemini – Your new AI assistant

Google’s AI model was the focus of the I/O event. Gemini 1.5 Pro will be available in over 35 languages to all developers and power users starting today. One of the main talking points was the fact that Gemini 1.5 Pro now has 1 million context tokens, so it can process, for example, five scientific papers at once, or accept multiple query modalities. This means that we will be able to attach a PDF file, a text query, a video and an image to it – all at the same time – so that the AI will give us a complete answer with a wide context.

Gemini and Google Photos

Users upload more than six billion photos to Google Photos every day, so it’s no surprise that they could use some help browsing through them. Gemini will be added to Google Photos this summer, adding additional search capabilities via the Ask Photos feature.

For example, ask it “What’s my license plate?” and it will search your photos to find the most likely answer, saving you from having to manually go through your photos to find it yourself. This feature will greatly speed up photo searches, and only becoming smarter and more useful over time.

AI in Android

Naturally, Google will also integrate Gemini into its mobile operating system. Android will be the first mobile OS to include such an advanced AI model, aiming to become the primary platform for all AI enthusiasts. Circle to Search was the first part of this integration, but this year Google will also add Gemini as a standard AI assistant on Android – and add more AI functions in the background.

Think of Gemini on Android as Google Assistant on steroids. It’ll be able to contextually understand the content on your screen, including summarizing YouTube videos, creating answer images, and answering any questions you have—without leaving the screen you’re on.

Accessibility is a key feature that AI offers. The Talkback feature has been around for a while, but now, thanks to Gemini, images can be described in detail, giving the visually impaired an easier way to use the phone. And since Gemini is available on the device, it is fast, efficient and secure because the data is processed locally and not in the cloud.

Gemini will help in the fight against unwanted and fraudulent (scam) calls. Gemini will listen to your calls and alert you when it detects suspicious activity – and because it’s all on-device, the information won’t leave your phone. This feature is still under development and won’t be available for the time being.

Notebook LM

Google’s software that helps educators and parents educate children will also get Gemini integration, taking its AI capabilities to an even higher level. During the presentation, Google showed how it adapts a physics lesson using basketball as an example. This personalization of learning is likely to become more common in the future.

Gemini Agents

Gemini isn’t just about asking questions and summarizing data – Google wants to actually do the work for you. Although it can’t vacuum or take out the trash, Agents is a new AI assistant to which you can assign tasks. Google demonstrated by taking a photo of a pair of shoes and telling Agent to return them. Using AI to identify shoes, it searched Gmail for the account and offered to initiate a refund via email. Agents can also be used to plan vacations, business trips, and other tasks related to organization and information.

Project Astra

Another experimental project for Google is Astra, which connects Gemini with cameras and allows it to understand and interpret the world around it. In the demo, Astra was able to identify a speaker, break down which part of the speaker is making sounds, and read program code and explain to it the functions of individual code snippets. Astra could also be used in conjunction with smart glasses, allowing you to ask questions about the things you see without having to pick up your phone’s camera.

This is one of the projects for which no exact deadline has been defined, so we do not know when it will be available on our market.

Generative AI for multimedia

Generative artificial intelligence is the most famous AI technology, and Google does not neglect this most important element of AI. Its latest AI model for creating images is called Imagen 3, and Google claims it’s the best model it’s ever built for creating images using words, as well as understanding queries.

In addition to images, Google is working intensively on creating an AI model for music generation, as well as the AI model Veo, which can create very impressive HD videos. Queries can be used to edit existing videos, so you don’t have to create videos from scratch every time, and the video examples shown definitely look better than most AI-generated videos.

If you’re worried that the generated images, sounds, and videos could be used for malicious purposes, don’t worry, Google has added SynthID to iterations of Gemini. It is an invisible form of watermark that indicates that the content was created using artificial intelligence – namely, in ImageFX and VideoFX tools.

Gemini and Google Workspace

Gemini has been available in Google’s enterprise software for a while now, but Google is ready to take it to the next level. The final version of the language model integrated into the Workspace side panel will be available as early as next month. Gemini is also coming to Google Meet in multiple languages.

As expected, Gemini will also expand to Gmail. Ask him to summarize information from your child’s school and he’ll do it, or he’ll simply condense long emails so you don’t have to. Type a question or query and Gemini will be able to answer or perform an action. For example, they will be able to collect different bids for construction works and compile them into a list, and from them make a table for an overview of all bids and costs. They will also be able to create a Google Sheet with all our accounts from email.

Smart Replies are also getting an upgrade with Contextual Smart Replies. These capabilities will be available to Workspace Labs users this summer.

You could soon be working alongside artificial intelligence. Google introduced AI Teammate, tasked with tracking resources for a team within a company. This “coworker” can answer questions in chats in Google Workspace and is able to remember when decisions were made and track the progress of specific projects, as well as convey to us answers to all our questions that we would have to scroll through e-mails and group conversations before.

Gemini app

The Gemini model will be integrated into the average user’s everyday life through the Google Gemini app which will now give replies almost instantly after a question is asked. Through this application, we can also generate Gemini Gem, a personalized version of the assistant for everyone’s individual needs.

So if you continuously use Gemini in certain ways, you can create a Gem to save time when you need it again. For example, you could customize Gem to tell you stories in the style you prefer, instead of constantly typing the same queries to a generic AI chatbot.

The Gemini app can do many of the things you’d expect from a Gemini, including planning trips and creating itineraries. This feature will be available this summer.

Google Search

Generative artificial intelligence will also appear in Google searches. AI overviews will summarize the results at the top of your search, instead of sending you to various websites. Multilevel inference will break down your queries, using Google’s indexes to provide you with the most relevant information. It can even help you plan your trip.


Zainteresirani ste za jedan od treninga?

Ispunite prijavu i javit ćemo Vam se u najkraćem mogućem roku!

Markoja d.o.o.
Selska cesta 93
OIB: 10585552225

    Ispunite prijavu i javit ćemo Vam se u najkraćem mogućem roku!

    All news