AI Mode now supports multimodal input and understanding, allowing users to either take a new photo or upload an existing one using Google Lens. This enables users to ask complex, context-aware questions based on what they see in the image.
Behind the scenes, Google is leveraging Gemini’s multimodal AI capabilities to interpret the entire scene — analyzing context, object relationships, unique content, colors, shapes, and layout within the image.
Google Lens accurately identifies every object in a photo. Using its “query fan-out” technique, AI Mode then generates multiple questions related to the overall image and the specific items within it. Compared to traditional Google Search, this provides richer and more in-depth information. The end result is a highly detailed, context-aware answer that helps users take their next step more confidently.
For example, if you're scanning a bookshelf, AI Mode can identify each book, ask relevant questions about them, and suggest highly rated alternatives. The final output might be a list of recommended books with links to learn more or buy — and you can continue the conversation by asking follow-up questions.
Google Lens within AI Mode is now available on both Android and iOS. To try it, open the AI Mode home page and look for the new Lens icon below the search field. Tapping it brings up the familiar Google Lens interface. Users can hold the shutter button to speak their questions aloud.
After a month of public testing, Google has also shared some early feedback. Users have praised the clean design, fast responses, and the ability to understand complex, nuanced queries. On average, AI Mode queries are said to be twice as long as traditional Google search queries. People are using it for more sophisticated tasks — like comparing products, solving how-to questions, and even planning trips.
0 Comments