Google’s TalkBack now lets users ask questions about on-screen images using Gemini AI

Photo of author

On the occasion of Global Accessibility Awareness Day, Google has announced a suite of new artificial intelligence (AI) and accessibility tools aimed at improving digital experiences for people with vision and hearing impairments. The updates will be introduced across Android devices and the Chrome browser, enhancing inclusivity through smart, user-friendly features.

In a blog post published on Thursday, the California-based tech company revealed that it is enhancing Android’s screen reader, TalkBack, with expanded Gemini AI capabilities. Originally launched last year to generate descriptive captions for images lacking alt text, the updated feature now allows users not only to hear descriptions but also to ask questions about the images or content on their screen. This interactive function is designed to provide deeper visual context for those with vision impairments.

Google also announced a wider release of its Expressive Captions feature. Debuted in the US in late 2024, this tool is part of the Live Captions system and aims to convey tone, emotion, and ambient sounds more effectively through AI-generated subtitles. By reflecting vocal emphasis—such as stretching the word “no” to “noooooo” to denote despair—the feature adds a layer of expressiveness to otherwise plain text. Expressive Captions are now being rolled out in English to users in Australia, Canada, the UK, and the US on devices running Android 15 and later.

Chrome users are also set to benefit from new accessibility enhancements. A major update enables the desktop browser to support screen readers for scanned PDF files, a longstanding limitation. This improvement is powered by optical character recognition (OCR) technology, allowing Chrome to identify text within scanned documents, making it possible to read aloud, highlight, copy, and search content.

Additionally, Chrome on Android is introducing Page Zoom—a tool that enlarges text on web pages without distorting the overall layout. This feature is tailored to users with low vision, offering an improved reading experience without the inconvenience of constantly panning across the screen.

Google has also emphasised its commitment to supporting developers creating speech recognition tools by releasing new resources to aid innovation in the field.


Source link

Leave a Comment