What’s All the Fuss About…Seeing
A new app that says what it sees
What is it?
A new iPhone app for blind and visually impaired people that describes the world around them. Built by Microsoft, it uses artificial intelligence (AI) to narrate what the phone’s camera is seeing. Thie company’s ambition is to “turn the visual world into an audible experience”.
Does it recognise people?
Yes, if you save them to a list of contacts. For strangers, it tries to identify their gender, age and even their emotion. In the main screenshot for example (right), Seeing AI recognises the subject as a “28 year old female wearing glasses looking happy”. Spot on.
What else can it recognise?
Objects, short text (such as the address on an envelope or a street sign), long documents, currency and product barcodes (to identify what it is). It has several clever tricks that make it easier to use, including audio cues when you’re looking for a barcode on a product, beeping more quickly as it comes into view. Also, when you scan a document, you’re told where to position it so no words are cut off, something OCR (optical character recognition) software lacks.
But perhaps the most useful tool is the “experimental” scene description, which tells you what’s happening in a photo you’ve taken.
The all-Seeing AI knows exactly what this is
How accurate is that?
Fairly accurate, at least according to an online demonstration by Saqib Shaikh, a software developer for Microsoft who lost his sight aged seven. To watch his video visit www.snipca.com/24947, then scroll down and click ‘Scene Demo’. Saying that the app feels like “science fiction”,
Shaikh shows us how the app can recognise “a man sitting on a couch using a laptop” and “a bus that is parked on the side of a road”. More impressively, another video on the site shows the app recognising “a young girl throwing a Frisbee in the park” (see screenshot left). Spot on again.
Is it free?
Yes, unlike similar apps such as Aipoly (http://aipoly.com), which costs £4.99 a month. But it’s not yet available in the UK, only in the US, Canada, Hong Kong, India, New Zealand or Singapore.
Will there be an Android version?
Microsoft won’t say, but it would be crazy to ignore hundreds of millions Android users, particularly as its main tech rivals – Google, Apple and Facebook – are also investing heavily in AI. All three are working on tools that help the blind and visually impaired.
Seeing AI describes the world for the blind – and it feels like science fiction
What are they doing?
Google’s ‘Show and Tell’ technology (www.snipca.com/24958), available for app developers, generates captions for photos. Apple’s screen reader VoiceOver (www.snipca.com/24950) tells you what’s on your iPad and iPhone when you tap the screen in a particular way (actions called ‘gestures’). Facebook is using ‘automatic alternative text’ (www. snipca.com/24955) to describe the contents of photos posted to the site. Thiese could improve the life of millions of people, but Microsoft’s ambition for AI is bigger.
It wants to secure the future of the planet, no less, having also launched AI for Earth’ (www.microsoft.com/en-us/ aiforearth), a project “empowering people and organizations to solve global environmental challenges”. But its aims are not entirely selfless. Thie company that dominates AI in future will make billions. Recognising frisbees is just the start.