Microsoft's Seeing AI embraces LiDAR and Spatial Audio

Published by at

Microsoft's 'Seeing AI' app, here for iOS, allows blind or partially sighted people to 'see' information around them, using a phone's camera and loads of server and local Artificial Intelligence. It's been available for a while but has now been significantly improved by incorporating the LiDAR scanners on the new iPhone 12 Pro range to range find objects in a room and feed this back through audio descriptions presented in stereo 'Spatial Audio'. Always improving, it's good to see Seeing AI's evolution - there will be much more in 2021.

LiDAR support on the iPhone 12 Pro and 12 Pro Max unlock extra capabilities, evidenced in the full changelog as found on the iOS App Store:

  • The new World channel, available on devices with a LiDAR scanner running iOS 14, enables you to explore an unfamiliar space in 3D, using spatial audio. When wearing headphones, you will hear objects around you announced from their location in the room. You can also find a particular object by placing an audio beacon on it. We are keen to hear your feedback on this early experiment, and invite you to work with us as we explore this new area together with the community.
  • On iPhone 12 Pro and Pro Max, the haptic proximity sensor enables you to point the LiDAR scanner and feel the distance to things around you
  • The main screen has been visually redesigned to improve contrast and widen the camera's field of view
  • Improvements to image descriptions on the Scene channel, and when browsing photos on your phone
  • Improved text recognition accuracy on the Document channel
  • Seeing AI is now available in seven additional languages: Czech, Danish, Finnish, Greek, Hungarian, Polish, and Swedish

A possible life saver for the blind, it's fun to experiment with if you're sighted too, to see what's possible with real time object and text recognition. It's fair to say that the LiDAR and Spatial Audio stuff is in its early days, but it does work. Walk into an office and there's a chorus of "Book!" "Book!" "Book!" from 'all around' you (on headphones), for example!

Screenshot

Having stereo awareness is a nice touch, though not exactly essential as a blind person already knows the orientation of their own phone and camera. The LiDAR haptics integration also works, but you have to be very tuned into the vibration variations in order to use it properly - in my tests, the phone just felt like it was 'buzzing' and I couldn't tell differences. But then I'm not blind! I'm sure these facilities will carry on improving and I'll keep the app installed for testing in 2021.

Anyone else used or played with this?

PS. Seeing AI also works on older iPhones with simpler camera set-ups, though with slightly less features. The text recognition is stunning, mind you. Reading signs and menus is astonishingly fast and good.