View The Document

Accession Number:



Delivering Sensory and Semantic Visual Information via Auditory Feedback on Mobile Technology


Author Organization(s):

Report Date:



This project seeks to create and assess new visual-assistive smartphone Apps for fully blind end users. These Apps convey information gathered by sensors (color, distance, heat) and artificial intelligence (object recognition) through both spoken verbal feedback and 'musical' audio that intuitively conveys the locations, visual properties, and identities of objects in the environment. Our research purpose is to produce new Apps that are tailored to blind end users to increase visual information accessibility, enhance daily functionality, and facilitate new interactions of interest to them. In terms of scope, this 2-year project focuses on the development of novel technologies in the first year, with at-home beta-testing by fully blind subjects in the second year. The technology development focuses on new iPhone sensors (LiDAR range-finding, plug-in thermal cameras) and their support for state-of-the-art object recognition techniques that run in real-time, locally on iPhone. During year 1, we found that DeepLabV3, which accurately segments object shapes from live visual images, provides new interaction possibilities. Here the location, shape, size, and identity of recognized objects in a scene can be rapidly presented to users through musical feedback. This represents scenes at the semantic-level (objects). This goes beyond prior technologies that operate at the 'sensory-level' (e.g. brightness, distances, heat) to provide a more intuitive understanding of the environment that remains stable across variable conditions. Furthermore, since object identity is known, we provide optional verbal feedback that tells users each objects name and describes its location in the image. This provides live user support and training within the App. Building on this, we are preparing for user testing in year 2. Here blind end users will beta-test our Apps that convey information at various levels (e.g. sensory, semantic).



File Size:





Distribution Statement:

Approved For Public Release

View The Document