Exploration of Behavioral, Physiological, and Computational Approaches to Auditory Scene Analysis
OHIO STATE UNIV COLUMBUS DEPT OF COMPUTER AND INFORMATION SCIENCE
Pagination or Media Count:
We present an overview for the study of auditory perception and scene analysis through the three main approaches researchers have used to study perception in general behavioral, physiological, and computational. At the behavioral level, we discuss the principles and origins of auditory scene analysis, and establish the relationship between auditory scene analysis and auditory masking. Within auditory masking, we note the coexistence of informational and energetic masking, and utilize the ideal time-frequency binary masks in a series of speech intelligibility experiments to isolate the energetic component of speech-on-speech masking. At the physiological level, we propose the adoption of the two-dimensional time-frequency oscillatory correlation representation as a main representation in auditory perception, after reviewing several of the theories and experiments in neurophysiology in effort to find its support. Finally, at the computational level, we extend an existing implementation of oscillatory correlation, LEGION 144, to simulate the major behavioral principles in alternating-tone sequences. Most notably, the decision boundaries of the temporal coherence boundary TCB and fission boundary FB first observed by Van Noorden 135 are automatically generated by the model. The results are compared to several existing implementations designed to simulate alternating-tone sequences 11, 104, 139. Throughout this thesis, we use the three levels of analysis proposed by Marr in vision 89. We emphasize the importance of balance at each level of analysis, and their relationship with the three approaches in the study auditory perception.