An Approach to Co-Channel Talker Interference Suppression Using a Sinusoidal Model for Speech
MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB
Pagination or Media Count:
This report describes a new approach to co-channel talker interference suppression based on a sinusoidal representation of speech, which has been applied effectively in situations where both the desired and interfering speech waveforms are vocalic. The technique fits a sinusoidal model to additive vocalic speech segments such that the least mean squared error between the model and the combined waveforms is obtained. Enhancement is achieved by synthesizing a waveform from the sine waves attributed to the desired speaker. Least squares estimation is applied to obtain sine wave amplitudes and phases of both talkers, based on either a priori sine wave frequencies or a priori fundamental frequency contours. When the frequencies of the two waveforms are closely spaced, the least squares approach can have difficulty in tracking the sine wave parameters. In these cases, the performance is significantly improved by an interpolation technique which predicts the time evolution of the sinusoidal parameters across multiple analysis frames. The approach yielded good suppression of the interfering speech and enhancement of the target speech over a wide range 9 to -16 dB of target to interferer ratio. The least squared error approach is also extended to estimate fundamental frequency contours of both speakers from the summed waveform, and applied further to estimate the remaining sinusoidal parameters.
- Non-Radio Communications