Optimum Classification of Voiced Speech, Unvoiced Speech and Silence in the Presence of Noise and Interference.
MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB
Pagination or Media Count:
The problem of determining whether a given interval of speech signal should be classified as voiced speech, unvoiced speech or silence is formulated as a test of statistical hypotheses. A robust detector structure is obtained by modelling the background noise as a correlated Gaussian random process and the interference as a deterministic periodic waveform. The unvoiced speech signal is also modelled as a Gaussian random process for which an estimate of the spectral properties are known. Voiced speech is characterized as a quasi-periodic deterministic waveform for which general spectral properties are also known. The methods of statistical decision theory are then applied to these models to synthesize an optimum, minumum probability of error classifier. The detector basically consists of a bank of least squares filters each tuned to the general properties of the noise, unvoiced speech and voiced speech waveforms. A suboptimal speech classifier is proposed that simplifies the computational requirements considerably.
- Voice Communications