Continuous Speech Recognition Using Segmental Neural Nets
BBN SYSTEMS AND TECHNOLOGIES CORP CAMBRIDGE MA
Pagination or Media Count:
We present the concept of a Segmental Neural Net SNN for phonetic modeling in continuous speech recognition. The SNN takes as input all the frames of a phonetic segment and gives as output an estimate of the probability of each of the phonemes, given the input segment. By tak- ing into account all the frames of a phonetic seg- ment simultaneously, the SNN overcomes the well- known conditional-independence limitation of hid- den Markov models HMM. However, the prob- lem of automatic segmentation with neural nets is a formidable computing task compared to HMMs. Therefore, to take advantage of the training and decoding speed of HMMs, we have developed a novel hybrid SNNHMM system that combines the advantages of both types of approaches. In this hy- brid system, use is made of the N-best paradigm to generate likely phonetic segmentations, which are then scored by the SNN. The HMM and SNN scores are then combined to optimize performance. In this manner, the recognition accuracy is guaran- teed to be no worse than the HMM system alone.
- Voice Communications