Robust Coarticulatory Modeling for Continuous Speech Recognition.
Abstract:
The purpose of this project is to perform research into algorithms for the automatic recognition of individual sounds or phonemes in continuous speech. The algorithms developed should be appropriate for understanding large-vocabulary continuous speech input and are to be made available to the Strategic Computing Program for incorporation in a complete word recognition system. This report describes process to date in developing phonetic models that are appropriate for continuous speech recognition. In continuous speech, the acoustic realization of each phoneme depends heavily on the preceding and following phonemes a process known as coarticulation. Thus, while there are relatively few phonemes in English on the order of fifty or so, the number of possible different accoustic realizations is in the thousands. Therefore, to develop high-accuracy recognition algorithms, one may need to develop literally thousands of relatively distince phonetic models to represent the various phonetic context adequately. Developing a large number of models usually necessitates having a large amount of speech to provide reliable estimates of the model parameters. The major contributions of this work are the development of 1 A simple but powerful formalism for modeling phonemes in context 2 Robust training methods for the reliable estimation of model parameters by utilizing the available speech training data in a maximally effective way and 3 Efficient search strategies for phonetic recognition while maintaining high recognition accuracy.