RECOGNITION OF SPOKEN DIGITS UTILIZING SEQUENTIAL PATTERNS.
TEXAS UNIV AUSTIN ELECTRONICS RESEARCH CENTER
Pagination or Media Count:
In linguistics each spoken word can be broken down into a series of phonemes or units of sound. Ideally each phoneme can be characterized by the fundamental frequency and the position of the formants in the speech spectrum. This is not possible in the practical case because each phoneme is affected by the phoneme before and after it. Furthermore, the fundamental frequency and formant positions are different for different people. However, by using the fundamental frequency and the formant resonances to characterize each phoneme to some degree, and then using sequences of these characterizations, recognition can be achieved. The characterizations used in this study were determined by comparing signal intensities in different frequency bands. The voice spectrum was broken into 22 bands using a bandpass filter set with 13 octave spacing of the center frequencies. The lowest center frequency was 100 hertz and the highest 12500 hertz. Each of the 22 analog signals were then rectified, smoothed, multiplexed, and digitized. The digital form of the data was that actually used to determine the sequential patterns of signal intensity concentrations used for recognition. Using a data base of 25 speakers, 14 female and 11 male, a recognition rate of 99.5 with a misrecognition rate of 8.2 was achieved. Author
- Voice Communications