Computer Identification of Phonemes in Continuous Speech.
AIR FORCE INST OF TECH WRIGHT-PATTERSON AFB OHIO SCHOOL OF ENGINEERING
Pagination or Media Count:
The purpose of this investigation was to identify phoneme segments as they appeared in continuous speech. The input device was an audio tape recorder from which the analog speech signal was digitized and fast Fourier transformed. The amplitudes of this transformed signal were combined in a logarithmic manner and printed out in a 16 channel digitized spectrogram. Sixty-one prototypes were selected to represent the phonemes of the English language. These prototypes were stored and used in a running crosscorrelation with the unknown speech signal. The amplitude values resulting from the correlation process were used to predict phoneme locations and the values were compared in order to identify the correct phoneme. The phonemes were selected from Speaker As speech signal and tests were conducted to analyze utterances from Speaker A and Speaker B. For Speaker A, location was rated at 81 percent while identification was rated at 45 percent. For Speaker B, location was found to be 70 percent with identification at 40 percent. Spatial filtering techniques, uniform length prototypes, and various normalization procedures were investigated next with the result of improving location for Speaker B. Author
- Computer Programming and Software
- Voice Communications