Rule-Based Frequency Domain Speech Coding
AIR FORCE INST OF TECH WRIGHT-PATTERSON AFB OH SCHOOL OF ENGINEERING
Pagination or Media Count:
A speech processing system is designed to simulate the transmission of speech signals using a speech coding scheme. The transmitter portion of the simulation extracts a minimized set of frequencies in Fourier space which represents the essence of each of the speech timeslices. These parameters are then adaptively quantized and transmitted to a receiver portion of the coding scheme. The receiver then generates an estimate of the original timeslice from the transmitted parameters using a sinusoidal speech model. After initial design, the thesis investigates how each of the design parameters affect the human perceived quality of speech. This is done with listening tests. The listening tests consist of having volunteers listen to a series of speech reconstructions. Each reconstruction is the result of the coding scheme acting on the same speech input file with the design parameters varied. The design parameters which are varied are number of frequencies used in the sinusoidal speech model for reconstruction, number of bits to encode amplitude information, and number of bits used to code phase information. The final design parameters for the coding scheme were selected based on the results of the listening tests. Post design listening tests showed that the system was capable of 4800 bps speech transmission with a quality rating of five on a scale from zero not understandable to ten sounds just like original speech.
- Computer Programming and Software
- Recording and Playback Devices
- Voice Communications