DID YOU KNOW? DTIC has over 3.5 million final reports on DoD funded research, development, test, and evaluation activities available to our registered users. Click HERE
to register or log in.
Clustering Techniques in Speaker Recognition
AIR FORCE INST OF TECH WRIGHT-PATTERSON AFB OH SCHOOL OF ENGINEERING
Pagination or Media Count:
This thesis presents a comparison based on identification rate, of three clustering techniques applied to cepstral features for speaker identification. LBG vector quantization as developed by Linde, Buzo and Gray is used to provide benchmark performance for comparison with Fuzzy clustering based on the unsupervised fuzzy partition-optimal number of classes, UFP-ONC algorithm by Gath and Geva and an Artificial Neural Network, the Multilayer Perceptron. Cepstral features from the TIMIT, King and AFIT93 corpus speaker databases are used to produce speaker-identification classifiers using each of the clustering algorithms. The experiment reported evaluates the speaker identification performance using the 20-dimensional cepstral features which were extracted directly from the databases. The speaker databases were taken from different recording environments, TIMIT is studio quality, AFIT93 was recorded in an office environment and King is recorded telephone conversations. The performance provides an indication of merit for the clustering techniques for the range of typical recording environments. This thesis demonstrates the application of fuzzy clustering for speaker identification. It is shown that the UFP-ONC algorithm can achieve identification rates equal to the LBG vector quantization system. LBG vector quantization provides the best overall performance of all three clustering techniques.
APPROVED FOR PUBLIC RELEASE