Iterated Class-Specific Subspaces for Speaker-Dependent Phoneme Classification

reportActive / Technical Report | Accession Number: ADA494622 | Open PDF

Abstract:

The features based on the MEL cepstrum have long dominated probabilistic methods in automatic speech recognition ASR. This feature set has evolved to maximize general ASR performance within a Bayesian classifier framework using a common feature space. Now, however, with the advent of the PDF projection theorem PPT and the class-specific method CSM, it is possible to design features separately for each phoneme and compare log-likelihood values fairly across various feature sets. In this paper, class-dependent features are found by optimizing a set of frequency-band functions for projection of the spectral vectors, analogous to the MEL frequency band functions, individually for each class. Using this method, we show significant improvement over standard MEL cepstrum methods in speaker and phoneme specific recognition.

Security Markings

DOCUMENT & CONTEXTUAL SUMMARY

Distribution:
Approved For Public Release
Distribution Statement:
Approved For Public Release; Distribution Is Unlimited.

RECORD

Collection: TR
Identifying Numbers
Subject Terms