Integrated Feature Normalization and Enhancement for Robust Speaker Recognition Using Acoustic Factor Analysis (Preprint)
TEXAS UNIV AT DALLAS CENTER FOR ROBUST SPEECH SYSTEMS (CRSS)
Pagination or Media Count:
State-of-the-art factor analysis based channel compensation methods for speaker recognition are based on the assumption that speakerutterance dependent Gaussian Mixture Model GMM mean super-vectors can be constrained to lie in a lower dimensional subspace, which does not consider the fact that conventional acoustic features may also be constrained in a similar way in the feature space. In this study, motivated by the low-rank covariance structure of cepstral features, we propose a factor analysis model in the acoustic feature space instead of the super-vector domain and derive a mixture of dependent feature transformation. We demonstrate that, the proposed Acoustic Factor Analysis AFA transformation performs feature dimensionality reduction, decorrelation, variance normalization and enhancement at the same time. The transform applies a square-root Wiener gain on the acoustic feature eigenvector directions, and is similar to the signal sub-space based speech enhancement schemes. We also propose several methods of adaptively selecting the AFA parameter for each mixture. The proposed feature transformation is applied using a probabilistic mixture alignment, and is integrated with a conventional i-Vector system. Experimental results on the telephone trials of the NIST SRE 2010 demonstrate the effectiveness of the proposed scheme.
- Statistics and Probability
- Voice Communications