DID YOU KNOW? DTIC has over 3.5 million final reports on DoD funded research, development, test, and evaluation activities available to our registered users. Click
HERE to register or log in.
Accession Number:
AD1031245
Title:
Performance Evaluation of Glottal Inverse Filtering Algorithms Using a Physiologically Based Articulatory Speech Synthesizer
Descriptive Note:
Journal Article - Open Access
Corporate Author:
MASSACHUSETTS INST OF TECH LEXINGTON LEXINGTON United States
Report Date:
2017-01-05
Pagination or Media Count:
11.0
Abstract:
Glottal inverse filtering aims to estimate the glottal airflow signal from a speech signal for applications such as speaker recognition and clinical assessment. Nonetheless, evaluation of inverse filtering performance has been challenging due to the practical difficulty in measuring the true glottal signals while speech signals are recorded. Apart from this, it is suspected that the performance of many methods degrade in conditions that are of great interest, such as breathy voice, high pitch, softloud voice, and running speech. This paper presents a comprehensive, objective, and comparative evaluation of state-of-the-art inverse filtering algorithms that takes advantage of speech and glottal signals generated by a physiologically relevant speech synthesizer. The synthesizer provides a realistic simulation of the voice production process, and thus an adequate test bed for revealing the temporal and spectral performance characteristics of each algorithm. Included in the synthetic data are continuous running speech utterances and sustained vowels, which are produced with multiple voice qualities pressed, slightly pressed, modal,slightly breathy, and breathy and subglottal pressure levels to simulate the natural variations in real speech. In evaluating the accuracy of a glottal flow estimate, multiple error measures are used, including an error in the estimated signal that measures overall waveform deviation, as well as an error in each of several clinically relevant features extracted from the glottal flow estimate. For two vowel-specific data subsets that were isolated for two open vowels and analyzed with three closed phase approaches, the resulting waveform errors had mean and standard deviation values below 20 and 10, respectively, of the true glottal source amplitude. These approaches also showed remarkable stability across different voice qualities and subglottal pressure levels. Results of data subset analysis suggest that analysis of close rounded vowels
Distribution Statement:
APPROVED FOR PUBLIC RELEASE