Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment
Final technical rept. Apr 2011-Apr 2015
TEXAS UNIV AT DALLAS RICHARDSON
Pagination or Media Count:
This study has focused on five complementary research tasks in the domain of audio, speech, language, and speaker recognition and processing. In the area of speaker recognitionidentification SID, advancements have been realized to address acoustic mismatch due to speaker overlap, language mismatch, channelmicrophoneadditive noise, speaker style spoken vs. singing, speaker state physical task stress, distant speech, and environment based room reverberation. In language ID LID, advancements have been shown for improved out-of-set language rejection, as well as integrated spectral and prosody based LID solutions. For co-channel and diarization, new algorithms based on gammatone subband frequency modulation was achieved. In diarization, robust speech activity detection based on a combination Combo-SAD feature stream was developed. New keyword spotting technology using phonological features as well as audio stream assessment for peak clipping and speaker height estimation were also developed. All algorithms were evaluated on various speech corpora from AFRL, CRSS-UTDallas, and publicly available.
- Voice Communications