Accession Number:

ADA568901

Title:

Familiar Speaker Recognition

Descriptive Note:

Conference paper

Corporate Author:

AIR FORCE RESEARCH LAB ROME NY

Report Date:

2012-05-01

Pagination or Media Count:

5.0

Abstract:

Speaker recognition by machines can be quite good for large groups as demonstrated in NIST speaker recognition evaluations. However, speaker recognition by machines can be fragile in changing environments. This research examines how robust humans are at recognizing familiar speakers in changing environments. The short-term goal of the research was to learn what frequency information is important for the recognition of familiar speakers by masking out certain frequency information. The long-term goal of the research is to use this information to develop more robust speaker recognition features. The authors used additive speech-shaped noise LTASS to degrade particular frequency regions of the speech signal. This way, the signal still sounded natural and the performance of listeners could be tied to the degradation of particular frequencies. If the performance decreased when a set of frequencies was masked by an interfering signal, it would indicate that the frequency range was important. The main conclusion of the research is that the distributions of the Normal Hearing and Hearing Deficit groups were statistically different for each listening condition, both for the performance values and the average elapsed time. Additional analysis is being performed to identify factors that may impact a listeners ability to identify a persons identity. All the bandlimited noise conditions resulted in lower performance compared to the clean no noise conditions. This research was a cursory look at what frequency information is important for speaker identification. More listening experiments with better randomization of stimuli and phonetic consideration are required.

Subject Categories:

  • Psychology
  • Acoustics
  • Voice Communications

Distribution Statement:

APPROVED FOR PUBLIC RELEASE