Accession Number:

ADA510822

Title:

Towards Link Characterization from Content

Descriptive Note:

Conference paper

Corporate Author:

RUTGERS - THE STATE UNIV PISCATAWAY NJ

Personal Author(s):

Report Date:

2008-01-01

Pagination or Media Count:

5.0

Abstract:

In processing large volumes of speech and language data, we are often interested in the distribution of languages, speakers, topics, etc. For large data sets, these distributions are typically estimated at a given point in time using pattern classification technology. Such estimates can be highly biased, especially for rare classes. While these biases have been addressed in some applications, they have thus far been ignored in the speech and language literature. This neglect causes significant error for low-frequency classes. Correcting this biased distribution involves exploiting uncertain knowledge of the classifier error patterns. The Metropolis-Hastings algorithm allows us to construct a Bayes estimator for the true class proportions. We experimentally evaluate this algorithm for a speaker recognition task.

Subject Categories:

  • Voice Communications

Distribution Statement:

APPROVED FOR PUBLIC RELEASE