Accession Number:

ADA460230

Title:

Recent Progress in Robust Vocabulary-Independent Speech Recognition

Descriptive Note:

Corporate Author:

CARNEGIE-MELLON UNIV PITTSBURGH PA SCHOOL OF COMPUTER SCIENCE

Personal Author(s):

Report Date:

1991-01-01

Pagination or Media Count:

7.0

Abstract:

This paper reports recent efforts to improve the performance of CMUs robust vocabulary-independent VI speech recognition systems on the DARPA speaker-independent resource management task. The improvements are evaluated on 320 sentences that randomly selected from the DARPA June 88, February 89 and October 89 test sets. Our first improvement involves more detailed acoustic modeling. We incorporated more dynamic features computed from the LPC cepstra and reduced error by 15 over the baseline system. Our second improvement comes from a larger training database. With more training data, our third improvement comes from a more detailed subword modeling. We incorporated the word boundary context into our VI subword modeling and it resulted in a 30 error reduction. Finally, we used decision-tree allophone clustering to find more suitable models for the subword units not covered in the training set and further reduced error by 17. All the techniques combined reduced the VI error rate on the resource management task from 11.1 to 5.4 and from 15.4 to 7.4 when training and testing were under different recording environment. This vocabulary-independent performance has exceeded our vocabulary-dependent performance. first order differenced cepstra and power. Here, we add second order differenced cepstra and power. We also incorporateboth 40 msec and 80 msec differenced cepstraa. These new features yielded a 15 error rate reduction, about the same as was achieved on vocabulary-dependent tasks 7. Our second improvement involves the collection of more general English data, from which we can model more phonetic variabilities, such as the word boundary context. Our experiment shows that adding 5,000 sentences to an original 15,000 sentence training set gives only a 3 error reduction.

Subject Categories:

  • Linguistics
  • Voice Communications

Distribution Statement:

APPROVED FOR PUBLIC RELEASE