Accession Number:

ADA459584

Title:

Language Modeling With Sentence-Level Mixtures

Descriptive Note:

Corporate Author:

BOSTON UNIV MA

Report Date:

1994-01-01

Pagination or Media Count:

7.0

Abstract:

This paper introduces a simple mixture language model that attempts to capture long distance constraints in a sentence or paragraph. The model is an m-component mixture of trigram models. The models were constructed using a 5K vocabulary and trained using a 76 million word Wail Street Journal text corpus. Using the BU recognition system, experiments show a 7 improvement in recognition accuracy with the mixture trigram models as compared to using a trigram model.

Subject Categories:

  • Linguistics

Distribution Statement:

APPROVED FOR PUBLIC RELEASE