Accession Number:

ADA459912

Title:

Estimating Grammar Parameters using Bounded Memory

Descriptive Note:

Corporate Author:

MARYLAND UNIV BALTIMORE DEPT OF COMPUTER SCIENCE AND ELECTRICAL ENGINEERING

Personal Author(s):

Report Date:

2002-01-01

Pagination or Media Count:

15.0

Abstract:

Estimating the parameters of stochastic context-free grammars SCFGs from data is an important, well-studied problem. Almost without exception, existing approaches make repeated passes over the training data. The memory requirements of such algorithms are ill-suited for embedded agents exposed to large amounts of training data over long periods of time. We present a novel algorithm, called HOLA, for estimating the parameters of SCFGs that computes summary statistics for each string as it is observed and then discards the string. The memory used by HOLA is bounded by the size of the grammar, not by the amount of training data. Empirical results show that HOLA performs as well as the Inside-Outside algorithm on a variety of standard problems, despite the fact that it has access to much less information.

Subject Categories:

  • Linguistics
  • Numerical Mathematics

Distribution Statement:

APPROVED FOR PUBLIC RELEASE