TINA: A Probabilistic Syntactic Parser for Speech Understanding Systems
MASSACHUSETTS INST OF TECH CAMBRIDGE LAB FOR COMPUTER SCIENCE
Pagination or Media Count:
A new natural language system, TINA, has been developed for applications involving speech understanding tasks, which integrates key ideas from context free grammars, Augmented Transition Networks ATNs 1, and Lexical Functional Grammars LFGs 2. The parser uses a best-first search strategy, with probability assignments on all arcs obtained automatically from a set of example sentences. An initial context-free grammar, derived from the example sentences, is first converted to a probabilistic network structure. Control includes both top-down and bottom-up cycles, and key parameters are passed among nodes to deal with long-distance movement and agreement constraints. The probabilities provide a natural mechanism for exploring more common grammatical constructions first. Arc probabilities also reduced test-set perplexity by nearly an order of magnitude. Included is a new strategy for dealing with movement, which can handle efficiently nested and chained gaps, and rejects crossed gaps.