Coping with Ambiguity and Unknown Words through Probabilistic Models
BBN SYSTEMS AND TECHNOLOGIES CORP CAMBRIDGE MA
Pagination or Media Count:
From spring 1990 through fall 1991, we performed a battery of small experiments to test the effectiveness of supplementing knowledge-based techniques with probabilistic models. This paper reports our experiments in predicting parts of speech of highly ambiguous words, predicting the intended interpretation of an utterance when more than one interpretation satisfies all known syntactic and semantic constraints, and learning case frame information for verbs from example uses. From these experiments, we are convinced that probabilistic models based on annotated corpora can effectively reduce the ambiguity in processing text and can be used to acquire lexical information from a corpus, by supplementing knowledge-based techniques. Based on the results of those experiments, we have constructed a new natural language system PLUMfor extracting data from text, e.g., newswire text.