Cohesion in Computer Text Generation: Lexical Substitution.
MASSACHUSETTS INST OF TECH CAMBRIDGE LAB FOR COMPUTER SCIENCE
Pagination or Media Count:
This report describes Paul, a computer text generation system designed to create cohesive text. The device used to achieve this cohesion is lexical substitution. Through the use of syntactic and semantic information, the system is able to determine which type of lexical substitution will provide the necessary information to generate an understandable reference, while not providing so much information that the reference is confusing or unnatural. Specifically, Paul is designed to deterministically choose between pronominalization, superordinate substitutions, and definite noun phrase reiteration. The system identifies a strength of antecedence recovery for each of the lexical substitutions, and matches them against the strength of potential antecedence of each element in the text to select the proper substitutions for these elements. There are five classes of potential antecedence based on the elements current and previous syntactic roles, semantic case roles, and the current focus of the discourse. Through the use of these lexial substitutions, Paul is able able to generate a cohesive text which exhibits the binding of sentences through presupposition dependencies, the marking of old information from new, and the avoiding of unnecessary and tedious repetitions.