Efficient Language Independent Generation from Lexical Conceptual Structure
MARYLAND UNIV COLLEGE PARK INST FOR ADVANCED COMPUTER STUDIES
Pagination or Media Count:
This paper describes a system for generating natural-language sentences from an interlingual representation, Lexical Conceptual Structure LCS. The system has been developed as part of a Chinese-English Machine Translation system however, it is designed to be used for many other MT language pairs and Natural Language applications. The contributions of this work include 1 Development of a language-independent generation system that maximizes efficiency through the use of a hybrid rule-basedstatistical module 2 Enhancements to an interlingual representation and associated algorithms for interpretation of multiply ambiguous input sentences 3 Development of an efficient reusable language-independent linearization module with a grammar description language that can be used with other systems 4 Improvements to an earlier algorithm for hierarchically mapping thematic roles to surface positions 5 Development of a diagnostic tool for lexicon coverage and correctness and use of the tool for verification of English, Spanish, and Chinese lexicons. An evaluation of translation quality shows comparable performance with commercial translation system. The generation system can also be straightforwardly extended to their languages and this is demonstrated and evaluated for Spanish.