Handling Translation Divergences in Generation-Heavy Hybrid Machine Translation

reportActive / Technical Report | Accession Number: ADA458778 | Open PDF

Abstract:

This paper describes a novel approach for handling translation divergences in a Generation- Heavy Hybrid Machine Translation GHMT system. The approach depends on the existence of rich target language resources such as word lexical semantics, including in- formation about categorical variations and subcategorization frames. These resources are used to generate multiple structural variations from a target-glossed lexico-syntactic representation of the source language sentence. The multiple structural variations account for different translation divergences. The overgeneration of the approach is constrained by a target-language model using corpus-based statistics. The exploitation of target language resources symbolic and statistical to handle a problem usually reserved to Transfer and Interlingual MT is useful for translation from structurally divergent source languages with scarce linguistic resources. A preliminary evaluation on the application of this approach to Spanish-English MT proves this approach extremely promising. The approach however is not limited to MT as it can be extended to monolingual NLG applications such as summarization.

Security Markings

DOCUMENT & CONTEXTUAL SUMMARY

Distribution:
Approved For Public Release
Distribution Statement:
Approved For Public Release; Distribution Is Unlimited.

RECORD

Collection: TR
Identifying Numbers
Subject Terms