Accession Number:

ADA604450

Title:

Improved Phrase Translation Modeling Using Maximum A-Posteriori (MAP) Adaptation

Descriptive Note:

Interim rept. 1 Jul 2011-30 Jun 2012

Corporate Author:

AIR FORCE RESEARCH LAB WRIGHT-PATTERSON AFB OH HUMAN PERFORMANCE WING (711TH) HUMAN EFFECTIVENESS DIRECTORATE/HUMAN CENTERED ISR DIV

Report Date:

2013-07-01

Pagination or Media Count:

15.0

Abstract:

In this paper, we explore several methods of improving the estimation of translation model probabilities for phrase-based statistical machine translation given in-domain data sparsity. We introduce a hierarchical variant of MAP adaptation for domain adaptation with an arbitrary number of out-of-domain models. We compare this adaptation technique to linear interpolation and phrase table fill-up. Additionally, we note that domain adaptation can have a smoothing effect, and we explore the interaction between smoothing and the incorporation of out-of-domain data. We find that the relative contributions of smoothing and interpolation depend on the datasets used. For both the IWSLT 2011 and WMT 2011 English-French datasets, the MAP adaptation method we present improves on a baseline system by 1.5 BLEU points.

Subject Categories:

  • Linguistics
  • Statistics and Probability
  • Cybernetics

Distribution Statement:

APPROVED FOR PUBLIC RELEASE