Accession Number:

ADA466330

Title:

A Survey of Statistical Machine Translation

Descriptive Note:

Technical rept.

Corporate Author:

MARYLAND UNIV COLLEGE PARK DEPT OF COMPUTER SCIENCE

Personal Author(s):

Report Date:

2007-04-01

Pagination or Media Count:

51.0

Abstract:

Statistical machine translation SMT treats the translation of natural language as a machine learning problem. By examining many samples of human-produced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular techniques have only emerged within the last few years. This survey presents a tutorial overview of state-of-the-art SMT at the beginning of 2007. We begin with the context of the current research, and then move to a formal problem description and an overview of the four main subproblems translational equivalence modeling, mathematical modeling, parameter estimation, and decoding. Along the way, we present a taxonomy of some different approaches within these areas. We conclude with an overview of evaluation and notes on future directions.

Subject Categories:

  • Numerical Mathematics
  • Computer Programming and Software
  • Cybernetics

Distribution Statement:

APPROVED FOR PUBLIC RELEASE