Accession Number:

ADA461133

Title:

Modeling Syntax for Parsing and Translation

Descriptive Note:

Doctoral thesis

Corporate Author:

CARNEGIE-MELLON UNIV PITTSBURGH PA DEPT OF COMPUTER SCIENCE

Personal Author(s):

Report Date:

2003-12-15

Pagination or Media Count:

131.0

Abstract:

Syntactic structure is an important component of natural language utterances, for both form and content. Therefore, a variety of applications can benefit from the integration of syntax into their statistical models of language. In this thesis, two new syntax-based models are presented, along with their training algorithms a monolingual generative model of sentence structure, and a model of the relationship between the structure of a sentence in one language and the structure of its translation into another language. After these models are trained and tested on the respective tasks of monolingual parsing and word-level bilingual corpus alignment, they are demonstrated in two additional applications. First, a new statistical parser is automatically induced for a language in which none was available, using a bilingual corpus. Second, a statistical translation system is augmented with syntax-based models. Thus the contributions of this thesis include a statistical parsing system a bilingual parsing system, which infers a structural relationship between two languages using a bilingual corpus a method for automatically building a parser for a language where no parser is available and a translation model that incorporates phrase structure.

Subject Categories:

  • Linguistics
  • Statistics and Probability

Distribution Statement:

APPROVED FOR PUBLIC RELEASE