Accession Number:

ADA456312

Title:

English-Chinese Information Retrieval at IBM

Descriptive Note:

Corporate Author:

IBM THOMAS J WATSON RESEARCH CENTER YORKTOWN HEIGHTS NY

Report Date:

2006-01-01

Pagination or Media Count:

7.0

Abstract:

We describe TREC-9 experiments with an IR system that incorporates statistical machine translation trained on sentence-aligned parallel corpora for both query translation English-to-Chinese and document translation Chinese-to-English. These systems are contrasted with monolingual Chinese retrieval and with query translation based on a widely available commercial machine translation package. These systems incorporate both words and characters as features for the retrieval. Comparisons with a baseline from TREC-56 enable our experiments to address issues related to the differences between Beijing and Hong Kong dialects.

Subject Categories:

  • Information Science
  • Cybernetics

Distribution Statement:

APPROVED FOR PUBLIC RELEASE