English-Chinese Information Retrieval at IBM
IBM THOMAS J WATSON RESEARCH CENTER YORKTOWN HEIGHTS NY
Pagination or Media Count:
We describe TREC-9 experiments with an IR system that incorporates statistical machine translation trained on sentence-aligned parallel corpora for both query translation English-to-Chinese and document translation Chinese-to-English. These systems are contrasted with monolingual Chinese retrieval and with query translation based on a widely available commercial machine translation package. These systems incorporate both words and characters as features for the retrieval. Comparisons with a baseline from TREC-56 enable our experiments to address issues related to the differences between Beijing and Hong Kong dialects.
- Information Science