DID YOU KNOW? DTIC has over 3.5 million final reports on DoD funded research, development, test, and evaluation activities available to our registered users. Click
HERE to register or log in.
Accession Number:
ADA455144
Title:
A Statistical Word-Level Translation Model for Comparable Corpora
Descriptive Note:
Corporate Author:
MARYLAND UNIV COLLEGE PARK INST FOR ADVANCED COMPUTER STUDIES
Report Date:
2000-06-01
Pagination or Media Count:
12.0
Abstract:
In this paper, we present a model of statistical word-level mapping for comparable corpora. The approach is based on the assumption that if two terms have close distributional profiles, their corresponding translations distributional profiles should be close in a comparable corpus. The proposed model is described. A preliminary investigation on intralanguage comparable corpora is laid out. The preliminary results are 92 accurate suggesting the feasibility of the model. The model needs to undergo some improvements and should be tested cross linguistically before assessing its significance.
Distribution Statement:
APPROVED FOR PUBLIC RELEASE