Accession Number:

ADA525800

Title:

Domain-Specific Term-List Expansion Using Existing Linguistic Resources

Descriptive Note:

Technical rept.

Corporate Author:

MARYLAND UNIV COLLEGE PARK IACS LANGUAGE AND MEDIA PROCESSING LAB

Personal Author(s):

Report Date:

2002-08-01

Pagination or Media Count:

13.0

Abstract:

This report describes a series of experiments involving expansion of a domain-specific human-generated seed list using available linguistic resources. The resources used for the expansion are intended to be general purpose two large-scale Chinese-English dictionaries and a Chinese lexical knowledge base HowNet. The methodology involves three steps 1 hand extraction of head words from each entry in the human-generated seed list 2 automatic comparison of these head words against entries in the linguistic resources-where an entry matches if the head word matches the entry exactly or is included in its the semantic definition and 3 collection of any resulting matching entries into a larger term list. The terms extracted by this process were verified manually to confirm whether they were relevant to the topic of a specific domain. An important contribution of this work is the finding that the use of a bilingual term list for the expansion proces does not provide a significant improvement over the use of a simpler, more easily produced, monoligual term list.

Subject Categories:

  • Linguistics

Distribution Statement:

APPROVED FOR PUBLIC RELEASE