Accession Number:

ADA581494

Title:

PRIS at 2012 TREC Medical Track: Query Expansion, Retrieval and Ranking

Descriptive Note:

Conference paper

Corporate Author:

BEIJING UNIV OF POSTS AND TELECOMMUNICATIONS (CHINA)

Report Date:

2012-11-01

Pagination or Media Count:

4.0

Abstract:

The official datasets are XML format so we have to parse them before indexing. We choose Lucene as our tool for indexing and searching, we select the Jakarta-commons-Digester the following we referred to as digester to parse the xml documents. The xml document is processed by the Digester to be a java object and then we can get the fields that we would use from the java object. In addition, we also process the tag reporttext in the xml documents so that we can get the age and sexuality information which are very important fields for searching task.

Subject Categories:

  • Information Science

Distribution Statement:

APPROVED FOR PUBLIC RELEASE