IRRA at TREC 2012: Divergence From Independence (DFI)
MUGLA UNIV (TURKEY) DEPT OF ELECTRONIC AND COMPUTER EDUCATION
Pagination or Media Count:
IRRA IR-Ra group participated in the 2012 Web track, with a system implementing a non-parametric term weighting method based on measuring the divergence from independence DFI. This is the third year of participation for IRRA group, following the participations in TREC 2009 and 2010 Web tracks. In this year, the aim is to evaluate a new DFI-based term weighting model developed on the basis of Shannon s information theory Shannon, 1949, along with the evaluation of a heuristic approach that is expected to provide early precision when used together with DFI term weighting. The TERRIER retrieval platform version 3.0 Ounis et al., 2007 is used to index and search the ClueWeb09-T09B1 data set Category B data set, a subset of about 50 million Web pages in English. During indexing and searching, terms are stemmed Porter s stemmer as implemented in TERRIER but not stopped. The result sets are filtered using the fusion of two spam-page lists provided by Cormack et al. 2010 for ClueWeb09 document collection.
- Information Science
- Computer Programming and Software