Entity Retrieval by Hierarchical Relevance Model, Exploiting the Structure of Tables and Learning Homepage Classifiers
PURDUE UNIV LAFAYETTE IN DEPT OF COMPUTER SCIENCES
Pagination or Media Count:
This paper gives an overview of our work done for the TREC 2009 Entity track. We propose a hierarchical relevance retrieval model for entity ranking. In this model, three levels of relevance are examined which are document, passage and entity, respectively. The final ranking score is a linear combination of the relevance scores from the three levels. Furthermore, we exploit the structure of tables and lists to identify the target entities from them by making a joint decision on all the entities with the same attribute. To find entity homepages, we train logistic regression models for each type of entities. A set of templates and filtering rules are also used to identify target entities. The key lessons that we learned by participating this years Entity track include 1 our special treatment of table and list data is well rewarding 2 The high accuracy of homepage finding is crucial in this track 3 Wikipedia can serve as a valuable knowledge resource for different aspects of the related entity finding task.
- Information Science
- Computer Programming and Software
- Test Facilities, Equipment and Methods