POSTECH at TREC 2009 Blog Track: Top Stories Identification
POHANG UNIV OF SCIENCE AND TECHNOLOGY (SOUTH KOREA) DEPT OF ELECTRONIC AND ELECTRICAL ENGINEERING
Pagination or Media Count:
This paper describes our participation in the TREC 2009 Blog Track. Our system consists of the query likelihood component and the news headline prior component, based on the language model framework. For the query likelihood, we propose several approaches to estimate the query language model and the news headline language model. We also suggest two approaches to choose the 10 supporting relevant posts Feed-Based Selection and Cluster-Based Selection. Furthermore, we propose two criteria to estimate the news headline prior for a given day. Experimental results show that using the prior significantly improves the performance of the task.
- Information Science
- Computer Programming and Software
- Test Facilities, Equipment and Methods