University of Glasgow at TREC 2008: Experiments in Blog, Enterprise, and Relevance Feedback Tracks with Terrier
GLASGOW UNIV (UNITED KINGDOM)
Pagination or Media Count:
In TREC 2008, we participate in the Blog, Enterprise, and Relevance Feedback tracks. In all tracks, we continue the research and development of the Terrier platform centred around extending state-of-the-art weighting models based on the Divergence From Randomness DFR framework. In particular, we investigate two main themes, namely, proximity-based models, and collection and profile enrichment techniques based on several resources. In the Blog track, we aim to improve our opinion detection techniques and to integrate various new blog-specific features into our Voting Model. For the baseline ad-hoc task, we aim to build strongly performing baselines by applying two different techniques. The first one boosts documents in which query terms co-occur in a given window size, and the second one applies query expansion using collection enrichment. Non-English documents are also removed from the retrieved results. In the opinion-finding task, we experiment with two main opinion detection approaches. The first one improves our TREC 2007 dictionary-based approach by automatically building an internal opinion dictionary from the collection itself. We measure the opinionated discriminability of each term using an information-theoretic divergence measure based on the relevance assessments of previous years. The second approach is based on the OpinionFinder tool, which identifies subjective sentences in text. In particular, we introduce a novel method to measure the informativeness of query terms occurring in close proximity to subjective sentences. In the blog distillation task, we have two research themes.
- Information Science