A Generative Theory of Relevance
MASSACHUSETTS UNIV AMHERST
Pagination or Media Count:
We present a new theory of relevance for the field of Information Retrieval. Relevance is viewed as a generative process and we hypothesize that both user queries and relevant documents represent random observations from that process. Based on this view we develop a formal retrieval model that has direct applications to a wide range of search scenarios. The new model substantially outperforms strong baselines on the tasks of ad-hoc retrieval, cross-language retrieval, handwriting retrieval, automatic image annotation, video retrieval and topic detection and tracking. Empirical success of our approach is due to a new technique we propose for modeling exchangeable sequences of discrete random variable. The new technique represents an attractive counterpart to existing formulations, such as multinomial mixtures, pLSI and LDAit is effective, easy to train, and makes no assumptions about the geometric structure of the data.
- Information Science