Accession Number:

ADA440135

Title:

A Generative Theory of Relevance

Descriptive Note:

Doctoral thesis

Corporate Author:

MASSACHUSETTS UNIV AMHERST

Personal Author(s):

Report Date:

2004-09-01

Pagination or Media Count:

131.0

Abstract:

We present a new theory of relevance for the field of Information Retrieval. Relevance is viewed as a generative process and we hypothesize that both user queries and relevant documents represent random observations from that process. Based on this view we develop a formal retrieval model that has direct applications to a wide range of search scenarios. The new model substantially outperforms strong baselines on the tasks of ad-hoc retrieval, cross-language retrieval, handwriting retrieval, automatic image annotation, video retrieval and topic detection and tracking. Empirical success of our approach is due to a new technique we propose for modeling exchangeable sequences of discrete random variable. The new technique represents an attractive counterpart to existing formulations, such as multinomial mixtures, pLSI and LDAit is effective, easy to train, and makes no assumptions about the geometric structure of the data.

Subject Categories:

  • Information Science

Distribution Statement:

APPROVED FOR PUBLIC RELEASE