Accession Number:

ADA509176

Title:

Use of Probabilistic Topic Models for Search

Descriptive Note:

Master's thesis

Corporate Author:

NAVAL POSTGRADUATE SCHOOL MONTEREY CA

Personal Author(s):

Report Date:

2009-09-01

Pagination or Media Count:

91.0

Abstract:

This thesis solves a common issue in search applications. Typically, the user does not know exactly which terms are used in a document he is searching for. Several attempts have been made to overcome this issue by augmenting the document model andor the query. In this thesis, a probabilistic topic model augments the document model. Probabilistic document models are formally introduced and inference methods are derived. It is shown how these models can be used for information retrieval tasks and how a search application can be implemented. A prototype was implemented and the implementation is tested and evaluated based on benchmark corpora. The evaluation provides empirical evidence that probabilistic document models improve the retrieval performance significantly, and shows which preprocessing steps should be made before applying the model.

Subject Categories:

  • Information Science
  • Statistics and Probability

Distribution Statement:

APPROVED FOR PUBLIC RELEASE