Accession Number:

ADA517770

Title:

Sparse Matrix Factorization: Applications to Latent Semantic Indexing

Descriptive Note:

Conference paper

Corporate Author:

SASKATCHEWAN UNIV SASKATOON

Report Date:

2009-11-01

Pagination or Media Count:

9.0

Abstract:

This article describes the use of Latent Semantic Indexing LSI and some of its variants for the TREC Legal batch task. Both folding-in and Essential Dimensions of LSI EDLSI appeared as if they might be successful for recall-focused retrieval on a collection of this size. Furthermore, we developed a new LSI technique, one which replaces the Singular Value Decomposition SVD with another technique for matrix factorization, the sparse column-row approximation SCRA. We were able to conclude that all three LSI techniques have similar performance. Although our 2009 results showed significant improvement when compared to our 2008 results, the use of a better method for selection of the parameter K, which is the ranking that results in the best balance between precision and recall, appears to have provided the most benefit.

Subject Categories:

  • Information Science
  • Linguistics

Distribution Statement:

APPROVED FOR PUBLIC RELEASE