QUANTIFICATION OF INFORMATION STORAGE AND RETRIEVAL METHODOLOGIES
Interim rept. 15 Sep 1969-14 May 1970
ANALYTICS INC WILLOW GROVE PA
Pagination or Media Count:
The paper presents the results of a set of Monte Carlo computations designed to show the general behavior of the efficiency of probabilistic information retrieval systems as a function of human-variability noise. The total amount of noise, the combination of noise produced in indexing documents and in formulating requests, is the independent variable. The effect of noise is measured by the fraction of the file that must be retrieved in order to obtain the document that in the absence of noise would be retrieved first. Computations are made for an idealized system in which the index and request vectors are normalized and have uniform distributions however, the method could accommodate other distributions. The results show how, for a fixed amount of noise, the depth of file search decreases with increasing numbers of index categories for each constant ratio of terms specified in the query to index categories in the space. Also, for a fixed number of index categories, the way in which the fraction of file searched decreases with the number of index terms in the query is shown.
- Information Science