ADAPTIVE TECHNIQUES AS APPLIED TO TEXTUAL DATA RETRIEVAL.
DOUGLAS AIRCRAFT CO INC NEWPORT BEACH CALIF
Pagination or Media Count:
Document screening is considered as the formation of a classification vector whose coordinates relate the pertinence of the document to various subject fields. This vector is the resultant of vectors assigned to key words and phrases in the document. Adaptive techniques are described which assign vectors to key words on the basis of their occurrence in a corpus of preclassified documents. The results of experiments, conducted to determine the accuracy with which these adaptive techniques approximate key word relevance vectors assigned by expert judgment, are presented. An overall system design philosophy was developed which includes an automatic input mechanism and a device for monitoring the key word relevance vectors to retard system obsolescence. Methods for the hardware implementation of the system are discussed, and computer simulation results on one type of extremely efficient character recognition logic are described. A statistical model was formulated for the analysis of the document screening device. The model is used to predict system error rates as a function of the document corpora, the obsolescence of the relevance vectors, the accuracy of the input device, and the variation of other system design parameters. Author