Accession Number : ADA466965


Title :   Content Based Retrieval Database Management System with Support for Similarity Searching and Query Refinement


Descriptive Note : Doctoral thesis


Corporate Author : ILLINOIS UNIV AT URBANA DEPT OF COMPUTER SCIENCE


Personal Author(s) : Ortega-Binderberger, Michael


Full Text : https://apps.dtic.mil/dtic/tr/fulltext/u2/a466965.pdf


Report Date : Jan 2002


Pagination or Media Count : 184


Abstract : With the emergence of many application domains that require imprecise similarity based access to information, techniques to support such a retrieval paradigm over database systems have emerged as a critical area of research. This thesis explores how to enhance database systems with content based search over arbitrary abstract data types in a similarity based framework with query refinement. This scope opens a number of challenges previously not faced by databases, among them: * Extension of abstract data types to support arbitrary similarity functions and support for query refinement. (Intra type similarity and feedback) * Extension of the already developed query refinement models under the MARS system to a general multi table relational model. (Inter Type similarity and feedback) * Extension of query processing models from a set based model where tuples either satisfy or not the query predicate to a result where the degree to which tuples satisfy a predicate is represented by their similarity values. (Similarity predicates) * Based on the similarity values, return only the best k matches. This implies a sorting on the similarity values and ample optimizations are possible to use lazy evaluation and only compute those answers that the user will see. (Ranked Retrieval) * Optimization of query execution under the similarity conditions which requires access to specialized indices. Optimized composite predicate merging is possible based on earlier work on the MARS project to compute the similarity value for a predicate based on independent streams rather than using the value directly. (Incremental top-k merging) We are building a prototype system that implements the proposed functionality in an efficient way and we evaluate the quality of the answers returned to the user.


Descriptors :   *DATA BASES , *INFORMATION RETRIEVAL , ALGORITHMS , OPTIMIZATION , MODELS , THESES


Subject Categories : Information Science


Distribution Statement : APPROVED FOR PUBLIC RELEASE