The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces (Preprint)

Chakrabarti, Kaushik; Mehrotra, Sharad

The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces (Preprint)

Active / Technical Report | Accession Number: ADA466131 |

Open PDF

Abstract:

Feature based similarity search is emerging as an important search paradigm in database systems. The technique used is to map the data items as points into a high dimensional feature space which is indexed using a multidimensional data structure. Similarity search then corresponds to a range search over the data structure. Although several data structures have been proposed for feature indexing, none of them is known to scale beyond 10-15 dimensional spaces. This paper introduces the hybrid tree a multidimensional data structure for indexing high dimensional feature spaces. Unlike other multidimensional data structures, the hybrid tree cannot be classified as either a pure data partitioning DP index structure e.g., R-tree, SS-tree, SRtree or a pure space partitioning SP one e.g., KDB-tree, hBtree rather, it combines positive aspects of the two types of index structures a single data structure to achieve search performance more scalable to high dimensionalities than either of the above techniques hence, the name hybrid . Furthermore, unlike many data structures e.g., distance based index structures like SS-tree, SR-tree, the hybrid tree can support queries based on arbitrary distance functions. Our experiments on real high dimensional large size feature databases demonstrate that the hybrid tree scales well to high dimensionality and large database sizes. It significantly outperforms both purely DP-based and SP-based index mechanisms as well as linear scan at all dimensionalities for large sized databases.

Author(s):

Chakrabarti, Kaushik ; Mehrotra, Sharad

Author Organization(s):

ILLINOIS UNIV AT URBANA DEPT OF COMPUTER SCIENCE

Descriptive Note:

Conference paper

Supplementary Note:

Sponsored in part the National Science Foundation Grant No. IIS-9734300. A version of this paper was presented at the Interantional Conference on Data Engineering (15th) held in Sydney, Australia on 23-26 Mar 1999 and published in proceedings of the same, p440-447, 1999.

Pagination:

0009

Security Markings

DOCUMENT & CONTEXTUAL SUMMARY

Distribution:

Approved For Public Release

Distribution Statement:

Approved For Public Release; Distribution Is Unlimited.

RECORD

Collection: TR

Identifying Numbers

Contract/Grant Number(s):

DAAL01-96-2-0003

Monitor Series:

ARL

Subject Terms

Joint Capability Areas:

JCA_1_Force Support; JCA_1.2.7_Experimentation; JCA_1.3_Human Capital Management; JCA_1.3.1_Personnel and Family Support; JCA_5_Command and Control; JCA_5.3_Planning; JCA_6_Net Centric; JCA_6.2.3_Core Enterprise Services; JCA_6.2_Enterprise Services; JCA_8_Building Partnerships; JCA_1.2_Force Preparation; JCA_8.1.1_Inform Domestic and Foreign Audiences; JCA_8.1_Communicate

Communities of Interest:

Air Platforms

Descriptor(s):

*DATA MANAGEMENT, *INFORMATION RETRIEVAL, DATA BASES, COMPUTER LOGIC

Field(s)/Group(s):

Operations Research, Information Science, Computer Programming and Software

Keyword(s):

FEATURE BASED SIMILARITY SEARCHES, MULTIMEDIA FEATURE INDEXING

Report Date:

1998 Jan 01

Creation Date:

2007 Jun 13