Accession Number:

ADA461521

Title:

Five Performance Enhancements for Hybrid Hash Join

Descriptive Note:

Corporate Author:

COLORADO UNIV AT BOULDER DEPT OF COMPUTER SCIENCE

Personal Author(s):

Report Date:

1992-07-01

Pagination or Media Count:

32.0

Abstract:

In this paper, we focus on set matching algorithms such as intersection, difference, union, and relational join, using join as a representative for all these matching problems. We discuss five performance enhancements for hybrid hash join algorithms, namely data compression, large cluster sizes and multi-level recursion, role reversal of build and probe inputs, histogram methods to exploit non-uniform data and hash value distributions skew, and join algorithms for multiple inputs. While each of the enhancements is fairly simple, the most surprising result is that hash value skew can be exploited and improve performance rather than being a danger to hybrid hash join performance as conventionally thought. Our design for hash-based N-way matching algorithms is a dual to pipelining data without intermediate sorting between multiple merge-joins on the same attribute interesting orderings, and exceeds its performance advantages. Each of the performance enhancements can be used by itself or they can be combined with each other as well as with parallel query execution techniques. The cumulative effect of the optimizations is that hybrid hash join will almost always be the set matching algorithm of choice, even in situations for which earlier research had recommended sorting and merge-join. DATABASE QUERY PROCESSING, SET MATCHING, HYBRID HASH JOIN, TUNING, DATA COMPRESSION, IO SPEED, FAN-OUT, RECURSION DEPTH, ROLE REVERSAL, NON-UNIFORMITY, HISTOGRAMS, INTERESTING ORDERINGS, N-WAY PARTITIONING,

Subject Categories:

  • Computer Programming and Software

Distribution Statement:

APPROVED FOR PUBLIC RELEASE