Accession Number:

ADA164025

Title:

Absolute Bounds on Set Intersection and Union Sizes from Distribution Information,

Descriptive Note:

Corporate Author:

NAVAL POSTGRADUATE SCHOOL MONTEREY CA

Personal Author(s):

Report Date:

1985-09-01

Pagination or Media Count:

51.0

Abstract:

Estimation of set intersection and union sizes is important for access method selection for a database. Absolute bounds on sizes are often much easier to compute than size estimates, requiring no distributional or independence assumptions, and can answer many of the same needs. We present a large compendium of quick closed-form bounds on set intersection and union sizes, each applying to a different situation they can be expressed as rules, and managed by rule-based or knowledge-base architecture. These methods use general-purpose statistics precomputed on the data, and exploit homomorphisms onto mappings of the data items onto distributions that can be more easily analyzed. Our methods can be used anytime, but tend to work best when there are strong or complex correlations in the data. This circumstance is poorly addressed by the standard methods of independence-assumption and distributional-assumption estimates, and hence our methods fill a need. Keywords Databases, Query processing, Statistical computing, Statistical inequalities, Sets, Boolean algebra, Estimation.

Subject Categories:

  • Computer Hardware

Distribution Statement:

APPROVED FOR PUBLIC RELEASE