The Harmonic Sieve: A Novel Application of Fourier Analysis to Machine Learning Theory and Practice.
CARNEGIE-MELLON UNIV PITTSBURGH PA SCHOOL OF COMPUTER SCIENCE
Pagination or Media Count:
This thesis presents new positive results--both theoretical and empirical--in machine learning. The primary learning-theoretic contribution is the Harmonic Sieve, the first efficient algorithm for learning the well-studied class of Disjunctive Normal Form DNF expressions learning is accomplished within the Probably Approximately Correct model with respect to the uniform distribution using membership queries. Of particular interest is the novel use of Fourier methods within the algorithm. Specifically, all prior Fourier-based learning algorithms focused on finding large Fourier coefficients of the function to be learned the target. The Harmonic Sieve departs from this paradigm it instead learns by finding large coefficients of certain functions other than the target. The robustness of this new Fourier technique is illustrated by applying it to prove learnability of noisy DNF expressions, of a circuit class that is even more expressive than DNF, and of an interesting class of geometric concepts. Empirically, the thesis demonstrates the significant practical potential of a classification-learning algorithm closely related to the Harmonic Sieve. The Boosting-based Perceptron BBP learning algorithm produces classifiers that are nonlinear perceptrons weighted thresholds over higher-order features. On several previously-studied machine learning benchmarks, the BBP algorithm produces classifiers that achieve accuracies essentially equivalent to or even better than the best previously-reported classifiers. Additionally, the perceptrons produced by the BBP algorithm tend to be relatively intelligible, an important feature in many machine learning applications. In a related vein, BBP and the Harmonic Sieve are applied successfully to the problem of rule extraction, that is, the problem of approximating an unintelligible classifier by a more intelligible function.