Accession Number:

ADA361637

Title:

Parallel Data Mining with the Message Passing Interface Standard on Clusters of Personal Computers.

Descriptive Note:

Master's thesis,

Corporate Author:

AIR FORCE INST OF TECH WRIGHT-PATTERSONAFB OH SCHOOL OF ENGINEERING

Personal Author(s):

Report Date:

1999-03-01

Pagination or Media Count:

137.0

Abstract:

Piles of personal computers PoPCs have begun to challenge the performance of the traditional Massively Parallel Processors MPPs and the less traditional networks of workstations NOWs as platforms for parallel computing. Large clusters of PCs have reached and at times exceeded the performance of modern MPPs at a fraction of the cost. Built with commodity components, these clusters can be constructed for about half the cost of a comparable NOW. The primary competing operating systems OIS in use on PoPCs are Linux and Windows NT. This thesis investigation compares the performance of an NT cluster with that of a Linux cluster, a NOW, and an MPP. A comparison of the MPI tools available for NT is also accomplished. These comparisons are made using the Pallas benchmark suite for MPI and a parallel data mining algorithm. This data mining technique, known as the Genetic Rule and Classifier Construction Environment GRaCCE, uses a genetic algorithm to mine decision rules from data. Results from experimentation and statistical analysis have produced three important conclusions. First, NT clusters are viable, cost effective alternatives to Linux clusters, NOWs, and MPPs for parallel computing. Second, the two primary communication libraries currently available for NT-PaTENT MPI and MPIPro-are statistically equivalent in performance. Third, the parallel GRaCCE algorithm is capable of relatively good speedup and efficiency, even for significantly unbalanced processor workloads, if the effects of first loop iteration caching are ignored.

Subject Categories:

  • Computer Hardware

Distribution Statement:

APPROVED FOR PUBLIC RELEASE