Data Reduction System for Improving Classifier Performance
Patent application, Filed 18 Mar 99
DEPARTMENT OF THE NAVY WASHINGTON DC
Pagination or Media Count:
A data reduction method for a classification system using quantized feature vectors for each class with a plurality of features and levels. The reduction algorithm consisting of applying a Bayesian data reduction algorithm to the classification system for developing reduced feature vectors. Test data is then quantified into the reduced feature vectors. The reduced classification system is then tested using the quantized test data. A Bayesian data reduction algorithm is further provided having by computing an initial probability of error for the classification system. Adjacent levels are merged for each feature in the quantized feature vectors. Level based probabilities of error are then calculated for these merged levels among the plurality of features. The system then selects and applies the merged adjacent levels having the minimum level based probability of error to create an intermediate classification system. Steps of merging, selecting and applying are performed until either the probability of error stops improving or the features and levels are incapable of further reduction.