RECOGNITION OF CLASS MEMBERSHIP BY MEANS OF WEAK, STATISTICALLY DEPENDENT FEATURES.
Final rept., May 64-Mar 66,
RCA LABS PRINCETON N J
Pagination or Media Count:
A method of automatic classification is developed for the case in which the features used to determine the class of an unknown object x are individually weak. The features are weak in the sense that any subset of the universe defined by a single feature value of x contains many objects belonging to a class different from that of x. The classes are defined by a small collection of examples, which are objects whose class membership and feature values are known. The basic problem in classification by example is the estimation of probabilities from a small number of examples. Ideally, the class of x should be determined by estimating the class probabilities in the subset J defined by the conjunction of all of the feature values of x. But J usually will contain no examples on which to base an estimate of the class probabilities. The recognition procedure offered here employs a subset D defined by the conjunction of some, but not all, of the feature values of x. It is postulated that the features are statistically dependent in such a way that the particular kind of dependence exhibited by the features defining D can be used to infer the class of x. This recognition procedure includes a hypothesis-testing procedure in which the competing hypotheses are related to the dependence of the features defining D and are only indirectly related to the classes to which x could possibly belong. In this approach the class of x can be decided without requiring that D contain any examples. These principles are demonstrated by experiments in character recognition.