Statistical Sensitive Data Protection and Inference Prevention with Decision Tree Methods
NAVAL RESEARCH LAB WASHINGTON DC
Pagination or Media Count:
We present a new approach for protecting sensitive data in a relational table columns attributes rows records. If sensitive data can be inferred by unauthorized users with non-sensitive data, we have the inference problem. We consider inference as correct classification and approach it with decision tree methods. As in our previous work, sensitive data are viewed as classes of those test data and non-sensitive data are the rest attribute values. In general, however, sensitive data may not be associated with one attribute i.e., the class, but are distributed among many attributes. We present a generalized decision tree method for distributed sensitive data. This method takes in turn each attribute as the class and analyze the corresponding classification error. Attribute values that maximize an integrated error measure are selected for modification. Our analysis shows that modified attribute values can be restored and hence, sensitive data are not securely protected. This result implies that modified values must themselves be subjected to protection. We present methods for this ramified protection problem and also discuss other statistical attacks.
- Information Science
- Computer Systems Management and Standards