Efficient Matrix Models for Relational Learning
CARNEGIE-MELLON UNIV PITTSBURGH PA MACHINE LEARNING DEPT
Pagination or Media Count:
Relational learning deals with the setting where one has multiple sources of data, each describing different properties of the same set of entities. We are concerned primarily with settings where the properties are pairwise relations between entities, and attributes of entities. We want to predict the value of relations and attributes, but relations between entities violate the basic statistical assumption of exchangeable data points, or entities. Furthermore, we desire models that scale gracefully as the number of entities and relations increase. This thesis rests on two claims, that i that Collective Matrix Factorization can effectively integrate different sources of data to improve prediction and, ii that training scales well as the number of entities and observations increase. We consider two real-world data sets in experimental support of these claims augmented collaborative filtering and augmented brain imaging. In augmented collaborative filtering, we show that genre information about movies can be used to increase the predictive accuracy of users ratings. In augmented brain imaging, we show that word co-occurrence information can be used to increase the predictive accuracy of a model of changes in brain activity to word stimuli, even in regions of the brain that were never included in the training data.