Accession Number : AD1018376

Title :   A Semi-supervised Heat Kernel Pagerank MBO Algorithm for Data Classification

Descriptive Note : Technical Report

Corporate Author : University of California, Los Angeles Los Angeles United States

Personal Author(s) : Merkurjev,Ekatherina ; Bertozzi,Andrea L ; Chung,Fan

Full Text :

Report Date : 01 Jul 2016

Pagination or Media Count : 20

Abstract : We present a very efficient semi-supervised graph-based algorithm for classification of high-dimensional data that is motivated by the MBO method of Garcia-Cardona (2014) and derived using the similarity graph. Our procedure is an elegant combination of heat kernel page rank and the MBO method applied to study semi-supervised problems. The timing of our algorithm is highly dependent on how quickly the page rank can be computed; we use two different yet very efficient approaches to calculate the page rank, one of which proceeds by simulating random walks of bounded length. Overall, our method is advantageous for very big, sparse data, in which the graph has few edges, and it produces good accuracy even if the number of labeled instances is very small. In fact, the accuracy of the procedure is comparable with or better than that of state-of-the-art methods and is demonstrated on benchmark data sets. In addition to experimental results, we include a thorough comparison of our algorithm to that of Garcia-Cardona (2014) and describe the advantages of both methods.

Descriptors :   algorithms , classification , kernel functions , rank order statistics , learning machines , computer vision , clustering , relational data bases , graphs , network architecture , probability , matrices (mathematics) , data mining

Distribution Statement : APPROVED FOR PUBLIC RELEASE