Large-Scale Linear Optimization through Machine Learning: From Theory to Practical System Design and Implementation
Technical Report,28 May 2014,27 May 2016
Korea Advanced Institute of Science and Technology Taejon Korea, South
Pagination or Media Count:
The linear programming LP is one of the most popular necessary optimization tool used for data analytics as well as in various scientific fields. However, the current state-of-art algorithms suffer from scalability issues when processing Big Data. For example, the commercial optimization software IBM CPLEX cannot handle an LP with more than hundreds of thousands variables or constraints. Existing algorithms are fundamentally hard to scale because they are inevitably too complex to parallelize. To address the issue, we study the possibility of using the Belief Propagation BP algorithm as an LP solver. BP has shown remarkable performances on various machine learning tasks and it naturally lends itself to fast parallel implementations. Despite this, very little work has been done in this area. In particular, while it is generally believed that BP implicitly solves an optimization problem, it is not well understood under what conditions the solution to a BP converges to that of a corresponding LP formulation. Our efforts consist of two main parts. First, we perform a theoretic study and establish the conditions in which BP can solve LP 1,2. Although there has been several works studying the relation between BP and LP for certain instances, our work provides a generic condition unifying all prior works for generic LP. Second, utilizing our theoretical results, we develop a practical BP-based parallel algorithms for solving generic LPs, and it shows 71x speed up while sacrificing only 0.1 accuracy compared to the state-of-art exact algorithm. As a result of the study, the PIs have published two conference papers and two follow-up journal papers are under submission. We refer the readers to our published work for details.