Accession Number : ADA584084


Title :   Parallel and Distributed Systems for Probabilistic Reasoning


Descriptive Note : Doctoral thesis


Corporate Author : CARNEGIE-MELLON UNIV PITTSBURGH PA MACHINE LEARNING DEPT


Personal Author(s) : Gonzalez, Joseph


Full Text : https://apps.dtic.mil/dtic/tr/fulltext/u2/a584084.pdf


Report Date : Dec 2012


Pagination or Media Count : 183


Abstract : Scalable probabilistic reasoning is the key to unlocking the full potential of the age of big data. From untangling the biological processes that govern cancer to effectively targeting products and advertisements, probabilistic reasoning is how we make sense of noisy data and turn information into understanding and action. Unfortunately, the algorithms and tools for sophisticated structured probabilistic reasoning were developed for the sequential Von Neumann architecture and have therefore been unable to scale with big data. In this thesis we propose a simple set of design principles to guide the development of new parallel and distributed algorithms and systems for scalable probabilistic reasoning. We then apply these design principles to develop a series of new algorithms for inference in probabilistic graphical models and derive theoretical tools to characterize the parallel properties of statistical inference. We implement and assess the efficiency and scalability of the new inference algorithms in the multicore and distributed settings demonstrating the substantial gains from applying the thesis methodology to real-world probabilistic reasoning. Based on the lessons learned in statistical inference we introduce the GraphLab parallel abstraction which generalizes the thesis methodology and enable the rapid development of new efficient and scalable parallel and distributed algorithms for probabilistic reasoning. We demonstrate how the GraphLab abstraction can be used to rapidly develop new scalable algorithms for probabilistic reasoning and assess their performance on real-world problems in both the multicore and distributed settings. Finally, we identify a unique challenge associated with the underlying graphical structure in a wide range of probabilistic reasoning tasks. To address this challenge we introduce PowerGraph which refines the GraphLab abstraction and achieves orders of magnitude improvements in performance relative to existing systems.


Descriptors :   *ALGORITHMS , PROBABILITY , REASONING , THESES


Subject Categories : Numerical Mathematics


Distribution Statement : APPROVED FOR PUBLIC RELEASE