Stochastic Algorithms for Learning with Incomplete Data: An Application to Bayesian Networks
AIR FORCE INST OF TECH WRIGHT-PATTERSONAFB OH
Pagination or Media Count:
A goal of machine learning is to develop intelligent systems that improve with experience. To achieve this goal the field has drawn on many diverse disciplines. Because of this diversity, there is no common framework for the field to develop a research vision. This research begins by proposing a machine learning framework. From the framework three hypotheses pertaining to learning Bayesian networks are proposed. The current state-of-the-art learning paradigm for inducing Bayesian networks from incomplete data involves using deterministic greedy hill-climbing algorithms. These algorithms suffer the fate of getting stuck at the nearest local maximum. In order to get around this problem, researchers use random restarts. The first hypotheses is that stochastic population-based algorithms will find networks with good predictive performance and do not get stuck at the nearest local maximum. I demonstrate this hypothesis with an evolutionary algorithm and a Markov Chain Monte Carlo algorithm. A problem with the Markov Chain Monte Carlo algorithm is that it is slow to converge to the stationary distribution. The second hypothesis developed from the framework is that using global information to propose new states will speed convergence. Finally, because the population-based algorithms have multiple models readily available, I explore the hypothesis that multiple models have a better predictive capability than the single best model. I demonstrate these hypotheses empirically with three carefully selected datasets.
- Operations Research