The Optimal Control of Partially Observable Markov Processes.
STANFORD UNIV CALIF STANFORD ELECTRONICS LABS
Pagination or Media Count:
The report studies the control of a finite-state, discrete-time Markov process characterized by incomplete state observation. The process is viewed through a set of outputs such that the probability of observing a given output is dependent on the current state of the Markov process. The observed stochastic process consisting of the time sequence of outputs generated by the imbedded Markov process is termed a partially observable Markov process. A finite number of alternative parameter sets for the partially observable process are available. Associated with each alternative is a set of costs for making transitions between the states of the Markov process and for producing the various outputs. At each time period an observer must select a control alternative to minimize the total expected operating costs for the process. The thesis consists of two major sections In the first section the state of the partially observable Markov process is proved to be the vector of state occupancy probabilities for the Markov process. Using this concept of state, an algorithm is developed to solve for the optimal control as a function of a finite operating time. The algorithm produces an exact solution for the optimal control over the complete state space of a general partially observable Markov process, and is applicable to both discounted and nondiscounted problems The second section deals with the case of infinite operating time, and is subdivided into the cases of discounted and nondiscounted costs. Author
- Statistics and Probability
- Operations Research