A MODIFIED DYNAMIC PROGRAMMING METHOD FOR MARKOVIAN DECISION PROBLEMS
CALIFORNIA UNIV LOS ANGELES WESTERN MANAGEMENT SCIENCE INST
Pagination or Media Count:
At the beginning of each period a system is in a certain state. Depending on the state, choice of an action determines the income for that period and the transition probabilities for moving to the next state. The problem is to choose the action at the beginning of each period which will maximize future total discounted income. A modified dynamic programming method for this problem is described. The method gives improved error control by showing how to compute upper and lower bonds on the optimal return which are, respectively, monotone decreasing and monotone increasing, and converge to the optimal return. The convergence appears to be quite rapid.
- Operations Research
- Logistics, Military Facilities and Supplies