FINITE STATE CONTINUOUS TIME MARKOV DECISION PROCESSES WITH A FINITE PLANNING HORIZON.
RAND CORP SANTA MONICA CALIF
Pagination or Media Count:
The system considered may be in one of n states at any point in time and its probability law is a Markov process which depends on the policy control chosen. The return to the system over a given planning horizon is the integral over that horizon of a return rate which depends on both the policy and the sample path of the process. The objective is to find a policy which maximizes the expected return over the given planning horizon. A necessary and sufficient condition for optimality is obtained, and a constructive proof is given that there is a piecewise constant policy which is optimal. A bound on the number of switches points where the piecewise constant policy jumps is obtained for the case where there are two states. Author
- Operations Research