Accession Number:

ADA454595

Title:

Steering Policies for Markov Decision Processes Under a Recurrence Condition

Descriptive Note:

Corporate Author:

MARYLAND UNIV COLLEGE PARK SYSTEMS RESEARCH CENTER

Personal Author(s):

Report Date:

1988-01-01

Pagination or Media Count:

28.0

Abstract:

This paper presents a class of adaptive policies in the context of Markov decision processes MDPs with long-run average performance measures. Under a recurrence condition, the proposed policy alternates between two stationary policies so as to adaptively track a sample average cost to a desired value. Direct sample path arguments are presented for investigating the convergence of sample average costs and the performance of the adaptive policy is discussed. The obtained results are particularly useful in discussing constrained MDPs with a single constraint. Applications include a wide class of constrained MDPs with finite state space Beutler and Ross 1985, an optimal flow control problem Ma and Makowski 1987 and an optimal resource allocation problem Nain and Ross 1986.

Subject Categories:

  • Statistics and Probability

Distribution Statement:

APPROVED FOR PUBLIC RELEASE