Approximate Dynamic Programming Algorithms for United States Air Force Officer Sustainment
AIR FORCE INSTITUTE OF TECHNOLOGY WRIGHT-PATTERSON AFB OH GRADUATE SCHOOL OF ENGINEERING AND MANAGEMENT
Pagination or Media Count:
The United States Air Force USAF officer sustainment system involves making accession and promotion decisions for nearly 64 thousand officers annually. We formulate a discrete time stochastic Markov decision process model to examine this military workforce planning problem. The large size of the motivating problem suggests that conventional exact dynamic programming algorithms are inappropriate. As such, we propose two approximate dynamic programming ADP algorithms to solve the problem. We employ a least-squares approximate policy iteration API algorithm with instrumental variables Bellman error minimization to determine approximate policies. In this API algorithm, we use a modified version of the Bellman equation based on the post-decision state variable. Approximating the value function using a post-decision state variable allows us to find the best policy for a given approximation using a decomposable mixed integer nonlinear programming formulation. We also propose an approximate value iteration algorithm using concave adaptive value estimation CAVE. The CAVE algorithm identities an improved policy for a test problem based on the current USAF officer sustainment system. The CAVE algorithm obtains a statistically significant 2.8 improvement over the currently employed USAF policy, which serves as the benchmark.
- Personnel Management and Labor Relations
- Numerical Mathematics
- Military Forces and Organizations