Approximate Dynamic Programming for the United State Air Force Officer Manpower Planning Problem
Abstract:
The United States Air Force (USAF) makes oxE;cer accession and promotion decisions annually. Optimal manpower planning of the commissioned offixE;cer corps is vital. A manpower system that is neither over-manned nor under-manned is desirable as it is most cost effective. The Air Force OxE;cer Manpower Planning Problem (AFO-MPP) is introduced, which models oxE;cer accessions, promotions, and the uncertainty in retention rates. The objective for the AFO-MPP is to identify the policy for accession and promotion decisions that minimizes expected total discounted cost of maintaining the required number of oxE;cers in the system over an inxC;nite time horizon. The AFO-MPP is formulated as an inxC;nite-horizon Markov decision problem, and a policy is found using approximate dynamic programming. A least-squares temporal dixB;erencing (LSTD) algorithm is employed to determine the best approximate policies possible. Six computational experiments are conducted with varying retention rates and oxE;cer manning starting conditions. The LSTD algorithm results are compared to the benchmark policy (e.g., currently practiced by the USAF). Results indicate that