Accession Number:

ADA318671

Title:

Convergence Behavior of Temporal Difference Learning.

Descriptive Note:

Final rept.,

Corporate Author:

WRIGHT LAB WRIGHT-PATTERSON AFB OH AVIONICS DIRECTORATE

Personal Author(s):

Report Date:

1996-05-01

Pagination or Media Count:

9.0

Abstract:

Temporal difference learning is an important class of incremental learning procedures which learn to predict outcomes of sequential processes through experience. Although these algorithms have been used in a variety of notorious intelligent systems such as Samuels checker-player and Tesauros Backgammon program. Their convergence properties remain poorly understood. This paper provides a brief summary of the theoretical basis for these algorithms and documents observed convergence performance in a variety of experiments. The implications of these results are also briefly discussed.

Subject Categories:

  • Cybernetics
  • Operations Research

Distribution Statement:

APPROVED FOR PUBLIC RELEASE