Accession Number:
ADA318671
Title:
Convergence Behavior of Temporal Difference Learning.
Descriptive Note:
Final rept.,
Corporate Author:
WRIGHT LAB WRIGHT-PATTERSON AFB OH AVIONICS DIRECTORATE
Personal Author(s):
Report Date:
1996-05-01
Pagination or Media Count:
9.0
Abstract:
Temporal difference learning is an important class of incremental learning procedures which learn to predict outcomes of sequential processes through experience. Although these algorithms have been used in a variety of notorious intelligent systems such as Samuels checker-player and Tesauros Backgammon program. Their convergence properties remain poorly understood. This paper provides a brief summary of the theoretical basis for these algorithms and documents observed convergence performance in a variety of experiments. The implications of these results are also briefly discussed.
Descriptors:
Subject Categories:
- Cybernetics
- Operations Research