Accession Number : ADA268288


Title :   A Hierarchical Network of Provably Optimal Learning Control Systems: Extensions of the Associative Control Process (ACP) Network


Descriptive Note : Final rept. 1 Jun 91-1 Oct 92,


Corporate Author : WRIGHT LAB WRIGHT-PATTERSON AFB OH


Personal Author(s) : Baird, Leemon C , III ; Klopf, A H


Full Text : https://apps.dtic.mil/dtic/tr/fulltext/u2/a268288.pdf


Report Date : Jan 1993


Pagination or Media Count : 35


Abstract : An associative control process (ACP) network is a learning control system that can reproduce a variety of animal learning results from classical and instrumental conditioning experiments (Klopf, Morgan, and Weaver, 1993; see also the article, 'A Hierarchical Network of Control Systems that Learn'). The ACP networks proposed and tested by Klopf, Morgan, and Weaver are not guaranteed, however, to learn optimal policies for maximizing reinforcement. Optimal behavior is guaranteed for a reinforcement learning system such as Q- learning (Watkins, 1989), but simple Q-learning is incapable of reproducing the animal learning results that ACP networks reproduce. We propose two new models that reproduce the animal learning results and are provably optimal. The first model, the modified ACP network, embodies the smallest number of changes necessary to the ACP network to guarantee that optimal policies will be learned while still reproducing the animal learning results. The second model, the single-layer ACP network, embodies the smallest number of changes necessary to Q-learning to guarantee that is reproduced the animal learning results while still learning optimal policies


Descriptors :   *CONTROL SYSTEMS , *LEARNING , REPRINTS , REINFORCEMENT(STRUCTURES) , NETWORKS , HIERARCHIES


Subject Categories : Psychology


Distribution Statement : APPROVED FOR PUBLIC RELEASE